Petersen F. Curt | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Petersen F. Curt is active.

Explore More

Publication

Featured researches published by Petersen F. Curt.

field-programmable custom computing machines | 2006

Floating-Point Accumulation Circuit for Matrix Applications

Michael R. Bodnar; John R. Humphrey; Petersen F. Curt; Dennis W. Prather

Many scientific algorithms require floating-point reduction operations, or accumulations, including matrix-vector-multiply (MVM), vector dot-products, and the discrete cosine transform (DCT). Because FPGA implementations of each of these algorithms are desirable, it is clear that a high-performance, floatingpoint accumulation unit is necessary. However, this type of circuit is difficult to design in an FPGA environment due to the deep pipelining of the floatingpoint arithmetic units, which is needed in order to attain high performance designs (Durbano et al., 2004, Leeser and Wang, 2004). A deep pipeline requires special handling in feedback circuits because of the long delay, which is further complicated by a continuous input data stream. Proposed accumulator architectures, which overcome such performance bottlenecks, are described in Zuo et al. (2005) and Zuo and Prassana (2005). This paper presents a floating-point accumulation circuit that is a natural evolution of this work. The system can handle streams of arbitrary length, requires modest area, and can handle interrupted data inputs. In contrast to the designs proposed by Zhuo et al., the proposed architecture maintains buffers for partial result storage which utilize significantly less embedded memory resources, while maintaining fixed size and speed characteristics, regardless of stream length. The results for both single- and double-precision accumulation architectures was verified in a Virtex-II 8000-4 part clocked at more than 150 MHz, and the power of this design was demonstrated in a computationally intense, matrix-matrix-multiply application

ieee antennas and propagation society international symposium | 2004

Hardware acceleration of the 3D finite-difference time-domain method

James P. Durbano; John R. Humphrey; Fernando E. Ortiz; Petersen F. Curt; Dennis W. Prather; Mark S. Mirotznik

Although the importance of fast, accurate computational electromagnetic (CEM) solvers is readily apparent, how to construct them is not. By nature, CEM algorithms are both computationally and memory intensive. Furthermore, the serial nature of most software-based implementations does not take advantage of the inherent parallelism found in many CEM algorithms. In an attempt to exploit parallelism, supercomputers and computer clusters are employed. However, these solutions can be prohibitively expensive and frequently impractical. Thus, a CEM accelerator or CEM co-processor would provide the community with much-needed processing power. This would enable iterative designs and designs that would otherwise be impractical to analyze. To this end, we are developing a full-3D, hardware-based accelerator for the finite-difference time-domain (FDTD) method (K.S. Yee, IEEE Trans. Antennas and Propag., vol. 14, pp. 302-307, 1966). This accelerator provides speedups of up to three orders of magnitude over single-PC solutions and will surpass the throughputs of the PC clusters. In this paper, we briefly summarize previous work in this area, where it has fallen short, and how our work fills the void. We then describe the current status of this project, summarizing our achievements to date and the work that remains. We conclude with the projected results of our accelerator.

Proceedings of SPIE | 2009

Real-time Embedded Atmospheric Compensation for Long-Range Imaging Using the Average Bispectrum Speckle Method

Petersen F. Curt; Michael R. Bodnar; Fernando E. Ortiz; Carmen J. Carrano; Eric J. Kelmelis

While imaging over long distances is critical to a number of security and defense applications, such as homeland security and launch tracking, current optical systems are limited in resolving power. This is largely a result of the turbulent atmosphere in the path between the region under observation and the imaging system, which can severely degrade captured imagery. There are a variety of post-processing techniques capable of recovering this obscured image information; however, the computational complexity of such approaches has prohibited real-time deployment and hampers the usability of these technologies in many scenarios. To overcome this limitation, we have designed and manufactured an embedded image processing system based on commodity hardware which can compensate for these atmospheric disturbances in real-time. Our system consists of a reformulation of the average bispectrum speckle method coupled with a high-end FPGA processing board, and employs modular I/O capable of interfacing with most common digital and analog video transport methods (composite, component, VGA, DVI, SDI, HD-SDI, etc.). By leveraging the custom, reconfigurable nature of the FPGA, we have achieved performance twenty times faster than a modern desktop PC, in a form-factor that is compact, low-power, and field-deployable.

Proceedings of SPIE | 2006

Modeling and simulation of nanoscale devices with a desktop supercomputer

Eric J. Kelmelis; James P. Durbano; John R. Humphrey; Fernando E. Ortiz; Petersen F. Curt

Designing nanoscale devices presents a number of unique challenges. As device features shrink, the computational demands of the simulations necessary to accurately model them increase significantly. This is a result of not only the increasing level of detail in the device design itself, but also the need to use more accurate models. The approximations that are generally made when dealing with larger devices break down as feature sizes decrease. This can be seen in the optics field when contrasting the complexity of physical optics models with those requiring a rigorous solution to Maxwells equations. This added complexity leads to more demanding calculations, stressing computational resources and driving research to overcome these limitations. There are traditionally two means of improving simulation times as model complexity grows beyond available computational resources: modifying the underlying algorithms to maintain sufficient precision while reducing overall computations and increasing the power of the computational system. In this paper, we explore the latter. Recent advances in commodity hardware technologies, particularly field-programmable gate arrays (FPGAs) and graphics processing units (GPUs), have allowed the creation of desktop-style devices capable of outperforming PC clusters. We will describe the key hardware technologies required to build such a device and then discuss their application to the modeling and simulation of nanophotonic devices. We have found that FPGAs and GPUs can be used to significantly reduce simulation times and allow for the solution of much large problems.

Unmanned/Unattended Sensors and Sensor Networks VIII | 2011

Nonmechanical beam steering using optical phased arrays

Thomas E. Dillon; Christopher A. Schuetz; Richard D. Martin; Daniel G. Mackrides; Petersen F. Curt; James Bonnett; Dennis W. Prather

Beam steering is an enabling technology for establishment of ad hoc communication links, directed energy for infrared countermeasures, and other in-theater defense applications. The development of nonmechanical beam steering techniques is driven by requirements for low size, weight, and power, and high slew rate, among others. The predominant beam steering technology currently in use relies on gimbal mounts, which are relatively large, heavy, and slow, and furthermore create drag on the airframes to which they are mounted. Nonmechanical techniques for beam steering are currently being introduced or refined, such as those based on liquid crystal spatial light modulators; however, drawbacks inherent to some of these approaches include narrow field of regard, low speed operation, and low optical efficiency. An attractive method that we explore is based on optical phased arrays, which has the potential to overcome the aforementioned issues associated with other mechanical and nonmechanical beam steering techniques. The optical array phase locks a number of coherent optical emitters in addition to applying arbitrary phase profiles across the array, thereby synthesizing beam shapes that can be steered and utilized for a diverse range of applications.

Proceedings of SPIE | 2009

A practical enhanced-resolution integrated optical-digital imaging camera (PERIODIC)

Mark S. Mirotznik; Scott A. Mathews; Robert J. Plemmons; Paul Pauca; Todd C. Torgersen; Ryan T. Barnard; Brian Gray; Qiang Zhang; J. van der Gracht; Petersen F. Curt; M. Bodnar; Sudhakar Prasad

An integrated array computational imaging system, dubbed PERIODIC, is presented which is capable of exploiting a diverse variety of optical information including sub-pixel displacements, phase, polarization, intensity, and wavelength. Several applications of this technology will be presented including digital superresolution, enhanced dynamic range and multi-spectral imaging. Other applications include polarization based dehazing, extended depth of field and 3D imaging. The optical hardware system and software algorithms are described, and sample results are shown.

Proceedings of SPIE | 2013

Realization of a video-rate distributed aperture millimeter-wave imaging system using optical upconversion

Christopher A. Schuetz; Richard K. Martin; Thomas E. Dillon; Peng Yao; Daniel G. Mackrides; Charles Harrity; Alicia Zablocki; Kevin Shreve; James Bonnett; Petersen F. Curt; Dennis W. Prather

Passive imaging using millimeter waves (mmWs) has many advantages and applications in the defense and security markets. All terrestrial bodies emit mmW radiation and these wavelengths are able to penetrate smoke, fog/clouds/marine layers, and even clothing. One primary obstacle to imaging in this spectrum is that longer wavelengths require larger apertures to achieve the resolutions desired for many applications. Accordingly, lens-based focal plane systems and scanning systems tend to require large aperture optics, which increase the achievable size and weight of such systems to beyond what can be supported by many applications. To overcome this limitation, a distributed aperture detection scheme is used in which the effective aperture size can be increased without the associated volumetric increase in imager size. This distributed aperture system is realized through conversion of the received mmW energy into sidebands on an optical carrier. This conversion serves, in essence, to scale the mmW sparse aperture array signals onto a complementary optical array. The side bands are subsequently stripped from the optical carrier and recombined to provide a real time snapshot of the mmW signal. Using this technique, we have constructed a real-time, video-rate imager operating at 75 GHz. A distributed aperture consisting of 220 upconversion channels is used to realize 2.5k pixels with passive sensitivity. Details of the construction and operation of this imager as well as field testing results will be presented herein.

Proceedings of SPIE | 2010

Comparing FPGAs and GPUs for high-performance image processing applications

Eric J. Kelmelis; Fernando E. Ortiz; Petersen F. Curt; Michael R. Bodnar; Kyle E. Spagnoli; Aaron Paolini; Daniel K. Price

Modern image enhancement techniques have been shown to be effective in improving the quality of imagery. However, the computational requirements of applying such algorithms to streams of video in real-time often cannot be satisfied by standard microprocessor-based systems. While a scaled solution involving clusters of microprocessors may provide the necessary arithmetic capacity, deployment is limited to data-center scenarios. What is needed is a way to perform these techniques in real time on embedded platforms. A new paradigm of computing utilizing special-purpose commodity hardware including Field-Programmable Gate Arrays (FPGAs) and Graphics Processing Units (GPU) has recently emerged as an alternative to parallel computing using clusters of traditional CPUs. Recent research has shown that for many applications, such as image processing techniques requiring intense computations and large memory spaces, these hardware platforms significantly outperform microprocessors. Furthermore, while microprocessor technology has begun to stagnate, GPUs and FPGAs have continued to improve exponentially. FPGAs, flexible and powerful, are best targeted at embedded, low-power systems and specific applications. GPUs, cheap and readily available, are available to most users through their standard desktop machines. Additionally, as fabrication scale continues to shrink, heat and power consumption issues typically limiting GPU deployment to high-end desktop workstations are becoming less of a factor. The ability to include these devices in embedded environments opens up entire new application domains. In this paper, we investigate two state-of-the-art image processing techniques, super-resolution and the average-bispectrum speckle method, and compare FPGA and GPU implementations in terms of performance, development effort, cost, deployment options, and platform flexibility.

Proceedings of SPIE | 2009

An embedded processor for real-time atmoshperic compensation

Michael R. Bodnar; Petersen F. Curt; Fernando E. Ortiz; Carmen J. Carrano; Eric J. Kelmelis

Imaging over long distances is crucial to a number of defense and security applications, such as homeland security and launch tracking. However, the image quality obtained from current long-range optical systems can be severely degraded by the turbulent atmosphere in the path between the region under observation and the imager. While this obscured image information can be recovered using post-processing techniques, the computational complexity of such approaches has prohibited deployment in real-time scenarios. To overcome this limitation, we have coupled a state-of-the-art atmospheric compensation algorithm, the average-bispectrum speckle method, with a powerful FPGA-based embedded processing board. The end result is a light-weight, lower-power image processing system that improves the quality of long-range imagery in real-time, and uses modular video I/O to provide a flexible interface to most common digital and analog video transport methods. By leveraging the custom, reconfigurable nature of the FPGA, a 20x speed increase over a modern desktop PC was achieved in a form-factor that is compact, low-power, and field-deployable.

Proceedings of SPIE | 2013

ATCOM: accelerated image processing for terrestrial long-range imaging through atmospheric effects

Petersen F. Curt; Aaron Paolini

Long-range video surveillance performance is often severely diminished due to atmospheric turbulence. The larger apertures typically used for video-rate operation at long-range are particularly susceptible to scintillation and blurring effects that limit the overall diffraction efficiency and resolution. In this paper, we present research progress made toward a digital signal processing technique which aims to mitigate the effects of turbulence in real-time. Our previous work in this area focused on an embedded implementation for portable applications. Our more recent research has focused on functional enhancements to the same algorithm using general-purpose hardware. We present some techniques that were successfully employed to accelerate processing of high-definition color video streams and study performance under nonideal conditions involving moving objects and panning cameras. Finally, we compare the real-time performance of two implementations using a CPU and a GPU.

Explore More