Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Eugene John is active.

Publication


Featured researches published by Eugene John.


great lakes symposium on vlsi | 1999

A novel low power energy recovery full adder cell

R. Shalem; Eugene John; Lizy Kurian John

A novel low power and low transistor count static energy recovery full adder (SERF) is presented in this paper. The power consumption and general characteristics of the SERF adder are then compared against three low powerful adders; the transmission function adder (TFA) the dual value logic (DVL) adder and the fourteen transistor (14 T) full adder. The proposed SERF adder design was proven to be superior to the other three designs in power dissipation and area, and second in propagation delay only to the DVL adder. The combination of low power and low transistor count makes the new SERF cell a viable option for low power design.


IEEE Transactions on Very Large Scale Integration Systems | 1998

A dynamically reconfigurable interconnect for array processors

Lizy Kurian John; Eugene John

Reconfigurability of processor arrays is important due to two reasons (1) to efficiently execute different algorithms and (2) to isolate faulty processors. An array processor that is reconfigurable by the user any number of times to yield a different topology or to isolate faults is envisaged in this paper. The system has a host or controller that broadcasts a command to the interconnect to configure itself into a particular fashion. The interconnect uses static-RAM programming technology and can be programmed to different configurations by sending a different set of bits to the configuration random access memory (RAM) in the interconnect. We present three designs reconfigurable into array, ring, mesh, or Illiac mesh topologies. The first design provides no redundancy or fault tolerance. The second design is capable of graceful degradation by bypassing faulty elements. The third design is capable of graceful degradation by rerouting. The details of the interconnect and the configuration RAM contents for typical configurations are illustrated. It is seen that reconfigurable interconnect results in a highly reconfigurable or polymorphic computer.


Journal of Low Power Electronics | 2005

Implementation of Low Power Digital Multipliers Using 10 Transistor Adder Blocks

Dhireesha Kudithipudi; Eugene John

The increasing demand for the high fidelity portable devices has laid emphasis on the development of low power and high performance systems. In the next generation processors, the low power design has to be incorporated into fundamental computation units, such as multipliers. The characterization and optimization of such low power multipliers will aid in comparison and choice of multiplier modules in system design. In this paper we performed a comparative analysis of the power, delay, and power delay product (PDP) optimization characteristics of four parallel digital multipliers implemented using low power 10 transistor (10T) adders and conventional CMOS adder cells. In order to achieve optimal power savings at smaller geometry sizes, we proposed a heuristic approach known as hybrid adder models. Multipliers realized using the Static Energy Recovery Full adder (SERF) circuit consumed considerably less power compared to 10T and static CMOS based multipliers for all the configurations studied. Furthermore, the difference between the power consumption of the 10 transistor based multipliers and 28T multipliers is significant at 180 nm, but not at 70 nm. For smaller geometry sizes down to 70 nm, the propagation delay of the multipliers implemented with 10 transistors translates to a better performance measure. Carry-Save Multipliers had better PDP range than the other multipliers for all the three adder sub-module designs. The PDP measure for optimal scaled gate width resulted in a best-case scenario for SERF Wallace tree multiplier as compared to the other three SERF based multipliers. This can be attributed to the fast computational capability of the Wallace Tree multiplier and SERF adders’ recovery energy logic saving more power at deep sub-micron sizes. The proposed SERF-10T Hybrid adder model multipliers consumed the least power of all the Hybrid and regular models with no deterioration in performance. Taken together, these results suggest that SERF-10T Hybrid model based multipliers are suited for ultra low power design and fast computation at smaller geometry sizes.


IEEE Transactions on Electron Devices | 1994

Design and performance analysis of InP-based high-speed and high-sensitivity optoelectronic integrated receivers

Eugene John; Mukunda B. Das

A novel transimpedance optoelectronic receiver amplifier suitable for monolithic integration is proposed and analyzed by exploiting state-of-the-art high-speed MSM photodiodes and HBTs based on lattice-matched InGaAs-InAlAs heterostructures on InP substrates. The projected performance characteristics of this amplifier indicate a high transimpedance (/spl ap/3.6 k/spl Omega/), a large bandwidth (17 GHz), and an excellent optical detection sensitivity (/spl minus/26.8 dBm) at 17 Gb/s for the standard bit-error-rate of 10/sup /spl minus/9/. The latter corresponds to an input noise spectral density, /spl radic/(i/sub in//sup 2//B), of 2.29 pA//spl radic/(Hz) for the full bandwidth. The bandwidth of the amplifier can be increased to 30 GHz for a reduced transimpedance (0.82 k/spl Omega/) and a lower detection sensitivity, i.e., /spl minus/21 dBm at 30 Gb/s. The amplifier also achieves a detected optical-to-electrical power gain of 21.5 dBm into a 50 /spl Omega/ load termination. The design utilizes small emitter-area HBTs for the input cascoded-pair stage, followed by a two-step emitter-follower involving one small and one large emitter-area HBTs. The design strategy of using small emitter-area HBTs is matched by a low-capacitance novel series/parallel connected MSM photodiode. This combined approach has yielded this amplifiers combined high performance characteristics which exceed either achieved or projected performances of any receiver amplifier reported to-date. The paper also discusses the issues concerning IC implementation of the receiver, including the means of realizing a high-value feedback resistor. >


international midwest symposium on circuits and systems | 2010

A quasi-power-gated low-leakage stable SRAM cell

Pradeep Nair; Savithra Eratne; Eugene John

Leakage power dissipation and stability continues to be a major concern in deep-submicron SRAM cell design. In this paper, a quasi-power-gating approach that reduces the leakage power dissipation in an SRAM cell while maintaining stability is proposed. As compared to a standard 6-transistor SRAM, it consists of four additional NMOS transistors. In the active mode, the cell is activated by enabling two NMOS transistors in the pull-down path of the inverter. In the idle mode, a quasi-power-gating scheme is employed to reduce leakage by utilizing stack effect. It was found that this cell resulted in about 39.54 percent and 30.5 percent leakage power savings at a supply voltage value of 1V and 300mV respectively. A stability increase was also observed when compared to the standard non-power-gated 6-transistor SRAM cell.


international midwest symposium on circuits and systems | 2009

A comparative analysis of coarse-grain and fine-grain power gating for FPGA lookup tables

Pradeep Nair; Santosh Koppa; Eugene John

Leakage power dissipation is becoming a concern in field-programmable gate arrays (FPGAs) due to scaling in FPGA technology. Widely available commercial FPGAs are based on lookup tables (LUTs) consisting of SRAM arrays and multiplexers. In this paper, we analyze the leakage power dissipation in the SRAM-array of a FPGA LUT for a 65nm CMOS process. We apply power-gating to an FPGA LUT SRAM array in two different ways, namely, coarse-grain power gating and fine-grain power gating. We carry out a comparative analysis of the two methods. In our research, we found that power-gating can be employed to drastically reduce the leakage power dissipation in the SRAM. More leakage savings were obtained with coarse-grain power-gating than with fine-grain power gating. The coarse-grain and fine-grain power-gating techniques yielded approximately 99 percent and 81 percent leakage savings, respectively, over the case where no power-gating is applied.


IEEE Photonics Technology Letters | 1992

Speed and sensitivity limitations of optoelectronic receivers based on MSM photodiode and millimeter-wave HBTs on InP substrate

Eugene John; Mukunda B. Das

Heterostructure bipolar transistors and MSM photodetectors based on compound semiconductors have demonstrated high-frequency performance beyond 100 GHz. By combining these state-of-the-art devices in a realistic integrated optoelectronic receiver, this letter demonstrates that it is possible to achieve a receiver sensitivity of -19.04 dBm at 16 Gb/s at a bit-error-rate of 10/sup -9/. Further improvement of noise and bit-rate can be achieved by designing HBTs with lower junction capacitances.<<ETX>>


IEEE Transactions on Multimedia | 2008

Caches for Multimedia Workloads: Power and Energy Tradeoffs

Dhireesha Kudithipudi; Stefan Petko; Eugene John

One of the significant workloads in current generation desktop processors and mobile devices is multimedia processing. Large on-chip caches are common in modern processors, but large caches will result in increased power consumption and increased access delays. Regular data access patterns in streaming multimedia applications and video processing applications can provide high hit-rates, but due to issues associated with access time, power and energy, caches cannot be made very large. Characterizing and optimizing the memory system is conducive for designing power and performance efficient multimedia application processors. Performance tradeoffs for multimedia applications have been studied in the past, however, power and energy tradeoffs for caches for multimedia processing have not been adequately studied in the past. In this paper, we characterize multimedia applications for I-cache and D-cache power and energy using a multilevel cache hierarchy. Both dynamic and static power increase with increasing cache sizes, however, the increase in dynamic power is small. The increase in static power is significant, and becomes increasingly relevant for smaller feature sizes. There is significant static power dissipation, ~ 45%, in L1 & L2 caches at 70 nm technology sizes, emphasizing the fact that future multimedia systems must be designed by taking leakage power reduction techniques into account. The energy consumption of on-chip L2 caches is seen to be very sensitive to cache size variations. Sizes larger than 16 k for I-caches and 32 k for D-caches will not be efficient choices to maintain power and performance balance. Since multimedia applications spend significant amounts of time in integer operations, to improve the performance, we propose implementing low power full adders and hybrid multipliers in the data path, which results in 9% to 21% savings in the overall power consumption.


IEEE Transactions on Very Large Scale Integration Systems | 2006

Architectural enhancements for network congestion control applications

Byeong Kil Lee; Lizy Kurian John; Eugene John

Complex network protocols and various network services require significant processing capability for modern network applications. One of the important features in modern networks is differentiated service. Along with differentiated service, rapidly changing network environments result in congestion problems. In this paper, we analyze the characteristics of representative congestion control applications-scheduling and queue management algorithms, and we propose application-specific acceleration techniques that use instruction-level parallelism (ILP) and packet-level parallelism (PLP) in these applications. From the PLP perspective, we propose a hardware acceleration model based on detailed analysis of congestion control applications. In order to get large throughputs, a large number of processing elements (PEs) and a parallel comparator are designed. Such hardware accelerators provide large parallelism proportional to the number of processing elements added. A 32-PE enhancement yields 24times speedup for weighted fair queueing (WFQ) and 27times speedup for random early detection (RED). For ILP, new instruction set extensions for fast conditional operations are applied for congestion control applications. Based on our experiments, proposed architectural extensions show 10%-12% improvement in performance for instruction set enhancements. As the performance of general-purpose processors rapidly increases, defining architectural extensions (e.g., multi-media extensions (MMX) as in multimedia applications) for general-purpose processors could be an alternative solution for a wide range of network applications


spec international performance evaluation workshop | 2009

A Tale of Two Processors: Revisiting the RISC-CISC Debate

Ciji Isen; Lizy Kurian John; Eugene John

The contentious debates between RISC and CISC have died down, and a CISC ISA, the x86 continues to be popular. Nowadays, processors with CISC-ISAs translate the CISC instructions into RISC style micro-operations (eg: uops of Intel and ROPS of AMD). The use of the uops (or ROPS) allows the use of RISC-style execution cores, and use of various micro-architectural techniques that can be easily implemented in RISC cores. This can easily allow CISC processors to approach RISC performance. However, CISC ISAs do have the additional burden of translating instructions to micro-operations. In a 1991 study between VAX and MIPS, Bhandarkar and Clark showed that after canceling out the code size advantage of CISC and the CPI advantage of RISC, the MIPS processor had an average 2.7x advantage over the studied CISC processor (VAX). A 1997 study on Alpha 21064 and the Intel Pentium Pro still showed 5% to 200% advantage for RISC for various SPEC CPU95 programs. A decade later and after introduction of interesting techniques such as fusion of micro-operations in the x86, we set off to compare a recent RISC and a recent CISC processor, the IBM POWER5+ and the Intel Woodcrest. We find that the SPEC CPU2006 programs are divided between those showing an advantage on POWER5+ or Woodcrest, narrowing down the 2.7x advantage to nearly 1.0. Our study points to the fact that if aggressive micro-architectural techniques for ILP and high performance can be carefully applied, a CISC ISA can be implemented to yield similar performance as RISC processors. Another interesting observation is that approximately 40% of all work done on the Woodcrest is wasteful execution in the mispredicted path.

Collaboration


Dive into the Eugene John's collaboration.

Top Co-Authors

Avatar

Pradeep Nair

University of Texas at San Antonio

View shared research outputs
Top Co-Authors

Avatar

Lizy Kurian John

University of Texas at Austin

View shared research outputs
Top Co-Authors

Avatar

Savithra Eratne

University of Texas at San Antonio

View shared research outputs
Top Co-Authors

Avatar

Dhireesha Kudithipudi

Rochester Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Byeong Kil Lee

University of Texas at San Antonio

View shared research outputs
Top Co-Authors

Avatar

Mukunda B. Das

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Santosh Koppa

University of Texas at San Antonio

View shared research outputs
Top Co-Authors

Avatar

Dhireesha Kudithipudi

Rochester Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Fred Hudson

University of Texas at San Antonio

View shared research outputs
Top Co-Authors

Avatar

Wei Ming Lin

University of Texas at San Antonio

View shared research outputs
Researchain Logo
Decentralizing Knowledge