Ketan Padalia
Altera
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ketan Padalia.
field programmable gate arrays | 2005
David Lewis; Elias Ahmed; Gregg William Baeckler; Vaughn Betz; Mark Bourgeault; David Cashman; David Galloway; Michael D. Hutton; Christopher F. Lane; Andy L. Lee; Paul Leventis; Sandy Marquardt; Cameron McClintock; Ketan Padalia; Bruce B. Pedersen; Giles Powell; Boris Ratchev; Srinivas T. Reddy; Jay Schleicher; Kevin Stevens; Richard Yuan; Richard G. Cliff; Jonathan Rose
This paper describes the Altera Stratix II™ logic and routing architecture. This architecture features a novel adaptive logic module (ALM) that is based on a 6-LUT, but can be partitioned into two smaller LUTs to efficiently implement circuits containing a range of LUT sizes that arises in conventional synthesis flows. This provides a performance increase of 15% in the Stratix II architecture while reducing area by 2%. The ALM also includes a more powerful arithmetic structure that can perform two bits of arithmetic per ALM, and perform a sum of up to three inputs. The routing fabric adds a new set of fast inputs to the routing multiplexers for another 3% improvement in performance, while other improvements in routing efficiency cause another 6% reduction in area. These changes in combination with other circuit and architecture changes in Stratix II contribute 27% of an overall 51% performance improvement (including architecture and process improvement). The architecture changes reduce area by 10% in the same process, and by 50% after including process migration.
field-programmable logic and applications | 2004
Michael D. Hutton; Jay Schleicher; David Lewis; Bruce B. Pedersen; Richard Yuan; Sinan Kaptanoglu; Gregg William Baeckler; Boris Ratchev; Ketan Padalia; Mark Bourgeault; Andy L. Lee; Henry Kim; Rahul Saini
This paper proposes a new adaptable FPGA logic element based on fracturable 6-LUTs, which fundamentally alters the longstanding belief that a 4-LUT is the most efficient area/delay tradeoff. We will describe theory and benchmarking results showing a 15% performance increase with 12% area de- crease vs. a standard BLE4. The ALM structure is one of a number of archi- tectural improvements giving Alteras 90nm Stratix II architecture a 50% per- formance advantage over its 130nm Stratix predecessor.
field programmable gate arrays | 2008
Adrian Ludwin; Vaughn Betz; Ketan Padalia
In this paper, we describe the application of two parallelization strategies to the Quartus II FPGA placer. The first uses a pipelining approach and achieves speedups of 1.3x on two processing cores. The second uses a parallel moves approach and achieves speedups of 2.2x on four cores. Unlike all previous parallel moves algorithms, ours is deterministic and always gives the same answer as the serial version of the algorithm, without any significant reduction in performance. We also describe a process to quantify multi-core performance effects, such as memory subsystem limitations and explicit synchronization overhead, and fully describe these effects on a CAD tool for the first time. Memory limitations alone are found to cost up to 35% of total runtime. Unlike previous algorithms, our algorithms have negligible explicit synchronization overhead. These results are relevant to both CAD designers and to any developers seeking to parallelize existing software.
field programmable gate arrays | 2003
Ketan Padalia; Ryan Fung; Mark Bourgeault; Aaron Egier; Jonathan Rose
One of the most difficult and time-consuming steps in the creation of an FPGA is its transistor-level design and physical layout. Modern commercial FPGAs typically consume anywhere from 50 to 200 man-years simply in the layout step. To date, automated tools have only been employed in small parts of the periphery and programming circuitry. The core tiles, which are repeated many times, are subject to painstaking manual design and layout. In this paper we present a new system (called GILES, for Good Instant Layout of Erasable Semiconductors) that automatically generates a transistor-level schematic from a high-level architectural specification of an FPGA. It also generates a cell-level netlist that is placed and routed automatically. The architectural specification is the one used as input to the VPR [3] architectural exploration tool. The output is the mask-level layout of a single tile that can be replicated to form an FPGA array. We describe a new placement tool that simultaneously places and compacts the layout to minimize white space and wiring demand, and a special-purpose router built for this task.GILES can place and route a tile consisting of four 4-input LUT logic cells and all of its programmable wires in a 0.18μm CMOS process using 8 layers of metal and 25983μm2 of area. When we generate the layout of an architecture similar to the Xilinx Virtex-E FPGA (built in a 0.18μm process) GILES requires only 47% more area than the original. The layout area of an architecture similar to the Altera Apex 20K400E (also built in a 0.18µm process) constructed by GILES requires 97% more area than the original.
Archive | 2005
Ryan Fung; Ketan Padalia
Archive | 2006
Jason Peters; Ketan Padalia; Adrian Ludwin
Archive | 2005
Ketan Padalia; Jason Peters; Vaughn Betz
Archive | 2013
Ketan Padalia; Ryan Fung
Archive | 2005
David Cashman; David Lewis; Gregg William Baeckler; Ketan Padalia
Archive | 2007
Ketan Padalia; Adrian Ludwin; Vaughn Betz; Ryan Fung