Se-Joong Lee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Se-Joong Lee is active.

Explore More

Publication

Featured researches published by Se-Joong Lee.

international solid-state circuits conference | 2003

An 800MHz star-connected on-chip network for application to systems on a chip

Se-Joong Lee; Seong-Jun Song; Kangmin Lee; Jeong-Ho Woo; Sung-Eun Kim; Byeong-Gyu Nam; Hoi-Jun Yoo

A 10.8/spl times/6.0mm/sup 2/ prototype chip is implemented with a star-connected on-chip network. The chip consists of a PLL, 1KB SRAM, two 2/spl times/2 crossbar switches, Up/Down-Samplers, two off-chip gateways, and synchronizers. The on-chip network contains 81k transistors, dissipates 264mW at 2.3V and 800MHz, and provides 1.6GB/s per port and 12.8GB/s aggregated bandwidth, supporting plesiochronous communication without global synchronization.

IEEE Transactions on Circuits and Systems Ii-express Briefs | 2005

Packet-switched on-chip interconnection network for system-on-chip applications

Se-Joong Lee; Kangmin Lee; Seong-Jun Song; Hoi-Jun Yoo

Increasing complexity of a system-on-chip design demands efficient on-chip interconnection architecture such as on-chip network to overcome limitations of bus architecture. In this brief, we propose a packet-switched on-chip interconnection network architecture, through which multiple processing units of different clock frequencies can communicate with each other without global synchronization. The architecture is analyzed in terms of area and energy consumption, and implementation issues on building blocks are addressed for cost-effective design. A test chip is implemented using 0.38-/spl mu/m CMOS technology, and measured its operation at 800 MHz to demonstrate its feasibility.

international solid-state circuits conference | 2001

An 80/20-MHz 160-mW multimedia processor integrated with embedded DRAM, MPEG-4 accelerator and 3-D rendering engine for mobile applications

Chi-Weon Yoon; Ramchan Woo; Jeonghoon Kook; Se-Joong Lee; Langmin Lee; Young-Don Bae; In-Cheol Park; Hoi-Jun Yoo

An 84 mm/sup 2/ 160 mW programmable processor in 0.18 /spl mu/m EMC technology consists of 32 b RISC with MAC, 20 MHz motion compensation accelerator for MPEG-4 at SP, 3D rendering engine with 2.2 M polygon/s at 20 MHz, and 7.125 Mb embedded DRAM with single bitline writing scheme.

IEEE Design & Test of Computers | 2005

Analysis and implementation of practical, cost-effective networks on chips

Se-Joong Lee; Kangmin Lee; Hoi-Jun Yoo

This article describes design issues in three NoCs (network on chip) that exploit star and mesh networks, with the objective of comparing area and energy costs. We present new solutions based on mesochronous communication and burst packet transactions.

symposium on vlsi circuits | 2005

Adaptive network-on-chip with wave-front train serialization scheme

Se-Joong Lee; Kwanho Kim; Hyejung Kim; Namjun Cho; Hoi-Jun Yoo

An adaptive network-on-chip (NoC) is implemented with self-calibration and dynamic bandwidth control. The chip calibrates skew between clock domains automatically for reliable mesochronous communication. Link bandwidth is controlled adaptively according to network traffic for energy-efficient packet transmission. A new on-chip serialization scheme, wave-front train (WAFT), is used in the NoC chip to realize high-performance serial link with minimum overhead. The chip is fabricated using 0.18/spl mu/m CMOS technology. The overall network and WAFT operations are successfully measured at 1.2Gb/s and 3Gb/s, respectively.

international symposium on circuits and systems | 2005

A reconfigurable crossbar switch with adaptive bandwidth control for networks-on-chip

Donghyun Kim; Kangmin Lee; Se-Joong Lee; Hoi-Jun Yoo

We propose a new crossbar switch structure with adaptive bandwidth control. In a complex SoC design, the proposed crossbar switch efficiently incorporates various IPs with different bandwidth requirements. Simulation under various traffic scenarios shows that the throughput of the proposed crossbar switch is as high as that of a conventional switch operating at twice the speed. The proposed crossbar switch shows maximum 27% improvement in throughput and maximum 41% improvement in latency compared to the conventional one. The proposed crossbar switch is implemented using Verilog HDL, synthesized with an 0.18 /spl mu/m process library, and verified on FPGAs. The area and power overhead of the proposed crossbar switch is 21% and 15%, respectively, when compared to the conventional crossbar switch.

custom integrated circuits conference | 2003

A distributed crossbar switch scheduler for on-chip networks

Kangmin Lee; Se-Joong Lee; Hoi-Jun Yoo

A scheduling algorithm is proposed for a lightweight on-chip crossbar switch in on-chip networks. The proposed NA-MOO algorithm distributes the arbitration computing over all of the crossbar fabric nodes. Its implementation shows that it can reduce >60% area and >20% computation delay compared to the conventional round robin based SLIP algorithm. Its feasibility is analyzed by using an SoC for HDTV as an example. The proposed techniques are area-efficient and show higher performance for the on-chip interconnection networks.

international symposium on circuits and systems | 2000

A 670 ps, 64 bit dynamic low-power adder design

Ramchan Woo; Se-Joong Lee; Hoi-Jun Yoo

A 64 bit dynamic low-power adder has been designed and fabricated for 2.5 V 0.25-/spl mu/m 1-poly 5-metal CMOS technology. Fast carry propagation is obtained by fast P generation, parallel quaternary-tree form of group carry (GC) selection and conditional sum selection. The results of proposed adder architecture show that propagation delay, power consumption, and the area are 670 ps, 100 mW, and 0.16 mm/sup 2/, respectively.

IEEE Journal of Solid-state Circuits | 2002

A reconfigurable multilevel parallel texture cache memory with 75-GB/s parallel cache replacement bandwidth

Se-Jeong Park; Jeong-Su Kim; Ramchan Woo; Se-Joong Lee; Kangmin Lee; Tae-Hum Yang; Jin-Yong Jung; Hoi-Jun Yoo

Recently, the level of realism in PC graphics applications has been approaching that of high-end graphics workstations, necessitating a more sophisticated texture data cache memory to overcome the finite bandwidth of the AGP or PCI bus. This paper proposes a multilevel parallel texture cache memory to reduce the required data bandwidth on the AGP or PCI bus and to accelerate the operations of parallel graphics pipelines in PC graphics cards. The proposed cache memory is fabricated by 0.16-/spl mu/m DRAM-based SOC technology. It is composed of four components: an 8-MB DRAM L2 cache, 8-way parallel SRAM L1 caches, pipelined texture data filters, and a serial-to-parallel loader. For high-speed parallel L1 cache data replacement, the internal bus bandwidth has been maximized up to 75 GB/s with a newly proposed hidden double data transfer scheme. In addition, the cache memory has a reconfigurable architecture in its line size for optimal caching performance in various graphics applications from three-dimensional (3-D) games to high-quality 3-D movies.

IEEE Journal of Solid-state Circuits | 2002

Race logic architecture (RALA): a novel logic concept using the race scheme of input variables

Se-Joong Lee; Hoi-Jun Yoo

A novel logic concept, Race Logic Architecture (RALA), is proposed. RALA is a new logic operation architecture in that the racing between input variables along the interconnection lines functions as an active logic element instead of logic gates, while the logic gates play a simple passive role. Logic operations of RALA are based on wired-OR that utilizes shared space and serial-AND that utilizes the triggering sequence of input variables. With these two concepts, RALA can implement arbitrary Boolean operations. Various kinds of combinational circuits are simulated and compared with RALAs. RALA shows the best performance in delay time, area, and power product results. A 64-bit carry-look-ahead adder with RALA is fabricated by 0.25-/spl mu/m CMOS technology to verify its feasibility and functionality. The area of the adder is 800 /spl mu/m/spl times/150 /spl mu/m, and the delay time from the clock to Sum31 measured 0.9 ns.

Explore More