Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Guojie Luo is active.

Publication


Featured researches published by Guojie Luo.


asia and south pacific design automation conference | 2007

Thermal-Aware 3D IC Placement Via Transformation

Jason Cong; Guojie Luo; Jie Wei; Yan Zhang

3D IC technologies can help to improve circuit performance and lower power consumption by reducing wirelength. Also, 3D IC technology can be used to realize heterogeneous system-on-chip design, by integrating different modules together with less interference with each other. In this paper, we propose a novel thermal-aware 3D cell placement approach, named T3Place, based on transforming a 2D placement with good wirelength to a 3D placement, with the objectives of half-perimeter wirelength, through-the-silicon (TS) via number and temperature. T3Place is composed of two steps, transformation from a 2D placement to a 3D placement and the refinement of the resulting 3D placement. We proposed and compared several different transformation techniques, including local stacking transformation (LST), folding-2, folding-4 and window-based stacking/folding transformation, and concluded that (i) LST can generate 3D placements with the least wirelength, (ii) the folding-based transformations result in 3D placements with the fewest TS vias, and (iii) the window-based stacking/folding transformations provide good TS via number and wirelength tradeoffs. For example, with four device layers, LST can reduce the wirelength by over 2times compared to the initial 2D placement, while window-based stacking/folding can provide over 10times variation in terms of the TS via number, thus adaptive to different manufacturing ability for TS via density. Moreover, we proposed a novel relaxed conflict-net (RCN) graph-based layer assignment method to further refine the 3D placements. Compared to LST results, thermal-aware RCN graph-based layer assignment algorithm (r = 10%) can further reduce the maximum on-chip temperature by 37%, with only 6% TS via number increase and 8% wirelength increase.


asia and south pacific design automation conference | 2009

A multilevel analytical placement for 3D ICs

Jason Cong; Guojie Luo

In this paper we propose a multilevel non-linear programming based 3D placement approach that minimizes a weighted sum of total wirelength and TS via number subject to area density constraints. This approach relaxes the discrete layer assignments so that they are continuous in the z-direction and the problem can be solved by an analytical global placer. A key idea is to do the overlap removal and device layer assignment simultaneously by adding a density penalty function for both area & TS via density constraints. Experimental results show that this analytical placer in a multilevel framework is effective to achieve trade-offs between wirelength and TS via number. Compared to the recently published transformation-based 3D placement method [1], we are able to achieve on average 12% shorter wirelength and 29% fewer TS via compared to their cases with best wirelength; we are also able to achieve on average 20% shorter wirelength and 50% fewer TS via number compared to their cases with best TS via numbers.


asia and south pacific design automation conference | 2013

Optimizing routability in large-scale mixed-size placement

Jason Cong; Guojie Luo; Kalliopi Tsota; Bingjun Xiao

One of the necessary requirements for the placement process is that it should be capable of generating routable solutions. This paper describes a simple but effective method leading to the reduction of the routing congestion and the final routed wirelength for large-scale mixed-size designs. In order to reduce routing congestion and improve routability, we propose blocking narrow regions on the chip. We also propose dummy-cell insertion inside regions characterized by reduced fixed-macro density. Our placer consists of three major components: (i) narrow channel reduction by performing neighbor-based fixed-macro inflation; (ii) dummy-cell insertion inside large regions with reduced fixed-macro density; and (iii) pre-placement inflation by detecting tangled logic structures in the netlist and minimizing the maximum pin density. We evaluated the quality of our placer using the newly released DAC 2012 routability-driven placement contest designs and we compared our results to the top four teams that participated in the placement contest. The experimental results reveal that our placer improves the routability of the DAC 2012 placement contest designs and effectively reduces the routing congestion.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2013

An Analytical Placement Framework for 3-D ICs and Its Extension on Thermal Awareness

Guojie Luo; Yiyu Shi; Jason Cong

In this paper, we present a high-quality analytical 3-D placement framework. We propose using a Huber-based local smoothing technique to work with a Helmholtz-based global smoothing technique to handle the nonoverlapping constraints. The experimental results show that this analytical approach is effective for achieving tradeoffs between the wirelength and the through-silicon-via (TSV) number. Compared to the state-of-the-art 3-D placer ntuplace3d, our placer achieves more than 20% wirelength reduction, on average, with a similar number of TSVs. Furthermore, we extend this analytical 3-D placement framework with thermal awareness. While 2-D thermal-aware placement simply follows uniform power distribution to minimize temperature, we show that the same criterion does not work for 3-D ICs. Instead, we are able to prove that when the TSV area in each bin is proportional to the lumped power consumption of that bin and the bins in all tiers directly above it, the peak temperature is minimized. Based on this criterion, we implement thermal awareness in our analytical 3-D placement framework. Compared with a TSV oblivious method, which only results in an 8% peak temperature reduction, our method reduces the peak temperature by 34%, on average, with slightly less wirelength overhead. These results suggest that considering the thermal effects of TSVs is necessary and effective during the placement stage.


IEEE Transactions on Mobile Computing | 2016

Sextant: Towards Ubiquitous Indoor Localization Service by Photo-Taking of the Environment

Ruipeng Gao; Yang Tian; Fan Ye; Guojie Luo; Kaigui Bian; Yizhou Wang; Tao Wang; Xiaoming Li

Mainstream indoor localization technologies rely on RF signatures that require extensive human efforts to measure and periodically recalibrate signatures. The progress to ubiquitous localization remains slow. In this study, we explore Sextant, an alternative approach that leverages environmental reference objects such as store logos. A user uses a smartphone to obtain relative position measurements to such static reference objects for the system to triangulate the user location. Sextant leverages image matching algorithms to automatically identify the chosen reference objects by photo-taking, and we propose two methods to systematically address image matching mistakes that cause large localization errors. We formulate the benchmark image selection problem, prove its NP-completeness, and propose a heuristic algorithm to solve it. We also propose a couple of geographical constraints to further infer unknown reference objects. To enable fast deployment, we propose a lightweight site survey method for service providers to quickly estimate the coordinates of reference objects. Extensive experiments have shown that Sextant prototype achieves 2-5 m accuracy at 80-percentile, comparable to the industry state-of-the-art, while covering a 150 x 75 m mall and 300 x 200m train station requires a one time investment of only 2-3 man-hours from service providers.


international conference on computer aided design | 2012

Memory partitioning and scheduling co-optimization in behavioral synthesis

Peng Li; Yuxin Wang; Peng Zhang; Guojie Luo; Tao Wang; Jason Cong

Achieving optimal throughput by extracting parallelism in behavioral synthesis often exaggerates memory bottleneck issues. Data partitioning is an important technique for increasing memory bandwidth by scheduling multiple simultaneous memory accesses to different memory banks. In this paper we present a vertical memory partitioning and scheduling algorithm that can generate a valid partition scheme for arbitrary affine memory inputs. It does this by arranging non-conflicting memory accesses across the border of loop iterations. A mixed memory partitioning and scheduling algorithm is also proposed to combine the advantages of the vertical and other state-of-art algorithms. A set of theorems is provided as criteria for selecting a valid partitioning scheme. This is followed by an optimal and scalable memory scheduling algorithm. By utilizing the property of constant strides between memory addresses in successive loop iterations, an address translation optimization technique for an arbitrary partition factor is proposed to improve performance, area and energy efficiency. Experimental results show that on a set of real-world medical image processing kernels, the proposed mixed algorithm with address translation optimization can gain speed-up, area reduction and power savings of 15.8%, 36% and 32.4% respectively, compared to the state-of-art memory partitioning algorithm.


ieee international d systems integration conference | 2010

Logic-on-logic 3D integration and placement

Thorlindur Thorolfsson; Guojie Luo; Jason Cong; Paul D. Franzon

In this paper we describe three 3D standard cell placement algorithms, which are: “3D Placement using Sequential Off-the-Shelf 2D Placement Tools”, “True-3D Analytical Placement with mPL” and “3D Placement using Simultaneous 2D Placements with mPL”. We use these algorithms to place three case studies in a real face-to-face 3D integration process. The three case studies are a 2 point FFT butterfly processing element (PE), an Advanced Encryption Standard encryption block (AES) and a multiple-input and multiple-output wireless decoder (MIMO). The placements are then fully routed and compared to 2D placements in terms of performance and power consumption. Using this methodology we show that using 3D face-to-face integration with microbumps in conjunction with the three placement algorithms we can improve the maximum clock speed of AES module by 15.3% and the PE by 22.6%, while reducing the power of the AES module and the PE by 2.6% and 12.9% respectively.


international symposium on low power electronics and design | 2016

Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster

Chen Zhang; Di Wu; Jiayu Sun; Guangyu Sun; Guojie Luo; Jason Cong

Recently, FPGA-based CNN accelerators have demonstrated superior energy efficiency compared to high-performance devices like GPGPUs. However, due to the constrained on-chip resource and many other factors, single-board FPGA designs may have difficulties in achieving optimal energy efficiency. In this paper we present a deeply pipelined multi-FPGA architecture that expands the design space for optimal performance and energy efficiency. A dynamic programming algorithm is proposed to map the CNN computing layers efficiently to different FPGA boards. To demonstrate the potential of the architecture, we built a prototype system with seven FPGA boards connected with high-speed serial links. The experimental results on AlexNet and VGG-16 show that the prototype can achieve up to 21x and 2x energy efficiency compared to optimized multi-core CPU and GPU implementations, respectively.


international symposium on physical design | 2013

FF-bond: multi-bit flip-flop bonding at placement

Chang-Cheng Tsai; Yiyu Shi; Guojie Luo; Iris Hui-Ru Jiang

Clock power contributes a significant portion of chip power in modern IC design. Applying multi-bit flip-flops can effectively reduce clock power. State-of-the-art work performs multi-bit flip-flop clustering at the post-placement stage. However, the solution quality may be limited because the combinational gates are immovable during the clustering process. To overcome the deficiency, in this paper, we propose multi-bit flip-flop bonding at placement. Inspired by ionic bonding in Chemistry, we direct flip-flops to merging friendly locations thus facilitating flip-flop merging. Experimental results show that our algorithm, called FF-Bond, can save 27% clock power on average. Compared with state-of-the-art post-placement multi-bit flip-flop clustering, FF-Bond can further reduce 14% clock power.


IEEE Transactions on Mobile Computing | 2016

Multi-Story Indoor Floor Plan Reconstruction via Mobile Crowdsensing

Ruipeng Gao; Mingmin Zhao; Tao Ye; Fan Ye; Guojie Luo; Yizhou Wang; Kaigui Bian; Tao Wang; Xiaoming Li

The lack of floor plans is a critical reason behind the current sporadic availability of indoor localization service. Service providers have to go through effort-intensive and time-consuming business negotiations with building operators, or hire dedicated personnel to gather such data. In this paper, we propose Jigsaw, a floor plan reconstruction system that leverages crowdsensed data from mobile users. It extracts the position, size, and orientation information of individual landmark objects from images taken by users. It also obtains the spatial relation between adjacent landmark objects from inertial sensor data, then computes the coordinates and orientations of these objects on an initial floor plan. By combining user mobility traces and locations where images are taken, it produces complete floor plans with hallway connectivity, room sizes, and shapes. It also identifies different types of connection areas (e.g., escalators and stairs) between stories, and employs a refinement algorithm to correct detection errors. Our experiments on three stories of two large shopping malls show that the 90-percentile errors of positions and orientations of landmark objects are about 1~2m and 5~9°, while the hallway connectivity and connection areas between stories are 100 percent correct.

Collaboration


Dive into the Guojie Luo's collaboration.

Top Co-Authors

Avatar

Jason Cong

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ruipeng Gao

Beijing Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Fan Ye

Stony Brook University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yiyu Shi

University of Notre Dame

View shared research outputs
Researchain Logo
Decentralizing Knowledge