Jean Jyh-Jiun Shann | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jean Jyh-Jiun Shann is active.

Explore More

Publication

Featured researches published by Jean Jyh-Jiun Shann.

Information Processing and Management | 2003

Inverted file compression through document identifier reassignment

Wann-Yun Shieh; Tien-Fu Chen; Jean Jyh-Jiun Shann; Chung-Ping Chung

The inverted file is the most popular indexing mechanism for document search in an information retrieval system. Compressing an inverted file can greatly improve document search rate. Traditionally, the d-gap technique is used in the inverted file compression by replacing document identifiers with usually much smaller gap values. However, fluctuating gap values cannot be efficiently compressed by some well-known prefix-free codes. To smoothen and reduce the gap values, we propose a document-identifier reassignment algorithm. This reassignment is based on a similarity factor between documents. We generate a reassignment order for all documents according to the similarity to reassign closer identifiers to the documents having closer relationships. Simulation results show that the average gap values of sample inverted files can be reduced by 30%, and the compression rate of d-gapped inverted file with prefix-free codes can be improved by 15%.

design automation conference | 2008

ETAHM: an energy-aware task allocation algorithm for heterogeneous multiprocessor

Po-Chun Chang; I-Wei Wu; Jean Jyh-Jiun Shann; Chung-Ping Chung

In demand of more computing power and less energy use, multiprocessor with power management facility emerges in embedded system design. Dynamic voltage scaling is such a facility that varies clock speed and supply voltage to save more energy. In this paper, we propose ETAHM to allocate tasks on a target multiprocessor system. In pursuit of global optimal solution, it mixes task scheduling, mapping and DVS utilization in one phase and couples ant colony optimization algorithm. Extensive experiments show ETAHM could save 22.71% more energy than CASPER (V. Kianzad et al., 2005), a state-of-the-art integrated framework that tackles the identical problem with genetic algorithm instead.

Information Processing and Management | 2006

Unique-order interpolative coding for fast querying and space-efficient indexing in information retrieval systems

Cher-Sheng Cheng; Jean Jyh-Jiun Shann; Chung-Ping Chung

This paper presents a size reduction method for the inverted file, the most suitable indexing structure for an information retrieval system (IRS). We notice that in an inverted file the document identifiers for a given word are usually clustered. While this clustering property can be used in reducing the size of the inverted file, good compression as well as fast decompression must both be available. In this paper, we present a method that can facilitate coding and decoding processes for interpolative coding using recursion elimination and loop unwinding. We call this method the unique-order interpolative coding. It can calculate the lower and upper bounds of every document identifier for a binary code without using a recursive process, hence the decompression time can be greatly reduced. Moreover, it also can exploit document identifier clustering to compress the inverted file efficiently. Compared with the other well-known compression methods, our method provides fast decoding speed and excellent compression. This method can also be used to support a self-indexing strategy. Therefore our research work in this paper provides a feasible way to build a fast and space-economical IRS.

international conference on parallel and distributed systems | 1998

An x86 load/store unit with aggressive scheduling of load/store operations

Hui-Yue Hwang; R-Ming Shiu; Jean Jyh-Jiun Shann

Because of register-memory instruction set architecture and limited register set, there are significant amounts of memory access instructions in x86 microprocessors. As the higher issue degree of superscalar microprocessor is provided, an aggressive scheduling policy of load/store operations becomes crucial. We examine the scheduling policies of loads/stores on x86 superscalar microprocessors and propose a new aggressive scheduling policy called load speculation, which allows loads to precede the previous unsolved pending stores. Simulation results show that the load speculation achieves the higher performance in comparison with the traditional scheduling policies such as load bypassing and load forwarding. Furthermore, by reducing the pipeline stages, the load speculation can achieve even higher performance.

design, automation, and test in europe | 2008

Instruction set extension exploration in multiple-issue architecture

I-Wei Wu; Zhi-Yuan Chen; Jean Jyh-Jiun Shann; Chung-Ping Chung

To satisfy high-performance computing demand in modern embedded devices, current embedded processor architectures provide designer with possibility either to define customized instruction set extension (ISE) or to increase instruction issue width. Previous studies have shown that deploying ISE in multiple-issue architecture can significantly improve performance. However, identifying ISE for multiple-issue architecture by using current ISE exploration algorithms will result in unnecessary waste of silicon area and limitation of performance improvement. This is because most algorithms overlook two important considerations: (1) only packing the operations lying on the critical path into ISE can improve performance; (2) the critical path usually changes after packing operations into an ISE. With these considerations, this paper presents an algorithm for ISE exploration based on list scheduling and Ant Colony Optimization (ACO), in which combines ISE exploration and the critical path identification (i.e. instruction scheduling). Results indicate that our approach outperforms the previous work in both performance improvement and area efficiency.

international computer symposium | 2010

File-based sharing for dynamically compiled code on Dalvik virtual machine

Yao-Chih Huang; Yu-Sheng Chen; Wuu Yang; Jean Jyh-Jiun Shann

Memory footprint is considered as an important design issue for embedded systems. Sharing dynamically compiled code among virtual machines can reduce memory footprint and recompilation overhead. On the other hand, sharing writable native code may cause security problems, due to support of native function call such as Java Native Interface (JNI). We propose a native-code sharing mechanism that ensures the security for Dalvik virtual machine (VM) on the Android platform. Dynamically generated code is saved in a file and is shared with memory mapping when other VMs need the same code. Protection is granted by controlling of file writing permissions. To improve the security, we implement a daemon process, named Query Agent, to control all accesses to the native code and maintain all the information of traces, which are the units of the compilation in the Dalvik VM. We implement our code sharing mechanism on Android version 2.1 system, and experiment on an arm-based system. We get 45% code-cache size reduction and 9% performance improvement from eliminating recompilation overhead.

Microprocessors and Microsystems | 2002

Design of an optimal folding mechanism for Java processors

Lee-Ren Ton; Lung-Chung Chang; Jean Jyh-Jiun Shann; Chung-Ping Chung

Abstract Java has become the most important language in the Internet area, but its execution performance is severely limited by the true data dependency inherited from the stack architecture defined by the Suns Java Virtual Machine (JVM). To enhance the performance of the JVM, a stack operations folding mechanism for the picoJava-II processor was proposed by Sun Microsystems to fold 42.3% stack push/pop instructions. A systematic folding algorithm—Producer, Operator, and Consumer (POC) folding model was proposed in the earlier research to eliminate up to 82.9% of stack push/pop instructions. The remaining push and pop instructions cannot be folded due to the sequential checking characteristic of the POC folding model. A new folding algorithm—enhanced POC (EPOC) folding model is proposed in this paper to further fold the remaining push and pop instructions. In the EPOC folding model, stack push/pop instructions are folded with the proposed Stack Reorder Buffer (SROB) architecture. With a small SROB size of 584 bits, almost all of the stack push/pop instructions can be folded with the precise exception handling capability. Statistical data shows that 98.8% of the stack push/pop instructions can be folded, and the average execution performance speedup of a 4-foldable processor with a 7-byte instruction buffer is 1.74 as compared to a traditional single-pipelined stack machine without folding.

annual acis international conference on computer and information science | 2013

Improving performance of JNA by using LLVM JIT compiler

Yu-Hsin Tsai; I-Wei Wu; I-Chun Liu; Jean Jyh-Jiun Shann

Java Native Access (JNA) has been proposed to alleviate the burden of programming in Java Native Interface (JNI). JNA allows programmer to call native functions without writing any JNI codes. However, JNA suffers from some performance degradation. To overcome this problem, in this paper, we modify the JNA source code and integrate the LLVM JIT compiler into JNA to improve the performance. Our experiment achieves about 8% to 16% performance improvement for calling a native function with different types and numbers of arguments. Furthermore, our design is a non-traditional way of using the runtime compiler, and the challenges we encountered may help other researchers to face the similar situations.

computational science and engineering | 2009

Methods for Precise False-Overlap Detection in Tile-Based Rendering

Hsiu-ching Hsieh; Chih-Chieh Hsiao; Hui-Chin Yang; Chung-Ping Chung; Jean Jyh-Jiun Shann

In graphics processing, overlap test is a crucial step before tile-binning in tile-based rendering for embedded devices. An object in a frame is decomposed into primitives, triangles of different sizes, for processing. In tile-binning process, these triangular primitives are typically represented by bounding boxes. However, the bounding box of a primitive usually covers a significant number of tiles which are not overlapped by the primitive. These tiles are called false-overlap tiles and approximate 70% of the tiles of a bounding box. Therefore, in tile-based rendering, identifying and eliminating those false-overlap tiles in a bounding box to reduce both storage pressures in tile-binning and data accesses of external memory for rasterizer become inviting. Existing false-overlap detection algorithms are either too tedious to reduce computation or too rough to gain high coverage. In this paper, we propose three methods to eliminate all false-overlap tiles: Cross-Product Test (CPT), Edge-Walk Test (EWT), and Counting X-Ratio (CXR). We partition the bounding box of a primitive into three rectangles at most according to the number of primitive vertices which are also the vertices of the bounding box. The edges of the primitive then become the diagonals of these rectangles, and false overlap detection becomes a well-formulated math processing. The false-overlap detection of these three rectangles may be processed in parallel to improve performance further. The proposed methods are tested using Doom3 and Quake4 for different screen sizes.

high performance embedded architectures and compilers | 2007

Instruction set extension generation with considering physical constraints

I-Wei Wu; Shih-Chia Huang; Chung-Ping Chung; Jean Jyh-Jiun Shann

In this paper, we propose new algorithms for both ISE exploration and selection with considering important physical constraints such as pipestage timing and instruction set architecture (ISA) format, silicon area and register file. To handle these considerations, an ISE exploration algorithm is proposed. It not only explores ISE candidates but also their implementation option to minimize the execution time meanwhile using less silicon area. In ISE selection, many researches only take silicon area into account, but it is not comprehensive. In this paper, we formulate ISE selection as a multiconstrained 0-1 knapsack problem so that it can consider multiple constraints. Results with MiBench indicate that under same number of ISE, our approach achieves 69.43%, 1.26% and 33.8% (max., min. and avg., respectively) of further reduction in silicon area and also has maximally 1.6% performance improvement compared with the previous one.

Explore More