Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tien-Fu Chen is active.

Publication


Featured researches published by Tien-Fu Chen.


Applied Physics Letters | 2007

Flexible field emitter made of carbon nanotubes microwave welded onto polymer substrates

Chen-Chan Wang; Tien-Fu Chen; Shih-Chieh Chang; T.S. Chin; Syh-Yuh Cheng

The drastic temperature rise of multiwall carbon nanotubes (MWCNTs) in response to microwave irradiation was applied to weld a MWCNT paste on a polymer substrate within a few seconds. It provides a strong bonding between the MWCNT and polymer without thermal damage to the substrate. A flexible field emitter was made from MWCNT microwave welded on polycarbonate, showing excellent electrical conduction and field-emission properties even under bending. The field emitter works with a turn-on voltage of 0.8V∕μm due to the direct electron transfer. By this method, printed circuits and field-emission devices can be processed simultaneously within seconds leading to important applications in flexible electronic devices.


symposium on vlsi circuits | 2014

ReRAM-based 4T2R nonvolatile TCAM with 7x NVM-stress reduction, and 4x improvement in speed-wordlength-capacity for normally-off instant-on filter-based search engines used in big-data processing

Li-Yue Huang; Meng-Fan Chang; Ching-Hao Chuang; Chia-Chen Kuo; Chien-Fu Chen; Geng-Hau Yang; Hsiang-Jen Tsai; Tien-Fu Chen; Shyh-Shyuan Sheu; Keng-Li Su; Frederick T. Chen; Tzu-Kun Ku; Ming-Jinn Tsai; Ming-Jer Kao

This study proposes an RC-filtered stress-decoupled (RCSD) 4T2R nonvolatile TCAM (nvTCAM) to 1) suppress match-line (ML) leakage current from match cells (IML-M), 2) reduce ML parasitic load (CML), 3) decouple NVM-stress from wordlength (WDL) and IML-MIS. RCSD reduces NVM-stress by 7+x, and achieves a 4+x improvement in speed-WDL-capacity-product. A 128×32b RCSD nvTCAM macro was fabricated using HfO ReRAM and an 180nm CMOS. This paper presents the first ReRAM-based nvTCAM featuring the shortest (1.2ns) search delay (TSD) among nvTCAMs with WDL≥32b.


IEEE Transactions on Very Large Scale Integration Systems | 2012

A Scalable High-Performance Virus Detection Processor Against a Large Pattern Set for Embedded Network Security

Chieh-Jen Cheng; Chao-Ching Wang; Wei-Chun Ku; Tien-Fu Chen; Jinn-Shyan Wang

Contemporary network security applications generally require the ability to perform powerful pattern matching to protect against attacks such as viruses and spam. Traditional hardware solutions are intended for firewall routers. However, the solutions in the literature for firewalls are not scalable, and they do not address the difficulty of an antivirus with an ever-larger pattern set. The goal of this work is to provide a systematic virus detection hardware solution for network security for embedded systems. Instead of placing entire matching patterns on a chip, our solution is a two-phase dictionary-based antivirus processor that works by condensing as much of the important filtering information as possible onto a chip and infrequently accessing off-chip data to make the matching mechanism scalable to large pattern sets. In the first stage, the filtering engine can filter out more than 93.1% of data as safe, using a merged shift table. Only 6.9% or less of potentially unsafe data must be precisely checked in the second stage by the exact-matching engine from off-chip memory. To reduce the impact of the memory gap, we also propose three enhancement algorithms to improve performance: 1) a skipping algorithm; 2) a cache method; and 3) a prefetching mechanism.


design automation conference | 2015

Energy-efficient non-volatile TCAM search engine design using priority-decision in memory technology for DPI

Hsiang-Jen Tsai; Keng-Hao Yang; Yin-Chi Peng; Chien-Chen Lin; Ya-Han Tsao; Meng-Fan Chang; Tien-Fu Chen

TCAM-based search engines are widely used in regular expression matching across multiple packets. However, the use of priority encoder results in increased energy consumption of pattern updates and search operations. This work, proposes a promising memory technology, called Priority-Decision in Memory (PDM), which eliminates the need for priority encoders and removes restrictions on ordering, meaning that patterns can be stored in an arbitrary order without sorting their lengths. Moreover, we present a Sequential Input-State Search (SIS) scheme to disable the mass of redundant search operations in state segments, based on the analysis distribution of hex signatures in a virus database. Experimental results demonstrate that PDM-based technology can improve update energy consumption of nvTCAM search engines by 36%~67% because most of the energy in the latter is used to reorder. By adopting the SIS-based method to avoid unnecessarily search operations in a TCAM array, the search energy reduction is around 64% of nvTCAM search engines.


IEEE Transactions on Very Large Scale Integration Systems | 2015

Soft-Error-Tolerant Design Methodology for Balancing Performance, Power, and Reliability

Hsuan-Ming Chou; Ming-Yi Hsiao; Yi-Chiao Chen; Keng-Hao Yang; Jean Tsao; Chiao-Ling Lung; Shih-Chieh Chang; Wen-Ben Jone; Tien-Fu Chen

Soft error has become an important reliability issue in advanced technologies. To tolerate soft errors, solutions suggested in previous works incur significant performance and power penalties, especially when a design with fault-tolerant structures is overprotected. In this paper, we present a soft-error-tolerant design methodology to tradeoff performance, power, and reliability for different applications. First, four novel detection and correction flip-flop (FF) structures are proposed to provide different levels of tolerance capability against soft errors. Second, architecture-level vulnerability and logic-level susceptibility analyses are employed to identify weak FFs that can easily cause program execution errors. Third, an optimization framework is developed to synthesize the proposed four novel FF structures into weak and highly observable storage bits with the flexibility of trading off performance, power, and reliability. A five-stage pipeline RISC core (UniRISC) is adopted to demonstrate the usefulness of our methodology. Experimental results show that the proposed method can accomplish design goals by balancing performance, power, and reliability. For example, we can not only satisfy the reliability requirement that no more than five errors occur per one billion hours in a design but also reduce up to 87% performance overhead and 91% power overhead when compared with previous works.


IEEE Transactions on Very Large Scale Integration Systems | 2017

Energy-Efficient TCAM Search Engine Design Using Priority-Decision in Memory Technology

Hsiang-Jen Tsai; Keng-Hao Yang; Yin-Chi Peng; Chien-Chen Lin; Ya-Han Tsao; Meng-Fan Chang; Tien-Fu Chen

Ternary content-addressable memory (TCAM)-based search engines generally need a priority encoder (PE) to select the highest priority match entry for resolving the multiple match problem due to the don’t care (X) features of TCAM. In contemporary network security, TCAM-based search engines are widely used in regular expression matching across multiple packets to protect against attacks, such as by viruses and spam. However, the use of PE results in increased energy consumption for pattern updates and search operations. Instead of using PEs to determine the match, our solution is a three-phase search operation that utilizes the length information of the matched patterns to decide the longest pattern match data. This paper proposes a promising memory technology called priority-decision in memory (PDM), which eliminates the need for PEs and removes restrictions on ordering, implying that patterns can be stored in an arbitrary order without sorting their lengths. Moreover, we present a sequential input-state (SIS) scheme to disable the mass of redundant search operations in state segments on the basis of an analysis distribution of hex signatures in a virus database. Experimental results demonstrate that the PDM-based technology can improve update energy consumption of nonvolatile TCAM (nvTCAM) search engines by 36%–67%, because most of the energy in these search engines is used to reorder. By adopting the SIS-based method to avoid unnecessary search operations in a TCAM array, the search energy reduction is around 64% of nvTCAM search engines.


international symposium on vlsi design, automation and test | 2012

IMITATOR: A deterministic multicore replay system with refining techniques

Shing-Yu Chen; Chi-Neng; Geng-Hau Yang; Wen-Ben Jone; Tien-Fu Chen

Developing parallel programs imposes many debugging challenges on multicore systems. Many researchers were successful to detect parallel faults in background by hardware assistance. However, it is still an urgent issue to reproduce the same faulted circumstance after faults occurred. Tracing the causality between events is a popular solution in current multicore systems, but it is limited by onchip storage and tracing bandwidth. As a result, an intelligent record and replay system is the key to the future multicore debugging problems. This paper proposes IMITATOR for both trace compression and deterministic replay. In contrast to the most other record and replay systems, IMITATOR presents an additional phase, refining phase, between record and replay phases to significantly reduce the recorder overhead, while enabling faster replaying. Results with SPLASH2 benchmark on a 32-core system show that IMITATOR can (a) significantly reduce trace size by the trace refining techniques (~16% of native trace) and (b) achieve replay speed 1.96 times faster than the replayer using Sigrace scheme on average.


IEEE Transactions on Very Large Scale Integration Systems | 2017

A Flexible Wildcard-Pattern Matching Accelerator via Simultaneous Discrete Finite Automata

Hsiang-Jen Tsai; Chien-Chih Chen; Yin-Chi Peng; Ya-Han Tsao; Yen-Ning Chiang; Wei-Cheng Zhao; Meng-Fan Chang; Tien-Fu Chen

Regular expression matching becomes indispensable elements of Internet of Things network security. However, traditional ternary content addressable memory (TCAM) search engine is unable to handle patterns with wildcards, as it precisely tracks only one active state with single transition. This paper proposes a promising simultaneous pattern matching methodology for wildcard patterns by two separated engines to represent discrete finite automata. A key preprocessing to encode possible postfix pattern by a unique key ensures that follow-up patterns can accurately traverse all possible matches with limited hardware resources. This approach is practical and scalable for achieving good performance and low space consumption in network security, and it can be applicable to any regular expressions even with multiwildcard patterns. The experimental results demonstrate that this scheme can efficiently and accurately recognize wildcard patterns by simultaneously tracking only two active states. By adopting SRAM TCAM in the proposed architecture, the energy consumption is reduced to around 39%, compared with the energy consumption using a computing system that contains a large memory lookup and comparison overhead.


ACM Transactions on Design Automation of Electronic Systems | 2017

Leak Stopper: An Actively Revitalized Snoop Filter Architecture with Effective Generation Control

Yin-Chi Peng; Chien-Chih Chen; Hsiang-Jen Tsai; Keng-Hao Yang; Pei-Zhe Huang; Shih-Chieh Chang; Wen-Ben Jone; Tien-Fu Chen

To alleviate high energy dissipation of unnecessary snooping accesses, snoop filters have been designed to reduce snoop lookups. These filters have the problem of decreasing filtering efficiency, and thus usually rely on partial or whole filter reset by detecting block evictions. Unfortunately, the reset conditions occur infrequently or unevenly (called passive filter deletion). This work proposes the concept of revitalized snoop filter (RSF) design, which can actively renew the destination filter by employing a generation wrapping-around scheme for various reference behaviors. We further utilize a sampling mechanism for RSF to timely trigger precise filter revitalizations, so that unnecessary RSF flushing can be minimized. The proposed RSF can be integrated to various existent inclusive snoop filters with only a minor change to their designs. We evaluate our proposed design and demonstrate that RSF eliminates 58.6% of snoop energy compared to JETTY on average while inducing only 6.5% of revitalization energy overhead. In addition, RSF eliminates 45.5% of snoop energy compared to stream registers on average and only induces 2.5% of revitalization energy overhead. Overall, these RSFs reduce the total L2 cache energy consumption by 52.1% (58.6% -- 6.5%) as compared to JETTY and by 43% (45.5% -- 2.5%) as compared to stream registers. Furthermore, RSF improves the overall performance by 1% to 1.4% on average compared to JETTY and stream registers for various benchmark suites.


IEEE Transactions on Very Large Scale Integration Systems | 2016

High-Performance Deadlock-Free ID Assignment for Advanced Interconnect Protocols

Hsuan-Ming Chou; Yi-Chiao Chen; Keng-Hao Yang; Jean Tsao; Shih-Chieh Chang; Wen-Ben Jone; Tien-Fu Chen

In a modern system-on-chip design, hundreds of cores and intellectual properties can be integrated into a single chip. To be suitable for high-performance interconnects, designers increasingly adopt advanced interconnect protocols that support novel mechanisms of parallel accessing, including outstanding transactions and out-of-order completion of transactions. To implement those novel mechanisms, a master tags an ID to each transaction to decide in-order or out-of-order properties. However, these advanced protocols may lead to transaction deadlocks that do not occur in traditional protocols. To prevent the deadlock problem, current solutions stall suspicious transactions and in certain cases, many such stalls can incur serious performance penalty. In this brief, we propose a novel ID assignment mechanism that guarantees the issued transactions to be deadlock-free and results in significant reduction in the number of transaction stalls issued by masters. Our experimental results show encouraging performance improvements compared with previous works with little hardware and power overheads.

Collaboration


Dive into the Tien-Fu Chen's collaboration.

Top Co-Authors

Avatar

Hsiang-Jen Tsai

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Keng-Hao Yang

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Meng-Fan Chang

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Yin-Chi Peng

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Chien-Chih Chen

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Shih-Chieh Chang

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Ya-Han Tsao

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Chien-Chen Lin

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Geng-Hau Yang

National Chiao Tung University

View shared research outputs
Top Co-Authors

Avatar

Hsuan-Ming Chou

National Tsing Hua University

View shared research outputs
Researchain Logo
Decentralizing Knowledge