Is this you? Create Your Porfile

Surin Kittitornkun

King Mongkut's Institute of Technology Ladkrabang

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Surin Kittitornkun is active.

Explore More

Publication

Featured researches published by Surin Kittitornkun.

international parallel and distributed processing symposium | 2006

MT-ClustalW: multithreading multiple sequence alignment

Kridsadakorn Chaichoompu; Surin Kittitornkun; Sissades Tongsima

ClustalW is the most widely used tool for aligning multiple protein or nucleotide sequences. The alignment is achieved via three stages: pairwise alignment, guide tree generation and progressive alignment. This paper analyzes and enhances a multithreaded implementation of ClustalW called ClustalW-SMP for higher throughput. Our goal is to maximize the degree of parallelism on multithreading ClustalW called MultiThreading-ClustalW (MT-ClustalW). As a result, bioinformatics laboratories are able to use this MT-ClustalW with much less energy consumption on multicore and SMP (symmetric multiprocessor) machines than that of PC clusters. The experiment results show that the MT-ClustalW framework can achieve a considerable speedup over the sequential ClustalW and original multithreaded ClustalW-SMP implementations

field-programmable technology | 2007

Applying Cuckoo Hashing for FPGA-based Pattern Matching in NIDS/NIPS

Tran Ngoc Thinh; Surin Kittitornkun; Shigenori Tomiyama

Pattern matching for network intrusion/prevention detection requires extremely high throughput with frequent updates to support new attack patterns. Most of current hardware implementations have outstanding performance over software implementations. However, the requirement for dynamic update pattern set is still challenging for hardware researchers. This paper describes a novel FPGA-based pattern matching architecture using a recent hashing algorithm called Cuckoo Hashing. The proposed architecture features on-the-fly pattern updates without reconfiguration, more efficient hardware utilization, and higher performance. Through various algorithmic changes of Cuckoo Hashing, we can implement parallel pattern matching on SRAM-based FPGA. Our system can accommodate the latest Snort rule-set, an open source network intrusion detection/prevention system, and achieve the highest utilization in terms of SRAM per character and logic cells per character at 17 bits/character and 0.043 logic cells/character, respectively on major Xilinx Virtex architectures. Compared to others, ours is much more efficient than any other Xilinx FPGA architectures.

symposium/workshop on electronic design, test and applications | 2010

Massively Parallel Cuckoo Pattern Matching Applied for NIDS/NIPS

Tran Ngoc Thinh; Surin Kittitornkun

This paper describes a Cuckoo-based Pattern Matching (CPM) engine based on a recently developed hashing algorithm called Cuckoo Hashing. We implement the improved parallel Cuckoo Hashing suitable for hardware-based multi-pattern matching with arbitrary length. CPM can rapidly update the static pattern set without reconfiguration while consuming the lowest amount of hardware. With the power of massively parallel processing, the speedup of CPM is up to 128X as compared with serial Cuckoo implementation. Compared to other hardware systems, CPM is far better in performance and saves 30% of the area.

international conference on electrical engineering electronics computer telecommunications and information technology | 2011

A multithreading methodology with OpenMP on multi-core CPUs: SNPHAP case study

Udom Ranok; Surin Kittitornkun; Sissades Tongsima

This paper presents a multithreading methodology for OpenMP library. The methodology can be applied to convert existing sequential and demanding programs to be multithreaded programs with OpenMP running on the Multi-core CPUs. In our experiments, we apply this methodology to SNPHAP, which is one of the best haplotype inference bioinformatics program in terms of speed. The results show that our significant achievement is the maximum Speedup 316% for Intel Xeon E5405 (8-core 2.0 GHz) and 410% for Intel Xeon E5520 (8-Core with HyperThreading 2.66GHz) faster than its own sequential version.

asia pacific network operations and management symposium | 2007

FPGA-based cuckoo hashing for pattern matching in NIDS/NIPS

Thinh Ngoc Tran; Surin Kittitornkun

Pattern matching for network intrusion/prevention detection demands exceptionally high throughput with recent updates to support new attack patterns. This paper describes a novel FPGA-based pattern matching architecture using a recent hashing algorithm called Cuckoo Hashing. The proposed architecture features on-the-fly pattern updates without reconfiguration, more efficient hardware utilization, and higher throughput. Through various algorithmic changes of Cuckoo Hashing, we can implement parallel pattern matching on SRAM-based FPGA. Our system can accommodate the newest Snort rule-set, an open source Network Intrusion Detection/Prevention System, and achieve the highest utilization in terms of SRAM per character and Logic Cells per character at 15.63 bits/character and 0.033 Logic Cells/character, respectively on major Xilinx Virtex FPGA architectures. Compared to others, ours is more efficient than any other Xilinx FPGA architectures.

international symposium on communications and information technologies | 2006

Multithreaded ClustalW with Improved Optimization for Intel Multi-core Processor

Kridsadakorn Chaichoompu; Surin Kittitornkun

This paper presents the methodology that assists the compiler to optimize ClustalW; the most widely used tool for aligning multiple text-based protein or nucleotide sequences in Bioinformatics. Our goal is to minimize latency and maximize the throughput of execution on multithreading ClustalW called MT-ClustalW: our previous work. As a result, optimized MT-ClustalW is able to fully utilize the machine resources and achieves higher throughput on multicore computers. The experiment results show that our methodology can assist the compiler to optimize the code better than only compiler-optimization and achieve over 2 times faster than the sequential ClustalW. Finally, we analyze the overall result with Amdahls Law

computer science and software engineering | 2012

Optimizing and multithreading SNPHAP on a multi-core APU with OpenCL

Apisit Rattanatranurak; Surin Kittitornkun; Sissades Tongsima

In this paper, we have optimized and multithreaded SNPHAP, a bioinformatics program, with OpenCL to reduce the computation time and thus accelerate the execution. Our method is called Radix Comparison algorithm running in sequential and parallel (multithreading). Based on the recent multi-core AMD A6-3650 APU (Accelerated Processing Unit), the achieveable Speedups of Sequential Radix and Parallel Radix SNPHAP compared with the original SNPHAP are 260% and 271%, respectively.

international conference on electrical engineering/electronics, computer, telecommunications and information technology | 2008

Optimizing RSA encryption for ARM microprocessor

Pitcha Tyoviriyakul; Surin Kittitornkun

Most compiler optimization techniques concern most about speed. In this paper, we present two high-level memory optimization methods for ARM-based secure applications on mobile phones, pocket PCs, etc. The experiments using RSA encryption on ARM920T with 1024-bit random public keys show that the proposed techniques can complement the existing speed-oriented ones to achieve less number of memory accesses, shorter execution time, and lower memory allocations to all ARM C++ optimization levels despite the 16-KB instruction and 16-KB data caches of ARM 920T core.

The Journal of Supercomputing | 2003

Processor Array Synthesis from Shift-Variant Deep Nested Do Loops

Surin Kittitornkun; Yu Hen Hu

The consolidation of Internet devices into a universal/portable device will soon be accomplishable through the incorporation of reconfigurable computing in system-on-a-chip (SOC). At any particular moment, it could be a video/audio mobile phone, an MP3 song player, and other devices. The basic construct of these multimedia processing algorithms can be described as deep nested Do loop algorithms. They are considered the most demanding data-intensive algorithms and hence ideal candidates for an array of reconfigurable nanoprocessors. Therefore, algorithm to hardware synthesis methodology is important for an efficient exploitation of both spatial parallelism and temporal pipelining. In this paper, we propose a processor array synthesis methodology. It can map an n-level nested Do loop represented by a nonuniform or shift-variant data dependence graph to a near-optimal of one-or two-dimensional processor array under the available resource constraints to satisfy high-throughput computation demands.

international computer science and engineering conference | 2015

Predicting SET50 stock prices using CARIMA (Cross Correlation ARIMA)

Sornpon Wichaidit; Surin Kittitornkun

Investing in stocks is one of the most popular approaches for money investment. This paper aims to predict short-term stock prices of SET50 of Stock Exchange of Thailand (SET). The proposed method is called CARIMA (Cross Correlation Autoregressive Integrated Moving Average. The basic idea of CARIMA is to find the most highly correlated s tock t o predict the target one in addition to ARIMA predicted price. The results of CARIMA model yield better price trends (measured by 10-day correlation coefficient) while % MAEs (Mean Absolute Errors) are quite similar with those of ARIMA.

Explore More