Banpot Dolwithayakul | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Banpot Dolwithayakul is active.

Explore More

Publication

Featured researches published by Banpot Dolwithayakul.

international symposium on intelligent signal processing and communication systems | 2011

GPU-based total variation image restoration using Sliding Window Gauss-Seidel algorithm

Banpot Dolwithayakul; Chantana Chantrapornchai; Noppadol Chumchob

Image restoration has been a research topic deeply investigated within the last two decades. As is well-known, total variation (TV) minimization by Rudin, Osher, and Fatami [6] offers superior image restoration quality and involves solving a second order nonlinear partial differential equation (PDE). In more recent years, some effort has been made in improving computational speed for solving the associated PDE remained a bottleneck, preventing its applications to high-resolution digital images. In this paper, we improve a novel parallel algorithm Gauss-Seidel on GPU, called QL-SWGS. The algorithm is improved from the original Sliding Window Gauss Seidel proposed in [1]. As expected, our numerical results on realistic and synthetic images not only confirm that the proposed algorithm on GPU delivers quality results but also that it is many orders of magnitude faster than those algorithms on multicore CPU, particularly by at most 80% from our benchmark.

computer science and software engineering | 2012

An efficient asynchronous approach for Gauss-Seidel iterative solver for FDM/FEM equations on multi-core processors

Banpot Dolwithayakul; Chantana Chantrapornchai; Noppadol Chumchob

In this paper, we proposed a new parallel iterative asynchronous method for Gauss-Seidel and Successive Over-Relaxation (SOR) for finite difference method (FDM) and finite element method (FEM). The approach attempts to minimize the thread synchronization which incurs a lot of thread idle time due to the dependency of computation. Our proposed method maximizes the thread utilization on multi-core processors with some space requirement for storing current states. We implement our proposed method based on the Poissons equation with FDM. It is found that our proposed algorithm runs 5.88 times faster than the original Gauss-Seidel and achieve speedup up to 1.25 compared with the parallel Sliding Window version.

international computer science and engineering conference | 2013

Real-time video denoising for 2D ultrasound streaming video on GPUs

Banpot Dolwithayakul; Chantana Chantrapornchai; Noppadol Chumchob

The ultrasound videos are mainly contaminated by multiplicative noises but also contaminated with additive noises. As the past few decades, there are some studies to remove the noises from ultrasound images as in the JY model [1] and the variational model which removes both types of noises. However, denoising these noises from the ultrasound video is the time-consuming process. With the advancement of multi-core and many-core processors, it makes the denoising process much faster and it is possible to render while doing the real-time denoising. In this study, we propose the modified strategy from [2] to denoise the streaming ultrasound video in real-time. Our proposed model can retain the frame order, and get the satisfactory frame rate (about 14.98 fps). The proposed strategy boosts the speedup of the frame denoising to 3.79 times compared to the sequential computation.

computer science and software engineering | 2012

Real-time parallel spatial video denoising schemes on multi-core processors

Banpot Dolwithayakul; Chantana Chantrapornchai; Noppadol Chumchob

Noises in videos can occur during recording and transmission. The advancement of the processor technology makes real time video denoising possible on multicore processors. In this paper, we investigate parallel techniques for denoising the real-time video on a multi-core processor. We compare two strategies: a block strategy, which assigns a group of threads to each block of video frames and a distributor strategy, which uses one thread to distribute the frame data to each thread. From our experiments with the total variation based image denoising technique, we found that by using the distributor strategy, we can achieve speedup which is 1.28 times faster than the block strategy and the video frame rate can be increased by 18.44%.

International Conference on Multimedia, Computer Graphics, and Broadcasting | 2009

Parallel Mass Transfer Simulation of Nanoparticles Using Nonblocking Communications

Chantana Chantrapornchai; Banpot Dolwithayakul; Sergei Gorlatch

This paper presents experiences and results obtained in optimizing parallelization of the mass transfer simulation in the High Gradient Magnetic Separation (HGMS) of nanoparticles using nonblocking communication techniques in the point-to-point and collective model. We study the dynamics of mass transfer statistically in terms of particle volume concentration and the continuity equation, which is solved numerically by using the finite-difference method to compute concentration distribution in the simulation domain at a given time. In the parallel simulation, total concentration data in the simulation domain are divided row-wise and distributed equally to a group of processes. We propose two parallel algorithms based on the row-wise partitioning: algorithms with nonblocking send/receive and nonblocking scatter/gather using the NBC library. We compare the performance of both versions by measuring their parallel speedup and efficiency. We also investigate the communication overhead in both versions. Our results show that the nonblocking collective communication can improve the performance of the simulation when the number of processes is large.

Journal of Computer Applications in Technology | 2015

Utilising the pipeline framework and state-based non-linear Gauss-Seidel for large satellite image denoising based on CPU-GPU cores

Banpot Dolwithayakul; Chantana Chantrapornchai; Noppadol Chumchob

Satellite images are usually large and are contaminated with noises during the acquisition process. Typically, they are composed of both additive noises and multiplicative noises. Denoising such images requires numerical processes that are time-consuming. In this paper, we propose a framework for denoising both multiplicative and additive noises at the same time based on the modern denoising technique in Chumchob et al. 2013. Our framework is able to fully utilise all available computing units both CPU cores and GPU cores effectively. We carefully divide the computation into stages which allows the computing units to work on each data partition in a pipeline fashion and tested our framework with different chunk sizes from 256 × 256 to 1024 × 1024. The experiments show that the speedup for the chunk size of 2048 × 2048 can be up to 70.98 times comparing with the normal denoising algorithm. Moreover, we also made the modification of stated-based Gauss-Seidel from Dolwithayakul et al. 2012 be suitable for GPU. We also change data structure to avoid usage of pointer and implement the memory hierarchy to reduce the single point of synchronisation and guarantee mutual exclusion on the job table.

computer science and software engineering | 2014

Parallel simulation of nanoparticles transport in high gradient magnetic field

Kanok Hournkumnuard; Chantana Chantrapornchai; Banpot Dolwithayakul; Prach Chaisiri

The transport of weakly magnetic nanoparticles, dispersed in an irrotational flow of inviscid fluid, in the region of high gradient magnetic fieldis simulated by a parallel algorithm developed based on OpenMP. The high gradient magnetic field here is produced by applying the background uniform magnetic field perpendicular to axes of parallel ferromagnetic wires. The direction of incoming fluid flow is perpendicular to both wired axes and the applied magnetic field. According to the features of magnetic field and fluid flow, the particles transport is simulated on a two dimensional domain enclosing a representative wire. The continuity equation describing transport of particles on the domain is solved numerically as an initial and boundary value problem. The simulation result shows the distribution of particle concentration on the domain at various times. The performance results show the consistent speedup when increasing the number of cores upto 20-30% due to the iteration dependency and the domain size.

The Scientific World Journal | 2014

Parallel Simulation of HGMS of Weakly Magnetic Nanoparticles in Irrotational Flow of Inviscid Fluid

Kanok Hournkumnuard; Banpot Dolwithayakul; Chantana Chantrapornchai

The process of high gradient magnetic separation (HGMS) using a microferromagnetic wire for capturing weakly magnetic nanoparticles in the irrotational flow of inviscid fluid is simulated by using parallel algorithm developed based on openMP. The two-dimensional problem of particle transport under the influences of magnetic force and fluid flow is considered in an annular domain surrounding the wire with inner radius equal to that of the wire and outer radius equal to various multiples of wire radius. The differential equations governing particle transport are solved numerically as an initial and boundary values problem by using the finite-difference method. Concentration distribution of the particles around the wire is investigated and compared with some previously reported results and shows the good agreement between them. The results show the feasibility of accumulating weakly magnetic nanoparticles in specific regions on the wire surface which is useful for applications in biomedical and environmental works. The speedup of parallel simulation ranges from 1.8 to 21 depending on the number of threads and the domain problem size as well as the number of iterations. With the nature of computing in the application and current multicore technology, it is observed that 4–8 threads are sufficient to obtain the optimized speedup.

international joint conference on computer science and software engineering | 2013

On the parallel simulation of magnetic targeting of nanoparticles in capillaries

Chantana Chantrapornchai; Banpot Dolwithayakul

We develop a parallel algorithm, by using CUDA, for calculating the concentration of multifunctional magnetic nanoparticles in capillaries under the influences of magnetic force, blood flow and diffusion process. The task of computing particle, concentration on the considered plane is distributed to computational threads. The continuity equation describing the time rate of change of the multifunctional particle concentration in each small element on the considered plane is solved via the explicit finite different method. The simulation results show the distributions of particle concentration on the focused plane in the blood vessel which are useful visualization for the biomedical researchers. The performance of parallel computing is also examined.

international computer science and engineering conference | 2013

Parallel simulation of magnetic targeting of nano-carriers in capillary using OpenMP and MPI

Kanok Hournkumnuard; Banpot Dolwithayakul; Chantana Chantrapornchai

A parallel algorithm for simulating concentration distribution of nano-carriers in a capillary is developed by using the combination of OpenMP and MPI. The transport of the carriers under the influences of diffusion, blood flow and magnetic driving force is investigated in two dimensions on a plane that symmetrically slices through the capillary diameter and parallel to the capillary axis. The continuity equation governing carriers transport is solved numerically as initial and boundary values problem by using the finite difference method. The computing tasks of updating carrier concentration at each time step are distributed to a group of nodes and threads in the parallel simulation. The patterns of carrier concentration distribution, which are simulation results, show the progress of carrier accumulation within the considered region. These data can be visualized and are useful for biomedical researchers. The performance of parallel computing by OpenMP and MPI is also evaluated.

Explore More