Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Baifeng Wu is active.

Publication


Featured researches published by Baifeng Wu.


international conference on information science and engineering | 2009

High Performance Computing via a GPU

Gang Chen; Guobo Li; Songwen Pei; Baifeng Wu

Graphics processor units (GPUs), such as the AMD FireStream series, offer a tremendous computing power that is frequently an order of magnitude larger than even the most modern multi-core CPUs, making them an attractive platform for high performance computing due to their relative cheapness compared with conventional PC clusters. General-purpose computing on GPUs (GPGPU) is becoming popular in HPC because of its high peak performance. This paper investigates current technology that enables a GPU to accelerate HPC applications. As a representative kernel program in HPC, the speed of matrix multiplication plays an important role in the whole performance of the application. We introduce a new parallel algorithm to accelerate this operation. Comparing the speed of the computation through the CPU and the GPU, the result demonstrates that calculations are preformed considerably faster through the GPU than through the CPU.


computer supported cooperative work in design | 2009

GPGPU supported cooperative acceleration in molecular dynamics

Gang Chen; Guobo Li; Songwen Pei; Baifeng Wu

Molecular dynamics simulations have become a significant computational approach to study complicated physical phenomena at the atomic level. Nevertheless, accurate simulations are limited in size and timescale by the available computing resources, which make the simulations very time-consuming. This consequentially leads to tremendous computational requirements. Therefore, the need for speeding up this process is crucial. In this paper, we present a novel implementation to accelerate molecular dynamics simulations with GPGPU (General Purpose Graphics Processing Unit). Our goal is to reduce the total computational time of MD simulations at a very high performance/cost ratio with the introduction of the GPGPU algorithm. This is motivated by their enhanced programmability, attractive cost/performance ratio and incredible growth in speed. To demonstrate that GPGPUs already provide an inexpensive alternative to scientific applications, we have used AMDs Brook+ streaming programming environment to implement a new parallel algorithm. Our experimental results show the novel approach achieves speedup by the factor of fifteen compared to the corresponding sequential implementation.


computer supported cooperative work in design | 2011

GPU accelerate parallel Odd-Even merge sort: An OpenCL method

Keliang Zhang; Jiajia Li; Gang Chen; Baifeng Wu

Odd-Even merge sort is a basic problem in computer supported cooperative work in design area. However, it is not effective because of the high complexity O(nlg2n) in CPU platform. In this paper, we present a novel implementation based on the OpenCL programming model on recent GPU (Graphic Processing Unit). Our implementation was based on Knuths algorithm and do some change. Due to limitations of OpenCL, we utilize a flag variable to make it avoid the direct backward control flow. As results, our implementation achieves 18× speedups compared with the CPU C++ STL quick sort. And it gets almost linear speedup for next generations of GPU because of the complete parallelism in each iteration process. Meanwhile, our approach makes the odd-even merge sort effectively in practice because of the high performance. Furthermore, the approach used in this paper for cooperating thousands of processing units to parallel process can also be used in other cooperation areas.


Tsinghua Science & Technology | 2007

Novel Software Automated Testing System Based on J2EE

Songwen Pei; Baifeng Wu; Kun Zhu; Qiang Yu

Abstract Software automated testing is one of the critical research subjects in the field of computer application. In this paper, a novel design of architecture called automated testing system (ATS) is proposed. Based on techniques relating to J2EE including MVC design pattern, Struts framework, etc, ATS can support any black-box testing business theoretically with relevant APIs programmed using Tcl script language beforehand. Moreover, as the core of ATS is built in Java, it can work in different environments without being re-complied. The efficiency of the new system is validated by plenty of applications in communication industry and the results also show the effectiveness and flexibility of the approach.


computer science and information engineering | 2009

SpMT WaveCache: Exploiting Thread-Level Parallelism in WaveScalar

Songwen Pei; Baifeng Wu; Min Du; Gang Chen; Leandro A. J. Marzulo; Felipe M. G. França

Speculative Multithreading (SpMT) increases the performance by means of executing multiple threads speculatively to exploit thread-level parallelism. By combining software and hardware approaches, we have improved the capabilities of previous WaveScalar ISA on the basis of Transactional Memory system for the WaveCache Architecture. Threads are extracted at the course of static compiling, and speculatively executed as a thread-level transaction that is supported by extra hardware components, such as Thread-Context-Table (TCT) and Thread-Memory-History (TMH). We have evaluated the SpMT WaveCache with 6 real benchmarks from SPEC, Mediabench and Mibench. On the whole, the SpMT WaveCache outperforms superscalar architecture ranging from 2X to 3X, and great performance gains are achieved over original WaveCache and Transactional WaveCache as well.


computer supported cooperative work in design | 2006

Towards Model-Driven Methodology: a Novel Testing Approach for Collaborative Embedded System Design

Yi Jiao; Kun Zhu; Qiang Yu; Baifeng Wu

With the influence of CSCW, model integrated computing plus platform based rapid prototype becomes a promising co-design paradigm in embedded system domain. Testing remains a key issue and becomes a bottleneck. Model-driven-testing (MDT) is an ongoing approach aiming to solve the problem. Capture and analysis of run-time testing result in prototype system is one of the four key techniques in MDT and not well studied yet. This paper describes a novel solution with the implementation of event-driven hardware/software collaborative monitor system, which allows system-level monitor on target system and observation at different abstraction-level. It is composed of a special hardware unit, software probes, and dedicated software: embedded analysis tools (EAT). The software probes collect run-time events in target system and export them via the hardware unit to a host database for EATs further analysis. As an auxiliary supporting tool in designing environment, this monitor system can collaborate seamlessly with others in MDT testing tool chain. A complete MDT architecture is also presented


Journal of Internet Technology | 2014

Inter-Block Multi-Erasure Coding Scheme for Cloud-Based Big Bulk Data Transmission

Songwen Pei; Gang Chen; Shile Zhang; Baifeng Wu; Naixue Xiong

Big data is one of the hottest topics in the area of Information Technology. The intersection between cloud computing and big data actually brings a lot of challenges to both the academia and the industry. Aiming at the GPU Clusters based cloud computing applications, high-speed big bulk data transmission usually becomes incorrect, leading to incorrect computations and communication. Erasure codes have been widely used to protect big data against errors or erasure during reliable transmission and storage by providing redundancy in big data storage and transmission in cloud systems. Although, most of the existing erasure codes designed originally for bit or symbol are based on the theory of Reed-Solomon codes. Due to the complexities of both encoding and decoding, they are not suitable to tolerate failures with respect to an entire big bulk data block transmission in a real-time. Therefore, we propose an effective inter-block multierasure coding scheme (IMEC), designed for big bulk data transmission of cloud computing applications based on our prior research, with relative simple encoding and decoding procedures. IMEC extends the traditional paritycheck codes, incorporating with the features of GF(2^8) field in Galois field, which can tolerate simultaneous failures (or losses) of 4 entire data blocks with only 4 redundancy data blocks for a group of continuous data and its encoding and decoding complexities are simple linear relationship with respect to the size of a data block. Furthermore, the IMEC scheme exhibits maximum distance separable (MDS) property, featuring an optimal erasure capability with respect to the same redundancy information.


computer supported cooperative work in design | 2010

A GPU-based computing framework for CSCW

Gang Chen; Guobo Li; Baifeng Wu; Songwen Pei

Graphics processing units (GPUs) have evolved from fixed graphics pipeline processors into more flexible and powerful data-parallel processors. Their ever-increasing computing power makes them an attractive platform for high performance computing at a low cost. Up to the present, most efforts that exploit GPUs are graphical and scientific applications. Nevertheless, little attention has been paid to harnessing these highly parallel devices to support collaborative work and design. In this paper, we propose a GPU-based framework based on stream computing technology, which aims at providing the computation service more efficiently among the collaborators in the CSCW system. The framework consists of two parts: A CPU-based data service mainly presiding over management and a GPU-based compute service primarily focus on computation.


computer supported cooperative work in design | 2005

Towards a systematic conflict resolution policy in multi-agent system: a conceptual framework

Yi Jiao; Baifeng Wu; Kun Zhu; Qiang Yu

Complex modern artifacts are often designed collaboratively by both human and machine agents with different areas of expertise in a multi-agent system (MAS) environment. The interaction of such agents inevitably invokes exceptions, which are not well-addressed due to their sophistication. This paper focuses on conflicts, the primary presentation of exceptions among agents, and considers mainly conflict resolution (CR) in a knowledge-based system consisting of both machine-based and human designer agents. Based on previous studies, we propose a generic taxonomy of conflicts, a preliminary integrated conflict management mechanism, and a general CR scheme. A new system architecture is also presented in the paper, with the discussion of a case study.


computer supported cooperative work in design | 2004

A concurrent design approach for data flow dominated embedded systems

Baifeng Wu; Chenglian Peng

The increased complexity and short life cycle of todays embedded systems increase the need of more aggressive methodology capable of designing systems quicker and more easily. In this paper, with the help of dynamic data flow modeling technique and a component-oriented implementation architecture, we present a concurrent design approach for data flow dominated embedded system. The approach can decompose the design process of an embedded system into a lot of separate design items with the consistency being maintained. The generation of implementation code framework for hardware/software components and implementation code for different types of data paths not only accelerates the design process, but also provides a guarantee for the integrity of the target system.

Collaboration


Dive into the Baifeng Wu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge