Sun Ninghui
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sun Ninghui.
ieee international conference on high performance computing data and analytics | 2005
Xu Lin; Zhang Peiheng; Bu Dongbo; Feng Shengzhong; Sun Ninghui
Multiple sequence alignment is a fundamental and challenging problem in computational molecular biology. ClustalW, the most widely used multiple sequence alignment software, performs very slowly on hundreds of sequence. Here, we analyze the algorithm complexity of ClustalW as well as the time profile in practice, and then propose a strategy which uses the reconfigurable hardware FPGA to accelerate ClustalW. Comparison with other coarse-grained parallel strategies demonstrates a fine speedup of this strategy and savage of computing resource
grid and cooperative computing | 2008
Song Huaiming; Wang Yang; An Mingyuan; Wang Weiping; Sun Ninghui
Hot-spot events accessing has recently received considerable attentions in the event stream historical analysis systems. Noting that predicates in SQL (Structured Query Language) requests usually have similarity features in a short time in event stream systems, that means events frequently queried recently might be queried again in the near future. This paper proposes a prediction model to forecast query predicates and then to choose them for speculative execution. We propose an adaptive two-level scoring (TLS) prediction algorithm, which can adjust parameters according to the system resource usage conditions. We introduce two metrics accuracy rate and efficiency rate, for query prediction evaluation, and make a detailed analysis of system costs. Our experimental results in DBroker system demonstrate the TLS algorithm and local speculative execution method can significantly reduce query response time.
Frontiers of Computer Science in China | 2007
Sun Ninghui; Meng Dan
Dawning4000A is an AMD Opteron-based Linux Cluster with 11.2Tflops peak performance and 8.06Tflops Linpack performance. It was developed for the Shanghai Supercomputer Center (SSC) as one of the computing power stations of the China National Grid (CNGrid) project. The Massively Cluster Computer (MCC) architecture is proposed to put added-value on the industry standard system. Several grid-enabling components are developed to support the running environment of the CNGrid. It is an achievement for a high performance computer with the low-cost approach.
Journal of Computer Science and Technology | 1997
Sun Ninghui; Liu Wenzhuo; Liu Hong; Wang Chuanbao; Lu Xuelin; Zhang Hao
PROOS is a distributed operating system running on the computing nodes of massively parallel processing computer Dawning-1000. It is an efficient and easily extendible micro kernel operating system. It supports the Intel NX message passinginterface for communication.
networking architecture and storages | 2009
Yuan Qingbo; Bao Yungang; Chen Mingyu; Sun Ninghui
The quickly development of the multi-core technology brings plenty of logical processors to the symmetric multiprocessing (SMP) system. As all of cores share the same system bus and memory bandwidth, the additional computing resources can’t fully play their roles. It is the basic restrict to the scalability of such a system. Furthermore, the operating system which runs in this system typically provides complex abstractions implemented over shared data structures protected by locks. More contentions come along with the increase of cores in such type of kernel. After several detailed experiments to 5 different types of benchmarks, we recognize these problems in the multi-core SMP system. At last, reasons causing the problems are analyzed and corresponding solutions are raised briefly.
Journal of Computer Science and Technology | 1999
Sun Ninghui
The Scalable I/O (SIO) Initiative’s Low-Level Application Programming Interface (SIO LLAPI) provides file system implementers with a simple low-Level interface to support high-level parallel I/O interfaces efficiently and effectively. This paper describes a reference implementation and the evaluation of the SIO LLAPI on the Intel Paragon multicomputer. The implementation provides the file system structure and striping algorithm, compatible with the Parallel File System (PFS) of Intel Paragon, and runs either inside the kernel or as a user level library. The scatter-gather addressing read/write, asynchronous I/O, client caching and prefetching mechanism, file access hint mechanism collective I/O and highly efficient file copy have been implemented. The preliminary experience shows that the SIO LLAPI provides opportunities of significant performance improvement and is easy to implement. Some high level file system interfaces and applications, such as PFS, ADIO and Hartree-Fock application, are also implemented on top of SIO. The performance of PFS is at least the same as that of Intel’s native PFS, and in many cases, such as small sequential file access, huge I/O requests and collective I/O, it is stable and much better. The SIO features help to support high level interfaces easily, quickly and more efficiently, and the cache, prefetching, hints are useful to get better performance based on different access models. The scalability and performance of SIO are limited by the network latency, network scalable bandwidth, memory copy bandwidth, memory size and pattern of I/O requests. The tradeoff between generality and efficiency should be considered in implementation.The Scalable I/O (SIO) Initiative’s Low-Level Application Programming Interface (SIO LLAPI) provides file system implementers with a simple low-Level interface to support high-level parallel I/O interfaces efficiently and effectively. This paper describes a reference implementation and the evaluation of the SIO LLAPI on the Intel Paragon multicomputer. The implementation provides the file system structure and striping algorithm, compatible with the Parallel File System (PFS) of Intel Paragon, and runs either inside the kernel or as a user level library. The scatter-gather addressing read/write, asynchronous I/O, client caching and prefetching mechanism, file access hint mechanism collective I/O and highly efficient file copy have been implemented. The preliminary experience shows that the SIO LLAPI provides opportunities of significant performance improvement and is easy to implement. Some high level file system interfaces and applications, such as PFS, ADIO and Hartree-Fock application, are also implemented on top of SIO. The performance of PFS is at least the same as that of Intel’s native PFS, and in many cases, such as small sequential file access, huge I/O requests and collective I/O, it is stable and much better. The SIO features help to support high level interfaces easily, quickly and more efficiently, and the cache, prefetching, hints are useful to get better performance based on different access models. The scalability and performance of SIO are limited by the network latency, network scalable bandwidth, memory copy bandwidth, memory size and pattern of I/O requests. The tradeoff between generality and efficiency should be considered in implementation.
Journal of Software | 2006
Tan Guang-Ming; Feng Shengzhong; Sun Ninghui
Archive | 2004
Liu Tao; Zhang Peiheng; Sun Ninghui
Archive | 2014
Liu Xiaoli; Cao Zheng; An Xuejun; Zhang Peiheng; Sun Ninghui; Wang Zhan; Su Yong
Archive | 2013
Sun Ninghui; Cao Zheng; Liu Xiaoli; An Xuejun; Zhang Peiheng