Dongbo Dai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dongbo Dai is active.

Explore More

Publication

Featured researches published by Dongbo Dai.

PLOS ONE | 2014

Parallel Clustering Algorithm for Large-Scale Biological Data Sets

Minchao Wang; Wu Zhang; Wang Ding; Dongbo Dai; Huiran Zhang; Hao Xie; Luonan Chen; Yike Guo; Jiang Xie

Backgrounds Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Methods Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. Result A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies.

PLOS ONE | 2013

CNNcon: Improved Protein Contact Maps Prediction Using Cascaded Neural Networks

Wang Ding; Jiang Xie; Dongbo Dai; Huiran Zhang; Hao Xie; Wu Zhang

Backgrounds Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence) alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly. Methods CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction. Results The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective prediction of long length proteins could be possible by the CNNcon.

statistical and scientific database management | 2012

Efficient range queries over uncertain strings

Dongbo Dai; Jiang Xie; Huiran Zhang; Jiaqi Dong

Edit distance based string range query is used extensively in the data integration, keyword search, biological function prediction and many others. In the presence of uncertainty, however, answering range queries is more challenging than those in deterministic scenarios since there are exponentially many possible worlds to be considered. This work extends existing filtering techniques tailored for deterministic strings to uncertain settings. We first design probabilistic q-gram filtering method that can work both efficiently and effectively. Another filtering technique, frequency distance based filtering, is also adapted to work with uncertain strings. To achieve further speed-up, we combined two state-of-the-art approaches based on cumulative distribution functions and local perturbation to improve lower bounds and upper bounds. Comprehensive experiment results show that our filter-based scheme, in the uncertain settings, is more efficient than existing methods only leveraging cumulative distribution functions or local perturbation.

annual acis international conference on computer and information science | 2012

PRT-HMM: A Novel Hidden Markov Model for Protein Secondary Structure Prediction

Wang Ding; Dongbo Dai; Jiang Xie; Huiran Zhang; Wu Zhang; Hao Xie

Protein secondary structure prediction is one of the most important and challenging problems in structural bioinformatics, which has been an essential task in determining the structure and function of the proteins. Despite significant progress made in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. A novel probability revise table based hidden Markov model (PRT-HMM) method is presented in this paper with considering the dependencies among the state transitions. We revise the initial predicted protein structure through looking up the probability revise table, which is learned from the dataset. Theoretical analysis and experiment results indicate that the proposed method is reasonable and the accuracy of protein secondary structure prediction is increased compared to the original hidden Markov model (HMM).

Physica Scripta | 2014

Nonuniformity mitigation of beam illumination in heavy ion inertial fusion

Shigeo Kawata; K. Noguchi; T. Suzuki; T. Kurosaki; Daisuke Barada; A.I. Ogoyski; Wu Zhang; Jiang Xie; Huiran Zhang; Dongbo Dai

In inertial fusion, a target DT fuel should be compressed to typically 1000 times the solid density. The target implosion nonuniformity is introduced by a driver beam?s illumination nonuniformity, for example. The target implosion should be robust against the implosion nonuniformities. In this paper, the requirement for implosion uniformity is first discussed. The implosion non-uniformity should be less than a few percent. The implosion dynamics is also briefly reviewed in heavy ion inertial fusion (HIF). Heavy ions deposit their energy inside the target energy absorber, and the energy deposition layer is rather thick, depending on the ion particle energy. Then nonuniformity mitigation mechanisms of the heavy ion beam (HIB) illumination in HIF are discussed. A density valley appears in the energy absorber, and the large-scale density valley also works as a radiation energy confinement layer, which contributes to a radiation energy smoothing. In HIF, wobbling heavy ion beam illumination was also introduced to realize a uniform implosion. The wobbling HIB axis oscillation is precisely controlled. In the wobbling HIBs? illumination, the illumination nonuniformity oscillates in time and space on an HIF target. The oscillating-HIB energy deposition may contribute to the reduction of the HIBs? illumination nonuniformity by its smoothing effect on the HIB illumination nonuniformity and also by a growth mitigation effect on the Rayleigh?Taylor instability.

international conference of the ieee engineering in medicine and biology society | 2012

A network clustering algorithm for detection of protein families

Jiang Xie; Minchao Wang; Dongbo Dai; Huiran Zhang; Wu Zhang

Detection of protein families in large scale database is a difficult but important biological problem. Computational clustering methods can effectively address the problem. Although there exist many clustering algorithms, most of them are just based on the threshold. Their computational performances are affected by the weight distribution greatly, and they are only valid for some special networks. A new network clustering algorithm, Markov Finding and Clustering (MFC), is proposed to cluster the proteins into their functionally specific families accurately in this paper. The MFC algorithm makes an improvement in the random walk process and reduces the affection of the noise on the clustering result. It has a good performance on these networks which are not well addressed by existing algorithms sensitive to the noise. Finally, experiments on the protein sequence datasets demonstrate that the algorithm is effective in the detection of protein families and has a better performance than the current algorithms.

High Power Laser Science and Engineering | 2014

Controllability of intense-laser ion acceleration

Shigeo Kawata; Toshihiro Nagashima; Masahiro Takano; T. Izumiyama; Daiki Kamiyama; Daisuke Barada; Q. Kong; Y. J. Gu; Ping Xiao Wang; Y. Y. Ma; Wei Ming Wang; Wu Zhang; Jiang Xie; Huiran Zhang; Dongbo Dai

international conference on information science electronics and electrical engineering | 2014