Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tonghai Jiang is active.

Publication


Featured researches published by Tonghai Jiang.


Archives of Virology | 2007

The N-terminal hydrophobic sequence of Autographa californica nucleopolyhedrovirus PIF-3 is essential for oral infection

Xuegang Li; Junqiang Song; Tonghai Jiang; Cuiyi Liang; Xueran Chen

SummaryThe Autographa californica nucleopolyhedrovirus (AcMNPV) open reading frame 115 has been identified as a per os infection factor (pif-3) and is essential for oral infection. Here, we have characterized the pif-3 of AcMNPV in more detail. The pif-3 transcripts were detected from 12 to 96 h post-infection (hpi) in Sf9 cells infected with AcMNPV. Polyclonal antiserum first recognized a 25-kDa protein at 36 hpi. Western blot analysis indicated that PIF-3 is a component of occlusion-derived virus but not of budded virus. The subcellular localization demonstrated that the 21-amino-acid (aa) N-terminal hydrophobic domain of PIF-3, which is conserved in PIF-1, PIF2 and PIF-3, acts as a nuclear location signal and is essential for trafficking the protein to the nucleus. Deletion of either pif-3 or the 21-aa N-terminal hydrophobic domain of pif-3 from AcMNPV abolished per os infectivity but had no effect on the infectivity of the budded virus phenotype.


International Journal of Molecular Sciences | 2017

PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein–Protein Interactions from Protein Sequences

Yan-Bin Wang; Zhu-Hong You; Xiao Li; Xing Chen; Tonghai Jiang; Jingting Zhang

Protein–protein interactions (PPIs) are essential for most living organisms’ process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is still far from complete. In addition, the high-throughput technologies for detecting PPIs has some unavoidable defects, including time consumption, high cost, and high error rate. In recent years, with the development of machine learning, computational methods have been broadly used to predict PPIs, and can achieve good prediction rate. In this paper, we present here PCVMZM, a computational method based on a Probabilistic Classification Vector Machines (PCVM) model and Zernike moments (ZM) descriptor for predicting the PPIs from protein amino acids sequences. Specifically, a Zernike moments (ZM) descriptor is used to extract protein evolutionary information from Position-Specific Scoring Matrix (PSSM) generated by Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then, PCVM classifier is used to infer the interactions among protein. When performed on PPIs datasets of Yeast and H. Pylori, the proposed method can achieve the average prediction accuracy of 94.48% and 91.25%, respectively. In order to further evaluate the performance of the proposed method, the state-of-the-art support vector machines (SVM) classifier is used and compares with the PCVM model. Experimental results on the Yeast dataset show that the performance of PCVM classifier is better than that of SVM classifier. The experimental results indicate that our proposed method is robust, powerful and feasible, which can be used as a helpful tool for proteomics research.


Molecular therapy. Nucleic acids | 2018

A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information

Hai-Cheng Yi; Zhu-Hong You; De-Shuang Huang; Xiao Li; Tonghai Jiang; Li-Ping Li

The interactions between non-coding RNAs (ncRNAs) and proteins play an important role in many biological processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological techniques are used to identify protein molecules bound with specific ncRNA, but they are usually expensive and time consuming. Deep learning provides a powerful solution to computationally predict RNA-protein interactions. In this work, we propose the RPI-SAN model by using the deep-learning stacked auto-encoder network to mine the hidden high-level features from RNA and protein sequences and feed them into a random forest (RF) model to predict ncRNA binding proteins. Stacked assembling is further used to improve the accuracy of the proposed method. Four benchmark datasets, including RPI2241, RPI488, RPI1807, and NPInter v2.0, were employed for the unbiased evaluation of five established prediction tools: RPI-Pred, IPMiner, RPISeq-RF, lncPro, and RPI-SAN. The experimental results show that our RPI-SAN model achieves much better performance than other methods, with accuracies of 90.77%, 89.7%, 96.1%, and 99.33%, respectively. It is anticipated that RPI-SAN can be used as an effective computational tool for future biomedical researches and can accurately predict the potential ncRNA-protein interacted pairs, which provides reliable guidance for biological research.


Journal of Sensors | 2018

A Type-Based Blocking Technique for Efficient Entity Resolution over Large-Scale Data

Hui-Juan Zhu; Zheng-Wei Zhu; Tonghai Jiang; Li Cheng; Wei-Lei Shi; Xi Zhou; Fan Zhao; Bo Ma

In data integration, entity resolution is an important technique to improve data quality. Existing researches typically assume that the target dataset only contain string-type data and use single similarity metric. For larger high-dimensional dataset, redundant information needs to be verified using traditional blocking or windowing techniques. In this work, we propose a novel ER-resolving method using a hybrid approach, including type-based multiblocks, varying window size, and more flexible similarity metrics. In our new ER workflow, we reduce the searching space for entity pairs by the constraint of redundant attributes and matching likelihood. We develop a reference implementation of our proposed approach and validate its performance using real-life dataset from one Internet of Things project. We evaluate the data processing system using five standard metrics including effectiveness, efficiency, accuracy, recall, and precision. Experimental results indicate that the proposed approach could be a promising alternative for entity resolution and could be feasibly applied in real-world data cleaning for large datasets.


recent advances in natural language processing | 2017

Log-linear Models for Uyghur Segmentation in Spoken Language Translation

Chenggang Mi; Yating Yang; Rui Dong; Xi Zhou; Lei Wang; Xiao Li; Tonghai Jiang

To alleviate data sparsity in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. Instead of learning model only from monolingual annotated corpus, this approach optimizes Uyghur segmentation for spoken translation based on both bilingual and monolingual corpus. Our approach relies on several features such as traditional conditional random field (CRF) feature, bilingual word alignment feature and monolingual suffixword co-occurrence feature. Experimental results shown that our proposed segmentation model for Uyghur spoken translation achieved 1.6 BLEU score improvements compared with the state-of-the-art baseline.


Neural Computing and Applications | 2017

HEMD: a highly efficient random forest-based malware detection framework for Android

Hui-Juan Zhu; Tonghai Jiang; Bo Ma; Zhu-Hong You; Wei-Lei Shi; Li Cheng

Mobile phones are rapidly becoming the most widespread and popular form of communication; thus, they are also the most important attack target of malware. The amount of malware in mobile phones is increasing exponentially and poses a serious security threat. Google’s Android is the most popular smart phone platforms in the world and the mechanisms of permission declaration access control cannot identify the malware. In this paper, we proposed an ensemble machine learning system for the detection of malware on Android devices. More specifically, four groups of features including permissions, monitoring system events, sensitive API and permission rate are extracted to characterize each Android application (app). Then an ensemble random forest classifier is learned to detect whether an app is potentially malicious or not. The performance of our proposed method is evaluated on the actual data set using tenfold cross-validation. The experimental results demonstrate that the proposed method can achieve a highly accuracy of 89.91%. For further assessing the performance of our method, we compared it with the state-of-the-art support vector machine classifier. Comparison results demonstrate that the proposed method is extremely promising and could provide a cost-effective alternative for Android malware detection.


Mathematical Problems in Engineering | 2017

Filtering Reordering Table Using a Novel Recursive Autoencoder Model for Statistical Machine Translation

Jinying Kong; Yating Yang; Lei Wang; Xi Zhou; Tonghai Jiang; Xiao Li

In phrase-based machine translation (PBMT) systems, the reordering table and phrase table are very large and redundant. Unlike most previous works which aim to filter phrase table, this paper proposes a novel deep neural network model to prune reordering table. We cast the task as a deep learning problem where we jointly train two models: a generative model to implement rule embedding and a discriminative model to classify rules. The main contribution of this paper is that we optimize the reordering model in PBMT by filtering reordering table using a recursive autoencoder model. To evaluate the performance of the proposed model, we performed it on public corpus to measure its reordering ability. The experimental results show that our approach obtains high improvement in BLEU score with less scale of reordering table on two language pairs: English-Chinese (


IEEE Access | 2017

A Novel Data Integration Framework Based on Unified Concept Model

Bo Ma; Tonghai Jiang; Xi Zhou; Fan Zhao; Yating Yang

Nowadays, data is being generated, collected, and analyzed at an unprecedented scale, data integration is the problem of combining data from heterogeneous, autonomous data sources, and providing users with a unified view of integrated data. To design a data integration framework, we need to address challenges, such as schema mapping, data cleaning, record linkage, and data fusion. In this paper, we briefly introduce the traditional data integration approaches, and then, a novel graph-based data integration framework based on unified concept model (UCM) is proposed to address real-world refueling data integration problems. Within this framework, schema mapping was carried out and metadata from heterogeneous sources is integrated in a UCM. UCM has the benefits of being easy to update. It is also important for effective schema mapping and data transformation. By following the structure of UCM, data from different sources is automatically transformed into instance data and linked together by using semantic similarity computation metrics, finally the data is stored in graph database. Experiments are carried out based on heterogeneous data from refueling records, social networks of astroturfers, and vehicle trajectories. Experimental results and reference implementation demonstrations show good precision and recall of the proposed framework.


applications of natural language to data bases | 2015

Optimized Uyghur Segmentation for Statistical Machine Translation

Chenggang Mi; Yating Yang; Rui Dong; Xi Zhou; Lei Wang; Xiao Li; Tonghai Jiang; Turghun Osman

In this paper, we propose an optimized method to segment the Uyghur word. We consider the optimization as a classification problem; the features are extracted from Uyghur-Chinese bilingual corpus. Experimental results show that with our method the performance of Uyghur-Chinese machine translation improved significantly.


Molecular BioSystems | 2017

Predicting protein–protein interactions from protein sequences by a stacked sparse autoencoder deep neural network

Yan-Bin Wang; Zhu-Hong You; Xiao Li; Tonghai Jiang; Xing Chen; Xi Zhou; Lei Wang

Collaboration


Dive into the Tonghai Jiang's collaboration.

Top Co-Authors

Avatar

Xi Zhou

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Lei Wang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yating Yang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Xiao Li

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Chenggang Mi

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Bo Ma

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Zhu-Hong You

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Fan Zhao

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Li Cheng

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Hui-Juan Zhu

Chinese Academy of Sciences

View shared research outputs
Researchain Logo
Decentralizing Knowledge