Dianwei Han | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dianwei Han is active.

Explore More

Publication

Featured researches published by Dianwei Han.

Knowledge and Information Systems | 2006

Singular value decomposition based data distortion strategy for privacy protection

Shuting Xu; Jun Zhang; Dianwei Han; Jie Wang

Privacy-preserving is a major concern in the application of data mining techniques to datasets containing personal, sensitive, or confidential information. Data distortion is a critical component to preserve privacy in security-related data mining applications, such as in data mining-based terrorist analysis systems. We propose a sparsified Singular Value Decomposition (SVD) method for data distortion. We also put forth a few metrics to measure the difference between the distorted dataset and the original dataset and the degree of the privacy protection. Our experimental results using synthetic and real world datasets show that the sparsified SVD method works well in preserving privacy as well as maintaining utility of the datasets.

intelligence and security informatics | 2005

Data distortion for privacy protection in a terrorist analysis system

Shuting Xu; Jun Zhang; Dianwei Han; Jie Wang

Data distortion is a critical component to preserve privacy in security-related data mining applications, such as in data mining-based terrorist analysis systems. We propose a sparsified Singular Value Decomposition (SVD) method for data distortion. We also put forth a few metrics to measure the difference between the distorted dataset and the original dataset. Our experimental results using synthetic and real world datasets show that the sparsified SVD method works well in preserving privacy as well as maintaining utility of the datasets.

international conference on data mining | 2007

Simultaneous Pattern and Data Hiding in Unsupervised Learning

Jie Wang; Jun Zhang; Lian Liu; Dianwei Han

How to control the level of knowledge disclosure and se- cure certain confidential patterns is a subtask comparable to confidential data hiding in privacy preserving data min- ing. We propose a technique to simultaneously hide data values and confidential patterns without undesirable side effects on distorting nonconfidential patterns. We use non- negative matrix factorization technique to distort the origi- nal dataset and preserve its overall characteristics. A fac- tor swapping method is designed to hide particular confi- dential patterns for k-means clustering. The effectiveness of this novel hiding technique is examined on a benchmark dataset. Experimental results indicate that our technique can produce a single modified dataset to achieve both pat- tern and data value hiding. Under certain constraints on the nonnegative matrix factorization iterations, an optimal solution can be computed in which the user-specified con- fidential memberships or relationships are hidden without undesirable alterations on nonconfidential patterns.

acm southeast regional conference | 2009

A novel method for MicroRNA secondary structure prediction using a bottom-up algorithm

Dianwei Han; Guiliang Tang; Jun Zhang

MicroRNAs (miRNAs) are newly discovered endogenous small non-coding RNAs (21-25nt) that are thought to regulate expression of target genes by direct interaction with mRNAs. MicroRNAs have been identified through both experimental and computational methods, and microRNA secondary structure prediction is important and essential. Generally, there are two classes of methods to predict the secondary structure of RNAs. Thermodynamics-based methods have been the dominant strategy for single-stranded RNA secondary structure prediction for many years. Recently, probabilistic-based methods have emerged to replace the free energy minimization methods for modeling RNA structures. However, the accuracies of the currently best available probabilistic-based models have yet to match those of the best thermodynamics-based methods. So this situation motivates us to develop a new prediction algorithm which will focus on microRNA structure prediction with high accuracy. A new model, nucleotide cyclic motifs (NCM), was recently proposed by Major et al. to predict RNA secondary structure. We propose and implement a novel model based on a Modified NCM (MNCM) model with a physics-based scoring strategy to tackle the problem of microRNA folding. By making use of a global optimal algorithm based on the bottom-up local optimal solutions, we implement MicroRNAfold. Our experimental results show that MicroRNAfold outperforms the current leading prediction tools.

international conference on machine learning and applications | 2007

A comparison of two algorithms for predicting the condition number

Dianwei Han; Jun Zhang

We present experimental results of comparing the modified K-nearest neighbor (MkNN) algorithm with support vector machine (SVM) in the prediction of condition numbers of sparse matrices. Condition number of a matrix is an important measure in numerical analysis and linear algebra. However, the direct computation of the condition number of a matrix is very expensive in terms of CPU and memory cost, and becomes prohibitive for large size matrices. We use data mining techniques to estimate the condition number of a given sparse matrix. In our previous work, we used support vector machine (SVM) to predict the condition numbers. While SVM is considered a state-of- the-art classification/regression algorithm, kNN is usually used for collaborative filtering tasks. Since prediction can also be interpreted as a classification/regression task, virtually any supervised learning algorithm (such as kNN) can also be applied. Experiments are performed on a publicly available dataset. We conclude that modified kNN (MkNN) performs much better than SVM on this particular dataset.

data mining in bioinformatics | 2012

MicroRNAfold: pre-microRNA secondary structure prediction based on modified NCM model with thermodynamics-based scoring strategy

Dianwei Han; Jun Zhang; Guiliang Tang

An accurate prediction of the pre-microRNA secondary structure is important in miRNA informatics. Based on a recently proposed model, nucleotide cyclic motifs (NCM), to predict RNA secondary structure, we propose and implement a Modified NCM (MNCM) model with a physics-based scoring strategy to tackle the problem of pre-microRNA folding. Our microRNAfold is implemented using a global optimal algorithm based on the bottom-up local optimal solutions. Our experimental results show that microRNAfold outperforms the current leading prediction tools in terms of True Negative rate, False Negative rate, Specificity, and Matthews coefficient ratio.

international conference on machine learning and applications | 2010

A Parallel Algorithm for Predicting the Secondary Structure of Polycistronic MicroRNAs

Dianwei Han; Guiliang Tang; Jun Zhang

MicroRNAs (miRNAs) are newly discovered endogenous small non-coding RNAs (21-25nt) that target their complementary gene transcripts for degradation or translational repression. The biogenesis of a functional miRNA is largely dependent on the secondary structure of the miRNA precursor (pre-miRNA). Recently, it has been shown that miRNAs are present in the genome as the form of polycistronic transcriptional units in plants and animals. It will be important to design methods to predict such structures for miRNA discovery and its applications in gene silencing. In this paper, we propose a parallel algorithm based on the master-slave architecture to predict the secondary structure from an input sequence. First, the master processor partitions the input sequence into subsequences and distributes them to the slave processors. The slave processors will then predict the secondary structure based on their individual task. Afterward, the slave processors will return their results to the master processor. Finally, the master processor will merge the partial structures from the slave processors into a whole candidate secondary structure. The optimal structure is obtained by sorting the candidate structures according to their scores. Our experimental results indicate that the actual speed-ups match the trend of theoretic values.

acm southeast regional conference | 2008

An online condition number query system

Dianwei Han; Shuting Xu; Jun Zhang

Condition number of a matrix is an important measure in numerical analysis and linear algebra. It is a measure of stability or sensitivity of a matrix to numerical operations. However, the direct computation of the condition number of a matrix is very expensive in terms of CPU and memory cost, and becomes prohibitive for large size matrices. We propose to use data mining techniques to estimate the condition number of a given sparse matrix. In particular, we will use Support Vector Machine (SVM) to predict the condition numbers. That is, after computing the sparsity pattern features of a matrix, we use support vector regression (SVR) to predict its condition number. This Online Condition Number Query System (OCNQS) allows the users to submit their matrices and to obtain predicted condition numbers for their matrices. The accuracy of our prediction methods may not be as precise as the direct computation methods, but it is much faster. Our online system accepts matrices in Harwell-Boeing (HB) format and in standard MATLAB format. The users can use our system to estimate the condition number of their matrices through LAPACK software as well.

Methods of Molecular Biology | 2016

Design, Construction, and Validation of Artificial MicroRNA Vectors Using Agrobacterium-Mediated Transient Expression System

Basdeo Bhagwat; Ming Chi; Dianwei Han; Haifeng Tang; Guiliang Tang

Artificial microRNA (amiRNA) technology utilizes microRNA (miRNA) biogenesis pathway to produce artificially selected small RNAs using miRNA gene backbone. It provides a feasible strategy for inducing loss of gene function, and has been applied in functional genomics study, improvement of crop quality and plant virus disease resistance. A big challenge in amiRNA applications is the unpredictability of silencing efficacy of the designed amiRNAs and not all constructed amiRNA candidates would be expressed effectively in plant cells. We and others found that high efficiency and specificity in RNA silencing can be achieved by designing amiRNAs with perfect or almost perfect sequence complementarity to their targets. In addition, we recently demonstrated that Agrobacterium-mediated transient expression system can be used to validate amiRNA constructs, which provides a simple, rapid and effective method to select highly expressible amiRNA candidates for stable genetic transformation. Here, we describe the methods for design of amiRNA candidates with perfect or almost perfect base-pairing to the target gene or gene groups, incorporation of amiRNA candidates in miR168a gene backbone by one step inverse PCR amplification, construction of plant amiRNA expression vectors, and assay of transient expression of amiRNAs in Nicotiana benthamiana through agro-infiltration, small RNA extraction, and amiRNA Northern blot.

International Journal of Bioinformatics Research and Applications | 2013

A parallel strategy for predicting the secondary structure of polycistronic microRNAs

Dianwei Han; Guiliang Tang; Jun Zhang

The biogenesis of a functional microRNA is largely dependent on the secondary structure of the microRNA precursor (pre-miRNA). Recently, it has been shown that microRNAs are present in the genome as the form of polycistronic transcriptional units in plants and animals. It will be important to design efficient computational methods to predict such structures for microRNA discovery and its applications in gene silencing. In this paper, we propose a parallel algorithm based on the master-slave architecture to predict the secondary structure from an input sequence. We conducted some experiments to verify the effectiveness of our parallel algorithm. The experimental results show that our algorithm is able to produce the optimal secondary structure of polycistronic microRNAs.

Explore More