Ya Xue
Duke University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ya Xue.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2007
David A. Williams; Xuejun Liao; Ya Xue; Lawrence Carin; Balaji Krishnapuram
We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both expectation-maximization (EM) and variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graph-based regularization. The semisupervised algorithm utilizes all available data-both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown
Journal of the American Statistical Association | 2008
David B. Dunson; Ya Xue; Lawrence Carin
In analyzing data from multiple related studies, it often is of interest to borrow information across studies and to cluster similar studies. Although parametric hierarchical models are commonly used, of concern is sensitivity to the form chosen for the random-effects distribution. A Dirichlet process (DP) prior can allow the distribution to be unknown, while clustering studies; however, the DP does not allow local clustering of studies with respect to a subset of the coefficients without making independence assumptions. Motivated by this problem, we propose a matrix stick-breaking process (MSBP) as a prior for a matrix of random probability measures. Properties of the MSBP are considered, and methods are developed for posterior computation using Markov chain Monte Carlo. Using the MSBP as a prior for a matrix of study-specific regression coefficients, we demonstrate advantages over parametric modeling in simulated examples. The methods are further illustrated using a multinational uterotrophic bioassay study.
international conference on machine learning | 2005
David A. Williams; Xuejun Liao; Ya Xue; Lawrence Carin
A logistic regression classification algorithm is developed for problems in which the feature vectors may be missing data (features). Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the non-missing data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both expectation maximization (EM) and Variational Bayesian EM (VB-EM). Using widely available real data, we demonstrate the general advantage of the VB-EM GMM estimation for handling incomplete data, vis-à-vis the EM algorithm. Moreover, it is demonstrated that the approach proposed here is generally superior to standard imputation procedures.
international conference on machine learning | 2007
Ya Xue; David B. Dunson; Lawrence Carin
In multi-task learning our goal is to design regression or classification models for each of the tasks and appropriately share information between tasks. A Dirichlet process (DP) prior can be used to encourage task clustering. However, the DP prior does not allow local clustering of tasks with respect to a subset of the feature vector without making independence assumptions. Motivated by this problem, we develop a new multitask-learning prior, termed the matrix stick-breaking process (MSBP), which encourages cross-task sharing of data. However, the MSBP allows separate clustering and borrowing of information for the different feature components. This is important when tasks are more closely related for certain features than for others. Bayesian inference proceeds by a Gibbs sampling algorithm and the approach is illustrated using a simulated example and a multi-national application.
IEEE Signal Processing Letters | 2008
Jun Fang; Shihao Ji; Ya Xue; Lawrence Carin
We consider the problem of multitask learning (MTL), in which we simultaneously learn classifiers for multiple data sets (tasks), with sharing of intertask data as appropriate. We introduce a set of relevance parameters that control the degree to which data from other tasks are used in estimating the current tasks classifier parameters. The set of relevance parameters are learned by maximizing their posterior probability, yielding an expectation-maximization (EM) algorithm. We illustrate the effectiveness of our approach through experimental results on a practical data set.
international conference on multimedia information networking and security | 2007
Jason R. Stack; F. Crosby; R. J. McDonald; Ya Xue; Lawrence Carin
The purpose of this research is to jointly learn multiple classification tasks by appropriately sharing information between similar tasks. In this setting, examples of different tasks include the discrimination of targets from non-targets by different sonars or by the same sonar operating in sufficiently different environments. This is known as multi-task learning (MTL) and is accomplished via a Bayesian approach whereby the learned parameters for classifiers of similar tasks are drawn from a common prior. To learn which tasks are similar and the appropriate priors a Dirichlet process is employed and solved using mean field variational Bayesian inference. The result is that for many real-world instances where training data is limited MTL exhibits a significant improvement over both learning individual classifiers for each task as well as pooling all data and training one overall classifier. The performance of this method is demonstrated on simulated data and experimental data from multiple imaging sonars operating over multiple environments.
ieee international workshop on computational advances in multi-sensor adaptive processing | 2007
Lawrence Carin; Dehong Liu; Ya Xue
Compressive sensing (CS) is a framework that exploits the compressible character of most natural signals, allowing the accurate measurement of an w-dimensional real signal u in terms of n<<m real measurements v. The CS measurements may be represented in terms of an ntimesm matrix that defines the linear relationship between v and u. In this paper we demonstrate that similar linear mappings of the form u rarr v are manifested naturally by wave propagation in complex media, and therefore in situ CS measurements may be performed simply by exploiting the complex propagation and scattering properties of natural environments. A similar phenomenon is observed in time-reversal imaging, to which connections are made. In addition to presenting the basic in situ CS framework, a simple but practical example problem is considered.
IEEE Transactions on Signal Processing | 2008
Shihao Ji; Ya Xue; Lawrence Carin
Journal of Machine Learning Research | 2007
Ya Xue; Xuejun Liao; Lawrence Carin; Balaji Krishnapuram
international conference on machine learning | 2005
Xuejun Liao; Ya Xue; Lawrence Carin