Jieping Ye | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jieping Ye is active.

Explore More

Publication

Featured researches published by Jieping Ye.

international conference on computer vision | 2009

Tensor completion for estimating missing values in visual data

Ji Liu; Przemyslaw Musialski; Peter Wonka; Jieping Ye

In this paper we propose an algorithm to estimate missing values in tensors of visual data. The values can be missing due to problems in the acquisition process, or because the user manually identified unwanted outliers. Our algorithm works even with a small amount of samples and it can propagate structure to fill larger missing regions. Our methodology is built on recent studies about matrix completion using the matrix trace norm. The contribution of our paper is to extend the matrix case to the tensor case by laying out the theoretical foundations and then by building a working algorithm. First, we propose a definition for the tensor trace norm, that generalizes the established definition of the matrix trace norm. Second, similar to matrix completion, the tensor completion is formulated as a convex optimization problem. Unfortunately, the straightforward problem extension is significantly harder to solve than the matrix case because of the dependency among multiple constraints. To tackle this problem, we employ a relaxation technique to separate the dependant relationships and use the block coordinate descent (BCD) method to achieve a globally optimal solution. Our experiments show potential applications of our algorithm and the quantitative evaluation indicates that our method is more accurate and robust than heuristic approaches.

NeuroImage | 2017

ENIGMA and the Individual: Predicting Factors that Affect the Brain in 35 Countries Worldwide

Paul M. Thompson; Ole A. Andreassen; Alejandro Arias-Vasquez; Carrie E. Bearden; Premika S.W. Boedhoe; Rachel M. Brouwer; Randy L. Buckner; Jan K. Buitelaar; Kazima Bulayeva; Dara M. Cannon; Ronald A. Cohen; Patricia J. Conrod; Anders M. Dale; Ian J. Deary; Emily L. Dennis; Marcel A. de Reus; Sylvane Desrivières; Danai Dima; Gary Donohoe; Simon E. Fisher; Jean-Paul Fouche; Clyde Francks; Sophia Frangou; Barbara Franke; Habib Ganjgahi; Hugh Garavan; David C. Glahn; Hans Joergen Grabe; Tulio Guadalupe; Boris A. Gutman

In this review, we discuss recent work by the ENIGMA Consortium (http://enigma.ini.usc.edu) – a global alliance of over 500 scientists spread across 200 institutions in 35 countries collectively analyzing brain imaging, clinical, and genetic data. Initially formed to detect genetic influences on brain measures, ENIGMA has grown to over 30 working groups studying 12 major brain diseases by pooling and comparing brain data. In some of the largest neuroimaging studies to date – of schizophrenia and major depression – ENIGMA has found replicable disease effects on the brain that are consistent worldwide, as well as factors that modulate disease effects. In partnership with other consortia including ADNI, CHARGE, IMAGEN and others1, ENIGMAs genomic screens – now numbering over 30,000 MRI scans – have revealed at least 8 genetic loci that affect brain volumes. Downstream of gene findings, ENIGMA has revealed how these individual variants – and genetic variants in general – may affect both the brain and risk for a range of diseases. The ENIGMA consortium is discovering factors that consistently affect brain structure and function that will serve as future predictors linking individual brain scans and genomic data. It is generating vast pools of normative data on brain measures – from tens of thousands of people – that may help detect deviations from normal development or aging in specific groups of subjects. We discuss challenges and opportunities in applying these predictors to individual subjects and new cohorts, as well as lessons we have learned in ENIGMAs efforts so far.

knowledge discovery and data mining | 2015

Multi-Task Learning for Spatio-Temporal Event Forecasting

Liang Zhao; Qian Sun; Jieping Ye; Feng Chen; Chang-Tien Lu; Naren Ramakrishnan

Spatial event forecasting from social media is an important problem but encounters critical challenges, such as dynamic patterns of features (keywords) and geographic heterogeneity (e.g., spatial correlations, imbalanced samples, and different populations in different locations). Most existing approaches (e.g., LASSO regression, dynamic query expansion, and burst detection) are designed to address some of these challenges, but not all of them. This paper proposes a novel multi-task learning framework which aims to concurrently address all the challenges. Specifically, given a collection of locations (e.g., cities), we propose to build forecasting models for all locations simultaneously by extracting and utilizing appropriate shared information that effectively increases the sample size for each location, thus improving the forecasting performance. We combine both static features derived from a predefined vocabulary by domain experts and dynamic features generated from dynamic query expansion in a multi-task feature learning framework; we investigate different strategies to balance homogeneity and diversity between static and dynamic terms. Efficient algorithms based on Iterative Group Hard Thresholding are developed to achieve efficient and effective model training and prediction. Extensive experimental evaluations on Twitter data from four different countries in Latin America demonstrated the effectiveness of our proposed approach.

knowledge discovery and data mining | 2015

Deep Model Based Transfer and Multi-Task Learning for Biological Image Analysis

Wenlu Zhang; Tao Zeng; Qian Sun; Sudhir Kumar; Jieping Ye; Shuiwang Ji

A central theme in learning from image data is to develop appropriate image representations for the specific task at hand. Traditional methods used handcrafted local features combined with high-level image representations to generate image-level representations. Thus, a practical challenge is to determine what features are appropriate for specific tasks. For example, in the study of gene expression patterns in Drosophila melanogaster, texture features based on wavelets were particularly effective for determining the developmental stages from in situ hybridization (ISH) images. Such image representation is however not suitable for controlled vocabulary (CV) term annotation because each CV term is often associated with only a part of an image. Here, we developed problem-independent feature extraction methods to generate hierarchical representations for ISH images. Our approach is based on the deep convolutional neural networks (CNNs) that can act on image pixels directly. To make the extracted features generic, the models were trained using a natural image set with millions of labeled examples. These models were transferred to the ISH image domain and used directly as feature extractors to compute image representations. Furthermore, we employed multi-task learning method to fine-tune the pre-trained models with labeled ISH images, and also extracted features from the fine-tuned models. Experimental results showed that feature representations computed by deep models based on transfer and multi-task learning significantly outperformed other methods for annotating gene expression patterns at different stage ranges. We also demonstrated that the intermediate layers of deep models produced the best gene expression pattern representations.

BMC Bioinformatics | 2015

Deep convolutional neural networks for annotating gene expression patterns in the mouse brain

Tao Zeng; Ravi Mukkamala; Jieping Ye; Shuiwang Ji

BackgroundProfiling gene expression in brain structures at various spatial and temporal scales is essential to understanding how genes regulate the development of brain structures. The Allen Developing Mouse Brain Atlas provides high-resolution 3-D in situ hybridization (ISH) gene expression patterns in multiple developing stages of the mouse brain. Currently, the ISH images are annotated with anatomical terms manually. In this paper, we propose a computational approach to annotate gene expression pattern images in the mouse brain at various structural levels over the course of development.ResultsWe applied deep convolutional neural network that was trained on a large set of natural images to extract features from the ISH images of developing mouse brain. As a baseline representation, we applied invariant image feature descriptors to capture local statistics from ISH images and used the bag-of-words approach to build image-level representations. Both types of features from multiple ISH image sections of the entire brain were then combined to build 3-D, brain-wide gene expression representations. We employed regularized learning methods for discriminating gene expression patterns in different brain structures. Results show that our approach of using convolutional model as feature extractors achieved superior performance in annotating gene expression patterns at multiple levels of brain structures throughout four developing ages. Overall, we achieved average AUC of 0.894 ± 0.014, as compared with 0.820 ± 0.046 yielded by the bag-of-words approach.ConclusionsDeep convolutional neural network model trained on natural image sets and applied to gene expression pattern annotation tasks yielded superior performance, demonstrating its transfer learning property is applicable to such biological image sets.

international symposium on biomedical imaging | 2015

Detecting genetic risk factors for Alzheimer's disease in whole genome sequence data via Lasso screening

Tao Yang; Jie Wang; Qian Sun; Derrek P. Hibar; Neda Jahanshad; Li Liu; Yalin Wang; Liang Zhan; Paul M. Thompson; Jieping Ye

Genetic factors play a key role in Alzheimers disease (AD). The Alzheimers Disease Neuroimaging Initiative (ADNI) whole genome sequence (WGS) data offers new power to investigate mechanisms of AD by combining entire genome sequences with neuroimaging and clinical data. Here we explore the ADNI WGS SNP (single nucleotide polymorphism) data in depth and extract approximately six million valid SNP features. We investigate imaging genetics associations using Lasso regression - a widely used sparse learning technique. To solve the large-scale Lasso problem more efficiently, we employ a highly efficient screening rule for Lasso - called dual polytope projections (DPP) - to remove irrelevant features from the optimization problem. Experiments demonstrate that the DPP can effectively identify irrelevant features and leads to a 400× speedup. This allows us for the first time to run the compute-intensive model selection procedure called stability selection to rank SNPs that may affect the brain and AD risk.

knowledge discovery and data mining | 2016

A Multi-Task Learning Formulation for Survival Analysis

Yan Li; Jie Wang; Jieping Ye; Chandan K. Reddy

Predicting the occurrence of a particular event of interest at future time points is the primary goal of survival analysis. The presence of incomplete observations due to time limitations or loss of data traces is known as censoring which brings unique challenges in this domain and differentiates survival analysis from other standard regression methods. The popularly used survival analysis methods such as Cox proportional hazard model and parametric survival regression suffer from some strict assumptions and hypotheses that are not realistic in most of the real-world applications. To overcome the weaknesses of these two types of methods, in this paper, we reformulate the survival analysis problem as a multi-task learning problem and propose a new multi-task learning based formulation to predict the survival time by estimating the survival status at each time interval during the study duration. We propose an indicator matrix to enable the multi-task learning algorithm to handle censored instances and incorporate some of the important characteristics of survival problems such as non-negative non-increasing list structure into our model through max-heap projection. We employ the L2,1-norm penalty which enables the model to learn a shared representation across related tasks and hence select important features and alleviate over-fitting in high-dimensional feature spaces; thus, reducing the prediction error of each task. To efficiently handle the two non-smooth constraints, in this paper, we propose an optimization method which employs Alternating Direction Method of Multipliers (ADMM) algorithm to solve the proposed multi-task learning problem. We demonstrate the performance of the proposed method using real-world microarray gene expression high-dimensional benchmark datasets and show that our method outperforms state-of-the-art methods.

knowledge discovery and data mining | 2015

Structural Graphical Lasso for Learning Mouse Brain Connectivity

Sen Yang; Qian Sun; Shuiwang Ji; Peter Wonka; Ian Davidson; Jieping Ye

Investigations into brain connectivity aim to recover networks of brain regions connected by anatomical tracts or by functional associations. The inference of brain networks has recently attracted much interest due to the increasing availability of high-resolution brain imaging data. Sparse inverse covariance estimation with lasso and group lasso penalty has been demonstrated to be a powerful approach to discover brain networks. Motivated by the hierarchical structure of the brain networks, we consider the problem of estimating a graphical model with tree-structural regularization in this paper. The regularization encourages the graphical model to exhibit a brain-like structure. Specifically, in this hierarchical structure, hundreds of thousands of voxels serve as the leaf nodes of the tree. A node in the intermediate layer represents a region formed by voxels in the subtree rooted at that node. The whole brain is considered as the root of the tree. We propose to apply the tree-structural regularized graphical model to estimate the mouse brain network. However, the dimensionality of whole-brain data, usually on the order of hundreds of thousands, poses significant computational challenges. Efficient algorithms that are capable of estimating networks from high-dimensional data are highly desired. To address the computational challenge, we develop a screening rule which can quickly identify many zero blocks in the estimated graphical model, thereby dramatically reducing the computational cost of solving the proposed model. It is based on a novel insight on the relationship between screening and the so-called proximal operator that we first establish in this paper. We perform experiments on both synthetic data and real data from the Allen Developing Mouse Brain Atlas; results demonstrate the effectiveness and efficiency of the proposed approach.

knowledge discovery and data mining | 2016

Scalable Fast Rank-1 Dictionary Learning for fMRI Big Data Analysis

Xiang Li; Milad Makkie; Binbin Lin; Mojtaba Sedigh Fazli; Ian Davidson; Jieping Ye; Tianming Liu; Shannon Quinn

It has been shown from various functional neuroimaging studies that sparsity-regularized dictionary learning could achieve superior performance in decomposing comprehensive and neuroscientifically meaningful functional networks from massive fMRI signals. However, the computational cost for solving the dictionary learning problem has been known to be very demanding, especially when dealing with large-scale data sets. Thus in this work, we propose a novel distributed rank-1 dictionary learning (D-r1DL) model and apply it for fMRI big data analysis. The model estimates one rank-1 basis vector with sparsity constraint on its loading coefficient from the input data at each learning step through alternating least squares updates. By iteratively learning the rank-1 basis and deflating the input data at each step, the model is then capable of decomposing the whole set of functional networks. We implement and parallelize the rank-1 dictionary learning algorithm using Spark engine and deployed the resilient distributed dataset (RDDs) abstracts for the data distribution and operations. Experimental results from applying the model on the Human Connectome Project (HCP) data show that the proposed D-r1DL model is efficient and scalable towards fMRI big data analytics, thus enabling data-driven neuroscientific discovery from massive fMRI big data in the future.

knowledge discovery and data mining | 2015

Dynamic Poisson Autoregression for Influenza-Like-Illness Case Count Prediction

Zheng Wang; Prithwish Chakraborty; Sumiko R. Mekaru; John S. Brownstein; Jieping Ye; Naren Ramakrishnan

Influenza-like-illness (ILI) is among of the most common diseases worldwide, and reliable forecasting of the same can have significant public health benefits. Recently, new forms of disease surveillance based upon digital data sources have been proposed and are continuing to attract attention over traditional surveillance methods. In this paper, we focus on short-term ILI case count prediction and develop a dynamic Poisson autoregressive model with exogenous inputs variables (DPARX) for flu forecasting. In this model, we allow the autoregressive model to change over time. In order to control the variation in the model, we construct a model similarity graph to specify the relationship between pairs of models at two time points and embed prior knowledge in terms of the structure of the graph. We formulate ILI case count forecasting as a convex optimization problem, whose objective balances the autoregressive loss and the model similarity regularization induced by the structure of the similarity graph. We then propose an efficient algorithm to solve this problem by block coordinate descent. We apply our model and the corresponding learning method on historical ILI records for 15 countries around the world using a variety of syndromic surveillance data sources. Our approach provides consistently better forecasting results than state-of-the-art models available for short-term ILI case count forecasting.

Explore More