Cheng Li | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cheng Li is active.

Explore More

Publication

Featured researches published by Cheng Li.

Scientific Reports | 2017

Rapid Bayesian optimisation for synthesis of short polymer fiber materials.

Cheng Li; David Rubin de Celis Leal; Santu Rana; Sunil Kumar Gupta; Alessandra Sutti; Stewart Greenhill; Teo Slezak; Murray Height; Svetha Venkatesh

The discovery of processes for the synthesis of new materials involves many decisions about process design, operation, and material properties. Experimentation is crucial but as complexity increases, exploration of variables can become impractical using traditional combinatorial approaches. We describe an iterative method which uses machine learning to optimise process development, incorporating multiple qualitative and quantitative objectives. We demonstrate the method with a novel fluid processing platform for synthesis of short polymer fibers, and show how the synthesis process can be efficiently directed to achieve material and process objectives.

international conference on data mining | 2016

Budgeted Batch Bayesian Optimization

Vu Nguyen; Santu Rana; Sunil Kumar Gupta; Cheng Li; Svetha Venkatesh

Parameter settings profoundly impact the performance of machine learning algorithms and laboratory experiments. The classical trial-error methods are exponentially expensive in large parameter spaces, and Bayesian optimization (BO) offers an elegant alternative for global optimization of black box functions. In situations where the functions can be evaluated at multiple points simultaneously, batch Bayesian optimization is used. Current batch BO approaches are restrictive in fixing the number of evaluations per batch, and this can be wasteful when the number of specified evaluations is larger than the number of real maxima in the underlying acquisition function. We present the budgeted batch Bayesian optimization (B3O) for hyper-parameter tuning and experimental design - we identify the appropriate batch size for each iteration in an elegant way. In particular, we use the infinite Gaussian mixture model (IGMM) for automatically identifying the number of peaks in the underlying acquisition functions. We solve the intractability of estimating the IGMM directly from the acquisition function by formulating the batch generalized slice sampling to efficiently draw samples from the acquisition function. We perform extensive experiments for benchmark functions and two real world applications - machine learning hyper-parameter tuning and experimental design for alloy hardening. We show empirically that the proposed B3O outperforms the existing fixed batch BO approaches in finding the optimum whilst requiring a fewer number of evaluations, thus saving cost and time.

international conference on multimedia and expo | 2013

Exploiting side information in distance dependent Chinese restaurant processes for data clustering

Cheng Li; Dinh Q. Phung; Santu Rana; Svetha Venkatesh

Multimedia contents often possess weakly annotated data such as tags, links and interactions. The weakly annotated data is called side information. It is the auxiliary information of data and provides hints for exploring the link structure of data. Most clustering algorithms utilize pure data for clustering. A model that combines pure data and side information, such as images and tags, documents and keywords, can perform better at understanding the underlying structure of data. We demonstrate how to incorporate different types of side information into a recently proposed Bayesian nonparametric model, the distance dependent Chinese restaurant process (DD-CRP). Our algorithm embeds the affinity of this information into the decay function of the DD-CRP when side information is in the form of subsets of discrete labels. It is flexible to measure distance based on arbitrary side information instead of only the spatial layout or time stamp of observations. At the same time, for noisy and incomplete side information, we set the decay function so that the DD-CRP reduces to the traditional Chinese restaurant process, thus not inducing side effects of noisy and incomplete side information. Experimental evaluations on two real-world datasets NUS WIDE and 20 Newsgroups show exploiting side information in DD-CRP significantly improves the clustering performance.

Knowledge Based Systems | 2016

Hierarchical Bayesian nonparametric models for knowledge discovery from electronic medical records

Cheng Li; Santu Rana; Dinh Q. Phung; Svetha Venkatesh

Electronic Medical Record (EMR) has established itself as a valuable resource for large scale analysis of health data. A hospital EMR dataset typically consists of medical records of hospitalized patients. A medical record contains diagnostic information (diagnosis codes), procedures performed (procedure codes) and admission details. Traditional topic models, such as latent Dirichlet allocation (LDA) and hierarchical Dirichlet process (HDP), can be employed to discover disease topics from EMR data by treating patients as documents and diagnosis codes as words. This topic modeling helps to understand the constitution of patient diseases and offers a tool for better planning of treatment. In this paper, we propose a novel and flexible hierarchical Bayesian nonparametric model, the word distance dependent Chinese restaurant franchise (wddCRF), which incorporates word-to-word distances to discover semantically-coherent disease topics. We are motivated by the fact that diagnosis codes are connected in the form of ICD-10 tree structure which presents semantic relationships between codes. We exploit a decay function to incorporate distances between words at the bottom level of wddCRF. Efficient inference is derived for the wddCRF by using MCMC technique. Furthermore, since procedure codes are often correlated with diagnosis codes, we develop the correspondence wddCRF (Corr-wddCRF) to explore conditional relationships of procedure codes for a given disease pattern. Efficient collapsed Gibbs sampling is derived for the Corr-wddCRF. We evaluate the proposed models on two real-world medical datasets - PolyVascular disease and Acute Myocardial Infarction disease. We demonstrate that the Corr-wddCRF model discovers more coherent topics than the Corr-HDP. We also use disease topic proportions as new features and show that using features from the Corr-wddCRF outperforms the baselines on 14-days readmission prediction. Beside these, the prediction for procedure codes based on the Corr-wddCRF also shows considerable accuracy.

pacific-asia conference on knowledge discovery and data mining | 2016

Toxicity Prediction in Cancer Using Multiple Instance Learning in a Multi-task Framework

Cheng Li; Sunil Kumar Gupta; Santu Rana; Wei Luo; Svetha Venkatesh; David Ashely; Dinh Q. Phung

Treatments of cancer cause severe side effects called toxicities. Reduction of such effects is crucial in cancer care. To impact care, we need to predict toxicities at fortnightly intervals. This toxicity data differs from traditional time series data as toxicities can be caused by one treatment on a given day alone, and thus it is necessary to consider the effect of the singular data vector causing toxicity. We model the data before prediction points using the multiple instance learning, where each bag is composed of multiple instances associated with daily treatments and patient-specific attributes, such as chemotherapy, radiotherapy, age and cancer types. We then formulate a Bayesian multi-task framework to enhance toxicity prediction at each prediction point. The use of the prior allows factors to be shared across task predictors. Our proposed method simultaneously captures the heterogeneity of daily treatments and performs toxicity prediction at different prediction points. Our method was evaluated on a real-word dataset of more than 2000 cancer patients and had achieved a better prediction accuracy in terms of AUC than the state-of-art baselines.

international joint conference on artificial intelligence | 2017

High dimensional bayesian optimization using dropout

Cheng Li; Sunil Kumar Gupta; Santu Rana; Vu Nguyen; Svetha Venkatesh; Alistair Shilton

Scaling Bayesian optimization to high dimensions is challenging task as the global optimization of high-dimensional acquisition function can be expensive and often infeasible. Existing methods depend either on limited active variables or the additive form of the objective function. We propose a new method for high-dimensional Bayesian optimization, that uses a dropout strategy to optimize only a subset of variables at each iteration. We derive theoretical bounds for the regret and show how it can inform the derivation of our algorithm. We demonstrate the efficacy of our algorithms for optimization on two benchmark functions and two real-world applications- training cascade classifiers and optimizing alloy composition.

Knowledge and Information Systems | 2016

Data clustering using side information dependent Chinese restaurant processes

Cheng Li; Santu Rana; Dinh Q. Phung; Svetha Venkatesh

Side information, or auxiliary information associated with documents or image content, provides hints for clustering. We propose a new model, side information dependent Chinese restaurant process, which exploits side information in a Bayesian nonparametric model to improve data clustering. We introduce side information into the framework of distance dependent Chinese restaurant process using a robust decay function to handle noisy side information. The threshold parameter of the decay function is updated automatically in the Gibbs sampling process. A fast inference algorithm is proposed. We evaluate our approach on four datasets: Cora, 20 Newsgroups, NUS-WIDE and one medical dataset. Types of side information explored in this paper include citations, authors, tags, keywords and auxiliary clinical information. The comparison with the state-of-the-art approaches based on standard performance measures (NMI, F1) clearly shows the superiority of our approach.

international conference on pattern recognition | 2014

Regularizing Topic Discovery in EMRs with Side Information by Using Hierarchical Bayesian Models

Cheng Li; Santu Rana; Dinh Q. Phung; Svetha Venkatesh

We propose a novel hierarchical Bayesian framework, word-distance-dependent Chinese restaurant franchise (wd-dCRF) for topic discovery from a document corpus regularized by side information in the form of word-to-word relations, with an application on Electronic Medical Records (EMRs). Typically, a EMRs dataset consists of several patients (documents) and each patient contains many diagnosis codes (words). We exploit the side information available in the form of a semantic tree structure among the diagnosis codes for semantically-coherent disease topic discovery. We introduce novel functions to compute word-to-word distances when side information is available in the form of tree structures. We derive an efficient inference method for the wddCRF using MCMC technique. We evaluate on a real world medical dataset consisting of about 1000 patients with PolyVascular disease. Compared with the popular topic analysis tool, hierarchical Dirichlet process (HDP), our model discovers topics which are superior in terms of both qualitative and quantitative measures.

international conference on pattern recognition | 2016

Multiple adverse effects prediction in longitudinal cancer treatment

Cheng Li; Sunil Kumar Gupta; Santu Rana; Vu Nguyen; Svetha Venkatesh; David M. Ashley; Trish M. Livingston

Adverse effects, such as voice change and fatigue, are prevalent in cancer treatment duration. These adverse effects have been significant burden for patients physically and emotionally. Predicting multiple adverse effects becomes important for patients and oncologists. In this paper, we formulate the prediction of multiple adverse effects in cancer treatment as a longitudinal multiple-output regression problem. The correlated multiple outputs are first decoupled to uncorrelated ones in a new output space. We then propose a comprehensive framework to capture the empirical loss between the predicted value and the ground truth in the transformed space and the temporal smoothness at neighboring prediction points. Experiments were performed on one synthetic data and two real-world datasets including radiotherapy and chemotherapy treatments. Results in terms of root mean square errors (RMSE) and R-value show that our proposed approach is promising for the longitudinal multiple-output regression problem.

pacific-asia conference on knowledge discovery and data mining | 2015

Small-Variance Asymptotics for Bayesian Nonparametric Models with Constraints

Cheng Li; Santu Rana; Dinh Q. Phung; Svetha Venkatesh

The users often have additional knowledge when Bayesian nonparametric models (BNP) are employed, e.g. for clustering there may be prior knowledge that some of the data instances should be in the same cluster (must-link constraint) or in different clusters (cannot-link constraint), and similarly for topic modeling some words should be grouped together or separately because of an underlying semantic. This can be achieved by imposing appropriate sampling probabilities based on such constraints. However, the traditional inference technique of BNP models via Gibbs sampling is time consuming and is not scalable for large data. Variational approximations are faster but many times they do not offer good solutions. Addressing this we present a small-variance asymptotic analysis of the MAP estimates of BNP models with constraints. We derive the objective function for Dirichlet process mixture model with constraints and devise a simple and efficient K-means type algorithm. We further extend the small-variance analysis to hierarchical BNP models with constraints and devise a similar simple objective function. Experiments on synthetic and real data sets demonstrate the efficiency and effectiveness of our algorithms.

Explore More