Minmin Chen
Washington University in St. Louis
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Minmin Chen.
conference on information and knowledge management | 2012
Zhixiang Eddie Xu; Minmin Chen; Kilian Q. Weinberger; Fei Sha
In text mining, information retrieval, and machine learning, text documents are commonly represented through variants of sparse Bag of Words (sBoW) vectors (e.g. TF-IDF [1]). Although simple and intuitive, sBoW style representations suffer from their inherent over-sparsity and fail to capture word-level synonymy and polysemy. Especially when labeled data is limited (e.g. in document classification), or the text documents are short (e.g. emails or abstracts), many features are rarely observed within the training corpus. This leads to overfitting and reduced generalization accuracy. In this paper we propose Dense Cohort of Terms (dCoT), an unsupervised algorithm to learn improved sBoW document features. dCoT explicitly models absent words by removing and reconstructing random sub-sets of words in the unlabeled corpus. With this approach, dCoT learns to reconstruct frequent words from co-occurring infrequent words and maps the high dimensional sparse sBoW vectors into a low-dimensional dense representation. We show that the feature removal can be marginalized out and that the reconstruction can be solved for in closed-form. We demonstrate empirically, on several benchmark datasets, that dCoT features significantly improve the classification accuracy across several document classification tasks.
international conference on data mining | 2011
Yi Mao; Yixin Chen; Gregory Hackmann; Minmin Chen; Chenyang Lu; Marin H. Kollef; Thomas C. Bailey
Data mining on medical data has great potential to improve the treatment quality of hospitals and increase the survival rate of patients. Every year,
american medical informatics association annual symposium | 2011
Gregory Hackmann; Minmin Chen; Octav Chipara; Chenyang Lu; Yixin Chen; Marin H. Kollef; Thomas C. Bailey
4
Computational Optimization and Applications | 2010
Yixin Chen; Minmin Chen
--
International Journal of Knowledge Discovery in Bioinformatics | 2011
Yi Mao; Yixin Chen; Gregory Hackmann; Minmin Chen; Chenyang Lu; Marin H. Kollef; Thomas C. Bailey
17\%
conference on information and knowledge management | 2011
Minmin Chen; Jian-Tao Sun; Xiaochuan Ni; Yixin Chen
of patients undergo cardiopulmonary or respiratory arrest while in hospitals. Early prediction techniques have become an apparent need in many clinical area. Clinical study has found early detection and intervention to be essential for preventing clinical deterioration in patients at general hospital units. In this paper, based on data mining technology, we propose an early warning system (EWS) designed to identify the signs of clinical deterioration and provide early warning for serious clinical events. Our EWS is designed to provide reliable early alarms for patients at the general hospital wards (GHWs). EWS automatically identifies patients at risk of clinical deterioration based on their existing electronic medical record. The main task of EWS is a challenging classification problem on high-dimensional stream data with irregular, multi-scale data gaps, measurement errors, outliers, and class imbalance. In this paper, we propose a novel data mining framework for analyzing such medical data streams. The framework addresses the above challenges and represents a practical approach for early prediction and prevention based on data that would realistically be available at GHWs. We assess the feasibility of the proposed EWS approach through retrospective study that includes data from 28,927 visits at a major hospital. Finally, we apply our system in a real-time clinical trial and obtain promising results. This project is an example of multidisciplinary cyber-physical systems involving researchers in clinical science, data mining, and nursing staff in the hospital. Our early warning algorithm shows promising result: the transfer of patients to ICU was predicted with sensitivity of 0.4127 and specificity of 0.950 in the real time system.
knowledge discovery and data mining | 2009
Minmin Chen; Yixin Chen; Michael R. Brent; Aaron E. Tenney
Clinical study has found early detection and intervention to be essential for preventing clinical deterioration in patients at general hospital units. In this paper, we envision a two-tiered early warning system designed to identify the signs of clinical deterioration and provide early warning of serious clinical events. The first tier of the system automatically identifies patients at risk of clinical deterioration from existing electronic medical record databases. The second tier performs real-time clinical event detection based on real-time vital sign data collected from on-body wireless sensors attached to those high-risk patients. We employ machine-learning techniques to analyze data from both tiers, assigning scores to patients in real time. The assigned scores can then be used to trigger early-intervention alerts. Preliminary study of an early warning system component and a wireless clinical monitoring system component demonstrate the feasibility of this two-tiered approach.
international conference on tools with artificial intelligence | 2009
Minmin Chen; Yixin Chen; Michael R. Brent; Aaron E. Tenney
Duality is an important notion for nonlinear programming (NLP). It provides a theoretical foundation for many optimization algorithms. Duality can be used to directly solve NLPs as well as to derive lower bounds of the solution quality which have wide use in other high-level search techniques such as branch and bound. However, the conventional duality theory has the fundamental limit that it leads to duality gaps for nonconvex problems, including discrete and mixed-integer problems where the feasible sets are generally nonconvex.In this paper, we propose an extended duality theory for nonlinear optimization in order to overcome some limitations of previous dual methods. Based on a new dual function, the extended duality theory leads to zero duality gap for general nonconvex problems defined in discrete, continuous, and mixed spaces under mild conditions. Comparing to recent developments in nonlinear Lagrangian functions and exact penalty functions, the proposed theory always requires lesser penalty to achieve zero duality. This is very desirable as the lower function value leads to smoother search terrains and alleviates the ill conditioning of dual optimization.Based on the extended duality theory, we develop a general search framework for global optimization. Experimental results on engineering benchmarks and a sensor-network optimization application show that our algorithm achieves better performance than searches based on conventional duality and Lagrangian theory.
international conference on machine learning | 2012
Minmin Chen; Zhixiang Xu; Fei Sha; Kilian Q. Weinberger
Data mining on medical data has great potential to improve the treatment quality of hospitals and increase the survival rate of patients. Every year, 4-17% of patients undergo cardiopulmonary or respiratory arrest while in hospitals. Clinical study has found early detection and intervention to be essential for preventing clinical deterioration in patients at general hospital units. This paper proposes an early warning system (EWS) designed to identify the signs of clinical deterioration and provide early warning for serious clinical events. The EWS is designed to provide reliable early alarms for patients at the general hospital wards (GHWs). The main task of EWS is a challenging classification problem on high-dimensional stream data with irregular, multi-scale data gaps, measurement errors, outliers, and class imbalance. This paper proposes a novel data mining framework for analyzing such medical data streams. The authors assess the feasibility of the proposed EWS approach through retrospective study that includes data from 41,503 visits at a major hospital. Finally, the system is applied in a clinical trial at a major hospital and obtains promising results. This project is an example of multidisciplinary cyber-physical systems involving researchers in clinical science, data mining, and nursing staff.
neural information processing systems | 2011
Minmin Chen; Kilian Q. Weinberger; John Blitzer
Topical classification of user queries is critical for general-purpose web search systems. It is also a challenging task, due to the sparsity of query terms and the lack of labeled queries. On the other hand, search contexts embedded in query sessions and unlabeled queries free on the web have not been fully utilized in most query classification systems. In this work, we leverage these information to improve query classification accuracy. We first incorporate search contexts into our framework using a Conditional Random Field (CRF) model. Discriminative training of CRFs is favored over the traditional maximum likelihood training because of its robustness to noise. We then adapt self-training with our model to exploit the information in unlabeled queries. By investigating different confidence measurements and model selection strategies, we effectively avoid the error-reinforcing nature of self-training. In extensive experiments on real search logs, we have averaged around 20% improvement in classification accuracy over other state-of-the-art baselines.
