Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Lyle H. Ungar is active.

Publication


Featured researches published by Lyle H. Ungar.


international acm sigir conference on research and development in information retrieval | 2002

Methods and metrics for cold-start recommendations

Andrew I. Schein; Alexandrin Popescul; Lyle H. Ungar; David M. Pennock

We have developed a method for recommending items that combines content and collaborative data under a single probabilistic framework. We benchmark our algorithm against a naïve Bayes classifier on the cold-start problem, where we wish to recommend items that no one in the community has yet rated. We systematically explore three testing methodologies using a publicly available data set, and explain how these methods apply to specific real-world applications. We advocate heuristic recommenders when benchmarking to give competent baseline performance. We introduce a new performance metric, the CROC curve, and demonstrate empirically that the various components of our testing strategy combine to obtain deeper understanding of the performance characteristics of recommender systems. Though the emphasis of our testing is on cold-start recommending, our methods for recommending and evaluation are general.


knowledge discovery and data mining | 2000

Efficient clustering of high-dimensional data sets with application to reference matching

Andrew McCallum; Kamal Nigam; Lyle H. Ungar

important problems involve clustering large datasets. Although naive implementations of clustering are computa- tionally expensive, there are established ecient techniques for clustering when the dataset has either (1) a limited num- ber of clusters, (2) a low feature dimensionality, or (3) a small number of data points. However, there has been much less work on methods of eciently clustering datasets that are large in all three ways at once|for example, having millions of data points that exist in many thousands of di- mensions representing many thousands of clusters. We present a new technique for clustering these large, high- dimensional datasets. The key idea involves using a cheap, approximate distance measure to eciently divide the data into overlapping subsets we call canopies .T hen cluster- ing is performed by measuring exact distances only between points that occur in a common canopy. Using canopies, large clustering problems that were formerly impossible become practical. Under reasonable assumptions about the cheap distance metric, this reduction in computational cost comes without any loss in clustering accuracy. Canopies can be applied to many domains and used with a variety of cluster- ing approaches, including Greedy Agglomerative Clustering, K-means and Expectation-Maximization. We present ex- perimental results on grouping bibliographic citations from the reference sections of research papers. Here the canopy approach reduces computation time over a traditional clus- tering approach by more than an order of magnitude and decreases error in comparison to a previously used algorithm by 25%.


IEEE Transactions on Neural Networks | 1992

Using radial basis functions to approximate a function and its error bounds

James A. Leonard; Mark A. Kramer; Lyle H. Ungar

A novel network called the validity index network (VI net) is presented. The VI net, derived from radial basis function networks, fits functions and calculates confidence intervals for its predictions, indicating local regions of poor fit and extrapolation.


Neurology | 2006

Identification of potential CSF biomarkers in ALS

Giulio Maria Pasinetti; Lyle H. Ungar; Dale J. Lange; S. Yemul; H. Deng; X. Yuan; Robert H. Brown; Merit Cudkowicz; Kristyn Newhall; Elaine R. Peskind; S. Marcus; Lap Ho

Background: The clinical diagnosis of ALS is based entirely on clinical features. Identification of biomarkers for ALS would be important for diagnosis and might also provide clues to pathogenesis. Objective: To determine if there is a specific protein profile in the CSF that distinguishes patients with ALS from those with purely motor peripheral neuropathy (PN) and healthy control subjects. Methods: CSF obtained from patients with ALS, disease controls (patients with other neurologic disorders), and normal controls were analyzed using the surface-enhanced laser desorption/ionization time-of-flight mass spectrometry proteomics technique. Biomarker sensitivity and specificity was calculated with receiver operating characteristic curve methodology. ALS biomarkers were purified and sequence identified by mass spectrometry–directed peptide sequencing. Results: In initial proteomic discovery studies, three protein species (4.8-, 6.7-, and 13.4-kDa) that were significantly lower in concentration in the CSF from patients with ALS (n = 36) than in normal controls (n = 21) were identified. A combination of three protein species (the “three-protein” model) correctly identified patients with ALS with 95% accuracy, 91% sensitivity, and 97% specificity from the controls. Independent validation studies using separate cohorts of ALS (n = 13), healthy control (n = 25), and PN (n = 7) subjects confirmed the ability of the three CSF protein species to separate patients with ALS from other diseases. Protein sequence analysis identified the 13.4-kDa protein species as cystatin C and the 4.8-kDa protein species as a peptic fragment of the neurosecretory protein VGF. Conclusion: Additional application of a “three-protein” biomarker model to current diagnostic criteria may provide an objective biomarker pattern to help identify patients with ALS.


Computers & Chemical Engineering | 1990

Adaptive networks for fault diagnosis and process control

Lyle H. Ungar; B.A. Powell; S.N. Kamens

The use of adaptive (artificial neural) networks for fault diagnosis and process control is explored. Adaptive networks can be used as fault recognition systems, as adaptive nonlinear process models, and as controllers. Connection strengths representing correlations between inputs (alarms and sensor measurements) and outputs (faults, future sensor measurements or control actions) are learned using the LMS (Widrow-Hoff) rule and the backpropagation algorithm. The resulting system is a pattern recognizer which is able to learn nonlinear and logical relationships as well as linear correlations. Results are presented for two problems: diagnosing failures in a small model chemical plant, and controlling a highly nonlinear bioreactor. For the diagnosis problem, learning in adaptive networks given qualitative (alarm) and quantitative (sensor) data are compared, and the effect of noise is studied. The importance of nonlinear networks is demonstrated using simple problems which require context sensitivity and problems where optimal alarm thresholds are learned. Results using an adaptive model-based controller using two neural networks (one for the model and one for the controller) are presented and extensions to the standard layered feedforward network are suggested which greatly increase the utility of neural networks for process control. Adaptive networks provide pattern recognition facilities which can be used and interpreted in several ways: they perform multiple nonlinear regression on input/output pairs (current sensor readings and alarms are associated with faults, future readings or control actions) that may be both quantitative and qualitative. Although adaptive networks have many similarities with well-established statistical techniques for system identification, they still offer promise of major benefit, primarily in suggesting new equations, architectures and algorithms. Although the adaptive networks can be viewed as a special form of nonlinear regression, the network formalism suggests several powerful nonlinear functional forms to use. Use of highly interconnected nonlinear systems allows unexpected interactions to be captured. Recurrent networks can learn to recognize arbitrary delays. Temporal difference methods can speed learning when feedback is not immediate. Artificial neural networks will not in any way replace control algorithms, but rather are good for learning that which we are ignorant of: alarm thresholds, patterns of disturbances and model-mismatch (including process delays). Adaptive networks can be thought of as a very data-intensive approach to system identification: many parameters are used in a format that allows interactions between all of the variables. Thus more specific patterns can be learned than when the system is described using a relatively simple equation with a small number of parameters. When the exact form of the equation is known or little data is available, an equation with fewer parameters is of course preferable and neural networks should be avoided. We looked at two example problems: fault diagnosis and adaptive model-based control. Both quantitative (sensor) and qualitative (alarm) information can be used in fault diagnosis. Optimal thresholds for triggering alarms are learned; These can be dependent on the context provided by the states of other variables. Nonlinear networks are required for all but the simplest problems. In adaptive model-based control, a highly nonlinear model was learned without using a priori knowledge of the equational forms. This approach is expected to yield the most benefit in MIMO systems which contain complex nonlinear interactions and in systems in which recurring disturbances must be recognized and forecast.


Psychological Science | 2015

Psychological Language on Twitter Predicts County-Level Heart Disease Mortality:

Johannes C. Eichstaedt; Hansen Andrew Schwartz; Margaret L. Kern; Gregory Park; Darwin R. Labarthe; Raina M. Merchant; Sneha Jha; Megha Agrawal; Lukasz Dziurzynski; Maarten Sap; Christopher Weeg; Emily E. Larson; Lyle H. Ungar; Martin E. P. Seligman

Hostility and chronic stress are known risk factors for heart disease, but they are costly to assess on a large scale. We used language expressed on Twitter to characterize community-level psychological correlates of age-adjusted mortality from atherosclerotic heart disease (AHD). Language patterns reflecting negative social relationships, disengagement, and negative emotions—especially anger—emerged as risk factors; positive emotions and psychological engagement emerged as protective factors. Most correlations remained significant after controlling for income and education. A cross-sectional regression model based only on Twitter language predicted AHD mortality significantly better than did a model that combined 10 common demographic, socioeconomic, and health risk factors, including smoking, diabetes, hypertension, and obesity. Capturing community psychological characteristics through social media is feasible, and these characteristics are strong markers of cardiovascular mortality at the community level.


Machine Learning | 2007

Active learning for logistic regression: an evaluation

Andrew I. Schein; Lyle H. Ungar

Abstract Which active learning methods can we expect to yield good performance in learning binary and multi-category logistic regression classifiers? Addressing this question is a natural first step in providing robust solutions for active learning across a wide variety of exponential models including maximum entropy, generalized linear, log-linear, and conditional random field models. For the logistic regression model we re-derive the variance reduction method known in experimental design circles as ‘A-optimality.’ We then run comparisons against different variations of the most widely used heuristic schemes: query by committee and uncertainty sampling, to discover which methods work best for different classes of problems and why. We find that among the strategies tested, the experimental design methods are most likely to match or beat a random sample baseline. The heuristic alternatives produced mixed results, with an uncertainty sampling variant called margin sampling and a derivative method called QBB-MM providing the most promising performance at very low computational cost. Computational running times of the experimental design methods were a bottleneck to the evaluations. Meanwhile, evaluation of the heuristic methods lead to an accumulation of negative results. We explore alternative evaluation design parameters to test whether these negative results are merely an artifact of settings where experimental design methods can be applied. The results demonstrate a need for improved active learning methods that will provide reliable performance at a reasonable computational cost.


Journal of Personality and Social Psychology | 2015

Automatic personality assessment through social media language.

Gregory Park; H. Andrew Schwartz; Johannes C. Eichstaedt; Margaret L. Kern; Michal Kosinski; David Stillwell; Lyle H. Ungar; Martin E. P. Seligman

Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality traits, and then we built a predictive model of personality based on their language. We used this model to predict the 5 personality factors in a separate sample of 4,824 Facebook users, examining (a) convergence with self-reports of personality at the domain- and facet-level; (b) discriminant validity between predictions of distinct traits; (c) agreement with informant reports of personality; (d) patterns of correlations with external criteria (e.g., number of friends, political attitudes, impulsiveness); and (e) test-retest reliability over 6-month intervals. Results indicated that language-based assessments can constitute valid personality measures: they agreed with self-reports and informant reports of personality, added incremental validity over informant reports, adequately discriminated between traits, exhibited patterns of correlations with external criteria similar to those found with self-reported personality, and were stable over 6-month intervals. Analysis of predictive language can provide rich portraits of the mental life associated with traits. This approach can complement and extend traditional methods, providing researchers with an additional measure that can quickly and cheaply assess large groups of participants with minimal burden.


Technometrics | 1998

Prediction intervals for neural networks via nonlinear regression

Richard D. De Veaux; Jason Schweinsberg; Jennifer Schumi; Lyle H. Ungar

Standard methods for computing prediction intervals in nonlinear regression can be effectively applied to neural networks when the number of training points is large. Simulations show, however, that these methods can generate unreliable prediction intervals on smaller datasets when the network is trained to convergence. Stopping the training algorithm prior to convergence, to avoid overfitting, reduces the effective number of parameters but can lead to prediction intervals that are too wide. We present an alternative approach to estimating prediction intervals using weight decay to fit the network and show via a simulation study that this method may be effective in overcoming some of the shortcomings of the other approaches.


Computers & Chemical Engineering | 1993

A comparison of two nonparametric estimation schemes: MARS and neural networks

R.D. De Veaux; Dimitris C. Psichogios; Lyle H. Ungar

Abstract The most popular form of artificial neural networks, feedforward networks with sigmoidal activation functions, and a new statistical technique, multivariate adaptive regression splines (MARS) are compared in terms of both their accuracy in learning different types of functions and their speed. Test problems that have been used for demonstrating the efficacy of each method are used to compare the two methods. Both methods can be classified as nonlinear, nonparametric function estimation techniques, and both show great promise for fitting general nonlinear multivariate functions. We find that MARS is in most cases both more accurate and much faster than neural networks. In addition, MARS is more interpretable due to the choice of basis functions which make up the final predictive model. This suggests that MARS could be used on many of the applications where neural networks are currently being used.

Collaboration


Dive into the Lyle H. Ungar's collaboration.

Top Co-Authors

Avatar

Dean P. Foster

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gregory Park

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar

Maarten Sap

University of Pennsylvania

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrew I. Schein

University of Pennsylvania

View shared research outputs
Researchain Logo
Decentralizing Knowledge