Yacine Jernite | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yacine Jernite is active.

Explore More

Publication

Featured researches published by Yacine Jernite.

The Annals of Applied Statistics | 2014

The random subgraph model for the analysis of an ecclesiastical network in Merovingian Gaul

Yacine Jernite; Pierre Latouche; Charles Bouveyron; Patrick Rivera; Laurent Jegou; Stéphane Lamassé

In the last two decades, many random graph models have been proposed to extract knowledge from networks. Most of them look for communities or more generally clusters of vertices with homogeneous connection profiles. While the first models focused on networks with binary edges only, extensions now allow to deal with valued networks. Recently, new models were also introduced in order to characterize connection patterns in networks through mixed memberships. This work was motivated by the need of analyzing a historical network where a partition of the vertices is given and where edges are typed. A known partition is seen as a decomposition of a network into subgraphs that we propose to model using a stochastic model with unknown latent clusters. Each subgraph has its own mixing vector and sees its vertices associated to the clusters. The vertices then connect with a probability depending on the subgraphs only, while the types of the edges are assumed to be sampled from the latent clusters. A variational Bayes expectation-maximization algorithm is proposed for inference as well as a model selection criterion for the estimation of the cluster number. Experiments are carried out on simulated data to assess the approach. The proposed methodology is then applied to an ecclesiastical network in merovingian Gaul. An R package, called Rambo, implementing the inference algorithm is available on the CRAN.

PLOS ONE | 2017

Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning

Steven Horng; David Sontag; Yoni Halpern; Yacine Jernite; Nathan I. Shapiro; Larry A. Nathanson

Objective To demonstrate the incremental benefit of using free text data in addition to vital sign and demographic data to identify patients with suspected infection in the emergency department. Methods This was a retrospective, observational cohort study performed at a tertiary academic teaching hospital. All consecutive ED patient visits between 12/17/08 and 2/17/13 were included. No patients were excluded. The primary outcome measure was infection diagnosed in the emergency department defined as a patient having an infection related ED ICD-9-CM discharge diagnosis. Patients were randomly allocated to train (64%), validate (20%), and test (16%) data sets. After preprocessing the free text using bigram and negation detection, we built four models to predict infection, incrementally adding vital signs, chief complaint, and free text nursing assessment. We used two different methods to represent free text: a bag of words model and a topic model. We then used a support vector machine to build the prediction model. We calculated the area under the receiver operating characteristic curve to compare the discriminatory power of each model. Results A total of 230,936 patient visits were included in the study. Approximately 14% of patients had the primary outcome of diagnosed infection. The area under the ROC curve (AUC) for the vitals model, which used only vital signs and demographic data, was 0.67 for the training data set, 0.67 for the validation data set, and 0.67 (95% CI 0.65–0.69) for the test data set. The AUC for the chief complaint model which also included demographic and vital sign data was 0.84 for the training data set, 0.83 for the validation data set, and 0.83 (95% CI 0.81–0.84) for the test data set. The best performing methods made use of all of the free text. In particular, the AUC for the bag-of-words model was 0.89 for training data set, 0.86 for the validation data set, and 0.86 (95% CI 0.85–0.87) for the test data set. The AUC for the topic model was 0.86 for the training data set, 0.86 for the validation data set, and 0.85 (95% CI 0.84–0.86) for the test data set. Conclusion Compared to previous work that only used structured data such as vital signs and demographic information, utilizing free text drastically improves the discriminatory ability (increase in AUC from 0.67 to 0.86) of identifying infection.

bioRxiv | 2017

Contextual Autocomplete: A Novel User Interface Using Machine Learning to Improve Ontology Usage and Structured Data Capture for Presenting Problems in the Emergency Department

Nathaniel R Greenbaum; Yacine Jernite; Yoni Halpern; Shelley Calder; Larry A. Nathanson; David Sontag; Steven Horng

Objective To determine the effect of contextual autocomplete, a user interface that uses machine learning, on the efficiency and quality of documentation of presenting problems (chief complaints) in the emergency department (ED). Materials and Methods We used contextual autocomplete, a user interface that ranks concepts by their predicted probability, to help nurses enter data about a patient’s reason for visiting the ED. Predicted probabilities were calculated using a previously derived model based on triage vital signs and a brief free text note. We evaluated the percentage and quality of structured data captured using a prospective before-and-after study design. Results A total of 279,231 patient encounters were analyzed. Structured data capture improved from 26.2% to 97.2% (p<0.0001). During the post-implementation period, presenting problems were more complete (3.35 vs 3.66; p=0.0004), as precise (3.59 vs. 3.74; p=0.1), and higher in overall quality (3.38 vs. 3.72; p=0.0002). Our system reduced the mean number of keystrokes required to document a presenting problem from 11.6 to 0.6 (p<0.0001), a 95% improvement. Discussion We have demonstrated a technique that captures structured data on nearly all patients. We estimate that our system reduces the number of man-hours required annually to type presenting problems at our institution from 92.5 hours to 4.8 hours. Conclusion Implementation of a contextual autocomplete system resulted in improved structured data capture, ontology usage compliance, and data quality.

national conference on artificial intelligence | 2016