Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Emma Pierson is active.

Publication


Featured researches published by Emma Pierson.


Nature Methods | 2017

Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning

Bo Wang; Junjie Zhu; Emma Pierson; Daniele Ramazzotti; Serafim Batzoglou

We present single-cell interpretation via multikernel learning (SIMLR), an analytic framework and software which learns a similarity measure from single-cell RNA-seq data in order to perform dimension reduction, clustering and visualization. On seven published data sets, we benchmark SIMLR against state-of-the-art methods. We show that SIMLR is scalable and greatly enhances clustering performance while improving the visualization and interpretability of single-cell sequencing data.


PLOS Computational Biology | 2015

Sharing and Specificity of Co-expression Networks across 35 Human Tissues

Emma Pierson; Daphne Koller; Alexis Battle

To understand the regulation of tissue-specific gene expression, the GTEx Consortium generated RNA-seq expression data for more than thirty distinct human tissues. This data provides an opportunity for deriving shared and tissue specific gene regulatory networks on the basis of co-expression between genes. However, a small number of samples are available for a majority of the tissues, and therefore statistical inference of networks in this setting is highly underpowered. To address this problem, we infer tissue-specific gene co-expression networks for 35 tissues in the GTEx dataset using a novel algorithm, GNAT, that uses a hierarchy of tissues to share data between related tissues. We show that this transfer learning approach increases the accuracy with which networks are learned. Analysis of these networks reveals that tissue-specific transcription factors are hubs that preferentially connect to genes with tissue specific functions. Additionally, we observe that genes with tissue-specific functions lie at the peripheries of our networks. We identify numerous modules enriched for Gene Ontology functions, and show that modules conserved across tissues are especially likely to have functions common to all tissues, while modules that are upregulated in a particular tissue are often instrumental to tissue-specific function. Finally, we provide a web tool, available at mostafavilab.stat.ubc.ca/GNAT, which allows exploration of gene function and regulation in a tissue-specific manner.


knowledge discovery and data mining | 2017

Algorithmic Decision Making and the Cost of Fairness

Sam Corbett-Davies; Emma Pierson; Avi Feller; Sharad Goel; Aziz Z. Huq

Algorithms are now regularly used to decide whether defendants awaiting trial are too dangerous to be released back into the community. In some cases, black defendants are substantially more likely than white defendants to be incorrectly classified as high risk. To mitigate such disparities, several techniques have recently been proposed to achieve algorithmic fairness. Here we reformulate algorithmic fairness as constrained optimization: the objective is to maximize public safety while satisfying formal fairness constraints designed to reduce racial disparities. We show that for several past definitions of fairness, the optimal algorithms that result require detaining defendants above race-specific risk thresholds. We further show that the optimal unconstrained algorithm requires applying a single, uniform threshold to all defendants. The unconstrained algorithm thus maximizes public safety while also satisfying one important understanding of equality: that all individuals are held to the same standard, irrespective of race. Because the optimal constrained and unconstrained algorithms generally differ, there is tension between improving public safety and satisfying prevailing notions of algorithmic fairness. By examining data from Broward County, Florida, we show that this trade-off can be large in practice. We focus on algorithms for pretrial release decisions, but the principles we discuss apply to other domains, and also to human decision makers carrying out structured decision rules.


conference on computer supported cooperative work | 2015

Outnumbered but Well-Spoken: Female Commenters in the New York Times

Emma Pierson

Using eight months of online comments on New York Times articles, we find that only 28% of commenters of identifiable gender are female, but that their comments receive more recommendations from other readers. Comments from women are more common on forums about parenting, fashion, and health, and on articles written by women. The number of recommendations comments from women receive is positively correlated with the percentage of men on a forum, and the number of recommendations men receive is negatively correlated with the percentage of men on a forum. Female commenters are more likely to remain anonymous and anonymous commenters receive fewer recommendations. Male and female commenters differ in their choice of topics to emphasize, backgrounds, and language; we find three specific examples in responses to articles about sexual assault, contraception, and farm subsidies. We discuss the implications of these gender differences for democratic discourse and suggest ways to increase gender parity.


Proteomics | 2018

SIMLR: A Tool for Large-Scale Genomic Analyses by Multi-Kernel Learning

Bo Wang; Daniele Ramazzotti; Luca De Sano; Junjie Zhu; Emma Pierson; Serafim Batzoglou

SIMLR (Single‐cell Interpretation via Multi‐kernel LeaRning), an open‐source tool that implements a novel framework to learn a sample‐to‐sample similarity measure from expression data observed for heterogenous samples, is presented here. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of samples. SIMLR was benchmarked against state‐of‐the‐art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization. SIMLR is available on https://github.com/BatzoglouLabSU/SIMLRGitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on http://bioconductor.org


bioRxiv | 2017

SIMLR: a tool for large-scale single-cell analysis by multi-kernel learning.

Bo Wang; Daniele Ramazzotti; Luca De Sano; Junjie Zhu; Emma Pierson; Serafim Batzoglou

Motivation We here present SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), an open-source tool that implements a novel framework to learn a cell-to-cell similarity measure from single-cell RNA-seq data. SIMLR can be effectively used to perform tasks such as dimension reduction, clustering, and visualization of heterogeneous populations of cells. SIMLR was benchmarked against state-of-the-art methods for these three tasks on several public datasets, showing it to be scalable and capable of greatly improving clustering performance, as well as providing valuable insights by making the data more interpretable via better a visualization. Availability and Implementation SIMLR is available on GitHub in both R and MATLAB implementations. Furthermore, it is also available as an R package on bioconductor.org. Contact [email protected] or [email protected] Supplementary Information Supplementary data are available at Bioinformatics online.


PLOS ONE | 2014

Uncertainty and denial: a resource-rational model of the value of information.

Emma Pierson; Noah D. Goodman

Classical decision theory predicts that people should be indifferent to information that is not useful for making decisions, but this model often fails to describe human behavior. Here we investigate one such scenario, where people desire information about whether an event (the gain/loss of money) will occur even though there is no obvious decision to be made on the basis of this information. We find a curious dual trend: if information is costless, as the probability of the event increases people want the information more; if information is not costless, peoples desire for the information peaks at an intermediate probability. People also want information more as the importance of the event increases, and less as the cost of the information increases. We propose a model that explains these results, based on the assumption that people have limited cognitive resources and obtain information about which events will occur so they can determine whether to expend effort planning for them.


international world wide web conferences | 2018

Modeling Individual Cyclic Variation in Human Behavior

Emma Pierson; Tim Althoff; Jure Leskovec

Cycles are fundamental to human health and behavior. Examples include mood cycles, circadian rhythms, and the menstrual cycle. However, modeling cycles in time series data is challenging because in most cases the cycles are not labeled or directly observed and need to be inferred from multidimensional measurements taken over time. Here, we present Cyclic Hidden Markov Models (CyHMMs) for detecting and modeling cycles in a collection of multidimensional heterogeneous time series data. In contrast to previous cycle modeling methods, CyHMMs deal with a number of challenges encountered in modeling real-world cycles: they can model multivariate data with both discrete and continuous dimensions; they explicitly model and are robust to missing data; and they can share information across individuals to accommodate variation both within and between individual time series. Experiments on synthetic and real-world health-tracking data demonstrate that CyHMMs infer cycle lengths more accurately than existing methods, with 58% lower error on simulated data and 63% lower error on real-world data compared to the best-performing baseline. CyHMMs can also perform functions which baselines cannot: they can model the progression of individual features/symptoms over the course of the cycle, identify the most variable features, and cluster individual time series into groups with distinct characteristics. Applying CyHMMs to two real-world health-tracking datasets -- of human menstrual cycle symptoms and physical activity tracking data -- yields important insights including which symptoms to expect at each point during the cycle. We also find that people fall into several groups with distinct cycle patterns, and that these groups differ along dimensions not provided to the model. For example, by modeling missing data in the menstrual cycles dataset, we are able to discover a medically relevant group of birth control users even though information on birth control is not given to the model.


Clinical Cancer Research | 2018

Higher Absolute Lymphocyte Counts Predict Lower Mortality from Early-Stage Triple-Negative Breast Cancer

Anosheh Afghahi; Natasha Purington; Summer S. Han; Manisha Desai; Emma Pierson; Maya B. Mathur; Tina Seto; Caroline A. Thompson; Joseph Rigdon; Melinda L. Telli; Sunil Badve; Christina Curtis; Robert B. West; Kathleen C. Horst; Scarlett Lin Gomez; James M. Ford; George W. Sledge; Allison W. Kurian

Purpose: Tumor-infiltrating lymphocytes (TIL) in pretreatment biopsies are associated with improved survival in triple-negative breast cancer (TNBC). We investigated whether higher peripheral lymphocyte counts are associated with lower breast cancer–specific mortality (BCM) and overall mortality (OM) in TNBC. Experimental Design: Data on treatments and diagnostic tests from electronic medical records of two health care systems were linked with demographic, clinical, pathologic, and mortality data from the California Cancer Registry. Multivariable regression models adjusted for age, race/ethnicity, socioeconomic status, cancer stage, grade, neoadjuvant/adjuvant chemotherapy use, radiotherapy use, and germline BRCA1/2 mutations were used to evaluate associations between absolute lymphocyte count (ALC), BCM, and OM. For a subgroup with TIL data available, we explored the relationship between TILs and peripheral lymphocyte counts. Results: A total of 1,463 stage I–III TNBC patients were diagnosed from 2000 to 2014; 1,113 (76%) received neoadjuvant/adjuvant chemotherapy within 1 year of diagnosis. Of 759 patients with available ALC data, 481 (63.4%) were ever lymphopenic (minimum ALC <1.0 K/μL). On multivariable analysis, higher minimum ALC, but not absolute neutrophil count, predicted lower OM [HR = 0.23; 95% confidence interval (CI), 0.16–0.35] and BCM (HR = 0.19; CI, 0.11–0.34). Five-year probability of BCM was 15% for patients who were ever lymphopenic versus 4% for those who were not. An exploratory analysis (n = 70) showed a significant association between TILs and higher peripheral lymphocyte counts during neoadjuvant chemotherapy. Conclusions: Higher peripheral lymphocyte counts predicted lower mortality from early-stage, potentially curable TNBC, suggesting that immune function may enhance the effectiveness of early TNBC treatment. Clin Cancer Res; 24(12); 2851–8. ©2018 AACR.


arXiv: Applications | 2017

A large-scale analysis of racial disparities in police stops across the United States

Emma Pierson; Camelia Simoiu; Jan Overgoor; Sam Corbett-Davies; Cheryl Phillips; Sharad Goel

Collaboration


Dive into the Emma Pierson's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Caroline A. Thompson

Palo Alto Medical Foundation

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge