Daniel B. Larremore | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Daniel B. Larremore is active.

Explore More

Publication

Featured researches published by Daniel B. Larremore.

Science Advances | 2015

Systematic inequality and hierarchy in faculty hiring networks

Aaron Clauset; Samuel Arbesman; Daniel B. Larremore

An analysis of networks of graduate-to-faculty hires reveals systematic hiring biases and patterns. The faculty job market plays a fundamental role in shaping research priorities, educational outcomes, and career trajectories among scientists and institutions. However, a quantitative understanding of faculty hiring as a system is lacking. Using a simple technique to extract the institutional prestige ranking that best explains an observed faculty hiring network—who hires whose graduates as faculty—we present and analyze comprehensive placement data on nearly 19,000 regular faculty in three disparate disciplines. Across disciplines, we find that faculty hiring follows a common and steeply hierarchical structure that reflects profound social inequality. Furthermore, doctoral prestige alone better predicts ultimate placement than a U.S. News & World Report rank, women generally place worse than men, and increased institutional prestige leads to increased faculty production, better faculty placement, and a more influential position within the discipline. These results advance our ability to quantify the influence of prestige in academia and shed new light on the academic system.

Physical Review Letters | 2011

Predicting criticality and dynamic range in complex networks: effects of topology.

Daniel B. Larremore; Woodrow L. Shew; Juan G. Restrepo

The collective dynamics of a network of coupled excitable systems in response to an external stimulus depends on the topology of the connections in the network. Here we develop a general theoretical approach to study the effects of network topology on dynamic range, which quantifies the range of stimulus intensities resulting in distinguishable network responses. We find that the largest eigenvalue of the weighted network adjacency matrix governs the network dynamic range. When the largest eigenvalue is exactly one, the system is in a critical state and its dynamic range is maximized. Further, we examine higher order behavior of the steady state system, which predicts that networks with more homogeneous degree distributions should have higher dynamic range. Our analysis, confirmed by numerical simulations, generalizes previous studies in terms of the largest eigenvalue of the adjacency matrix.

Science Advances | 2017

The ground truth about metadata and community detection in networks

Leto Peel; Daniel B. Larremore; Aaron Clauset

Troubles with community detection in networks: No ground truth, no free lunch, and the complex coupling of metadata with structure. Across many scientific domains, there is a common need to automatically extract a simplified view or coarse-graining of how a complex system’s components interact. This general task is called community detection in networks and is analogous to searching for clusters in independent vector data. It is common to evaluate the performance of community detection algorithms by their ability to find so-called ground truth communities. This works well in synthetic networks with planted communities because these networks’ links are formed explicitly based on those known communities. However, there are no planted communities in real-world networks. Instead, it is standard practice to treat some observed discrete-valued node attributes, or metadata, as ground truth. We show that metadata are not the same as ground truth and that treating them as such induces severe theoretical and practical problems. We prove that no algorithm can uniquely solve community detection, and we prove a general No Free Lunch theorem for community detection, which implies that there can be no algorithm that is optimal for all possible community detection tasks. However, community detection remains a powerful tool and node metadata still have value, so a careful exploration of their relationship with network structure can yield insights of genuine worth. We illustrate this point by introducing two statistical techniques that can quantify the relationship between metadata and community structure for a broad class of models. We demonstrate these techniques using both synthetic and real-world networks, and for multiple types of metadata and community structures.

Physical Review E | 2014

Efficiently inferring community structure in bipartite networks.

Daniel B. Larremore; Aaron Clauset; Abigail Z. Jacobs

Bipartite networks are a common type of network data in which there are two types of vertices, and only vertices of different types can be connected. While bipartite networks exhibit community structure like their unipartite counterparts, existing approaches to bipartite community detection have drawbacks, including implicit parameter choices, loss of information through one-mode projections, and lack of interpretability. Here we solve the community detection problem for bipartite networks by formulating a bipartite stochastic block model, which explicitly includes vertex type information and may be trivially extended to k-partite networks. This bipartite stochastic block model yields a projection-free and statistically principled method for community detection that makes clear assumptions and parameter choices and yields interpretable results. We demonstrate this models ability to efficiently and accurately find community structure in synthetic bipartite networks with known structure and in real-world bipartite networks with unknown structure, and we characterize its performance in practical contexts.

Physical Review E | 2012

Statistical properties of avalanches in networks.

Daniel B. Larremore; Marshall Y. Carpenter; Edward Ott; Juan G. Restrepo

We characterize the distributions of size and duration of avalanches propagating in complex networks. By an avalanche we mean the sequence of events initiated by the externally stimulated excitation of a network node, which may, with some probability, then stimulate subsequent excitations of the nodes to which it is connected, resulting in a cascade of excitations. This type of process is relevant to a wide variety of situations, including neuroscience, cascading failures on electrical power grids, and epidemiology. We find that the statistics of avalanches can be characterized in terms of the largest eigenvalue and corresponding eigenvector of an appropriate adjacency matrix that encodes the structure of the network. By using mean-field analyses, previous studies of avalanches in networks have not considered the effect of network structure on the distribution of size and duration of avalanches. Our results apply to individual networks (rather than network ensembles) and provide expressions for the distributions of size and duration of avalanches starting at particular nodes in the network. These findings might find application in the analysis of branching processes in networks, such as cascading power grid failures and critical brain dynamics. In particular, our results show that some experimental signatures of critical brain dynamics (i.e., power-law distributions of size and duration of neuronal avalanches) are robust to complex underlying network topologies.

PLOS Computational Biology | 2013

A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes

Daniel B. Larremore; Aaron Clauset; Caroline O. Buckee

The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences.

Nature Communications | 2015

Ape parasite origins of human malaria virulence genes

Daniel B. Larremore; Sesh A. Sundararaman; Weimin Liu; William R. Proto; Aaron Clauset; Dorothy E. Loy; Sheri Speede; Lindsey J. Plenderleith; Paul M. Sharp; Beatrice H. Hahn; Julian C. Rayner; Caroline O. Buckee

Antigens encoded by the var gene family are major virulence factors of the human malaria parasite Plasmodium falciparum, exhibiting enormous intra- and interstrain diversity. Here we use network analysis to show that var architecture and mosaicism are conserved at multiple levels across the Laverania subgenus, based on var-like sequences from eight single-species and three multi-species Plasmodium infections of wild-living or sanctuary African apes. Using select whole-genome amplification, we also find evidence of multi-domain var structure and synteny in Plasmodium gaboni, one of the ape Laverania species most distantly related to P. falciparum, as well as a new class of Duffy-binding-like domains. These findings indicate that the modular genetic architecture and sequence diversity underlying var-mediated host-parasite interactions evolved before the radiation of the Laverania subgenus, long before the emergence of P. falciparum.

Chaos | 2011

Effects of network topology, transmission delays, and refractoriness on the response of coupled excitable systems to a stochastic stimulus.

Daniel B. Larremore; Woodrow L. Shew; Edward Ott; Juan G. Restrepo

We study the effects of network topology on the response of networks of coupled discrete excitable systems to an external stochastic stimulus. We extend recent results that characterize the response in terms of spectral properties of the adjacency matrix by allowing distributions in the transmission delays and in the number of refractory states and by developing a nonperturbative approximation to the steady state network response. We confirm our theoretical results with numerical simulations. We find that the steady state response amplitude is inversely proportional to the duration of refractoriness, which reduces the maximum attainable dynamic range. We also find that transmission delays alter the time required to reach steady state. Importantly, neither delays nor refractoriness impact the general prediction that criticality and maximum dynamic range occur when the largest eigenvalue of the adjacency matrix is unity.

Siam Review | 2018

Configuring Random Graph Models with Fixed Degree Sequences

Bailey K. Fosdick; Daniel B. Larremore; Joel Nishimura; Johan Ugander

Random graph null models have found widespread application in diverse research communities analyzing network datasets, including social, information, and economic networks, as well as food webs, protein-protein interactions, and neuronal networks. The most popular family of random graph null models, called configuration models, are defined as uniform distributions over a space of graphs with a fixed degree sequence. Commonly, properties of an empirical network are compared to properties of an ensemble of graphs from a configuration model in order to quantify whether empirical network properties are meaningful or whether they are instead a common consequence of the particular degree sequence. In this work we study the subtle but important decisions underlying the specification of a configuration model, and investigate the role these choices play in graph sampling procedures and a suite of applications. We place particular emphasis on the importance of specifying the appropriate graph labeling (stub-labeled or vertex-labeled) under which to consider a null model, a choice that closely connects the study of random graphs to the study of random contingency tables. We show that the choice of graph labeling is inconsequential for studies of simple graphs, but can have a significant impact on analyses of multigraphs or graphs with self-loops. The importance of these choices is demonstrated through a series of three vignettes, analyzing network datasets under many different configuration models and observing substantial differences in study conclusions under different models. We argue that in each case, only one of the possible configuration models is appropriate. While our work focuses on undirected static networks, it aims to guide the study of directed networks, dynamic networks, and all other network contexts that are suitably studied through the lens of random graph null models.

Proceedings of the National Academy of Sciences of the United States of America | 2017

The misleading narrative of the canonical faculty productivity trajectory

Samuel F. Way; Allison C. Morgan; Aaron Clauset; Daniel B. Larremore

Significance Scholarly productivity impacts nearly every aspect of a researcher’s career, from their initial placement as faculty to funding and tenure decisions. Historically, expectations for individuals rely on 60 years of research on aggregate trends, which suggest that productivity rises rapidly to an early-career peak and then gradually declines. Here we show, using comprehensive data on the publication and employment histories of an entire field of research, that the canonical narrative of “rapid rise, gradual decline” describes only about one-fifth of individual faculty, and the remaining four-fifths exhibit a rich diversity of productivity patterns. This suggests existing models and expectations for faculty productivity require revision, as they capture only one of many ways to have a successful career in science. A scientist may publish tens or hundreds of papers over a career, but these contributions are not evenly spaced in time. Sixty years of studies on career productivity patterns in a variety of fields suggest an intuitive and universal pattern: Productivity tends to rise rapidly to an early peak and then gradually declines. Here, we test the universality of this conventional narrative by analyzing the structures of individual faculty productivity time series, constructed from over 200,000 publications and matched with hiring data for 2,453 tenure-track faculty in all 205 PhD-granting computer science departments in the United States and Canada. Unlike prior studies, which considered only some faculty or some institutions, or lacked common career reference points, here we combine a large bibliographic dataset with comprehensive information on career transitions that covers an entire field of study. We show that the conventional narrative confidently describes only one-fifth of faculty, regardless of department prestige or researcher gender, and the remaining four-fifths of faculty exhibit a rich diversity of productivity patterns. To explain this diversity, we introduce a simple model of productivity trajectories and explore correlations between its parameters and researcher covariates, showing that departmental prestige predicts overall individual productivity and the timing of the transition from first- to last-author publications. These results demonstrate the unpredictability of productivity over time and open the door for new efforts to understand how environmental and individual factors shape scientific productivity.

Explore More