Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Karl Rohe is active.

Publication


Featured researches published by Karl Rohe.


Methods in Ecology and Evolution | 2016

Fast and accurate detection of evolutionary shifts in Ornstein–Uhlenbeck models

Mohammad Khabbazian; Ricardo Kriebel; Karl Rohe; Cécile Ané

Summary The detection of evolutionary shifts in trait evolution from extant taxa is motivated by the study of convergent evolution, or to correlate shifts in traits with habitat changes or with changes in other phenotypes. We propose here a phylogenetic lasso method to study trait evolution from comparative data and detect past changes in the expected mean trait values. We use the Ornstein–Uhlenbeck process, which can model a changing adaptive landscape over time and over lineages. Our method is very fast, running in minutes for hundreds of species, and can handle multiple traits. We also propose a phylogenetic Bayesian information criterion that accounts for the phylogenetic correlation between species, as well as for the complexity of estimating an unknown number of shifts at unknown locations in the phylogeny. This criterion does not suffer model overfitting and has high precision, so it offers a conservative alternative to other information criteria. Our re-analysis of Anolis lizard data suggests a more conservative scenario of morphological adaptation and convergence than previously proposed. Software is available on GitHub.


Biometrika | 2017

Covariate-assisted spectral clustering

Norbert Binkiewicz; Joshua T. Vogelstein; Karl Rohe

Summary Biological and social systems consist of myriad interacting units. The interactions can be represented in the form of a graph or network. Measurements of these graphs can reveal the underlying structure of these interactions, which provides insight into the systems that generated the graphs. Moreover, in applications such as connectomics, social networks, and genomics, graph data are accompanied by contextualizing measures on each node. We utilize these node covariates to help uncover latent communities in a graph, using a modification of spectral clustering. Statistical guarantees are provided under a joint mixture model that we call the node-contextualized stochastic blockmodel, including a bound on the misclustering rate. The bound is used to derive conditions for achieving perfect clustering. For most simulated cases, covariate-assisted spectral clustering yields results superior both to regularized spectral clustering without node covariates and to an adaptation of canonical correlation analysis. We apply our clustering method to large brain graphs derived from diffusion MRI data, using the node locations or neurological region membership as covariates. In both cases, covariate-assisted spectral clustering yields clusters that are easier to interpret neurologically.


Proceedings of the National Academy of Sciences of the United States of America | 2016

Co-clustering directed graphs to discover asymmetries and directional communities

Karl Rohe; Tai Qin; Bin Yu

Significance This paper adds to the continuing and long-running interest in networks and network clustering. For directed networks, we propose the di-sim algorithm to capture asymmetries of connections and discover directional clusters. We illustrate this algorithm with three data examples: the Enron email network, the hyperlinked blog network during the 2004 US presidential election, and the chemical connections among the neurons in Caenorhabditis elegans. We identify informative and bottleneck nodes in all three networks. In particular, for the third example, di-sim finds bottleneck nodes that create a feedforward structure among clusters of nodes. In directed graphs, relationships are asymmetric and these asymmetries contain essential structural information about the graph. Directed relationships lead to a new type of clustering that is not feasible in undirected graphs. We propose a spectral co-clustering algorithm called di-sim for asymmetry discovery and directional clustering. A Stochastic co-Blockmodel is introduced to show favorable properties of di-sim. To account for the sparse and highly heterogeneous nature of directed networks, di-sim uses the regularized graph Laplacian and projects the rows of the eigenvector matrix onto the sphere. A nodewise asymmetry score and di-sim are used to analyze the clustering asymmetries in the networks of Enron emails, political blogs, and the Caenorhabditis elegans chemical connectome. In each example, a subset of nodes have clustering asymmetries; these nodes send edges to one cluster, but receive edges from another cluster. Such nodes yield insightful information (e.g., communication bottlenecks) about directed networks, but are missed if the analysis ignores edge direction.


Electronic Journal of Statistics | 2017

Central limit theorems for network driven sampling

Xiao Li; Karl Rohe

Respondent-Driven Sampling is a popular technique for sampling hidden populations. This paper models Respondent-Driven Sampling as a Markov process indexed by a tree. Our main results show that the Volz-Heckathorn estimator is asymptotically normal below a critical threshold. The key technical difficulties stem from (i) the dependence between samples and (ii) the tree structure which characterizes the dependence. The theorems allow the growth rate of the tree to exceed one and suggest that this growth rate should not be too large. To illustrate the usefulness of these results beyond their obvious use, an example shows that in certain cases the sample average is preferable to inverse probability weighting. We provide a test statistic to distinguish between these two cases.


Statistica Sinica | 2018

Asymptotic theory for estimating the singular vectors and values of a partially-observed low rank matrix with noise

Juhee Cho; Donggyu Kim; Karl Rohe

Matrix completion algorithms recover a low rank matrix from a small fraction of the entries, each entry contaminated with additive errors. In practice, the singular vectors and singular values of the low rank matrix play a pivotal role for statistical analyses and inferences. This paper proposes estimators of these quantities and studies their asymptotic behavior. Under the setting where the dimensions of the matrix increase to infinity and the probability of observing each entry is identical, Theorem 4.1 gives the rate of convergence for the estimated singular vectors; Theorem 4.3 gives a multivariate central limit theorem for the estimated singular values. Even though the estimators use only a partially observed matrix, they achieve the same rates of convergence as the fully observed case. These estimators combine to form a consistent estimator of the full low rank matrix that is computed with a non-iterative algorithm. In the cases studied in this paper, this estimator achieves the minimax lower bound in Koltchinskii et al. (2011). The numerical experiments corroborate our theoretical results.


New Media & Society | 2018

Attention and amplification in the hybrid media system: The composition and activity of Donald Trump’s Twitter following during the 2016 presidential election:

Yini Zhang; Chris Wells; Song Wang; Karl Rohe

Building on studies of the hybrid media system and attention economy, we develop the concept of amplification to explore how the activities of social media–based publics may enlarge the attention paid to a given person or message. We apply the concept to the 2016 US election, asking who constituted Donald Trump’s enormous Twitter following and how that following contributed to his success at attracting attention, including from the mainstream press. Using spectral clustering based on social network similarity, we identify key publics that constituted Trump’s Twitter following and demonstrate how particular publics amplified his social media presence in different ways. Our discussion raises questions about how algorithms “read” metrics to guide content on social media platforms, how journalists draw on social media metrics in their determinations of news value and worthiness, and how the process of amplification relates to possibilities of citizen action through digital communication.


Journal of Computational and Graphical Statistics | 2018

Intelligent Initialization and Adaptive Thresholding for Iterative Matrix Completion; Some Statistical and Algorithmic Theory for Adaptive-Impute

Juhee Cho; Donggyu Kim; Karl Rohe

Abstract Over the past decade, various matrix completion algorithms have been developed. Thresholded singular value decomposition (SVD) is a popular technique in implementing many of them. A sizable number of studies have shown its theoretical and empirical excellence, but choosing the right threshold level still remains as a key empirical difficulty. This article proposes a novel matrix completion algorithm which iterates thresholded SVD with theoretically justified and data-dependent values of thresholding parameters. The estimate of the proposed algorithm enjoys the minimax error rate and shows outstanding empirical performances. The thresholding scheme that we use can be viewed as a solution to a nonconvex optimization problem, understanding of whose theoretical convergence guarantee is known to be limited. We investigate this problem by introducing a simpler algorithm, generalized- softImpute, analyzing its convergence behavior, and connecting it to the proposed algorithm.


Proceedings of the National Academy of Sciences of the United States of America | 2018

Generalized least squares can overcome the critical threshold in respondent-driven sampling

Sebastien Roch; Karl Rohe

Significance Respondent-driven sampling (RDS) is a popular technique to sample marginalized or hard-to-reach populations, where participants can refer multiple contacts into the sample. Using the sampled participants, we wish to estimate properties of the population, often the proportion of individuals that are HIV+. Because contacts often share the same HIV status, adjacent samples are dependent. As a result, RDS can lead to highly variable estimates of HIV prevalence. This paper studies an estimation technique for HIV prevalence that is based upon the classical idea of generalized least squares. To sample marginalized and/or hard-to-reach populations, respondent-driven sampling (RDS) and similar techniques reach their participants via peer referral. Under a Markov model for RDS, previous research has shown that if the typical participant refers too many contacts, then the variance of common estimators does not decay like O(n−1), where n is the sample size. This implies that confidence intervals will be far wider than under a typical sampling design. Here we show that generalized least squares (GLS) can effectively reduce the variance of RDS estimates. In particular, a theoretical analysis indicates that the variance of the GLS estimator is O(n−1). We then derive two classes of feasible GLS estimators. The first class is based upon a Degree Corrected Stochastic Blockmodel for the underlying social network. The second class is based upon a rank-two model. It might be of independent interest that in both model classes, the theoretical results show that it is possible to estimate the spectral properties of the population network from a random walk sample of the nodes. These theoretical results point the way to entirely different classes of estimators that account for the network structure beyond node degree. Diagnostic plots help to identify situations where feasible GLS estimators are more appropriate. The computational experiments show the potential benefits and also indicate that there is room to further develop these estimators in practical settings.


Journal of Educational and Behavioral Statistics | 2017

Latent Factors in Student–Teacher Interaction Factor Analysis

Thu Le; Daniel M. Bolt; Eric M. Camburn; Peter Goff; Karl Rohe

Classroom interactions between students and teachers form a two-way or dyadic network. Measurements such as days absent, test scores, student ratings, or student grades can indicate the “quality” of the interaction. Together with the underlying bipartite graph, these values create a valued student–teacher dyadic interaction network. To study the broad structure of these values, we propose using interaction factor analysis (IFA), a recently developed statistical technique that can be used to investigate the hidden factors underlying the quality of student–teacher interactions. Our empirical study indicates there are latent teacher (i.e., teaching style) and student (i.e., preference for teaching style) types that influence the quality of interactions. Students and teachers of the same type tend to have more positive interactions, and those of differing types tend to have more negative interactions. IFA has the advantage of traditional factor analysis in that the types are not presupposed; instead, the types are identified by IFA and can be interpreted in post hoc analysis. Whereas traditional factor analysis requires one to observe all interactions, IFA performs well even when only a small fraction of potential interactions are actually observed.


Annals of Statistics | 2011

Spectral clustering and the high-dimensional stochastic blockmodel

Karl Rohe; Sourav Chatterjee; Bin Yu

Collaboration


Dive into the Karl Rohe's collaboration.

Top Co-Authors

Avatar

Bin Yu

University of California

View shared research outputs
Top Co-Authors

Avatar

Tai Qin

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Jinzhu Jia

University of California

View shared research outputs
Top Co-Authors

Avatar

Juhee Cho

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Chris Wells

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Donggyu Kim

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Mohammad Khabbazian

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Norbert Binkiewicz

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Sebastien Roch

University of Wisconsin-Madison

View shared research outputs
Top Co-Authors

Avatar

Yilin Zhang

University of Wisconsin-Madison

View shared research outputs
Researchain Logo
Decentralizing Knowledge