Is this you? Create Your Porfile

Rajmonda S. Caceres

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rajmonda S. Caceres is active.

Explore More

Publication

Featured researches published by Rajmonda S. Caceres.

european conference on machine learning | 2015

Handling oversampling in dynamic networks using link prediction

Benjamin Fish; Rajmonda S. Caceres

Oversampling is a common characteristic of data representing dynamic networks. It introduces noise into representations of dynamic networks, but there has been little work so far to compensate for it. Oversampling can affect the quality of many important algorithmic problems on dynamic networks, including link prediction. Link prediction seeks to predict edges that will be added to the network given previous snapshots. We show that not only does oversampling affect the quality of link prediction, but that we can use link prediction to recover from the effects of oversampling. We also introduce a novel generative model of noise in dynamic networks that represents oversampling. We demonstrate the results of our approach on both synthetic and real-world data.

conference on decision and control | 2015

Control of epidemics on graphs

Christopher Ho; Mykel J. Kochenderfer; Vineet Mehta; Rajmonda S. Caceres

The control of complex dynamic networks has many applications ranging from the management of electric power grids to the regulation of biological cellular networks. Prior work has focused on understanding the relationship between cascading behavior and network structure, the role of critical nodes, and conditions necessary for sustained cascade propagation. Recent work has begun to examine general approaches for dynamically influencing processes to guide the network or a targeted subnetwork towards a desired state. This paper models the control of an epidemic as a Markov Decision Process and applies a variety of different state-of-the-art planning algorithms to arrive at targeted control strategies. The results reveal the conditions under which certain approaches succeed and fail.

Physical Review X | 2017

Super-Resolution Community Detection for Layer-Aggregated Multilayer Networks

Dane Taylor; Rajmonda S. Caceres; Peter J. Mucha

Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the trade-offs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by developing random-matrix theory for modularity matrices associated with layer-aggregated networks with N nodes and L layers, which are drawn from an ensemble of Erdős–Rényi networks with communities planted in subsets of layers. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit K*. When layers are aggregated via a summation, we obtain K∗∝O(NL/T), where T is the number of layers across which the community persists. Interestingly, if T is allowed to vary with L, then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that T/L decays more slowly than 𝒪(L−1/2). Moreover, we find that thresholding the summation can, in some cases, cause K* to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. In other words, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold.

asilomar conference on signals, systems and computers | 2015

Residuals-based subgraph detection with cue vertices

Benjamin A. Miller; Stephen Kelley; Rajmonda S. Caceres; Steven T. Smith

A common problem in modern graph analysis is the detection of communities, an example of which is the detection of a single anomalously dense subgraph. Recent results have demonstrated a fundamental limit for this problem when using spectral analysis of modularity. In this paper, we demonstrate the implication of these results on subgraph detection when a cue vertex is provided, indicating one of the vertices in the community of interest. Several recent algorithms for local community detection are applied in this context, and we compare their empirical performance to that of the simple method used to derive the theoretical detection limits.

Sigkdd Explorations | 2016

Current and Future Challenges in Mining Large Networks: Report on the Second SDM Workshop on Mining Networks and Graphs

Lawrence B. Holder; Rajmonda S. Caceres; David F. Gleich; E. Jason Riedy; Maleq Khan; Nitesh V. Chawla; Ravi Kumar; Yinghui Wu; Christine Klymko; Tina Eliassi-Rad; B. Aditya Prakash

We report on the Second Workshop on Mining Networks and Graphs held at the 2015 SIAM International Conference on Data Mining. This half-day workshop consisted of a keynote talk, four technical paper presentations, one demonstration, and a panel on future challenges in mining large networks. We summarize the main highlights of the workshop, including expanded written summaries of the future challenges provided by the panelists. The current and future challenges discussed at the workshop and elaborated here provide valuable guidance for future research in the field

international conference on acoustics, speech, and signal processing | 2017

Network discovery using content and homophily

Steven T. Smith; Rajmonda S. Caceres; Kenneth D. Senne; Molly McMahon; Timothy Greer

A new approach for targeted graph sampling is proposed in which graph sampling and classification occur together, and content-based homophily is exploited to achieve improved classification performance. The application of network discovery of relevant content is considered using an approach that may be generalized to a broad class of vertex properties. The resulting procedure provides the initial step of a graph analytic processing chain whose performance is directly affected by the quality of graph sampling. The performance of the algorithm is measured with real network data and content observed on a social media site. Precision-Recall performance improvements of 30% are demonstrated with this dataset, compared to a baseline approach that does not exploit homophily. Because real-world graphs grow exponentially, this performance improvement may have a significant impact on graph analytic algorithms with sensitivities to the graph sampling quality.

computational intelligence and data mining | 2014

Evaluating topic quality using model clustering

Vineet Mehta; Rajmonda S. Caceres; Kevin M. Carter

Topic modeling continues to grow as a popular technique for finding hidden patterns, as well as grouping collections of new types of text and non-text data. Recent years have witnessed a growing body of work in developing metrics and techniques for evaluating the quality of topic models and the topics they generate. This is particularly true for text data where significant attention has been given to the semantic interpretability of topics using measures such as coherence. It has been shown however that topic assessments based on coherence metrics do not always align well with human judgment. Other efforts have examined the utility of information-theoretic distance metrics for evaluating topic quality in connection with semantic interpretability. Although there has been progress in evaluating interpretability of topics, the existing intrinsic evaluation metrics do not address some of the other aspects of concern in topic modeling such as: the number of topics to select, the ability to align topics from different models, and assessing the quality of training data. Here we propose an alternative metric for characterizing topic quality that addresses all three aforementioned issues. Our approach is based on clustering topics, and using the silhouette measure, a popular clustering index, for characterizing the quality of topics. We illustrate the utility of this approach in addressing the other topic modeling concerns noted above. Since this metric is not focused on interpretability, we believe it can be applied more broadly to text as well as non-text data. In this paper however we focus on the application of this metric to archival and non-archival text data.

applications of natural language to data bases | 2017

Challenges and Solutions with Alignment and Enrichment of Word Embedding Models

Cem Şafak Şahin; Rajmonda S. Caceres; Brandon Oselio; William M. Campbell

Word embedding models offer continuous vector representations that can capture rich semantics of word co-occurrence patterns. Although these models have improved the state-of-the-art on a number of nlp tasks, many open research questions remain. We study the semantic consistency and alignment of these models and show that their local properties are sensitive to even slight variations in the training datasets and parameters. We propose a solution that improves alignment of different word embedding models by leveraging carefully generated synthetic data points. Our approach leads to substantial improvements in recovering consistent and richer embeddings of local semantics.

ieee global conference on signal and information processing | 2016

Intersection and convex combination in multi-source spectral planted cluster detection

Benjamin A. Miller; Rajmonda S. Caceres; Steven T. Smith

Planted cluster detection is an important form of signal detection when the data are in the form of a graph. When there are multiple graphs representing multiple connection types, the method of aggregation can have significant impact on the results of a detection algorithm. This paper addresses the tradeoff between two possible aggregation methods: convex combination and intersection. For a spectral detection method, convex combination dominates when the cluster is relatively sparse in at least one graph, while the intersection method dominates in cases where it is dense across graphs. Experimental results confirm the theory. We consider the context of adversarial cluster placement, and determine how an adversary would distribute connections among the graphs to best avoid detection.

asilomar conference on signals, systems and computers | 2015

Improved hidden clique detection by optimal linear fusion of multiple adjacency matrices

Himanshu Nayar; Benjamin A. Miller; Kelly Geyer; Rajmonda S. Caceres; Steven T. Smith; Raj Rao Nadakuditi

Graph fusion has emerged as a promising research area for addressing challenges associated with noisy, uncertain, multi-source data. While many ad-hoc graph fusion techniques exist in the current literature, an analytical approach for analyzing the fundamentals of the graph fusion problem is lacking. We consider the setting where we are given multiple Erdös-Rényi modeled adjacency matrices containing a common hidden or planted clique. The objective is to combine them linearly so that the principal eigenvectors of the resulting matrix best reveal the vertices associated with the clique. We utilize recent results from random matrix theory to derive the optimal weighting coefficients and use these insights to develop a data-driven fusion algorithm. We demonstrate the improved performance of the algorithm relative to other simple heuristics.

Explore More