Sebastian Moreno | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sebastian Moreno is active.

Explore More

Publication

Featured researches published by Sebastian Moreno.

international world wide web conferences | 2014

Attributed graph models: modeling network structure with correlated attributes

Joseph J. Pfeiffer; Sebastian Moreno; Timothy La Fond; Jennifer Neville; Brian Gallagher

Online social networks have become ubiquitous to todays society and the study of data from these networks has improved our understanding of the processes by which relationships form. Research in statistical relational learning focuses on methods to exploit correlations among the attributes of linked nodes to predict user characteristics with greater accuracy. Concurrently, research on generative graph models has primarily focused on modeling network structure without attributes, producing several models that are able to replicate structural characteristics of networks such as power law degree distributions or community structure. However, there has been little work on how to generate networks with real-world structural properties and correlated attributes. In this work, we present the Attributed Graph Model (AGM) framework to jointly model network structure and vertex attributes. Our framework learns the attribute correlations in the observed network and exploits a generative graph model, such as the Kronecker Product Graph Model (KPGM) and Chung Lu Graph Model (CL), to compute structural edge probabilities. AGM then combines the attribute correlations with the structural probabilities to sample networks conditioned on attribute values, while keeping the expected edge probabilities and degrees of the input graph model. We outline an efficient method for estimating the parameters of AGM, as well as a sampling method based on Accept-Reject sampling to generate edges with correlated attributes. We demonstrate the efficiency and accuracy of our AGM framework on two large real-world networks, showing that AGM scales to networks with hundreds of thousands of vertices, as well as having high attribute correlation.

allerton conference on communication, control, and computing | 2010

Tied Kronecker product graph models to capture variance in network populations

Sebastian Moreno; Sergey Kirshner; Jennifer Neville; S.V.N. Vishwanathan

Much of the past work on mining and modeling networks has focused on understanding the observed properties of single example graphs. However, in many real-life applications it is important to characterize the structure of populations of graphs. In this work, we investigate the distributional properties of Kronecker product graph models (KPGMs) [1]. Specifically, we examine whether these models can represent the natural variability in graph properties observed across multiple networks and find surprisingly that they cannot. By considering KPGMs from a new viewpoint, we can show the reason for this lack of variance theoretically—which is primarily due to the generation of each edge independently from the others. Based on this understanding we propose a generalization of KPGMs that uses tied parameters to increase the variance of the model, while preserving the expectation. We then show experimentally, that our mixed-KPGM can adequately capture the natural variability across a population of networks.

international conference on data mining | 2013

Network Hypothesis Testing Using Mixed Kronecker Product Graph Models

Sebastian Moreno; Jennifer Neville

The recent interest in networks-social, physical, communication, information, etc.-has fueled a great deal of research on the analysis and modeling of graphs. However, many of the analyses have focused on a single large network (e.g., a sub network sampled from Facebook). Although several studies have compared networks from different domains or samples, they largely focus on empirical exploration of network similarities rather than explicit tests of hypotheses. This is in part due to a lack of statistical methods to determine whether two large networks are likely to have been drawn from the same underlying graph distribution. Research on across-network hypothesis testing methods has been limited by (i) difficulties associated with obtaining a set of networks to reason about the underlying graph distribution, and (ii) limitations of current statistical models of graphs that make it difficult to represent variations across networks. In this paper, we exploit the recent development of mixed-Kronecker Product Graph Models, which accurately capture the natural variation in real world graphs, to develop a model-based approach for hypothesis testing in networks.

knowledge discovery and data mining | 2013

Learning mixed kronecker product graph models with simulated method of moments

Sebastian Moreno; Jennifer Neville; Sergey Kirshner

There has recently been a great deal of work focused on developing statistical models of graph structure---with the goal of modeling probability distributions over graphs from which new, similar graphs can be generated by sampling from the estimated distributions. Although current graph models can capture several important characteristics of social network graphs (e.g., degree, path lengths), many of them do not generate graphs with sufficient variation to reflect the natural variability in real world graph domains. One exception is the mixed Kronecker Product Graph Model (mKPGM), a generalization of the Kronecker Product Graph Model, which uses parameter tying to capture variance in the underlying distribution [10]. The enhanced representation of mKPGMs enables them to match both the mean graph statistics and their spread as observed in real network populations, but unfortunately to date, the only method to estimate mKPGMs involves an exhaustive search over the parameters. In this work, we present the first learning algorithm for mKPGMs. The O(|E|) algorithm searches over the continuous parameter space using constrained line search and is based on simulated method of moments, where the objective function minimizes the distance between the observed moments in the training graph and the empirically estimated moments of the model. We evaluate the mKPGM learning algorithm by comparing it to several different graph models, including KPGMs. We use multi-dimensional KS distance to compare the generated graphs to the observed graphs and the results show mKPGMs are able to produce a closer match to real-world graphs (10-90% reduction in KS distance), while still providing natural variation in the generated graphs.

international conference on data mining | 2014

A Scalable Method for Exact Sampling from Kronecker Family Models

Sebastian Moreno; Joseph J. Pfeiffer; Jennifer Neville; Sergey Kirshner

The recent interest in modeling complex networks has fueled the development of generative graph models, such as Kronecker Product Graph Model (KPGM) and mixed KPGM (mKPGM). The Kronecker family of models are appealing because of their elegant fractal structure, as well as their ability to capture important network characteristics such as degree, diameter, and (in the case of mKPGM) clustering and population variance. In addition, scalable sampling algorithms for KPGMs made the analysis of large-scale, sparse networks feasible for the first time. In this work, we show that the scalable sampling methods, in contrast to prior belief, do not in fact sample from the underlying KPGM distribution and often result in sampling graphs that are very unlikely. To address this issue, we develop a new representation that exploits the structure of Kronecker models and facilitates the development of novel grouped sampling methods that are provably correct. In this paper, we outline efficient algorithms to sample from mKPGMs and KPGMs based on these ideas. Notably, our mKPGM algorithm is the first available scalable sampling method for this model and our KPGM algorithm is both faster and more accurate than previous scalable methods. We conduct both theoretical analysis and empirical evaluation to demonstrate the strengths of our algorithms and show that we can sample a network with 75 million edges in 87 seconds on a single processor.

knowledge discovery and data mining | 2016

Sampling of Attributed Networks from Hierarchical Generative Models

Pablo Robles; Sebastian Moreno; Jennifer Neville

Network sampling is a widely used procedure in social network analysis where a random network is sampled from a generative network model (GNM). Recently proposed GNMs, allow generation of networks with more realistic structural characteristics than earlier ones. This facilitates tasks such as hypothesis testing and sensitivity analysis. However, sampling of networks with correlated vertex attributes remains a challenging problem. While the recent work of \cite{Pfeiffer:14} has provided a promising approach for attributed-network sampling, the approach was developed for use with relatively simple GNMs and does not work well with more complex hierarchical GNMs (which can model the range of characteristics and variation observed in real world networks more accurately). In contrast to simple GNMs where the probability mass is spread throughout the space of edges more evenly, hierarchical GNMs concentrate the mass to smaller regions of the space to reflect dependencies among edges in the network---this produces more realistic network characteristics, but also makes it more difficult to identify candidate networks from the sampling space. In this paper, we propose a novel sampling method, CSAG, to sample from hierarchical GNMs and generate networks with correlated attributes. CSAG constrains every step of the sampling process to consider the structure of the GNM---in order to bias the search to regions of the space with higher likelihood. We implemented CSAG using mixed Kronecker Product Graph Models and evaluated our approach on three real-world datasets. The results show that CSAG jointly models the correlation and structure of the networks better than the state of the art. Specifically, CSAG maintains the variability of the underlying GNM while providing a ≥ 5X reduction in attribute correlation error.

ACM Transactions on Knowledge Discovery From Data | 2018

Tied Kronecker Product Graph Models to Capture Variance in Network Populations

Sebastian Moreno; Jennifer Neville; Sergey Kirshner

international joint conference on artificial intelligence | 2017

Unified Representation and Lifted Sampling for Generative Models of Social Networks

Pablo Robles-Granda; Sebastian Moreno; Jennifer Neville

Statistical models of network structure are widely used in network science to reason about the properties of complex systems—where the nodes and edges represent entities and their relationships. Recently, a number of generative network models (GNM) have been developed that accurately capture characteristics of real world networks, but since they are typically defined in a procedural manner, it is difficult to identify commonalities in their structure. Moreover, procedural definitions make it difficult to develop statistical sampling algorithms that are both efficient and correct. In this paper, we identify a family of GNMs that share a common latent structure and create a Bayesian network (BN) representation that captures their common form. We show how to reduce two existing GNMs to this representation. Then, using the BN representation we develop a generalized, efficient, and provably correct, sampling method that exploits parametric symmetries and deterministic context-specific dependence. Finally, we use the new representation to design a novel GNM and evaluate it empirically.

international conference on data mining | 2015

Analyzing the Transferability of Collective Inference Models Across Networks

Ransen Niu; Sebastian Moreno; Jennifer Neville

Collective inference models have recently been used to significantly improve the predictive accuracy of node classifications in network domains. However, these methods have generally assumed a fully labeled network is available for learning. There has been relatively little work on transfer learning methods for collective classification, i.e., to exploit labeled data in one network domain to learn a collective classification model to apply in another network. While there has been some work on transfer learning for link prediction and node classification, the proposed methods focus on developing algorithms to adapt the models without a deep understanding of how the network structure impacts transferability. Here we make the key observation that collective classification models are generally composed of local model templates that are rolled out across a heterogeneous network to construct a larger model for inference. Thus, the transferability of a model could depend on similarity of the local model templates and/or the global structure of the data networks. In this work, we study the performance of basic relational models when learned on one network and transferred to another network to apply collective inference. We show, using both synthetic and real data experiments, that transferability of models depends on both the graph structure and local model parameters. Moreover, we show that a probability calibration process (that removes bias due to propagation errors in collective inference) improves transferability.

privacy security risk and trust | 2012