Joseph J. Pfeiffer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Joseph J. Pfeiffer is active.

Explore More

Publication

Featured researches published by Joseph J. Pfeiffer.

international world wide web conferences | 2014

Attributed graph models: modeling network structure with correlated attributes

Joseph J. Pfeiffer; Sebastian Moreno; Timothy La Fond; Jennifer Neville; Brian Gallagher

Online social networks have become ubiquitous to todays society and the study of data from these networks has improved our understanding of the processes by which relationships form. Research in statistical relational learning focuses on methods to exploit correlations among the attributes of linked nodes to predict user characteristics with greater accuracy. Concurrently, research on generative graph models has primarily focused on modeling network structure without attributes, producing several models that are able to replicate structural characteristics of networks such as power law degree distributions or community structure. However, there has been little work on how to generate networks with real-world structural properties and correlated attributes. In this work, we present the Attributed Graph Model (AGM) framework to jointly model network structure and vertex attributes. Our framework learns the attribute correlations in the observed network and exploits a generative graph model, such as the Kronecker Product Graph Model (KPGM) and Chung Lu Graph Model (CL), to compute structural edge probabilities. AGM then combines the attribute correlations with the structural probabilities to sample networks conditioned on attribute values, while keeping the expected edge probabilities and degrees of the input graph model. We outline an efficient method for estimating the parameters of AGM, as well as a sampling method based on Accept-Reject sampling to generate edges with correlated attributes. We demonstrate the efficiency and accuracy of our AGM framework on two large real-world networks, showing that AGM scales to networks with hundreds of thousands of vertices, as well as having high attribute correlation.

IEEE Transactions on Robotics | 2011

Using Bayesian Filtering to Localize Flexible Materials During Manipulation

Robert Platt; Frank Noble Permenter; Joseph J. Pfeiffer

Localization and manipulation of features such as buttons, snaps, or grommets embedded in fabrics and other flexible materials is a difficult robotics problem. Approaches that rely too much on sensing and localization that occurs before touching the material are likely to fail because the flexible material can move when the robot actually makes contact. This paper experimentally explores the possibility to use proprioceptive and load-based tactile information to localize features embedded in flexible materials during robot manipulation. In our experiments, Robonaut 2, a robot with human-like hands and arms, uses particle filtering to localize features based on proprioceptive and tactile measurements. Our main contribution is to propose a method to interact with flexible materials that reduces the state space of the interaction by forcing the material to comply in repeatable ways. Measurements are matched to a “haptic map,” which is created during a training phase, that describes expected measurements as a low-dimensional function of state. We evaluate localization performance when using proprioceptive information alone and when tactile data are also available. The two types of measurements are shown to contain complementary information. We find that the tactile measurement model is critical to localization performance and propose a series of models that offer increasingly better accuracy. Finally, this paper explores the localization approach in the context of two flexible material insertion tasks that are relevant to manufacturing applications.

workshop on applications of computer vision | 2009

A general framework for reconciling multiple weak segmentations of an image

Soumya Ghosh; Joseph J. Pfeiffer; Jane Mulligan

Segmentation, or partitioning images into internally homogeneous regions, is an important first step in many Computer Vision tasks. In this paper, we attack the segmentation problem using an ensemble of low cost image segmentations. These segmentations are reconciled by applying recent techniques from the consensus clustering literature which exploit a Non-negative Matrix Factorization (NMF) framework. We describe extensions to these methods that scale them for large images and also incorporate smoothness constraints. This framework allows us to uniformly and easily combine segmentations from different algorithms or feature modalities. We then demonstrate that popular bottom up image segmentation algorithms, Mean Shift and Efficient Graph Based segmentation, perform no better than our simple combination of multiple image segmentations derived from k-means clustering (of various feature spaces) or from “naive” RGB quantizations. The algorithms are evaluated on the Berkeley image segmentation dataset.

conference on information and knowledge management | 2014

Active Exploration in Networks: Using Probabilistic Relationships for Learning and Inference

Joseph J. Pfeiffer; Jennifer Neville; Paul N. Bennett

Many interesting domains in machine learning can be viewed as networks, with relationships (e.g., friendships) connecting items (e.g., individuals). The Active Exploration (AE) task is to identify all items in a network with a desired trait (i.e., positive labels) given only partial information about the network. The AE process iteratively queries for labels or network structure within a limited budget; thus, accurate predictions prior to making each query is critical to maximizing the number of positives gathered. However, the targeted AE query process produces partially observed networks that can create difficulties for predictive modeling. In particular, we demonstrate that these partial networks can exhibit extreme label correlation bias, which makes it difficult for conventional relational learning methods to accurately estimate relational parameters. To overcome this issue, we model the joint distribution of possible edges and labels to improve learning and inference. Our proposed method, Probabilistic Relational Expectation Maximization (PR-EM), is the first AE approach to accurately learn the complex dependencies between attributes, labels, and structure to improve predictions. PR-EM utilizes collective inference over the missing relationships in the partial network to jointly infer unknown item traits. Further, we develop a linear inference algorithm to facilitate efficient use of PR-EM in large networks. We test our approach on four real world networks, showing that AE with PR-EM gathers significantly more positive items compared to state-of-the-art methods.

international conference on data mining | 2014

A Scalable Method for Exact Sampling from Kronecker Family Models

Sebastian Moreno; Joseph J. Pfeiffer; Jennifer Neville; Sergey Kirshner

The recent interest in modeling complex networks has fueled the development of generative graph models, such as Kronecker Product Graph Model (KPGM) and mixed KPGM (mKPGM). The Kronecker family of models are appealing because of their elegant fractal structure, as well as their ability to capture important network characteristics such as degree, diameter, and (in the case of mKPGM) clustering and population variance. In addition, scalable sampling algorithms for KPGMs made the analysis of large-scale, sparse networks feasible for the first time. In this work, we show that the scalable sampling methods, in contrast to prior belief, do not in fact sample from the underlying KPGM distribution and often result in sampling graphs that are very unlikely. To address this issue, we develop a new representation that exploits the structure of Kronecker models and facilitates the development of novel grouped sampling methods that are provably correct. In this paper, we outline efficient algorithms to sample from mKPGMs and KPGMs based on these ideas. Notably, our mKPGM algorithm is the first available scalable sampling method for this model and our KPGM algorithm is both faster and more accurate than previous scalable methods. We conduct both theoretical analysis and empirical evaluation to demonstrate the strengths of our algorithms and show that we can sample a network with 75 million edges in 87 seconds on a single processor.

international conference on data mining | 2014

Composite Likelihood Data Augmentation for Within-Network Statistical Relational Learning

Joseph J. Pfeiffer; Jennifer Neville; Paul N. Bennett

The prevalence of datasets that can be represented as networks has recently fueled a great deal of work in the area of Relational Machine Learning (RML). Due to the statistical correlations between linked nodes in the network, many RML methods focus on predicting node features (i.e., labels) using the network relationships. However, many domains are comprised of a single, partially-labeled network. Thus, relational versions of Expectation Maximization (i.e., R-EM), which jointly learn parameters and infer the missing labels, can outperform methods that learn parameters from the labeled data and apply them for inference on the unlabeled nodes. Although R-EM methods can significantly improve predictive performance in networks that are densely labeled, they do not achieve the same gains in sparsely labeled networks and can perform worse than RML methods. In this work, we show the fixed-point methods that R-EM uses for approximate learning and inference result in errors that prevent convergence in sparsely labeled networks. We then propose two methods that do not experience this problem. First, we develop a Relational Stochastic EM (R-SEM) method, which uses stochastic parameters that are not as susceptible to approximation errors. Then we develop a Relational Data Augmentation (R-DA) method, which integrates over a range of stochastic parameter values for inference. R-SEM and R-DA can use any collective RML algorithm for learning and inference in partially labeled networks. We analyze their performance with two RML learners over four real world datasets, and show that they outperform independent learning, RML and R-EM -- particularly in sparsely labeled networks.

social network mining and analysis | 2014

Assortativity in Chung Lu Random Graph Models

Stephen Mussmann; John Moore; Joseph J. Pfeiffer; Jennifer Neville

Due to the widespread interest in networks as a representation to investigate the properties of complex systems, there has been a great deal of interest in generative models of graph structure that can capture the properties of networks observed in the real world. Recent models have focused primarily on accurate characterization of sparse networks with skewed degree distributions, short path lengths, and local clustering. While assortativity---degree correlation among linked nodes---is used as a measure to both describe and evaluate connectivity patterns in networks, there has been little effort to explicitly incorporate patterns of assortativity into model representations. This is because many graph models are edge-based (modeling whether a link should be placed between a pair of nodes i and j) and assortativity is a second-order characteristic that depends on the global properties of the graph (i.e., the final degree of i and j). As such, it is difficult to incorporate direct optimization of assortativity into edge-based generative models. One exception is the BTER method [5], which generates graphs with positive assortativity (e.g., high degree nodes link to each other). However, BTER does not directly estimate assortativity and also is not applicable for networks with negative assortativity (e.g, high degree nodes link primarily to low degree nodes). In this work, we present a novel approach to directly model observed assortativity (both positive and negative) via accept-reject sampling. Our key observation is to use a coarse approximation of the observed joint degree distribution and modify the likelihood that two nodes i, j should link based on the output properties of the original model. We implement our approach as an augmentation of Chung-Lu models and refer to it as Binning Chung Lu (BCL). We apply our method to six network datasets and show that it captures assortativity significantly more accurately than other methods while maintaining other graph properties of the original CL models. Also, our BCL approach is efficient (linear in the number of observed edges), thus it scales easily to large networks.

advances in social networks analysis and mining | 2017

Optimizing the Effectiveness of Incentivized Social Sharing

Joseph J. Pfeiffer; Elena Zheleva

Social media has become an important tool for companies interested in increasing the reach of their products and services. Some companies even offer monetary incentives to customers for recommending products to their social circles. However, the effectiveness of such incentives is often hard to optimize due to the large space of incentive parameters and the inherent tradeoff between the incentive attractiveness for the customer and the return on investment for the company. To address this problem, we propose a novel graph evolution model, Me+N model, which provides flexibility in exploring the effect of different incentive parameters on companys profits by capturing the probabilistic nature of customer behavior over time. We look at a specific family of incentives in which customers get a reward if they convince a certain number of friends to purchase a given product. Our analysis shows that simple monetary incentives can be surprisingly effective in social media strategies.

international acm sigir conference on research and development in information retrieval | 2015

Modeling Website Topic Cohesion at Scale to Improve Webpage Classification

Dhivya Eswaran; Paul N. Bennett; Joseph J. Pfeiffer

Considerable work in web page classification has focused on incorporating the topical structure of the web (e.g., the hyperlink graph) to improve prediction accuracy. However, the majority of work has primarily focused on relational or graph-based methods that are impractical to run at scale or in an online environment. This raises the question of whether it is possible to leverage the topical structure of the web while incurring nearly no additional prediction-time cost. To this end, we introduce an approach which adjusts a page content-only classification from that obtained with a global prior to the posterior obtained by incorporating a prior which reflects the topic cohesion of the site. Using ODP data, we empirically demonstrate that our approach yields significant performance increases over a range of topics.

international conference on weblogs and social media | 2011