Is this you? Create Your Porfile

Parag Singla

Indian Institute of Technology Delhi

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Parag Singla is active.

Explore More

Publication

Featured researches published by Parag Singla.

international conference on data mining | 2006

Entity Resolution with Markov Logic

Parag Singla; Pedro M. Domingos

Entity resolution is the problem of determining which records in a database refer to the same entities, and is a crucial and expensive step in the data mining process. Interest in it has grown rapidly, and many approaches have been proposed. However, they tend to address only isolated aspects of the problem, and are often ad hoc. This paper proposes a well-founded, integrated solution to the entity resolution problem based on Markov logic. Markov logic combines first-order logic and probabilistic graphical models by attaching weights to first-order formulas, and viewing them as templates for features of Markov networks. We show how a number of previous approaches can be formulated and seamlessly combined in Markov logic, and how the resulting learning and inference problems can be solved efficiently. Experiments on two citation databases show the utility of this approach, and evaluate the contribution of the different components.

international world wide web conferences | 2008

Yes, there is a correlation: - from social networks to personal behavior on the web

Parag Singla; Matthew Richardson

Characterizing the relationship that exists between a persons social group and his/her personal behavior has been a long standing goal of social network analysts. In this paper, we apply data mining techniques to study this relationship for a population of over 10 million people, by turning to online sources of data. The analysis reveals that people who chat with each other (using instant messaging) are more likely to share interests (their Web searches are the same or topically similar). The more time they spend talking, the stronger this relationship is. People who chat with each other are also more likely to share other personal characteristics, such as their age and location (and, they are likely to be of opposite gender). Similar findings hold for people who do not necessarily talk to each other but do have a friend in common. Our analysis is based on a well-defined mathematical formulation of the problem, and is the largest such study we are aware of.

european conference on machine learning | 2005

Object identification with attribute-mediated dependences

Parag Singla; Pedro M. Domingos

Object identification is the problem of determining whether different observations correspond to the same object. It occurs in a wide variety of fields, including vision, natural language, citation matching, and information integration. Traditionally, the problem is solved separately for each pair of observations, followed by transitive closure. We propose solving it collectively, performing simultaneous inference for all candidate match pairs, and allowing information to propagate from one candidate match to another via the attributes they have in common. Our formulation is based on conditional random fields, and allows an optimal solution to be found in polynomial time using a graph cut algorithm. Parameters are learned using a voted perceptron algorithm. Experiments on real and synthetic datasets show that this approach outperforms the standard one.

logic in computer science | 2016

Unifying Logical and Statistical AI

Pedro M. Domingos; Daniel Lowd; Stanley Kok; Aniruddh Nath; Hoifung Poon; Matthew Richardson; Parag Singla

Intelligent agents must be able to handle the complexity and uncertainty of the real world. Logical AI has focused mainly on the former, and statistical AI on the latter. Markov logic combines the two by attaching weights to first-order formulas and viewing them as templates for features of Markov networks. Inference algorithms for Markov logic draw on ideas from satisfiability, Markov chain Monte Carlo and knowledge-based model construction. Learning algorithms are based on the voted perceptron, pseudo-likelihood and inductive logic programming. Markov logic has been successfully applied to a wide variety of problems in natural language understanding, vision, computational biology, social networks and others, and is the basis of the open-source Alchemy system.

computer vision and pattern recognition | 2008

Discovery of social relationships in consumer photo collections using Markov Logic

Parag Singla; Henry A. Kautz; Jiebo Luo; Andrew C. Gallagher

We identify the social relationships between individuals in consumer photos. Consumer photos generally do not contain a random gathering of strangers but rather groups of friends and families. Detecting and identifying these relationships are important steps towards understanding consumer image collections. Similar to the approach that a human might use, we use a rule-based system to quantify the domain knowledge (e.g. children tend to be photographed more often than adults; parents tend to appear with their kids). The weight of each rule reflects its importance in the overall prediction model. Learning and inference are based on a sound mathematical formulation using the theory developed in the area of statistical relational models. In particular, we use the language called Markov Logic [14]. We evaluate our model using cross validation on a set of about 4500 photos collected from 13 different users. Our experiments show the potential of our approach by improving the accuracy (as well as other statistical measures) over a set of two different relationship prediction tasks when compared with different baselines. We conclude with directions for future work.

international semantic web conference | 2008

Just Add Weights: Markov Logic for the Semantic Web

Pedro M. Domingos; Daniel Lowd; Stanley Kok; Hoifung Poon; Matthew Richardson; Parag Singla

In recent years, it has become increasingly clear that the vision of the Semantic Web requires uncertain reasoning over rich, first-order representations. Markov logic brings the power of probabilistic modeling to first-order logic by attaching weights to logical formulas and viewing them as templates for features of Markov networks. This gives natural probabilistic semantics to uncertain or even inconsistent knowledge bases with minimal engineering effort. Inference algorithms for Markov logic draw on ideas from satisfiability, Markov chain Monte Carlo and knowledge-based model construction. Learning algorithms are based on the conjugate gradient algorithm, pseudo-likelihood and inductive logic programming. Markov logic has been successfully applied to problems in entity resolution, link prediction, information extraction and others, and is the basis of the open-source Alchemy system.

principles and practice of constraint programming | 2011

Constraint propagation for efficient inference in Markov logic

Tivadar Papai; Parag Singla; Henry A. Kautz

Many real world problems can be modeled using a combination of hard and soft constraints. Markov Logic is a highly expressive language which represents the underlying constraints by attaching realvalued weights to formulas in first order logic. The weight of a formula represents the strength of the corresponding constraint. Hard constraints are represented as formulas with infinite weight. The theory is compiled into a ground Markov network over which probabilistic inference can be done. For many problems, hard constraints pose a significant challenge to the probabilistic inference engine. However, solving the hard constraints (partially or fully) before hand outside of the probabilistic engine can hugely simplify the ground Markov network and speed probabilistic inference. In this work, we propose a generalized arc consistency algorithm that prunes the domains of predicates by propagating hard constraints. Our algorithm effectively performs unit propagation at a lifted level, avoiding the need to explicitly ground the hard constraints during the pre-processing phase, yielding a potentially exponential savings in space and time. Our approach results in much simplified domains, thereby, making the inference significantly more efficient both in terms of time and memory. Experimental evaluation over one artificial and two real-world datasets show the benefit of our approach.

Computer Vision and Image Understanding | 2016

Lazy Generic Cuts

Dinesh Khandelwal; Kush Bhatia; Chetan Arora; Parag Singla

An efficient algorithm for inference in binary higher order MRF-MAP is proposed.Scalable to mid sized cliques which could not be done by state of the art.The algorithm is a flow based combinatorial algorithm based on Generic Cuts.In a typical inference problem minimum depends only on small set of constraints.The experiments validate the observation and show superiority over state of the art. LP relaxation based message passing and flow-based algorithms are two of the popular techniques for performing MAP inference in graphical models. Generic Cuts (GC) (Arora et?al., 2015) combines the two approaches to generalize the traditional max-flow min-cut based algorithms for binary models with higher order clique potentials. The algorithm has been shown to be significantly faster than the state of the art algorithms. The time and memory complexities of Generic Cuts are linear in the number of constraints, which in turn is exponential in the clique size. This limits the applicability of the approach to small cliques only. In this paper, we propose a lazy version of Generic Cuts exploiting the property that in most of such inference problems a large fraction of the constraints are never used during the course of minimization. We start with a small set of constraints (called the active constraints) which are expected to play a role during the minimization process. GC is then run with this reduced set allowing it to be efficient in time and memory. The set of active constraints is adaptively learnt over multiple iterations while guaranteeing convergence to the optimum for submodular clique potentials. Our experiments show that the number of constraints required by the algorithm is typically less than 3% of the total number of constraints. Experiments on computer vision datasets show that our approach can significantly outperform the state of the art both in terms of time and memory and is scalable to clique sizes that could not be handled by existing approaches.

international conference on data engineering | 2014

Characterizing comparison shopping behavior: A case study

Mona Gupta; Happy Mittal; Parag Singla; Amitabha Bagchi

In this work we study the behavior of users on online comparison shopping using session traces collected over one year from an Indian mobile phone comparison website: http://smartprix.com. There are two aspects to our study: data analysis and behavior prediction. The first aspect of our study, data analysis, is geared towards providing insights into user behavior that could enable vendors to offer the right kinds of products and prices, and that could help the comparison shopping engine to customize the search based on user preferences. We discover the correlation between the search queries which users write before coming on the site and their future behavior on the same. We have also studied the distribution of users based on geographic location, time of the day, day of the week, number of sessions which have a click to buy (convert), repeat users, phones/brands visited and compared. We analyze the impact of price change on the popularity of a product and how special events such as launch of a new model affect the popularity of a brand. Our analysis corroborates intuitions such as increasing price leads to decrease in popularity and vice-versa. Further, we characterize the time lag in the effect of such phenomena on popularity. We characterize the user behavior on the website in terms of sequence of transitions between multiple states (defined in terms of the kind of page being visited e.g. home, visit, compare etc.). We use KL divergence to show that a time-homogeneous Markov chain is the right model for session traces when the number of clicks varies from 5 to 30. Finally, we build a model using Markov logic that uses the history of the users activity in a session to predict whether a user is going to click to convert in that session. Our methodology of combining data analysis with machine learning is, in our opinion, a new approach to the empirical study of such data sets.

Social Network Analysis and Mining | 2015

On the role of conductance, geography and topology in predicting hashtag virality

Siddharth Bora; Harvineet Singh; Anirban Sen; Amitabha Bagchi; Parag Singla

We focus on three aspects of the early spread of a hashtag in order to predict whether it will go viral: the network properties of the subset of users tweeting the hashtag, its geographical properties, and, most importantly, its conductance-related properties. One of our significant contributions is to discover the critical role played by the conductance-based features for the successful prediction of virality. More specifically, we show that the second derivative of the conductance gives an early indication of whether the hashtag is going to go viral or not. We present a detailed experimental evaluation of the effect of our various categories of features on the virality prediction task. When compared to the baselines and the state-of-the-art techniques proposed in the literature our feature set is able to achieve significantly better accuracy on a large dataset of 7.7 million users and all their tweets over a period of month, as well as on existing datasets.

Explore More