Nagarajan Natarajan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nagarajan Natarajan is active.

Explore More

Publication

Featured researches published by Nagarajan Natarajan.

conference on information and knowledge management | 2011

Exploiting longer cycles for link prediction in signed networks

Kai-Yang Chiang; Nagarajan Natarajan; Ambuj Tewari; Inderjit S. Dhillon

We consider the problem of link prediction in signed networks. Such networks arise on the web in a variety of ways when users can implicitly or explicitly tag their relationship with other users as positive or negative. The signed links thus created reflect social attitudes of the users towards each other in terms of friendship or trust. Our first contribution is to show how any quantitative measure of social imbalance in a network can be used to derive a link prediction algorithm. Our framework allows us to reinterpret some existing algorithms as well as derive new ones. Second, we extend the approach of Leskovec et al. (2010) by presenting a supervised machine learning based link prediction method that uses features derived from longer cycles in the network. The supervised method outperforms all previous approaches on 3 networks drawn from sources such as Epinions, Slashdot and Wikipedia. The supervised approach easily scales to these networks, the largest of which has 132k nodes and 841k edges. Most real-world networks have an overwhelmingly large proportion of positive edges and it is therefore easy to get a high overall accuracy at the cost of a high false positive rate. We see that our supervised method not only achieves good accuracy for sign prediction but is also especially effective in lowering the false positive rate.

Bioinformatics | 2014

Inductive matrix completion for predicting gene–disease associations

Nagarajan Natarajan; Inderjit S. Dhillon

Motivation: Most existing methods for predicting causal disease genes rely on specific type of evidence, and are therefore limited in terms of applicability. More often than not, the type of evidence available for diseases varies—for example, we may know linked genes, keywords associated with the disease obtained by mining text, or co-occurrence of disease symptoms in patients. Similarly, the type of evidence available for genes varies—for example, specific microarray probes convey information only for certain sets of genes. In this article, we apply a novel matrix-completion method called Inductive Matrix Completion to the problem of predicting gene-disease associations; it combines multiple types of evidence (features) for diseases and genes to learn latent factors that explain the observed gene–disease associations. We construct features from different biological sources such as microarray expression data and disease-related textual data. A crucial advantage of the method is that it is inductive; it can be applied to diseases not seen at training time, unlike traditional matrix-completion approaches and network-based inference methods that are transductive. Results: Comparison with state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database shows that the proposed approach is substantially better—it has close to one-in-four chance of recovering a true association in the top 100 predictions, compared to the recently proposed Catapult method (second best) that has <15% chance. We demonstrate that the inductive method is particularly effective for a query disease with no previously known gene associations, and for predicting novel genes, i.e. genes that are previously not linked to diseases. Thus the method is capable of predicting novel genes even for well-characterized diseases. We also validate the novelty of predictions by evaluating the method on recently reported OMIM associations and on associations recently reported in the literature. Availability: Source code and datasets can be downloaded from http://bigdata.ices.utexas.edu/project/gene-disease. Contact: [email protected]

ACM Transactions on Intelligent Systems and Technology | 2011

Scalable Affiliation Recommendation using Auxiliary Networks

Vishvas Vasuki; Nagarajan Natarajan; Zhengdong Lu; Berkant Savas; Inderjit S. Dhillon

Social network analysis has attracted increasing attention in recent years. In many social networks, besides friendship links among users, the phenomenon of users associating themselves with groups or communities is common. Thus, two networks exist simultaneously: the friendship network among users, and the affiliation network between users and groups. In this article, we tackle the affiliation recommendation problem, where the task is to predict or suggest new affiliations between users and communities, given the current state of the friendship and affiliation networks. More generally, affiliations need not be community affiliations---they can be a user’s taste, so affiliation recommendation algorithms have applications beyond community recommendation. In this article, we show that information from the friendship network can indeed be fruitfully exploited in making affiliation recommendations. Using a simple way of combining these networks, we suggest two models of user-community affinity for the purpose of making affiliation recommendations: one based on graph proximity, and another using latent factors to model users and communities. We explore the affiliation recommendation algorithms suggested by these models and evaluate these algorithms on two real-world networks, Orkut and Youtube. In doing so, we motivate and propose a way of evaluating recommenders, by measuring how good the top 50 recommendations are for the average user, and demonstrate the importance of choosing the right evaluation strategy. The algorithms suggested by the graph proximity model turn out to be the most effective. We also introduce scalable versions of these algorithms, and demonstrate their effectiveness. This use of link prediction techniques for the purpose of affiliation recommendation is, to our knowledge, novel.

PLOS ONE | 2013

Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses

U. Martin Singh-Blom; Nagarajan Natarajan; Ambuj Tewari; John O. Woods; Inderjit S. Dhillon; Edward M. Marcotte

Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called Catapult (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas Catapult is better suited to correctly identifying gene-trait associations overall. The authors want to thank Jon Laurent and Kris McGary for some of the data used, and Li and Patra for making their code available. Most of Ambuj Tewaris contribution to this work happened while he was a postdoctoral fellow at the University of Texas at Austin.

conference on recommender systems | 2013

Which app will you use next?: collaborative filtering with interactional context

Nagarajan Natarajan; Donghyuk Shin; Inderjit S. Dhillon

The application a smart phone user will launch next intuitively depends on the sequence of apps used recently. More generally, when users interact with systems such as shopping websites or online radio, they click on items that are of interest in the current context. We call the sequence of clicks made in the current session interactional context. It is desirable for a recommender system to use the context set by the user to update recommendations. Most current context-aware recommender systems focus on a relatively less dynamic representational context defined by attributes such as season, location and tastes. In this paper, we study the problem of collaborative filtering with interactional context, where the goal is to make personalized and dynamic recommendations to a user engaged in a session. To this end, we propose the methodname algorithm that works in two stages. First, users are clustered by their transition behavior (one-step Markov transition probabilities between items), and cluster-level Markov models are computed. Then personalized PageRank is computed for a given user on the corresponding cluster Markov graph, with a personalization vector derived from the current context. We give an interpretation of the second stage of the algorithm as adding an appropriate context bias, in addition to click bias (or rating bias), to a classical neighborhood-based collaborative filtering model, where the neighborhood is determined from a Markov graph. Experimental results on two real-life datasets demonstrate the superior performance of our algorithm, where we achieve at least 20% (up to 37%) improvement over competitive methods in the recall level at top-20.

conference on recommender systems | 2010

Affiliation recommendation using auxiliary networks

Vishvas Vasuki; Nagarajan Natarajan; Zhengdong Lu; Inderjit S. Dhillon

Social network analysis has attracted increasing attention in recent years. In many social networks, besides friendship links amongst users, the phenomenon of users associating themselves with groups or communities is common. Thus, two networks exist simultaneously: the friendship network among users, and the affiliation network between users and groups. In this paper, we tackle the affiliation recommendation problem, where the task is to predict or suggest new affiliations between users and communities, given the current state of the friendship and affiliation networks. More generally, affiliations need not be community affiliations - they can be a users taste, so affiliation recommendation algorithms have applications beyond community recommendation. In this paper, we show that information from the friendship network can indeed be fruitfully exploited in making affiliation recommendations. Using a simple way of combining these networks, we suggest two models of user-community affinity for the purpose of making affiliation recommendations: one based on graph proximity, and another using latent factors to model users and communities. We explore the two classes of affiliation recommendation algorithms suggested by these models. We evaluate these algorithms on two real world networks - Orkut and Youtube. In doing so, we motivate and propose a way of evaluating recommenders, by measuring how good the top 50 recommendations are for the average user, and demonstrate the importance of choosing the right evaluation strategy. The algorithms suggested by the graph proximity model turn out to be the most effective and efficient. This use of link prediction techniques for the purpose of affiliation recommendation is, to our knowledge, novel.

advances in social networks analysis and mining | 2013

Community detection in content-sharing social networks

Nagarajan Natarajan; Prithviraj Sen; Vineet Chaoji

Network structure and content in microblogging sites like Twitter influence each other - user A on Twitter follows user B for the tweets that B posts on the network, and A may then re-tweet the content shared by B to his/her own followers. In this paper, we propose a probabilistic model to jointly model link communities and content topics by leveraging both the social graph and the content shared by users. We model a community as a distribution over users, use it as a source for topics of interest, and jointly infer both communities and topics using Gibbs sampling. While modeling communities using the social graph, or modeling topics using content have received a great deal of attention, a few recent approaches try to model topics in content-sharing platforms using both content and social graph. Our work differs from the existing generative models in that we explicitly model the social graph of users along with the user-generated content, mimicking how the two entities co-evolve in content-sharing platforms. Recent studies have found Twitter to be more of a content-sharing network and less a social network, and it seems hard to detect tightly knit communities from the follower-followee links. Still, the question of whether we can extract Twitter communities using both links and content is open. In this paper, we answer this question in the affirmative. Our model discovers coherent communities and topics, as evinced by qualitative results on sub-graphs of Twitter users. Furthermore, we evaluate our model on the task of predicting follower-followee links. We show that joint modeling of links and content significantly improves link prediction performance on a sub-graph of Twitter (consisting of about 0.7 million users and over 27 million tweets), compared to generative models based on only structure or only content and paths-based methods such as Katz.

ieee international workshop on computational advances in multi sensor adaptive processing | 2015

PU matrix completion with graph information

Nagarajan Natarajan; Nikhil Rao; Inderjit S. Dhillon

Motivated by applications in recommendation systems and bioinformatics, we consider the problem of completing a low rank, partially observed binary matrix with graph information. We show that the corresponding problem can be set up in a positive and unlabeled data learning (referred to as PU learning in literature) framework. We make connections to convex optimization and show that existing greedy methods can be used to solve the problem. Experiments on simulated data as well as gene-disease associations data from bioinformatics show that using graphs, and adapting matrix completion in the PU learning setting, yield advantages over the standard binary matrix completion.

neural information processing systems | 2013