Suju Rajan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Suju Rajan is active.

Explore More

Publication

Featured researches published by Suju Rajan.

international world wide web conferences | 2009

Large scale multi-label classification via metalabeler

Lei Tang; Suju Rajan; Vijay K. Narayanan

The explosion of online content has made the management of such content non-trivial. Web-related tasks such as web page categorization, news filtering, query categorization, tag recommendation, etc. often involve the construction of multi-label categorization systems on a large scale. Existing multi-label classification methods either do not scale or have unsatisfactory performance. In this work, we propose MetaLabeler to automatically determine the relevant set of labels for each instance without intensive human involvement or expensive cross-validation. Extensive experiments conducted on benchmark data show that the MetaLabeler tends to outperform existing methods. Moreover, MetaLabeler scales to millions of multi-labeled instances and can be deployed easily. This enables us to apply the MetaLabeler to a large scale query categorization problem in Yahoo!, yielding a significant improvement in performance.

knowledge discovery and data mining | 2015

Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC

Sungjin Ahn; Anoop Korattikara; Nathan Liu; Suju Rajan; Max Welling

Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid over-fitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distributed Stochastic Gradient Langevin Dynamics, can not only match the prediction accuracy of standard MCMC methods like Gibbs sampling, but at the same time is as fast and simple as stochastic gradient descent. In our experiments, we show that our algorithm can achieve the same level of prediction accuracy as Gibbs sampling an order of magnitude faster. We also show that our method reduces the prediction error as fast as distributed stochastic gradient descent, achieving a 4.1% improvement in RMSE for the Netflix dataset and an 1.8% for the Yahoo music dataset.

web search and data mining | 2011

Scalable clustering of news search results

Choon Hui Teo; Suju Rajan; Kunal Punera; Byron Dom; Alexander J. Smola; Yi Chang; Zhaohui Zheng

In this paper, we present a system for clustering the search results of a news search engine. The news search interface includes the relevant news articles to a given query organized in terms of related news stories. Here each cluster corresponds to a news story and the news articles are clustered into stories. We present a system that clusters the search results of a news search system in a fast and scalable manner. The clustering system is organized into three components including offline clustering, incremental clustering and realtime clustering. We propose novel techniques for clustering the search results in realtime. The experimental results with large collections of news documents reveal that our system is both scalable and also achieves good accuracy in clustering the news search results.

international world wide web conferences | 2010

A large-scale active learning system for topical categorization on the web

Suju Rajan; Dragomir Yankov; Scott Gaffney; Adwait Ratnaparkhi

Many web applications such as ad matching systems, vertical search engines, and page categorization systems require the identification of a particular type or class of pages on the Web. The sheer number and diversity of the pages on the Web, however, makes the problem of obtaining a good sample of the class of interest hard. In this paper, we describe a successfully deployed end-to-end system that starts from a biased training sample and makes use of several state-of-the-art machine learning algorithms working in tandem, including a powerful active learning component, in order to achieve a good classification system. The system is evaluated on traffic from a real-world ad-matching platform and is shown to achieve high categorization effectiveness with a significant reduction in editorial effort and labeling time.

knowledge discovery and data mining | 2015

Building Discriminative User Profiles for Large-scale Content Recommendation

Erheng Zhong; Nathan Liu; Yue Shi; Suju Rajan

Content recommendation systems are typically based on one of the following paradigms: user based customization, or recommendations based on either collaborative filtering or low rank matrix factorization methods, or with systems that impute user interest profiles based on content browsing behavior and retrieve items similar to the interest profiles. All of these systems have a distinct disadvantage, namely data sparsity and cold-start on items or users. Furthermore, very few content recommendation solutions explicitly model the wealth of information in implicit negative feedback from the users. In this paper, we propose a hybrid solution that makes use of a latent factor model to infer user interest vectors. The hybrid approach enables us to overcome both the data sparsity and cold-start problems. Our proposed method is learned purely on implicit user feedback, both positive and negative. Exploiting the information in the negative feedback allows the user profiles generated to be discriminative. We also provide a Map/Reduce framework based implementation that enables scaling our solution to real-world recommendation problems. We demonstrate the efficacy of our proposed approach with both offline experiments and A/B tests on live traffic on Yahoo properties.

conference on information and knowledge management | 2016

Scaling Factorization Machines with Parameter Server

Erheng Zhong; Yue Shi; Nathan Liu; Suju Rajan

Factorization Machines (FM) have been recognized as an effective learning paradigm for incorporating complex relations to improve item recommendation in recommender systems. However, one open issue of FM lies in its factorized representation (latent factors) for each feature in the observed feature space, a characteristic often resulting in a large parameter space. Therefore, training FM (in other words, learning a large number of parameters in FM) is a computationally expensive task. Our work targets to improve the scalability of FM by building it in a distributed environment. We propose a new system framework that integrates Parameter Server (PS) with the Map/Reduce (MR) framework. In addition to the data parallelism achieved via MR, our framework particularly benefits from PS for model parallelism, a critical characteristic for learning with a large number of parameters in FM. We further address two specific challenges in our system, namely, communication cost and parameter update collision. Through both offline and online experiments on recommendation tasks, we demonstrate that the proposed system framework succeeds in scaling up FM for very large datasets, while it also maintains competitive performance on recommendation quality compared to alternative baselines.

web search and data mining | 2016

Geographic Segmentation via Latent Poisson Factor Model

Rose Yu; Andrew Gelfand; Suju Rajan; Cyrus Shahabi; Yan Liu

Discovering latent structures in spatial data is of critical importance to understanding the user behavior of location-based services. In this paper, we study the problem of geographic segmentation of spatial data, which involves dividing a collection of observations into distinct geo-spatial regions and uncovering abstract correlation structures in the data. We introduce a novel, Latent Poisson Factor (LPF) model to describe spatial count data. The model describes the spatial counts as a Poisson distribution with a mean that factors over a joint item-location latent space. The latent factors are constrained with weak labels to help uncover interesting spatial dependencies. We study the LPF model on a mobile app usage data set and a news article readership data set. We empirically demonstrate its effectiveness on a variety of prediction tasks on these two data sets.

international conference on data mining | 2010

Integer Programming for Multi-class Active Learning

Dragomir Yankov; Suju Rajan; Adwait Ratnaparkhi

Active learning has been demonstrated to be a powerful tool for improving the effectiveness of binary classifiers. It iteratively identifies informative unlabeled examples which after labeling are used to augment the initial training set. Adapting the procedure to large-scale, multi-class classification problems, however, poses certain challenges. For instance, to guarantee improvement by the method we may need to select a large number of examples that require prohibitive labeling resources. Furthermore, the notion of informative examples also changes significantly when multiple classes are considered. In this paper we show that multi-class active learning can be cast into an integer programming framework, where a subset of examples that are informative across maximum number of classes is selected. We test our approach on several large-scale document categorization problems. We demonstrate that in the case of limited labeling resources and large number of classes the proposed method is more effective compared to other known approaches.

conference on recommender systems | 2014