Narayan Bhamidipati
Yahoo!
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Narayan Bhamidipati.
knowledge discovery and data mining | 2015
Mihajlo Grbovic; Vladan Radosavljevic; Nemanja Djuric; Narayan Bhamidipati; Jaikit Savla; Varun Bhagwan; Doug Sharp
In recent years online advertising has become increasingly ubiquitous and effective. Advertisements shown to visitors fund sites and apps that publish digital content, manage social networks, and operate e-mail services. Given such large variety of internet resources, determining an appropriate type of advertising for a given platform has become critical to financial success. Native advertisements, namely ads that are similar in look and feel to content, have had great success in news and social feeds. However, to date there has not been a winning formula for ads in e-mail clients. In this paper we describe a system that leverages user purchase history determined from e-mail receipts to deliver highly personalized product ads to Yahoo Mail users. We propose to use a novel neural language-based algorithm specifically tailored for delivering effective product recommendations, which was evaluated against baselines that included showing popular products and products predicted based on co-occurrence. We conducted rigorous offline testing using a large-scale product purchase data set, covering purchases of more than 29 million users from 172 e-commerce websites. Ads in the form of product recommendations were successfully tested on online traffic, where we observed a steady 9% lift in click-through rates over other ad formats in mail, as well as comparable lift in conversion rates. Following successful tests, the system was launched into production during the holiday season of 2014.
international world wide web conferences | 2015
Nemanja Djuric; Hao Wu; Vladan Radosavljevic; Mihajlo Grbovic; Narayan Bhamidipati
We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data comprising three months of user activity logs collected on Yahoo servers. The results indicate that the proposed model can learn useful representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.
international acm sigir conference on research and development in information retrieval | 2015
Mihajlo Grbovic; Nemanja Djuric; Vladan Radosavljevic; Fabrizio Silvestri; Narayan Bhamidipati
Search engines represent one of the most popular web services, visited by more than 85% of internet users on a daily basis. Advertisers are interested in making use of this vast business potential, as very clear intent signal communicated through the issued query allows effective targeting of users. This idea is embodied in a sponsored search model, where each advertiser maintains a list of keywords they deem indicative of increased user response rate with regards to their business. According to this targeting model, when a query is issued all advertisers with a matching keyword are entered into an auction according to the amount they bid for the query, and the winner gets to show their ad. One of the main challenges is the fact that a query may not match many keywords, resulting in lower auction value, lower ad quality, and lost revenue for advertisers and publishers. Possible solution is to expand a query into a set of related queries and use them to increase the number of matched ads, called query rewriting. To this end, we propose rewriting method based on a novel query embedding algorithm, which jointly models query content as well as its context within a search session. As a result, queries with similar content and context are mapped into vectors close in the embedding space, which allows expansion of a query via simple K-nearest neighbor search in the projected space. The method was trained on more than 12 billion sessions, one of the largest corpuses reported thus far, and evaluated on both public TREC data set and in-house sponsored search data set. The results show the proposed approach significantly outperformed existing state-of-the-art, strongly indicating its benefits and the monetization potential.
international world wide web conferences | 2015
Mihajlo Grbovic; Nemanja Djuric; Vladan Radosavljevic; Narayan Bhamidipati
Determining user audience for online ad campaigns is a critical problem to companies competing in online advertising space. One of the most popular strategies is search retargeting, which involves targeting users that issued search queries related to advertisers core business, commonly specified by advertisers themselves. However, advertisers often fail to include many relevant queries, which results in suboptimal campaigns and negatively impacts revenue for both advertisers and publishers. To address this issue, we use recently proposed neural language models to learn low-dimensional, distributed query embeddings, which can be used to expand query lists with related queries through simple nearest neighbor searches in the embedding space. Experiments on real-world data set strongly suggest benefits of the approach.
IEEE Transactions on Knowledge and Data Engineering | 2009
Narayan Bhamidipati; Sankar K. Pal
Often, ranking is performed on the the basis of some scores available for each item. The existing practice for comparing scoring functions is to compare the induced rankings by one of the multitude of rank comparison methods available in the literature. We suggest that it may be better to compare the underlying scores themselves. To this end, a generalized Kendall distance is defined, which takes into consideration not only the final ordering that the two schemes produce, but also at the spacing between pairs of scores. This is shown to be equivalent to comparing the scores after fusing with another set of scores, making it theoretically interesting. A top k version of the score comparison methodology is also provided. Experimental results clearly show the advantages score comparison has over rank comparison.
systems man and cybernetics | 2007
Narayan Bhamidipati; Sankar K. Pal
A novel corpus-based method for stemmer refinement, which can provide improvement in both classification and retrieval, is described. The method models the given words as generated from a multinomial distribution over the topics available in the corpus and includes a procedurelike sequential hypothesis testing that enables grouping together distributionally similar words. The system can refine any stemmer, and its strength can be controlled with parameters that reflect the amount of tolerance to be allowed in computing the similarity between the distributions of two words. Although obtaining the morphological roots of the given words is not the primary objective, the algorithm automatically does that to some extent. Despite a huge reduction in dictionary size, classification accuracies are seen to improve significantly when the proposed system is applied on some existing stemmers for classifying 20 Newsgroups and WebKB data. The refinements obtained are also suitable for cross-corpus stemming. Regarding retrieval, its superiority is extensively demonstrated with respect to four existing methods
international conference on data mining | 2014
Nemanja Djuric; Vladan Radosavljevic; Mihajlo Grbovic; Narayan Bhamidipati
Estimating a users propensity to click on a display ad or purchase a particular item is a critical task in targeted advertising, a burgeoning online industry worth billions of dollars. Better and more accurate estimation methods result in improved online user experience, as only relevant and interesting ads are shown, and may also lead to large benefits for advertisers, as targeted users are more likely to click or make a purchase. In this paper we address this important problem, and propose an approach for improved estimation of ad click or conversion probability based on a sequence of users online actions, modeled using Hidden Conditional Random Fields (HCRF) model. In addition, in order to address the sparsity issue at the input side of the HCRF model, we propose to learn distributed, low-dimensional representations of user actions through a directed skip-gram, a neural architecture suitable for sequential data. Experimental results on a real-world data set comprising thousands of user sessions collected at Yahoo servers clearly indicate the benefits and the potential of the proposed approach, which outperformed competing state-of-the-art algorithms and obtained significant improvements in terms of retrieval measures.
international world wide web conferences | 2011
Michael Grabchak; Narayan Bhamidipati; Rushi Bhatt; Dinesh Garg
Stochastic knapsack problems deal with selecting items with potentially random sizes and rewards so as to maximize the total reward while satisfying certain capacity constraints. A novel variant of this problem, where items are worthless unless collected in bundles, is introduced here. This setup is similar to the Groupon model, where a deal is off unless a minimum number of users sign up for it. Since the optimal algorithm to solve this problem is not practical, several adaptive greedy approaches with reasonable time and memory requirements are studied in detail - theoretically, as well as, experimentally. Worst case performance guarantees are provided for some of these greedy algorithms, while results of experimental evaluation demonstrate that they are much closer to optimal than what the theoretical bounds suggest. Applications include optimizing for online advertising pricing models where advertisers pay only when certain goals, in terms of clicks or conversions, are met. We perform extensive experiments for the situation where there are between two and five ads. For typical ad conversion rates, the greedy policy of selecting items having the highest individual expected reward obtains a value within 5% of optimal over 95% of the time for a wide selection of parameters.
international conference on pattern recognition | 2006
Narayan Bhamidipati; Sankar K. Pal
Several methods for comparing rankings are available. These rankings are generally performed on the basis of some scores available for each item. We suggest that it may be better to compare the underlying scores themselves. To this effect, a metric has been provided which is equivalent to comparing the scores after fusing with another set of scores, making it theoretically interesting. This metric is a generalization of Kendall distance. In the present article, we assume the distribution of the scores to be fused with to be unknown. Some characteristics of the proposed methodology are studied and preliminary experimental results are reported
international world wide web conferences | 2016
Vladan Radosavljevic; Mihajlo Grbovic; Nemanja Djuric; Narayan Bhamidipati; Daneo Zhang; Jack Wang; Jiankai Dang; Haiying Huang; Ananth Nagarajan; Peiji Chen
Last decade has witnessed a tremendous expansion of mobile devices, which brought an unprecedented opportunity to reach a large number of mobile users at any point in time. This resulted in a surge of interest of mobile operators and ad publishers to understand usage patterns of mobile apps and allow more relevant content recommendations. Due to a large input space, a critical step in understanding app usage patterns is reducing sparseness by classifying apps into predefined interest taxonomies. However, besides short name and noisy description majority of apps have very limited information available, which makes classification a challenging task. We address this issue and present a novel method to classify apps into interest categories by: 1) embedding apps into low-dimensional space using a neural language model applied on smartphone logs; and 2) applying k-nearest-neighbors classification in the embedding space. To validate the method we run experiments on more than one billion device logs covering hundreds of thousands of apps. To the best of our knowledge this is the first app categorization study at this scale. Empirical results show that the proposed method outperforms the current state-of-the-art.