Nemanja Djuric
Yahoo!
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nemanja Djuric.
knowledge discovery and data mining | 2015
Mihajlo Grbovic; Vladan Radosavljevic; Nemanja Djuric; Narayan Bhamidipati; Jaikit Savla; Varun Bhagwan; Doug Sharp
In recent years online advertising has become increasingly ubiquitous and effective. Advertisements shown to visitors fund sites and apps that publish digital content, manage social networks, and operate e-mail services. Given such large variety of internet resources, determining an appropriate type of advertising for a given platform has become critical to financial success. Native advertisements, namely ads that are similar in look and feel to content, have had great success in news and social feeds. However, to date there has not been a winning formula for ads in e-mail clients. In this paper we describe a system that leverages user purchase history determined from e-mail receipts to deliver highly personalized product ads to Yahoo Mail users. We propose to use a novel neural language-based algorithm specifically tailored for delivering effective product recommendations, which was evaluated against baselines that included showing popular products and products predicted based on co-occurrence. We conducted rigorous offline testing using a large-scale product purchase data set, covering purchases of more than 29 million users from 172 e-commerce websites. Ads in the form of product recommendations were successfully tested on online traffic, where we observed a steady 9% lift in click-through rates over other ad formats in mail, as well as comparable lift in conversion rates. Following successful tests, the system was launched into production during the holiday season of 2014.
international world wide web conferences | 2015
Nemanja Djuric; Hao Wu; Vladan Radosavljevic; Mihajlo Grbovic; Narayan Bhamidipati
We consider the problem of learning distributed representations for documents in data streams. The documents are represented as low-dimensional vectors and are jointly learned with distributed vector representations of word tokens using a hierarchical framework with two embedded neural language models. In particular, we exploit the context of documents in streams and use one of the language models to model the document sequences, and the other to model word sequences within them. The models learn continuous vector representations for both word tokens and documents such that semantically similar documents and words are close in a common vector space. We discuss extensions to our model, which can be applied to personalized recommendation and social relationship mining by adding further user layers to the hierarchy, thus learning user-specific vectors to represent individual preferences. We validated the learned representations on a public movie rating data set from MovieLens, as well as on a large-scale Yahoo News data comprising three months of user activity logs collected on Yahoo servers. The results indicate that the proposed model can learn useful representations of both documents and word tokens, outperforming the current state-of-the-art by a large margin.
international acm sigir conference on research and development in information retrieval | 2015
Mihajlo Grbovic; Nemanja Djuric; Vladan Radosavljevic; Fabrizio Silvestri; Narayan Bhamidipati
Search engines represent one of the most popular web services, visited by more than 85% of internet users on a daily basis. Advertisers are interested in making use of this vast business potential, as very clear intent signal communicated through the issued query allows effective targeting of users. This idea is embodied in a sponsored search model, where each advertiser maintains a list of keywords they deem indicative of increased user response rate with regards to their business. According to this targeting model, when a query is issued all advertisers with a matching keyword are entered into an auction according to the amount they bid for the query, and the winner gets to show their ad. One of the main challenges is the fact that a query may not match many keywords, resulting in lower auction value, lower ad quality, and lost revenue for advertisers and publishers. Possible solution is to expand a query into a set of related queries and use them to increase the number of matched ads, called query rewriting. To this end, we propose rewriting method based on a novel query embedding algorithm, which jointly models query content as well as its context within a search session. As a result, queries with similar content and context are mapped into vectors close in the embedding space, which allows expansion of a query via simple K-nearest neighbor search in the projected space. The method was trained on more than 12 billion sessions, one of the largest corpuses reported thus far, and evaluated on both public TREC data set and in-house sponsored search data set. The results show the proposed approach significantly outperformed existing state-of-the-art, strongly indicating its benefits and the monetization potential.
knowledge discovery and data mining | 2011
Zhuang Wang; Nemanja Djuric; Koby Crammer; Slobodan Vucetic
Support Vector Machines (SVMs) are among the most popular and successful classification algorithms. Kernel SVMs often reach state-of-the-art accuracies, but suffer from the curse of kernelization due to linear model growth with data size on noisy data. Linear SVMs have the ability to efficiently learn from truly large data, but they are applicable to a limited number of domains due to low representational power. To fill the representability and scalability gap between linear and nonlinear SVMs, we propose the Adaptive Multi-hyperplane Machine (AMM) algorithm that accomplishes fast training and prediction and has capability to solve nonlinear classification problems. AMM model consists of a set of hyperplanes (weights), each assigned to one of the multiple classes, and predicts based on the associated class of the weight that provides the largest prediction. The number of weights is automatically determined through an iterative algorithm based on the stochastic gradient descent algorithm which is guaranteed to converge to a local optimum. Since the generalization bound decreases with the number of weights, a weight pruning mechanism is proposed and analyzed. The experiments on several large data sets show that AMM is nearly as fast during training and prediction as the state-of-the-art linear SVM solver and that it can be orders of magnitude faster than kernel SVM. In accuracy, AMM is somewhere between linear and kernel SVMs. For example, on an OCR task with 8 million highly dimensional training examples, AMM trained in 300 seconds on a single-core processor had 0.54% error rate, which was significantly lower than 2.03% error rate of a linear SVM trained in the same time and comparable to 0.43% error rate of a kernel SVM trained in 2 days on 512 processors. The results indicate that AMM could be an attractive option when solving large-scale classification problems. The software is available at www.dabi.temple.edu/~vucetic/AMM.html.
international world wide web conferences | 2015
Mihajlo Grbovic; Nemanja Djuric; Vladan Radosavljevic; Narayan Bhamidipati
Determining user audience for online ad campaigns is a critical problem to companies competing in online advertising space. One of the most popular strategies is search retargeting, which involves targeting users that issued search queries related to advertisers core business, commonly specified by advertisers themselves. However, advertisers often fail to include many relevant queries, which results in suboptimal campaigns and negatively impacts revenue for both advertisers and publishers. To address this issue, we use recently proposed neural language models to learn low-dimensional, distributed query embeddings, which can be used to expand query lists with related queries through simple nearest neighbor searches in the embedding space. Experiments on real-world data set strongly suggest benefits of the approach.
Transportation Research Record | 2011
Nemanja Djuric; Vladan Radosavljevic; Vladimir Coric; Slobodan Vucetic
This paper explores the application of the recently proposed continuous conditional random fields (CCRF) to travel forecasting. CCRF is a flexible, probabilistic framework that can seamlessly incorporate multiple traffic predictors and exploit spatial and temporal correlations inherently present in traffic data. In addition to improving prediction accuracy, the probabilistic approach provides information about prediction uncertainty. Moreover, information about the relative importance of particular predictor and spatial–temporal correlations can be easily extracted from the model. CCRF is fault-tolerant and can provide predictions even when some observations are missing. Several CCRF models were applied to the problem of travel speed prediction in a range from 10 to 60 min ahead and evaluated on loop detector data from a 5.71-mi section of I-35W in Minneapolis, Minnesota. Several CCRF models, with increasing levels of complexity, are proposed to assess performance of the method better. When these CCRF models were compared with the linear regression model, they reduced the mean absolute error by around 4%. The results imply that modeling spatial and temporal neighborhoods in traffic data and combining various baseline predictors under the CCRF framework can be beneficial.
international conference on data mining | 2014
Nemanja Djuric; Vladan Radosavljevic; Mihajlo Grbovic; Narayan Bhamidipati
Estimating a users propensity to click on a display ad or purchase a particular item is a critical task in targeted advertising, a burgeoning online industry worth billions of dollars. Better and more accurate estimation methods result in improved online user experience, as only relevant and interesting ads are shown, and may also lead to large benefits for advertisers, as targeted users are more likely to click or make a purchase. In this paper we address this important problem, and propose an approach for improved estimation of ad click or conversion probability based on a sequence of users online actions, modeled using Hidden Conditional Random Fields (HCRF) model. In addition, in order to address the sparsity issue at the input side of the HCRF model, we propose to learn distributed, low-dimensional representations of user actions through a directed skip-gram, a neural architecture suitable for sequential data. Experimental results on a real-world data set comprising thousands of user sessions collected at Yahoo servers clearly indicate the benefits and the potential of the proposed approach, which outperformed competing state-of-the-art algorithms and obtained significant improvements in terms of retrieval measures.
international acm sigir conference on research and development in information retrieval | 2016
Mihajlo Grbovic; Nemanja Djuric; Vladan Radosavljevic; Fabrizio Silvestri; Ricardo A. Baeza-Yates; Andrew Feng; Erik Ordentlich; Lee Yang; Gavin Owens
Sponsored search represents a major source of revenue for web search engines. The advertising model brings a unique possibility for advertisers to target direct user intent communicated through a search query, usually done by displaying their ads alongside organic search results for queries deemed relevant to their products or services. However, due to a large number of unique queries, it is particularly challenging for advertisers to identify all relevant queries. For this reason search engines often provide a service of advanced matching, which automatically finds additional relevant queries for advertisers to bid on. We present a novel advance match approach based on the idea of semantic embeddings of queries and ads. The embeddings were learned using a large data set of user search sessions, consisting of search queries, clicked ads and search links, while utilizing contextual information such as dwell time and skipped ads. To address the large-scale nature of our problem, both in terms of data and vocabulary size, we propose a novel distributed algorithm for training of the embeddings. Finally, we present an approach for overcoming a cold-start problem associated with new ads and queries. We report results of editorial evaluation and online tests on actual search traffic. The results show that our approach significantly outperforms baselines in terms of relevance, coverage and incremental revenue. Lastly, as part of this study, we open sourced query embeddings that can be used to advance the field.
IEEE Geoscience and Remote Sensing Letters | 2015
Nemanja Djuric; Vladan Radosavljevic; Zoran Obradovic; Slobodan Vucetic
We present a Gaussian conditional random field model for the aggregation of aerosol optical depth (AOD) retrievals from multiple satellite instruments into a joint retrieval. The model provides aggregated retrievals with higher accuracy and coverage than any of the individual instruments while also providing an estimation of retrieval uncertainty. The proposed model finds an optimal temporally smoothed combination of individual retrievals that minimizes the root-mean-squared error of AOD retrieval. We evaluated the model on five years (2006-2010) of satellite data over North America from five instruments (Aqua and Terra MODIS, MISR, SeaWiFS, and the Ozone Monitoring Instrument), collocated with ground-based Aerosol Robotic Network ground-truth AOD readings, clearly showing that the aggregation of different sources leads to improvements in the accuracy and coverage of AOD retrievals.
Scientific Reports | 2016
Djordje Gligorijevic; Jelena Stojanovic; Nemanja Djuric; Vladan Radosavljevic; Mihajlo Grbovic; Rob J. Kulathinal; Zoran Obradovic
Data-driven phenotype analyses on Electronic Health Record (EHR) data have recently drawn benefits across many areas of clinical practice, uncovering new links in the medical sciences that can potentially affect the well-being of millions of patients. In this paper, EHR data is used to discover novel relationships between diseases by studying their comorbidities (co-occurrences in patients). A novel embedding model is designed to extract knowledge from disease comorbidities by learning from a large-scale EHR database comprising more than 35 million inpatient cases spanning nearly a decade, revealing significant improvements on disease phenotyping over current computational approaches. In addition, the use of the proposed methodology is extended to discover novel disease-gene associations by including valuable domain knowledge from genome-wide association studies. To evaluate our approach, its effectiveness is compared against a held-out set where, again, it revealed very compelling results. For selected diseases, we further identify candidate gene lists for which disease-gene associations were not studied previously. Thus, our approach provides biomedical researchers with new tools to filter genes of interest, thus, reducing costly lab studies.