Samuel Ieong | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Samuel Ieong is active.

Explore More

Publication

Featured researches published by Samuel Ieong.

web search and data mining | 2009

Diversifying search results

Rakesh Agrawal; Sreenivas Gollapudi; Alan Halverson; Samuel Ieong

We study the problem of answering ambiguous web queries in a setting where there exists a taxonomy of information, and that both queries and documents may belong to more than one category according to this taxonomy. We present a systematic approach to diversifying results that aims to minimize the risk of dissatisfaction of the average user. We propose an algorithm that well approximates this objective in general, and is provably optimal for a natural special case. Furthermore, we generalize several classical IR metrics, including NDCG, MRR, and MAP, to explicitly account for the value of diversification. We demonstrate empirically that our algorithm scores higher in these generalized metrics compared to results produced by commercial search engines.

web search and data mining | 2012

Domain bias in web search

Samuel Ieong; Nina Mishra; Eldar Sadikov; Li Zhang

This paper uncovers a new phenomenon in web search that we call domain bias --- a users propensity to believe that a page is more relevant just because it comes from a particular domain. We provide evidence of the existence of domain bias in click activity as well as in human judgments via a comprehensive collection of experiments. We begin by studying the difference between domains that a search engine surfaces and that users click. Surprisingly, we find that despite changes in the overall distribution of surfaced domains, there has not been a comparable shift in the distribution of clicked domains. Users seem to have learned the landscape of the internet and their click behavior has thus become more predictable over time. Next, we run a blind domain test, akin to a Pepsi/Coke taste test, to determine whether domains can shift a users opinion of which page is more relevant. We find that domains can actually flip a users preference about 25% of the time. Finally, we demonstrate the existence of systematic domain preferences, even after factoring out confounding issues such as position bias and relevance, two factors that have been used extensively in past work to explain user behavior. The existence of domain bias has numerous consequences including, for example, the importance of discounting click activity from reputable domains.

international acm sigir conference on research and development in information retrieval | 2014

Time-critical search

Nina Mishra; Ryen W. White; Samuel Ieong; Eric Horvitz

We study time-critical search, where users have urgent information needs in the context of an acute problem. As examples, users may need to know how to stem a severe bleed, help a baby who is choking on a foreign object, or respond to an epileptic seizure. While time-critical situations and actions have been studied in the realm of decision-support systems, little has been done with time-critical search and retrieval, and little direct support is offered by search systems. Critical challenges with time-critical search include accurately inferring when users have urgent needs and providing relevant information that can be understood and acted upon quickly. We leverage surveys and search log data from a large mobile search provider to (a) characterize the use of search engines for time-critical situations, and (b) develop predictive models to accurately predict urgent information needs, given a query and a diverse set of features spanning topical, temporal, behavioral, and geospatial attributes. The methods and findings highlight opportunities for extending search and retrieval to consider the urgency of queries.

conference on information and knowledge management | 2011

Efficient query rewrite for structured web queries

Sreenivas Gollapudi; Samuel Ieong; Alexandros Ntoulas; Stelios Paparizos

Web search engines incorporate results from structured data sources to answer semantically rich user queries, i.e. Samsung 50 inch led tv can be answered from a table of television data. However, users are not domain experts and quite often enter values that do not match precisely the underlying data, so a literal execution will return zero results. A search engine would prefer to return at least a minimum number of results as close to the original query as possible while providing a time-bound execution guarantee. In this paper, we formalize these requirements, show the problem is NP-Hard and present approximation algorithms that produce rewrites that work in practice. We empirically validate our algorithms on large-scale data from a major search engine.

knowledge discovery and data mining | 2012

Aggregating web offers to determine product prices

Rakesh Agrawal; Samuel Ieong

Historical prices are important information that can help consumers decide whether the time is right to buy a product. They provide both a context to the users, and facilitate the use of prediction algorithms for forecasting future prices. To produce a representative price history, one needs to consider all offers for the product. However, matching offers to a product is a challenging problem, and mismatches could lead to glaring errors in price history. We propose a principled approach to filter out erroneous matches based on a probabilistic model of prices. We give an efficient algorithm for performing inference that takes advantage of the structure of the problem. We evaluate our results empirically using merchant offers collected from a search engine, and measure the proximity of the price history generated by our approach to the true price history. Our method outperforms alternatives based on robust statistics both in tracking the true price levels and the true price trends.

knowledge discovery and data mining | 2011

Ameliorating buyer's remorse

Rakesh Agrawal; Samuel Ieong; Raja P. Velu

Keeping in pace with the increasing importance of commerce conducted over the Web, several e-commerce websites now provide admirable facilities for helping consumers decide what product to buy and where to buy it. However, since the prices of durable and high-tech products generally fall over time, a buyer of such products is often faced with a dilemma: Should she buy the product now or wait for cheaper prices? We present the design and implementation of Prodcast, an experimental system whose goal is to help consumers decide when to buy a product. The system makes use of forecasts of future prices based on price histories of the products, incorporating features such as sales volume, seasonality, and competition in making its recommendation. We describe techniques that are well-suited for this task and present a comprehensive evaluation of their relative merits using retail sales data for electronic products. Our back-testing of the system indicates that the system is capable of helping consumers time their purchase, resulting in significant savings to them.

conference on information and knowledge management | 2011

Timing when to buy

Rakesh Agrawal; Samuel Ieong; Raja P. Velu

Most e-commerce sites to-date have focused on helping consumers decide what to buy and where to buy. We study the complementary question of helping consumers decide when to buy, focusing on consumer durables. We introduce a utility-based model for evaluating different approaches to this question. We focus on how best to make use of forecasts in making recommendations, and propose three natural strategies. We establish a relationship between these strategies, and show that one of them is optimal. We conduct a large-scale experimental study to test the performance and robustness of these strategies. Across a wide range of conditions, the best strategy obtains 90% of the maximum possible gains.

international acm sigir conference on research and development in information retrieval | 2011

Indexing strategies for graceful degradation of search quality

Shuai Ding; Sreenivas Gollapudi; Samuel Ieong; Krishnaram Kenthapadi; Alexandros Ntoulas

Large web search engines process billions of queries each day over tens of billions of documents with often very stringent requirements for a users search experience, in particular, low latency and highly relevant search results. Index generation and serving are key to satisfying both these requirements. For example, the load to search engines can vary drastically when popular events happen around the world. In the case when the load is exceeding what the search engine can serve, queries will get dropped. This results in an un- graceful degradation in search quality. Another example that could increase the query load and affect the users search experience are ambiguous queries which often result in the execution of multiple query alterations in the back end. In this paper, we look into the problem of designing robust indexing strategies, i.e. strategies that allow for a graceful degradation of search quality in both the above scenarios. We study the problems of index generation and serving using the notions of document allocation, server selection, and document replication. We explore the space of efficient algorithms for these problems and empirically corroborate with existing theory that it is hard to optimally solve the alocation and selection problems without any replication. We propose a greedy replication algorithm and study its performance under different choices of allocation and selection. Further, we show hat under random selection and allocation, our algorithm is optimal.

conference on information and knowledge management | 2012

Structured query reformulations in commerce search

Sreenivas Gollapudi; Samuel Ieong; Anitha Kannan

Recent work in commerce search has shown that understanding the semantics in user queries enables more effective query analysis and retrieval of relevant products. However, due to lack of sufficient domain knowledge, user queries often include terms that cannot be mapped directly to any product attribute. For example, a user looking for designer handbags might start with such a query because she is not familiar with the manufacturers, the price ranges, and/or the material that gives a handbag designer appeal. Current commerce search engines treat terms such as designer as keywords and attempt to match them to contents such as product reviews and product descriptions, often resulting in poor user experience. In this study, we propose to address this problem by reformulating queries involving terms such as designer, which we call modifiers, to queries that specify precise product attributes. We learn to rewrite the modifiers to attribute values by analyzing user behavior and leveraging structured data sources such as the product catalog that serves the queries. We first produce a probabilistic mapping between the modifiers and attribute values based on user behavioral data. These initial associations are then used to retrieve products from the catalog, over which we infer sets of attribute values that best describe the semantics of the modifiers. We evaluate the effectiveness of our approach based on a comprehensive Mechanical Turk study. We find that users agree with the attribute values selected by our approach in about 95% of the cases and they prefer the results surfaced for our reformulated queries to ones for the original queries in 87% of the time.

web search and data mining | 2011

Optimizing merchant revenue with rebates

Rakesh Agrawal; Samuel Ieong; Raja P. Velu

We study an online advertising model in which the merchant reimburses a portion of the transacted amount to the customer in a form of rebate. The customer referral and the rebate transfer might be mediated by a search engine. We investigate how the merchants can set rebate rates across different products to maximize their revenue. We consider two widely used demand models in economics---linear and log-linear---and explain how the effects of rebates can be incorporated in these models. Treating the parameters estimated as inputs to a revenue maximization problem, we develop convex optimization formulations of the problem and combinatorial algorithms for solving them. We validate our modeling assumptions using real transaction data. We conduct an extensive simulation study to evaluate the performance of our approach on maximizing revenue, and found that it generates significantly higher revenues for merchants compared to other rebate strategies. The rebate rates selected are extremely close to the optimal rates selected in hindsight.

Explore More