Imed Zitouni | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Imed Zitouni is active.

Explore More

Publication

Featured researches published by Imed Zitouni.

web search and data mining | 2014

Modeling dwell time to predict click-level satisfaction

Youngho Kim; Ahmed Hassan; Ryen W. White; Imed Zitouni

Clicks on search results are the most widely used behavioral signals for predicting search satisfaction. Even though clicks are correlated with satisfaction, they can also be noisy. Previous work has shown that clicks are affected by position bias, caption bias, and other factors. A popular heuristic for reducing this noise is to only consider clicks with long dwell time, usually equaling or exceeding 30 seconds. The rationale is that the more time a searcher spends on a page, the more likely they are to be satisfied with its contents. However, having a single threshold value assumes that users need a fixed amount of time to be satisfied with any result click, irrespective of the page chosen. In reality, clicked pages can differ significantly. Pages have different topics, readability levels, content lengths, etc. All of these factors may affect the amount of time spent by the user on the page. In this paper, we study the effect of different page characteristics on the time needed to achieve search satisfaction. We show that the topic of the page, its length and its readability level are critical in determining the amount of dwell time needed to predict whether any click is associated with satisfaction. We propose a method to model and provide a better understanding of click dwell time. We estimate click dwell time distributions for SAT (satisfied) or DSAT (dissatisfied) clicks for different click segments and use them to derive features to train a click-level satisfaction model. We compare the proposed model to baseline methods that use dwell time and other search performance predictors as features, and demonstrate that the proposed model achieves significant improvements.

international world wide web conferences | 2015

Automatic Online Evaluation of Intelligent Assistants

Jiepu Jiang; Ahmed Hassan Awadallah; Rosie Jones; Umut Ozertem; Imed Zitouni; Ranjitha Gurunath Kulkarni; Omar Zia Khan

Voice-activated intelligent assistants, such as Siri, Google Now, and Cortana, are prevalent on mobile devices. However, it is challenging to evaluate them due to the varied and evolving number of tasks supported, e.g., voice command, web search, and chat. Since each task may have its own procedure and a unique form of correct answers, it is expensive to evaluate each task individually. This paper is the first attempt to solve this challenge. We develop consistent and automatic approaches that can evaluate different tasks in voice-activated intelligent assistants. We use implicit feedback from users to predict whether users are satisfied with the intelligent assistant as well as its components, i.e., speech recognition and intent classification. Using this approach, we can potentially evaluate and compare different tasks within and across intelligent assistants ac-cording to the predicted user satisfaction rates. Our approach is characterized by an automatic scheme of categorizing user-system interaction into task-independent dialog actions, e.g., the user is commanding, selecting, or confirming an action. We use the action sequence in a session to predict user satisfaction and the quality of speech recognition and intent classification. We also incorporate other features to further improve our approach, including features derived from previous work on web search satisfaction prediction, and those utilizing acoustic characteristics of voice requests. We evaluate our approach using data collected from a user study. Results show our approach can accurately identify satisfactory and unsatisfactory sessions.

international world wide web conferences | 2016

Detecting Good Abandonment in Mobile Search

Kyle Williams; Julia Kiseleva; Aidan C. Crook; Imed Zitouni; Ahmed Hassan Awadallah; Madian Khabsa

Web search queries for which there are no clicks are referred to as abandoned queries and are usually considered as leading to user dissatisfaction. However, there are many cases where a user may not click on any search result page (SERP) but still be satisfied. This scenario is referred to as good abandonment and presents a challenge for most approaches measuring search satisfaction, which are usually based on clicks and dwell time. The problem is exacerbated further on mobile devices where search providers try to increase the likelihood of users being satisfied directly by the SERP. This paper proposes a solution to this problem using gesture interactions, such as reading times and touch actions, as signals for differentiating between good and bad abandonment. These signals go beyond clicks and characterize user behavior in cases where clicks are not needed to achieve satisfaction. We study different good abandonment scenarios and investigate the different elements on a SERP that may lead to good abandonment. We also present an analysis of the correlation between user gesture features and satisfaction. Finally, we use this analysis to build models to automatically identify good abandonment in mobile search achieving an accuracy of 75%, which is significantly better than considering query and session signals alone. Our findings have implications for the study and application of user satisfaction in search systems.

web search and data mining | 2015

Toward Predicting the Outcome of an A/B Experiment for Search Relevance

Lihong Li; Jin Young Kim; Imed Zitouni

A standard approach to estimating online click-based metrics of a ranking function is to run it in a controlled experiment on live users. While reliable and popular in practice, configuring and running an online experiment is cumbersome and time-intensive. In this work, inspired by recent successes of offline evaluation techniques for recommender systems, we study an alternative that uses historical search log to reliably predict online click-based metrics of a \emph{new} ranking function, without actually running it on live users. To tackle novel challenges encountered in Web search, variations of the basic techniques are proposed. The first is to take advantage of diversified behavior of a search engine over a long period of time to simulate randomized data collection, so that our approach can be used at very low cost. The second is to replace exact matching (of recommended items in previous work) by \emph{fuzzy} matching (of search result pages) to increase data efficiency, via a better trade-off of bias and variance. Extensive experimental results based on large-scale real search data from a major commercial search engine in the US market demonstrate our approach is promising and has potential for wide use in Web search.

international acm sigir conference on research and development in information retrieval | 2016

Predicting User Satisfaction with Intelligent Assistants

Julia Kiseleva; Kyle Williams; Ahmed Hassan Awadallah; Aidan C. Crook; Imed Zitouni; Tasos Anastasakos

There is a rapid growth in the use of voice-controlled intelligent personal assistants on mobile devices, such as Microsofts Cortana, Google Now, and Apples Siri. They significantly change the way users interact with search systems, not only because of the voice control use and touch gestures, but also due to the dialogue-style nature of the interactions and their ability to preserve context across different queries. Predicting success and failure of such search dialogues is a new problem, and an important one for evaluating and further improving intelligent assistants. While clicks in web search have been extensively used to infer user satisfaction, their significance in search dialogues is lower due to the partial replacement of clicks with voice control, direct and voice answers, and touch gestures. In this paper, we propose an automatic method to predict user satisfaction with intelligent assistants that exploits all the interaction signals, including voice commands and physical touch gestures on the device. First, we conduct an extensive user study to measure user satisfaction with intelligent assistants, and simultaneously record all user interactions. Second, we show that the dialogue style of interaction makes it necessary to evaluate the user experience at the overall task level as opposed to the query level. Third, we train a model to predict user satisfaction, and find that interaction signals that capture the user reading patterns have a high impact: when including all available interaction signals, we are able to improve the prediction accuracy of user satisfaction from 71% to 81% over a baseline that utilizes only click and query features.

international acm sigir conference on research and development in information retrieval | 2016

Is This Your Final Answer?: Evaluating the Effect of Answers on Good Abandonment in Mobile Search

Kyle Williams; Julia Kiseleva; Aidan C. Crook; Imed Zitouni; Ahmed Hassan Awadallah; Madian Khabsa

Answers on mobile search result pages have become a common way to attempt to satisfy users without them needing to click on search results. Many different types of answers exist, such as weather, flight and currency answers. Understanding the effect that these different answer types have on mobile user behavior and how they contribute to satisfaction is important for search engine evaluation. We study these two aspects by analyzing the logs of a commercial search engine and through a user study. Our results show that user click, abandonment and engagement behavior differs depending on the answer types present on a page. Furthermore, we find that satisfaction rates differ in the presence of different answer types with simple answer types, such as time zone answers, leading to more satisfaction than more complex answers, such as news answers. Our findings have implications for the study and application of user satisfaction for search systems.

IEEE Transactions on Audio, Speech, and Language Processing | 2014

Aligned-Parallel-Corpora Based Semi-Supervised Learning for Arabic Mention Detection

Imed Zitouni; Yassine Benajiba

In the last two decades, significant effort has been put into annotating linguistic resources in several languages. Despite this valiant effort, there are still many languages left that have only small amounts of such resources. The goal of this article is to present and investigate a method of propagating information (specifically mentions) from a resource-rich language such as English into a relatively less-resource language such as Arabic. We compare also this approach to its equivalent counterpart using monolingual resources. Part of the investigation is to quantify the contribution of propagating information in different conditions - based on the availability of resources in the target language. Experiments on the language pair Arabic-English show that one can achieve relatively decent performance by propagating information from a language with richer resources such as English into Arabic alone (no resources or models in the source language Arabic). Furthermore, results show that propagated features from English do help improve the Arabic system performance even when used in conjunction with all feature types built from the source language. Experiments also show that using propagated features in conjunction with lexically-derived features only (as can be obtained directly from a mention annotated corpus) brings the system performance at the one obtained in the target language by using feature derived from many linguistic resources, therefore improving the system when such resources are not available.

conference on information and knowledge management | 2016

Learning to Account for Good Abandonment in Search Success Metrics

Madian Khabsa; Aidan C. Crook; Ahmed Hassan Awadallah; Imed Zitouni; Tasos Anastasakos; Kyle Williams

Abandonment in web search has been widely used as a proxy to measure user satisfaction. Initially it was considered a signal of dissatisfaction, however with search engines moving towards providing answer-like results, a new category of abandonment was introduced and referred to as Good Abandonment. Predicting good abandonment is a hard problem and it was the subject of several previous studies. All those studies have focused, though, on predicting good abandonment in offline settings using manually labeled data. Thus, it remained a challenge how to have an online metric that accounts for good abandonment. In this work we describe how a search success metric can be augmented to account for good abandonment sessions using a machine learned metric that depends on users viewport information. We use real user traffic from millions of users to evaluate the proposed metric in an A/B experiment. We show that taking good abandonment into consideration has a significant effect on the overall performance of the online metric.

conference on information and knowledge management | 2014

Machine-Assisted Search Preference Evaluation

Ahmed Hassan Awadallah; Imed Zitouni

Information Retrieval systems are traditionally evaluated using the relevance of web pages to individual queries. Other work on IR evaluation has focused on exploring the use of preference judgments over two search result lists. Unlike traditional query-document evaluation, collecting preference judgments over two search result-lists takes the context of documents, and hence takes the interaction between search results, into consideration. Moreover, preference judgments have been shown to produce more accurate results compared to absolute judgment. On the other hand result list preference judgments have very high annotation cost. In this work, we investigate how machine learned models can assist human judges in order to collect reliable result list preference judgments at large scale with lower judgment-cost. We build novel models that can predict user preference automatically. We investigate the effect of different features on the prediction quality. We focus on predicting preferences with high confidence and show that these models can be effectively used to assist human judges resulting in significant reduction in annotation cost.

conference on information and knowledge management | 2018

Measuring User Satisfaction on Smart Speaker Intelligent Assistants Using Intent Sensitive Query Embeddings

Seyyed Hadi Hashemi; Kyle Williams; Ahmed El Kholy; Imed Zitouni; Paul A. Crook

Intelligent assistants are increasingly being used on smart speaker devices, such as Amazon Echo, Google Home, Apple Homepod, and Harmon Kardon Invoke with Cortana. Typically, user satisfaction measurement relies on user interaction signals, such as clicks and scroll movements, in order to determine if a user was satisfied. However, these signals do not exist for smart speakers, which creates a challenge for user satisfaction evaluation on these devices. In this paper, we propose a new signal, user intent, as a means to measure user satisfaction. We propose to use this signal to model user satisfaction in two ways: 1) by developing intent sensitive word embeddings and then using sequences of these intent sensitive query representations to measure user satisfaction; 2) by representing a users interactions with a smart speaker as a sequence of user intents and thus using this sequence to identify user satisfaction. Our experimental results indicate that our proposed user satisfaction models based on the intent-sensitive query representations have statistically significant improvements over several baselines in terms of common classification evaluation metrics. In particular, our proposed task satisfaction prediction model based on intent-sensitive word embeddings has a 11.81% improvement over a generative model baseline and 6.63% improvement over a user satisfaction prediction model based on Skip-gram word embeddings in terms of the F1 metric. Our findings have implications for the evaluation of Intelligent Assistant systems.

Explore More