[PDF] An Empirical Study of Clarifying Question-Based Systems

Abstract

Search and recommender systems that take the initiative to ask clarifying questions to better understand users' information needs are receiving increasing attention from the research community. However, to the best of our knowledge, there is no empirical study to quantify whether and to what extent users are willing or able to answer these questions. In this work, we conduct an online experiment by deploying an experimental system, which interacts with users by asking clarifying questions against a product repository. We collect both implicit interaction behavior data and explicit feedback from users showing that: (a) users are willing to answer a good number of clarifying questions (11-21 on average), but not many more than that; (b) most users answer questions until they reach the target product, but also a fraction of them stops due to fatigue or due to receiving irrelevant questions; (c) part of the users' answers (12-17%) are actually opposite to the description of the target product; while (d) most of the users (66-84%) find the question-based system helpful towards completing their tasks. Some of the findings of the study contradict current assumptions on simulated evaluations in the field, while they point towards improvements in the evaluation framework and can inspire future interactive search/recommender system designs.

Full PDF

AAn Empirical Study of Clarifying Question-Based Systems

Jie Zou

University of AmsterdamAmsterdam, The [email protected]

Evangelos Kanoulas

University of AmsterdamAmsterdam, The [email protected]

Yiqun Liu

BNRist, DCST, Tsinghua UniversityBeijing, [email protected]

ABSTRACT

Search and recommender systems that take the initiative to askclarifying questions to better understand users’ information needsare receiving increasing attention from the research community.However, to the best of our knowledge, there is no empirical studyto quantify whether and to what extent users are willing or ableto answer these questions. In this work, we conduct an online ex-periment by deploying an experimental system, which interactswith users by asking clarifying questions against a product reposi-tory. We collect both implicit interaction behavior data and explicitfeedback from users showing that: (a) users are willing to answera good number of clarifying questions (11-21 on average), but notmany more than that; (b) most users answer questions until theyreach the target product, but also a fraction of them stops dueto fatigue or due to receiving irrelevant questions; (c) part of theusers’ answers (12-17%) are actually opposite to the descriptionof the target product; while (d) most of the users (66-84%) findthe question-based system helpful towards completing their tasks.Some of the findings of the study contradict current assumptionson simulated evaluations in the field, while they point towardsimprovements in the evaluation framework and can inspire futureinteractive search/recommender system designs.

KEYWORDS

Empirical Study; Question-based Systems; Asking Clarifying Ques-tions; Conversational Search; Conversational Recommendation

ACM Reference Format:

Jie Zou, Evangelos Kanoulas, and Yiqun Liu. 2020. An Empirical Studyof Clarifying Question-Based Systems. In

Proceedings of ACM Conference(Conference’17).

ACM, New York, NY, USA, 6 pages. https://doi.org/10.475/123_4

One of the key components of conversational search and recom-mender systems [3, 9, 11] is the construction and selection of goodclarifying questions to gather item information from users in asearchable repository. Most current studies either collect and learnfrom human-to-human conversations [2, 5, 8], or create a pool ofquestions on the basis of some "anchor" text (e.g. item aspects [1, 9],

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

Conference’17, July 2017, Washington, DC, USA © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.ACM ISBN 123-4567-24-567/08/06...$15.00https://doi.org/10.475/123_4 entities [10–13], grounding text [6, 7]) that characterizes the search-able items themselves. Although the aforementioned works havedemonstrated success in helping systems better understand users,most of them evaluate algorithms by the means of simulationswhich assume users are willing to provide answers to as manyquestions as the system generates, and that users can always an-swer the questions correctly, i.e. they always know what the targetitem should look like in its finest details. On the basis of such as-sumptions, their evaluations (e.g. Bi et al. [1], Zhang et al. [9], Zouand Kanoulas [11]) focus on whether the system can place the tar-get item at a high ranking position. To the best of our knowledge,there is no empirical study to quantify whether and to what extentusers can respond to these questions, and the usefulness perceivedby users while interacting with the system.In this paper we conduct a user study by deploying an onlinequestion-based system to answer the following research questions:(1) To what extent are users willing to engage with a question-based system?(2) To what extent can users provide correct answers to thegenerated questions?(3) How useful do users perceive while interacting with a question-based system?The study is repeated under two conditions: (a) the question-based system uses an oracle to always obtain the right answer tothe questions asked, and (b) the system uses the user’s answers,even if they are imperfect, in ranking items and choosing the nextquestion to ask. We believe that answering these research questionscan help the community design better evaluation frameworks andmore robust question-based systems.

In our study, the users interact with a question-based system in thedomain of online retail. The user is answering questions promptedby the system with a “Yes”, a “No” or a “Not Sure”, in order to finda target product to buy. The architecture of our system is shown inFigure 1, with the user going through 4 steps.

Step 1: Category selection.

In this step, the users select an Amazoncategory that they feel most familiar with to fit their interests, e.g.a category from which they have purchased products before. Step 2: Target product assignment.

We randomly assign a targetproduct to the user from the selected category. The user is requestedto read the title and the description of the product carefully. Apicture of the product is also provided. Asking the users to carefullyread the description simulates a use case in which the user reallyknows what she is looking for, as opposed to an exploratory use case. Categories and dataset we used: http://jmcauley.ucsd.edu/data/amazon/links.html a r X i v : . [ c s . I R ] A ug magine that you want to buy a product. Please selectthe product category you are most familiar with (e.g.,most frequently purchased category). Step 1: Category selection

Step 1: Category selection

Imagine that you want to buy a product. Please select theproduct category you are most familiar with (e.g., mostfrequently purchased category).

Category:

Home and Kitchen

To Step 2

Step 1: Category selection

Imagine that you want to buy a product. Please select theproduct category you are most familiar with (e.g., mostfrequently purchased category).

Category:

Home and Kitchen

To Step 2

Step 2: Target product assignment

Imagine that you want to buy the target product shown below, please read theproduct title & description very carefully . After you click the ”Next step"button, you will interact with our algorithm by answering YES/NO questions:

Step 2: Target product assignment

Imagine that you want to buy the target product shown below, please read the product title and productdescription very carefully ( if not familiar with the target product or the description is not clear, you can click the"Change target product" button to be assigned a new product). After you click the "start conversational search"button, you will interact with our algorithm by answering YES/NO questions and the algorithm will take a few secondsto start the interactive search session:

Product Title:

Hog Wild Fish Sticks (Sold Individually)

Product Description:

Kids love this pre-scissors skills activity set of 1-piece chopsticks! Use the tongs with oral-motor activities. Simply set up small toys, easy-grip foods or cotton balls for kids to transferacross a midline. Styles may vary. Set of 48. Education Categories: Special Needs / Fine Motor / Scissors - Tools. UNSPSC/NIGP Codes: 6000000000-78500000

Change target

Next step

Product Title:

Hog Wild Fish Sticks (Sold Individually)

Product Description:

Step 3: Find the target product

Please answer the following algorithmically constructed questions according tothe title and description of your target product shown on the last page. Afteryou click the "next" button, the algorithm will select the next question to ask.When you wish to stop answering questions you can click the "stop" button.

Step 3: Find the target product

Yes No Not Sure

Next Stop

Please answer the following algorithmically constructed questions according to the title and description of yourtarget product shown in last page (e.g., choose 'yes' when the selected term in the question is present in the titleand description while choose 'no' when absent, choose 'not sure' when you are not sure about it). After you click the"next" button, the algorithm will take few seconds to select the next question, please wait for a while and do notclick the button twice. When you wish to stop answering questions you can click the "stop" button.Question: Is "rosewood" relevant to the product you are looking for?Ranking list of search results, from top 1 (left) - top 4(right):

Product title: Hog Wild Fish Sticks (Sold Individually): Product title: Bone & Rosewood Chopsticks: Product title: Fred & Friends Good FortuneChopsticks: Product title: 2pk Green Pot Holders/Trivet Set:

Ranking list of search results, from top 1 (left) - top 4(right):

Hog Wild Fish Sticks (Sold Individually): Bone & Rosewood Chopsticks: Fred & Friends Good Fortune Chopsticks: 2pk Green Pot Holders/Trivet Set:

Q1: Did you find our question-based system helpful towards locating the target product?Q2: Will you use such a question-based system for product search or recommendation in the future?Q3: What was your experience using the question-based system?Q4: How many questions are you willing to answer for locating your target product?Q5: Why did you click the "Stop" button to stop answering in the last step?Q6: If selected "other" in Q5, please specify:Q7: Are the generated questions easy to answer? … Step 4: Questionnaire

Step 4: Questionnaire