Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Suzan Verberne is active.

Publication


Featured researches published by Suzan Verberne.


international acm sigir conference on research and development in information retrieval | 2007

Evaluating discourse-based answer extraction for why -question answering

Suzan Verberne; Lou Boves; Nelleke Oostdijk; P.A.J.M. Coppen

30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2007)


Computational Linguistics | 2010

What is not in the bag of words for why-qa?

Suzan Verberne; Lou Boves; Nelleke Oostdijk; P.A.J.M. Coppen

While developing an approach to why-QA, we extended a passage retrieval system that uses off-the-shelf retrieval technology with a re-ranking step incorporating structural information. We get significantly higher scores in terms of MRR150 (from 0.25 to 0.34) and success10. The 23 improvement that we reach in terms of MRR is comparable to the improvement reached on different QA tasks by other researchers in the field, although our re-ranking approach is based on relatively lightweight overlap measures incorporating syntactic constituents, cue words, and document structure.


Information Retrieval | 2011

Learning to rank for why-question answering

Suzan Verberne; Hans van Halteren; D.L. Theijssen; Stephan Raaijmakers; Lou Boves

In this paper, we evaluate a number of machine learning techniques for the task of ranking answers to why-questions. We use TF-IDF together with a set of 36 linguistically motivated features that characterize questions and answers. We experiment with a number of machine learning techniques (among which several classifiers and regression techniques, Ranking SVM and SVMmap) in various settings. The purpose of the experiments is to assess how the different machine learning approaches can cope with our highly imbalanced binary relevance data, with and without hyperparameter tuning. We find that with all machine learning techniques, we can obtain an MRR score that is significantly above the TF-IDF baseline of 0.25 and not significantly lower than the best score of 0.35. We provide an in-depth analysis of the effect of data imbalance and hyperparameter tuning, and we relate our findings to previous research on learning to rank for Information Retrieval.


conference of the european chapter of the association for computational linguistics | 2006

Developing an approach for why -question answering

Suzan Verberne

In the current project, we aim at developing an approach for automatically answering why-questions. We created a data collection for research, development and evaluation of a method for automatically answering why-questions (why-QA) The resulting collection comprises 395 why-questions. For each question, the source document and one or two user-formulated answers are available in the data set. The resulting data set is of importance for our research and it will contribute to and stimulate other research in the field of why-QA. We developed a question analysis method for why-questions, based on syntactic categorization and answer type determination. The quality of the output of this module is promising for future development of our method for why-QA.


Computational Linguistics | 2013

Text Representations for Patent Classification

Eva D'hondt; Suzan Verberne; Cornelis H. A. Koster; Lou Boves

With the increasing rate of patent application filings, automated patent classification is of rising economic importance. This article investigates how patent classification can be improved by using different representations of the patent documents. Using the Linguistic Classification System (LCS), we compare the impact of adding statistical phrases (in the form of bigrams) and linguistic phrases (in two different dependency formats) to the standard bag-of-words text representation on a subset of 532,264 English abstracts from the CLEF-IP 2010 corpus. In contrast to previous findings on classification with phrases in the Reuters-21578 data set, for patent classification the addition of phrases results in significant improvements over the unigram baseline. The best results were achieved by combining all four representations, and the second best by combining unigrams and lemmatized bigrams. This article includes extensive analyses of the class models (a.k.a. class profiles) created by the classifiers in the LCS framework, to examine which types of phrases are most informative for patent classification. It appears that bigrams contribute most to improvements in classification accuracy. Similar experiments were performed on subsets of French and German abstracts to investigate the generalizability of these findings.


international acm sigir conference on research and development in information retrieval | 2007

Paragraph retrieval for why -question answering

Suzan Verberne

In the current research project, we aim at developing a system for answering why-questions (why-QA). In previous research, we investigated the possibilities of answer extraction for why-QA exploiting discourse structure in the source text. One conclusion was that many why-questions require a complete paragraph as an answer. In the present paper, we will first discuss the results and the main conclusions that we obtained from these experiments. Then, we will present our research plans concerning paragraph retrieval for why-QA.


patent information retrieval | 2011

Phrase-Based Document Categorization

Cornelis H. A. Koster; Jean Beney; Suzan Verberne; Merijn Vogel

This chapter takes a fresh look at an old idea in Information Retrieval: the use of linguistically extracted phrases as terms in the automatic categorization of documents, and in particular the pre-classification of patent applications. In Information Retrieval, until now there was found little or no evidence that document categorization benefits from the application of linguistic techniques. Classification algorithms using the most cleverly designed linguistic representations typically did not perform better than those using simply the bag-of-words representation. We have investigated the use of dependency triples as terms in document categorization, according to a dependency model based on the notion of aboutness and using normalizing transformations to enhance recall. We describe a number of large-scale experiments with different document representations, test collections and even languages, presenting evidence that adding such triples to the words in a bag-of-terms document representation may lead to a statistically significant increase in the accuracy of document categorization.


european conference on information retrieval | 2015

User Simulations for Interactive Search: Evaluating Personalized Query Suggestion

Suzan Verberne; Maya Sappelli; Kalervo Järvelin; Wessel Kraaij

In this paper, we address the question “what is the influence of user search behaviour on the effectiveness of personalized query suggestion?”. We implemented a method for query suggestion that generates candidate follow-up queries from the documents clicked by the user. This is a potentially effective method for query suggestion, but it heavily depends on user behaviour. We set up a series of experiments in which we simulate a large range of user session behaviour to investigate its influence. We found that query suggestion is not profitable for all user types. We identified a number of significant effects of user behaviour on session effectiveness. In general, it appears that there is extensive interplay between the examination behaviour, the term selection behaviour, the clicking behaviour and the query modification strategy. The results suggest that query suggestion strategies need to be adapted to specific user behaviours.


Information Sciences | 2016

Assessing e-mail intent and tasks in e-mail messages

Maya Sappelli; Gabriella Pasi; Suzan Verberne; M.H.T. de Boer; Wessel Kraaij

In this paper, we analyze corporate e-mail messages as a medium to convey work tasks. Research indicates that categorization of e-mail could alleviate the common problem of information overload. Although e-mail clients provide possibilities of e-mail categorization, not many users spend effort on proper e-mail management. Since e-mail clients are often used for task management, we argue that intent- and task-based categorizations might be what is missing from current systems. We propose a taxonomy of tasks that are expressed through e-mail messages. With this taxonomy, we manually annotated two e-mail datasets (Enron and Avocado), and evaluated the validity of the dimensions in the taxonomy. Furthermore, we investigated the potential for automatic e-mail classification in a machine learning experiment. We found that approximately half of the corporate e-mail messages contain at least one task, mostly informational or procedural in nature. We show that automatic detection of the number of tasks in an e-mail message is possible with 71% accuracy. One important finding is that it is possible to use the e-mails from one company to train a classifier to classify e-mails from another company. Detecting how many tasks a message contains, whether a reply is expected, or what the spatial and time sensitivity of such a task is, can help in providing a more detailed priority estimation of the message for the recipient. Such a priority-based categorization can support knowledge workers in their battle against e-mail overload.


european conference on information retrieval | 2014

Query Term Suggestion in Academic Search

Suzan Verberne; Maya Sappelli; Wessel Kraaij

In this paper, we evaluate query term suggestion in the context of academic professional search. Our overall goal is to support scientists in their information seeking tasks. We set up an interactive search system in which terms are extracted from clicked documents and suggested to the user before every query specification step. We evaluated our method with the iSearch collection of academic information seeking behaviour and crowdsourced term relevance judgements. We found that query term suggestion can significantly improve recall in academic search.

Collaboration


Dive into the Suzan Verberne's collaboration.

Top Co-Authors

Avatar

Wessel Kraaij

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Lou Boves

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Maya Sappelli

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Nelleke Oostdijk

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

P.A.J.M. Coppen

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Eva D'hondt

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar

Max Hinne

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

D.L. Theijssen

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge