Wauter Bosma
VU University Amsterdam
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wauter Bosma.
agent-directed simulation | 2004
W. Lewis Johnson; Paola Rizzo; Wauter Bosma; Sander Kole; Mattijs Ghijsen; Herwin van Welbergen
Analysis of student-tutor coaching dialogs suggest that good human tutors attend to and attempt to influence the motivational state of learners. Moreover, they are sensitive to the social face of the learner, and seek to mitigate the potential face threat of their comments. This paper describes a dialog generator for pedagogical agents that takes motivation and face threat factors into account. This enables the agent to interact with learners in a socially appropriate fashion, and foster intrinsic motivation on the part of the learner, which in turn may lead to more positive learner affective states.
cross language evaluation forum | 2006
Wauter Bosma; Chris Callison-Burch
We describe a method for recognizing textual entailment that uses the length of the longest common subsequence (LCS) between two texts as its decision criterion. Rather than requiring strict word matching in the common subsequences, we perform a flexible match using automatically generated paraphrases.We find that the use of paraphrases over strict word matches represents an average F-measure improvement from 0.22 to 0.36 on the CLEF 2006 Answer Validation Exercise for 7 languages.
Archive | 2008
Wauter Bosma
The meaning of text appears to be tightly related to intentions and circumstances. Context sensitivity of meaning is addressed by theories of discourse structure. Few attempts have been made to exploit text organization in summarization. This thesis is an exploration of what knowledge of discourse structure can do for content selection as a subtask of automatic summarization, and query-based summarization in particular. Query-based summarization is the task of answering an arbitrary user query or question by using content from potentially relevant sources. This thesis presents a general framework for discourse oriented summarization, relying on graphs to represent semantic relations in discourse, and redundancy as a special type of semantic relation. Semantic relations occur on several levels of text analysis (query-relevance, coherence, layout, etc.), and a broad range of textual features may be required to detect them. The graph-based framework facilitates combining multiple features into an integrated semantic model of the documents to summarize. Recognizing redundancy and entailment relations between text passages is particularly important when a summary is generated of multiple documents, e.g. to avoid including redundant content in a summary. For this reason, I pay particular attention to recognizing textual entailment. Within this framework, a three-fold evaluation is performed to evaluate different aspects of discourse oriented summarization. The first is a user study, measuring the effect on user appreciation of using a particular type of knowledge for query-based summarization. In this study, three presentation strategies are compared: summarization using the rhetorical structure of the source, a baseline summarization method which uses the layout of the source, and a baseline presentation method which uses no summarization but just a concise answer to the query. Results show that knowledge of the rhetorical structure not only helps to provide the necessary context for the user to verify that the summary addresses the query adequately, but also to increase the amount of relevant content. The second evaluation is a comparison of implementations of the graph-based framework which are capable of fully automatic summarization. The two variables in the experiment are the set of textual features used to model the source and the algorithm used to search a graph for relevant content. The features are based on cosine similarity, and are realized as graph representations of the source. The graph search algorithms are inspired by existing algorithms in summarization. The quality of summaries is measured using the Rouge evaluation toolkit. The best performer would have ranked first (Rouge-2) or second (Rouge-SU4) if it had participated in the DUC 2005 query-based summarization challenge. The third study is an evaluation in the context of the DUC 2006 summarization challenge, which includes readability measurements as well as various content-based evaluation metrics. The evaluated automatic discourse oriented summarization system is similar to the one described above, but uses additional features, i.e. layout and textual entailment. The system performed well on readability at the cost of content-based scores which were well below the scores of the highest ranking DUC 2006 participant. This indicates a trade-off between readable, coherent content and useful content, an issue yet to be explored. Previous research implies that theories of text organization generalize well to multimedia. This suggests that the discourse oriented summarization framework applies to summarizing multimedia as well, provided sufficient knowledge of the organization of the (multimedia) source documents is available. The last study in this thesis is an investigation of the applicability of structural relations in multimedia for generating picture-illustrated summaries, by relating summary content to picture-associated text (i.e. captions or surrounding paragraphs). Results suggest that captions are the more suitable annotation for selecting appropriate pictures. Compared to manual illustration, results of automatic pictures are similar if the manual picture is mainly decorative.
Theory and Applications of Natural Language Processing | 2011
Wauter Bosma; Erwin Marsi; Emiel Krahmer; Mariët Theune
In this chapter, we describe our efforts in text-to-text generation within the IMOGEN project. In particular, we describe two focus areas of research to improve the quality of the answer: (a) graph-based content selection to improve the answer in terms of usefulness, and (b) sentence fusion to improve the answer in terms of formulation. We use sentence fusion to join together multiple sentences in order to eliminate overlapping parts, thereby reducing redundancy. The results of this work have been applied in the IMIX system. This system uses a question answering system to pinpoint fragments of text which are relevant to the information need expressed by the user. A content selection system then uses these fragments as entry points in the text to formulate a more complete answer. Sentence fusion is applied to manipulate the result in order to increase the fluency of the text.
Theory and Applications of Natural Language Processing | 2011
Charlotte van Hooijdonk; Wauter Bosma; Emiel Krahmer; A. Maes; Mariët Theune
In this chapter we describe three experiments investigating multimodal information presentation in the context of a medical QA system. In Experiment 1, we wanted to know how non-experts design (multimodal) answers to medical questions, distinguishing between what questions and how questions. In Experiment 2, we concentrated on how people evaluate multimodal (text+picture) answer presentations on their informativeness and attractiveness. In Experiment 3, we evaluated two versions of an automatic picture selection method, and compared answer presentations with automatically selected pictures to answer presentations with manually selected pictures.
The 5th International Conference on Generative Approaches to the Lexicon (GL2009) | 2009
Wauter Bosma; Piek Vossen; Aitor Soroa; German Rigau; Maurizio Tesconi; Andrea Marchetti; Monica Monachini; C. Aliprandi
Journal of Network and Systems Management | 2007
Erwin Marsi; Emiel Krahmer; Wauter Bosma
Optics Express | 2006
Erwin Marsi; Emiel Krahmer; Wauter Bosma; Mariët Theune
Computational Linguistics | 2005
Wauter Bosma
Computational Linguistics | 2005
Wauter Bosma; Ton van der Wouden; Michaela Poss; Hilke Reckman; Crit Cremers