Raphael Hoffmann | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Raphael Hoffmann is active.

Explore More

Publication

Featured researches published by Raphael Hoffmann.

knowledge discovery and data mining | 2008

Information extraction from Wikipedia: moving down the long tail

Fei Wu; Raphael Hoffmann; Daniel S. Weld

Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-supervised information extraction. While previous efforts at extraction from Wikipedia achieve high precision and recall on well-populated classes of articles, they fail in a larger number of cases, largely because incomplete articles and infrequent use of infoboxes lead to insufficient training data. This paper presents three novel techniques for increasing recall from Wikipedias long tail of sparse classes: (1) shrinkage over an automatically-learned subsumption taxonomy, (2) a retraining technique for improving the training data, and (3) supplementing results by extracting from the broader Web. Our experiments compare design variations and show that, used in concert, these techniques increase recall by a factor of 1.76 to 8.71 while maintaining or increasing precision.

user interface software and technology | 2007

Assieme: finding and leveraging implicit references in a web search interface for programmers

Raphael Hoffmann; James Fogarty; Daniel S. Weld

Programmers regularly use search as part of the development process, attempting to identify an appropriate API for a problem, seeking more information about an API, and seeking samples that show how to use an API. However, neither general-purpose search engines nor existing code search engines currently fit their needs, in large part because the information programmers need is distributed across many pages. We present Assieme, a Web search interface that effectively supports common programming search tasks by combining information from Web-accessible Java Archive (JAR) files, API documentation, and pages that include explanatory text and sample code. Assieme uses a novel approach to finding and resolving implicit references to Java packages, types, and members within sample code on the Web. In a study of programmers performing searches related to common programming tasks, we show that programmers obtain better solutions, using fewer queries, in the same amount of time spent using a general Web search interface.

ubiquitous computing | 2005

Fast and robust interface generation for ubiquitous applications

Krzysztof Z. Gajos; David William Christianson; Raphael Hoffmann; Tal Shaked; Kiera Henning; Jing Jing Long; Daniel S. Weld

We present Supple, a novel toolkit which automatically generates interfaces for ubiquitous applications. Designers need only specify declarative models of the interface and desired hardware device and Supple uses decision-theoretic optimization to automatically generate a concrete rendering for that device. This paper provides an overview of our system and describes key extensions that barred the previous version (reported in [3]) from practical application. Specifically, we describe a functional modeling language capable of representing complex applications. We propose a new adaptation strategy, split interfaces, which speeds access to common interface features without disorienting the user. We present a customization facility that allows designers and end users to override Supples automatic rendering decisions. We describe a distributed architecture which enables computationally-impoverished devices to benefit from Supple interfaces. Finally, we present experiments and a preliminary user-study that demonstrate the practicality of our approach.

human factors in computing systems | 2009

Amplifying community content creation with mixed initiative information extraction

Raphael Hoffmann; Saleema Amershi; Kayur Patel; Fei Wu; James Fogarty; Daniel S. Weld

Although existing work has explored both information extraction and community content creation, most research has focused on them in isolation. In contrast, we see the greatest leverage in the synergistic pairing of these methods as two interlocking feedback cycles. This paper explores the potential synergy promised if these cycles can be made to accelerate each other by exploiting the same edits to advance both community content creation and learning-based information extraction. We examine our proposed synergy in the context of Wikipedia infoboxes and the Kylin information extraction system. After developing and refining a set of interfaces to present the verification of Kylin extractions as a non primary task in the context of Wikipedia articles, we develop an innovative use of Web search advertising services to study people engaged in some other primary task. We demonstrate our proposed synergy by analyzing our deployment from two complementary perspectives: (1) we show we accelerate community content creation by using Kylins information extraction to significantly increase the likelihood that a person visiting a Wikipedia article as a part of some other primary task will spontaneously choose to help improve the articles infobox, and (2) we show we accelerate information extraction by using contributions collected from people interacting with our designs to significantly improve Kylins extraction performance.

human factors in computing systems | 2008

Evaluating visual cues for window switching on large screens

Raphael Hoffmann; Patrick Baudisch; Daniel S. Weld

An increasing number of users are adopting large, multi-monitor displays. The resulting setups cover such a broad viewing angle that users can no longer simultaneously perceive all parts of the screen. Changes outside the users visual field often go unnoticed. As a result, users sometimes have trouble locating the active window, for example after switching focus. This paper surveys graphical cues designed to direct visual attention and adapts them to window switching. Visual cues include five types of frames and mask around the target window and four trails leading to the window. We report the results of two user studies. The first evaluates each cue in isolation. The second evaluates hybrid techniques created by combining the most successful candidates from the first study. The best cues were visually sparse --- combinations of curved frames which use color to pop-out and tapered trails with predictable origin.

conference on information and knowledge management | 2009

Semi-supervised learning of semantic classes for query understanding: from the web and for the web

Ye-Yi Wang; Raphael Hoffmann; Xiao Li; Jakub Szymanski

Understanding intents from search queries can improve a users search experience and boost a sites advertising profits. Query tagging via statistical sequential labeling models has been shown to perform well, but annotating the training set for supervised learning requires substantial human effort. Domain-specific knowledge, such as semantic class lexicons, reduces the amount of needed manual annotations, but much human effort is still required to maintain these as search topics evolve over time. This paper investigates semi-supervised learning algorithms that leverage structured data (HTML lists) from the Web to automatically generate semantic-class lexicons, which are used to improve query tagging performance - even with far less training data. We focus our study on understanding the correct objectives for the semi-supervised lexicon learning algorithms that are crucial for the success of query tagging. Prior work on lexicon acquisition has largely focused on the precision of the lexicons, but we show that precision is not important if the lexicons are used for query tagging. A more adequate criterion should emphasize a trade-off between maximizing the recall of semantic class instances in the data, and minimizing the confusability. This ensures that the similar levels of precision and recall are observed on both training and test set, hence prevents over-fitting the lexicon features. Experimental results on retail product queries show that enhancing a query tagger with lexicons learned with this objective reduces word level tagging errors by up to 25% compared to the baseline tagger that does not use any lexicon features. In contrast, lexicons obtained through a precision-centric learning algorithm even degrade the performance of a tagger compared to the baseline. Furthermore, the proposed method outperforms one in which semantic class lexicons have been extracted from a database.

meeting of the association for computational linguistics | 2011