Charles Jochim
University of Stuttgart
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Charles Jochim.
patent information retrieval | 2010
Charles Jochim; Christina Lioma; Hinrich Schütze; Steffen Koch; Thomas Ertl
Patent retrieval is a branch of Information Retrieval (IR) aiming to support patent professionals in retrieving patents that satisfy their information needs. Often, patent granting bodies require patents to be partially translated into one or more major foreign languages, so that language boundaries do not hinder their accessibility. This multilinguality of patent collections offers opportunities for improving patent retrieval. In this work we exploit these opportunities by applying query translation to patent retrieval. We expand monolingual patent queries with their translations, using both a domain-specific patent dictionary that we extract from the patent collection, and a general domain-free dictionary. Experimental evaluation on a standard CLEF-IP dataset shows that using either translation dictionary fetches similar results: query translation can help patent retrieval, but not always, and without great improvement compared to standard statistical monolingual query expansion (Rocchio). The improvement is greater when the source language is English, as opposed to French or German, a finding partly due to the effect of the complex French and German morphology upon translation accuracy, but also partly due to the prevalence of English in the collection. A thorough per-query analysis reveals that cases where standard query expansion fails (e.g. zero recall) can benefit from query translation.
information retrieval facility conference | 2011
Charles Jochim; Christina Lioma; Hinrich Schütze
Patent retrieval is a branch of Information Retrieval (IR) that aims to enable the challenging task of retrieving highly technical and often complicated patents. Typically, patent granting bodies translate patents into several major foreign languages, so that language boundaries do not hinder their accessibility. Given such multilingual patent collections, we posit that the patent translations can be exploited for facilitating patent retrieval. Specifically, we focus on the translation of patent queries from German and French, the morphology of which poses an extra challenge to retrieval. We compare two translation approaches that expand the query with (i) translated terms and (ii) translated phrases. Experimental evaluation on a standard CLEF-IP European Patent Office dataset reveals a novel finding: phrase translation may be more suited to French, and term translation may be more suited to German. We trace this finding to language morphology, and we conclude that tailoring the query translation per language can lead to improved results in patent retrieval.
Informatik Spektrum | 2010
Christian Rohrdantz; Steffen Koch; Charles Jochim; Gerhard Heyer; Gerik Scheuermann; Thomas Ertl; Hinrich Schütze; Daniel A. Keim
ZusammenfassungMethoden und Techniken zur automatischen Verarbeitung und inhaltlichen Erfassung großer Mengen an Textdokumenten haben in den vergangenen Jahren enorm an Bedeutung gewonnen. Während einerseits die Verfügbarkeit und der Zugang zu digitalisierten Textdokumenten bis dato in ungeahntem Maße gestiegen sind, erweist sich die Erfassung des semantischen Inhalts solcher Dokumentsammlungen als problematisch. Dem expandierenden Forschungsfeld der visuellen Textanalyse und Textvisualisierung kommt dabei eine Schlüsselrolle bei der Lösung von Problemstellungen aus der Praxis zu. Anhand aktueller Anwendungsbeispiele und einem Überblick über den Stand der Forschung erläutert dieser Artikel die vielfältigen Möglichkeiten, die sich durch visuelle Textanalyse ergeben.
conference of the european chapter of the association for computational linguistics | 2009
Markus Dickinson; Charles Jochim
Building on the use of local contexts, or frames, for human category acquisition, we explore the treatment of contexts as categories. This allows us to examine and evaluate the categorical properties that local unsupervised methods can distinguish and their relationship to corpus POS tags. From there, we use lexical information to combine contexts in a way which preserves the intended category, providing a platform for grammatical category induction.
international conference on computational linguistics | 2012
Charles Jochim; Hinrich Sch"utze
international conference on computational linguistics | 2012
Florian Heimerl; Charles Jochim; Steffen Koch; Thomas Ertl
language resources and evaluation | 2010
Markus Dickinson; Charles Jochim
conference of the european chapter of the association for computational linguistics | 2017
Charles Jochim; Léa Amandine Deleris
north american chapter of the association for computational linguistics | 2018
Léa Amandine Deleris; Francesca Bonin; Elizabeth M. Daly; Stéphane Deparis; Yufang Hou; Charles Jochim; Yassine Lassoued; Killian Levacher
language resources and evaluation | 2018
Charles Jochim; Francesca Bonin; Roy Bar-Haim; Noam Slonim