Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Philipp Sorg is active.

Publication


Featured researches published by Philipp Sorg.


applications of natural language to data bases | 2009

An experimental comparison of explicit semantic analysis implementations for cross-language retrieval

Philipp Sorg; Philipp Cimiano

Explicit Semantic Analysis (ESA) has been recently proposed as an approach to computing semantic relatedness between words (and indirectly also between texts) and has thus a natural application in information retrieval, showing the potential to alleviate the vocabulary mismatch problem inherent in standard Bag-of-Word models. The ESA model has been also recently extended to cross-lingual retrieval settings, which can be considered as an extreme case of the vocabulary mismatch problem. The ESA approach actually represents a class of approaches and allows for various instantiations. As our first contribution, we generalize ESA in order to clearly show the degrees of freedom it provides. Second, we propose some variants of ESA along different dimensions, testing their impact on performance on a cross-lingual mate retrieval task on two datasets (JRC-ACQUIS and Multext). Our results are interesting as a systematic investigation has been missing so far and the variations between different basic design choices are significant. We also show that the settings adopted in the original ESA implementation are reasonably good, which to our knowledge has not been demonstrated so far, but can still be significantly improved by tuning the right parameters (yielding a relative improvement on a cross-lingual mate retrieval task of between 62% (Multext) and 237% (JRC-ACQUIS) with respect to the original ESA model).


international conference on knowledge capture | 2011

Language resources extracted from Wikipedia

Denny Vrandecic; Philipp Sorg; Rudi Studer

Wikipedia provides an interesting amount of text for more than hundred languages. This also includes languages where no reference corpora or other linguistic resources are easily available. We have extracted background language models built from the content of Wikipedia in various languages. The models generated from Simple and English Wikipedia are compared to language models derived from other established corpora. The differences between the models in regard to term coverage, term distribution and correlation are described and discussed. We provide access to the full dataset and create visualizations of the language models that can be used exploratory. The paper describes the newly released dataset for 33 languages, and the services that we provide on top of them.


conference on information and knowledge management | 2011

Detect'11: international workshop on DETecting and Exploiting Cultural diversiTy on the social web

Sergej Sizov; Stefan Siersdorfer; Thomas Gottron; Philipp Sorg

Sergej Sizov WeST – Institute for Web Science and Technologies University of Koblenz-Landau, 56070 Koblenz, Germany [email protected] Stefan Siersdorfer Intelligent Access to Information L3S Research Center, 30167 Hannover, Germany [email protected] Thomas Gottron WeST – Institute for Web Science and Technologies University of Koblenz-Landau, 56070 Koblenz, Germany [email protected] Philipp Sorg Institut für Angewandte Informatik und Formale Beschreibungsverfahren Karlsruhe Institute of Technology, 76133 Karlsruhe, Germany [email protected]


Foundations for the Web of Information and Services. Ed.: D. Fensel | 2011

Combining Data-Driven and Semantic Approaches for Text Mining

Stephan Bloehdorn; Sebastian Blohm; Philipp Cimiano; Eugenie Giesbrecht; Andreas Hotho; Uta Lösch; Alexander Mädche; Eddie Mönch; Philipp Sorg; Steffen Staab; Johanna Völker

While the amount of structured data published on the Web keeps growing (fostered in particular by the Linked Open Data initiative), the Web still comprises of mainly unstructured—in particular textual—content and is therefore a Web for human consumption. Thus, an important question is which techniques are most suitable to enable people to effectively access the large body of unstructured information available on the Web, whether it is semantic or not. While the hope is that semantic technologies can be combined with standard Information Retrieval approaches to enable more accurate retrieval, some researchers have argued against this view. They claim that only data-driven or inductive approaches are applicable to tasks requiring the organization of unstructured (mainly textual) data for retrieval purposes. We argue that the dichotomy between data-driven/inductive and semantic approaches is indeed a false one. We further argue that bottom-up or inductive approaches can be successfully combined with top-down or semantic approaches and illustrate this for a number of tasks such as Ontology Learning, Information Retrieval, Information Extraction and Text Mining.


international semantic web conference | 2008

Learning Methods in Multi-grained Query Answering

Philipp Sorg

This PhD proposal is about the development of new methods for information access. Two new approaches are proposed: Multi-Grained Query Answering that bridges the gap between Information Retrieval and Question Answering and Learning-Enhanced Query Answering that enables the improvement of retrieval performance based on the experience of previous queries and answers.


international joint conference on artificial intelligence | 2009

Explicit versus latent concept models for cross-language information retrieval

Philipp Cimiano; Antje Schultz; Sergej Sizov; Philipp Sorg; Steffen Staab


cross language evaluation forum | 2008

Cross-lingual Information Retrieval with Explicit Semantic Analysis

Philipp Sorg; Philipp Cimiano


data and knowledge engineering | 2012

Exploiting Wikipedia for cross-lingual and multilingual information retrieval

Philipp Sorg; Philipp Cimiano


national conference on artificial intelligence | 2008

Enriching the crosslingual link structure of Wikipedia - A classification-based approach

Philipp Sorg; Philipp Cimiano


cross-language evaluation forum | 2010

Overview of the Cross-lingual Expert Search (CriES) Pilot Challenge.

Philipp Sorg; Philipp Cimiano; Antje Schultz; Sergej Sizov

Collaboration


Dive into the Philipp Sorg's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sergej Sizov

University of Düsseldorf

View shared research outputs
Top Co-Authors

Avatar

Antje Schultz

University of Koblenz and Landau

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Steffen Staab

University of Koblenz and Landau

View shared research outputs
Top Co-Authors

Avatar

Thomas Gottron

University of Koblenz and Landau

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Denny Vrandecic

Karlsruhe Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Eugenie Giesbrecht

Forschungszentrum Informatik

View shared research outputs
Researchain Logo
Decentralizing Knowledge