Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Georg Rehm is active.

Publication


Featured researches published by Georg Rehm.


european semantic web conference | 2016

Towards a Platform for Curation Technologies: Enriching Text Collections with a Semantic-Web Layer

Peter Bourgonje; Julián Moreno-Schneider; Jan Nehring; Georg Rehm; Felix Sasaki; Ankit Srivastava

In an attempt to put a Semantic Web-layer that provides linguistic analysis and discourse information on top of digital content, we develop a platform for digital curation technologies. The platform offers language-, knowledge- and data-aware services as a flexible set of workflows and pipelines for the efficient processing of various types of digital content. The platform is intended to enable human experts (knowledge workers) to get a grasp and understand the contents of large document collections in an efficient way so that they can curate, process and further analyse the collection according to their sector-specific needs.


meeting of the association for computational linguistics | 2017

Event Detection and Semantic Storytelling: Generating a Travelogue from a large Collection of Personal Letters.

Georg Rehm; Julian Moreno Schneider; Peter Bourgonje; Ankit Srivastava; Jan Nehring; Armin Berger; Luca König; Sören Räuchle; Jens Gerth

We present an approach at identifying a specific class of events, movement action events (MAEs), in a data set that consists of ca. 2,800 personal letters exchanged by the German architect Erich Mendelsohn and his wife, Luise. A backend system uses these and other semantic analysis results as input for an authoring environment that digital curators can use to produce new pieces of digital content. In our example case, the human expert will receive recommendations from the system with the goal of putting together a travelogue, i.e., a description of the trips and journeys undertaken by the couple. We describe the components and architecture and also apply the system to news data.


international conference on human interface and management of information | 2017

Designing User Interfaces for Curation Technologies.

Georg Rehm; Jing He; Julián Moreno-Schneider; Jan Nehring; Joachim Quantz

Digital content and online media have reached an unprecedented level of relevance and importance. In the context of a research and technology transfer project on Digital Curation Technologies for online content we develop a platform that provides curation services that can be integrated into concrete curation or content management systems. In this project, the German Research Center for Artificial Intelligence (DFKI) collaborates with four Berlin-based SMEs that work with and on digital content in four different sectors. The curation services comprise several semantic text and document analytics processes as well as knowledge technologies that can be applied to document collections. The key objective of this set of curation services is to support knowledge workers and digital curators in their daily work, i.e., to automate or to semi-automate processes that the human experts are normally required to do intellectually and without tool support. The goal is to help this group of information and knowledge workers to become more efficient and more effective as well as to enable them to produce high-quality content in their respective sectors. In this article we concentrate on the current state of a user interface that is currently under development at ART+COM, one of the SME partners in the project. A second, more generic, i.e., not domain-specific user interface is under development at DFKI. In this article we describe the technology platform and the two different interfaces. We also take a look at the different requirements for ART+COM’s domain-specific and DFKI’s generic user interface.


international conference on human interface and management of information | 2017

Towards User Interfaces for Semantic Storytelling

Julián Moreno-Schneider; Peter Bourgonje; Georg Rehm

Digital content and online media have reached an unprecedented level of relevance and importance. In the context of a research and technology transfer project on Digital Curation Technologies for online content we develop a Semantic Storytelling prototype. The approach is based on the semantic analysis of document collections, in which, among others, individual analysis results are, if possible, mapped to external knowledge bases. We interlink key information contained in the documents of the collection, which can be essentially conceptualised as automatic hypertext generation. With this semantic layer on top of the set of documents in place, we attempt to identify interesting, surprising, eye-opening relationships between different concepts or entities mentioned in the document collection. In this article we concentrate on the current state of the user interfaces of our Semantic Storytelling prototype.


International Conference of the German Society for Computational Linguistics and Language Technology | 2017

Automatic Classification of Abusive Language and Personal Attacks in Various Forms of Online Communication

Peter Bourgonje; Julián Moreno-Schneider; Ankit Srivastava; Georg Rehm

The sheer ease with which abusive and hateful utterances can be made online – typically from the comfort of your home and the lack of any immediate negative repercussions – using today’s digital communication technologies (especially social media), is responsible for their significant increase and global ubiquity. Natural Language Processing technologies can help in addressing the negative effects of this development. In this contribution we evaluate a set of classification algorithms on two types of user-generated online content (tweets and Wikipedia Talk comments) in two languages (English and German). The different sets of data we work on were classified towards aspects such as racism, sexism, hatespeech, aggression and personal attacks. While acknowledging issues with inter-annotator agreement for classification tasks using these labels, the focus of this paper is on classifying the data according to the annotated characteristics using several text classification algorithms. For some classification tasks we are able to reach f-scores of up to 81.58.


Archive | 2015

Semantische Technologien und Standards für das mehrsprachige Europa

Georg Rehm; Felix Sasaki

Der Beitrag beleuchtet im Kontext mehrsprachiger semantischer Anwendungen die Rolle ausgewahlter Technologien und Standards. Standardisierte semantische Ressourcen und standardisierte Verfahren fur ihre Nutzung in sprachtechnologischen Anwendungen und Workflows besitzen das Potential, die Qualitat der Anwendungen entscheidend zu verbessern und den Prozess der Anwendungsentwicklung erheblich zu vereinfachen. Im Zentrum steht zum einen die Infrastruktur META-SHARE. Diese wurde im Rahmen der Initiative META-NET entwickelt und umfasst ein XML-Metadatenschema fur die Katalogisierung von Sprachressourcen. Zum anderen behandelt der Beitrag die Nutzung von Linked Data zur Reprasentation von Metadaten und Sprachressourcen. Relevant hierfur sind Standards wie DCAT, NIF und ITS. Nach diesen praxisorientierten Betrachtungen schliest der Beitrag mit der Einbettung in einen groseren Kontext: die mehrsprachige, europaische Informationsgesellschaft.


The Prague Bulletin of Mathematical Linguistics | 2017

Improving Machine Translation through Linked Data

Ankit Srivastava; Georg Rehm; Felix Sasaki

Abstract With the ever increasing availability of linked multilingual lexical resources, there is a renewed interest in extending Natural Language Processing (NLP) applications so that they can make use of the vast set of lexical knowledge bases available in the Semantic Web. In the case of Machine Translation, MT systems can potentially benefit from such a resource. Unknown words and ambiguous translations are among the most common sources of error. In this paper, we attempt to minimise these types of errors by interfacing Statistical Machine Translation (SMT) models with Linked Open Data (LOD) resources such as DBpedia and BabelNet. We perform several experiments based on the SMT system Moses and evaluate multiple strategies for exploiting knowledge from multilingual linked data in automatically translating named entities. We conclude with an analysis of best practices for multilingual linked data sets in order to optimise their benefit to multilingual and cross-lingual applications.


International Conference of the German Society for Computational Linguistics and Language Technology | 2017

Different German and English Coreference Resolution Models for Multi-domain Content Curation Scenarios

Ankit Srivastava; Sabine Weber; Peter Bourgonje; Georg Rehm

Coreference Resolution is the process of identifying all words and phrases in a text that refer to the same entity. It has proven to be a useful intermediary step for a number of natural language processing applications. In this paper, we describe three implementations for performing coreference resolution: rule-based, statistical, and projection-based (from English to German). After a comparative evaluation on benchmark datasets, we conclude with an application of these systems on German and English texts from different scenarios in digital curation such as an archive of personal letters, excerpts from a museum exhibition, and regional news articles.


International Conference of the German Society for Computational Linguistics and Language Technology | 2017

Different Types of Automated and Semi-automated Semantic Storytelling: Curation Technologies for Different Sectors

Georg Rehm; Julián Moreno-Schneider; Peter Bourgonje; Ankit Srivastava; Rolf Fricke; Jan Thomsen; Jing He; Joachim Quantz; Armin Berger; Luca König; Sören Räuchle; Jens Gerth; David Wabnitz

Many industries face an increasing need for smart systems that support the processing and generation of digital content. This is both due to an ever increasing amount of incoming content that needs to be processed faster and more efficiently, but also due to an ever increasing pressure of publishing new content in cycles that are getting shorter and shorter. In a research and technology transfer project we develop a platform that provides content curation services that can be integrated into Content Management Systems, among others. In the project we develop curation services, which comprise semantic text and document analytics processes as well as knowledge technologies that can be applied to document collections. The key objective is to support digital curators in their daily work, i.e., to (semi-)automate processes that the human experts are normally required to carry out intellectually and, typically, without tool support. The goal is to enable knowledge workers to become more efficient and more effective as well as to produce high-quality content. In this article we focus on the current state of development with regard to semantic storytelling in our four use cases.


meeting of the association for computational linguistics | 2017

DFKI-DKT at SemEval-2017 Task 8: Rumour Detection and Classification using Cascading Heuristics.

Ankit Srivastava; Georg Rehm; Julian Moreno Schneider

Collaboration


Dive into the Georg Rehm's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jan Hajic

Charles University in Prague

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge