Felix Hamborg
University of Konstanz
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Felix Hamborg.
acm ieee joint conference on digital libraries | 2017
Felix Hamborg; Norman Meuschke; Bela Gipp
News aggregators capably handle the large amount of news that is published nowadays. However, these systems focus on the presentation of important, common information in news, but do not reveal different perspectives on the same topic. Thus, current news aggregators suffer from media bias, i.e. differences in the content or presentation of news. Finding such differences is crucial to reduce the effects of media bias. This paper presents matrix-based news analysis (MNA), a novel design for news exploration. MNA helps users gain a broad and diverse news understanding by presenting various news perspectives on the same topic. Furthermore, we present NewsBird, a news aggregator that implements MNA to find different perspectives on international news topics. The results of a case study demonstrate that NewsBird broadens the users news understanding while it also provides similar news aggregation functionalities as established systems.
conference on information and knowledge management | 2017
Norman Meuschke; Moritz Schubotz; Felix Hamborg; Tomáš Skopal; Bela Gipp
This paper presents, to our knowledge, the first study on analyzing mathematical expressions to detect academic plagiarism. We make the following contributions. First, we investigate confirmed cases of plagiarism to categorize the similarities of mathematical content commonly found in plagiarized publications. From this investigation, we derive possible feature selection and feature comparison strategies for developing math-based detection approaches and a ground truth for our experiments. Second, we create a test collection by embedding confirmed cases of plagiarism into the NTCIR-11 MathIR Task dataset, which contains approx. 60 million mathematical expressions in 105,120 documents from arXiv.org. Third, we develop a first math-based detection approach by implementing and evaluating different feature comparison approaches using an open source parallel data processing pipeline built using the Apache Flink framework. The best performing approach identifies all but two of our real-world test cases at the top rank and achieves a mean reciprocal rank of 0.86. The results show that mathematical expressions are promising text-independent features to identify academic plagiarism in large collections. To facilitate future research on math-based plagiarism detection, we make our source code and data available.
13th International Conference : iConference 2018 | 2018
Felix Hamborg; Soeren Lachnit; Moritz Schubotz; Thomas Hepp; Bela Gipp
Extraction of event descriptors from news articles is a commonly required task for various tasks, such as clustering related articles, summarization, and news aggregation. Due to the lack of generally usable and publicly available methods optimized for news, many researchers must redundantly implement such methods for their project. Answers to the five journalistic W questions (5Ws) describe the main event of a news article, i.e., who did what, when, where, and why. The main contribution of this paper is Giveme5W, the first open-source, syntax-based 5W extraction system for news articles. The system retrieves an article’s main event by extracting phrases that answer the journalistic 5Ws. In an evaluation with three assessors and 60 articles, we find that the extraction precision of 5W phrases is ( p = 0.7 ).
cross language evaluation forum | 2017
Moritz Schubotz; Leonard Krämer; Norman Meuschke; Felix Hamborg; Bela Gipp
Mathematical formulae in academic texts significantly contribute to the overall semantic content of such texts, especially in the fields of Science, Technology, Engineering and Mathematics. Knowing the definitions of the identifiers in mathematical formulae is essential to understand the semantics of the formulae. Similar to the sense-making process of human readers, mathematical information retrieval systems can analyze the text that surrounds formulae to extract the definitions of identifiers occurring in the formulae. Several approaches for extracting the definitions of mathematical identifiers from documents have been proposed in recent years. So far, these approaches have been evaluated using different collections and gold standard datasets, which prevented comparative performance assessments. To facilitate future research on the task of identifier definition extraction, we make three contributions. First, we provide an automated evaluation framework, which uses the dataset and gold standard of the NTCIR-11 Math Retrieval Wikipedia task. Second, we compare existing identifier extraction approaches using the developed evaluation framework. Third, we present a new identifier extraction approach that uses machine learning to combine the well-performing features of previous approaches. The new approach increases the precision of extracting identifier definitions from 17.85% to 48.60%, and increases the recall from 22.58% to 28.06%. The evaluation framework, the dataset and our source code are openly available at: https://ident.formulasearchengine.com.
acm ieee joint conference on digital libraries | 2018
Felix Hamborg; Corinna Breitinger; Moritz Schubotz; Soeren Lachnit; Bela Gipp
The identification and extraction of the events that news articles report on is a commonly performed task in the analysis workflow of various projects that analyze news articles. However, due to the lack of universally usable and publicly available methods for news articles, many researchers must redundantly implement methods for event extraction to be used within their projects. Answers to the journalistic five W and one H questions (5W1H) describe the main event of a news story, i.e., who did what, when, where, why, and how. We propose Giveme5W1H, an open-source system that uses syntactic and domain-specific rules to extract phrases answering the 5W1H. In our evaluation, we find that the extraction precision of 5W1H phrases is p=0.64, and p=0.79 for the first four W questions, which discretely describe an event.
International Journal on Digital Libraries | 2018
Felix Hamborg; Norman Meuschke; Bela Gipp
Media bias describes differences in the content or presentation of news. It is an ubiquitous phenomenon in news coverage that can have severely negative effects on individuals and society. Identifying media bias is a challenging problem, for which current information systems offer little support. News aggregators are the most important class of systems to support users in coping with the large amount of news that is published nowadays. These systems focus on identifying and presenting important, common information in news articles, but do not reveal different perspectives on the same topic. Due to this analysis approach, current news aggregators cannot effectively reveal media bias. To address this problem, we present matrix-based news aggregation, a novel approach for news exploration that helps users gain a broad and diverse news understanding by presenting various perspectives on the same news topic. Additionally, we present NewsBird, an open-source news aggregator that implements matrix-based news aggregation for international news topics. The results of a user study showed that NewsBird more effectively broadens the user’s news understanding than the list-based visualization approach employed by established news aggregators, while achieving comparable effectiveness and efficiency for the two main use cases of news consumption: getting an overview of and finding details on current news topics.
Archive | 2017
Felix Hamborg; Norman Meuschke; Akiko Aizawa; Bela Gipp
Depending on the news source, a reader can be exposed to a different narrative and conflicting perceptions for the same event. Today, news aggregators help users cope with the large volume of news published daily. However, aggregators focus on presenting shared information, but do not expose the different perspectives from articles on same topics. Thus, users of such aggregators suffer from media bias, which is often implemented intentionally to influence public opinion. In this paper, we present NewsBird, an aggregator that presents shared and different information on topics. Currently, NewsBird reveals different perspectives on international news. Our system has led to insights about media bias and news analysis, which we use to propose approaches to be investigated in future research. Our vision is to provide a system that reveals media bias, and thus ultimately allows users to make their own judgement on the potential bias inherent in news.
15th International Symposium of Information Science (ISI 2017) | 2017
Felix Hamborg; Norman Meuschke; Corinna Breitinger; Bela Gipp
Proceedings of the 20th Nordic Conference of Computational Linguistics (NODALIDA 2015) | 2015
Dominik Sacha; Yuki Asano; Christian Rohrdantz; Felix Hamborg; Daniel A. Keim; Bettina Braun; Miriam Butt
international acm sigir conference on research and development in information retrieval | 2017
Felix Hamborg; Moustafa Elmaghraby; Corinna Breitinger; Bela Gipp