Sebastian Ebert
Ludwig Maximilian University of Munich
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sebastian Ebert.
arXiv: Computation and Language | 2016
Wenpeng Yin; Sebastian Ebert; Hinrich Schütze
Understanding open-domain text is one of the primary challenges in natural language processing (NLP). Machine comprehension benchmarks evaluate the systems ability to understand text based on the text content only. In this work, we investigate machine comprehension on MCTest, a question answering (QA) benchmark. Prior work is mainly based on feature engineering approaches. We come up with a neural network framework, named hierarchical attention-based convolutional neural network (HABCNN), to address this task without any manually designed features. Specifically, we explore HABCNN for this task by two routes, one is through traditional joint modeling of passage, question and answer, one is through textual entailment. HABCNN employs an attention mechanism to detect key phrases, key sentences and key snippets that are relevant to answering the question. Experiments show that HABCNN outperforms prior deep learning approaches by a big margin.
north american chapter of the association for computational linguistics | 2016
Sascha Rothe; Sebastian Ebert; Hinrich Schütze
Embeddings are generic representations that are useful for many NLP tasks. In this paper, we introduce DENSIFIER, a method that learns an orthogonal transformation of the embedding space that focuses the information relevant for a task in an ultradense subspace of a dimensionality that is smaller by a factor of 100 than the original space. We show that ultradense embeddings generated by DENSIFIER reach state of the art on a lexicon creation task in which words are annotated with three types of lexical information - sentiment, concreteness and frequency. On the SemEval2015 10B sentiment analysis task we show that no information is lost when the ultradense subspace is used, but training is an order of magnitude more efficient due to the compactness of the ultradense space.
international conference on frontiers in handwriting recognition | 2010
Sebastian Ebert; Marcus Liwicki; Andreas Dengel
In this paper we introduce a new layer for the task of handwriting recognition. We add semantic information by means of ontologies. The task of our recognizer therefore is not only to recognize the ASCII transcription of the handwritten document, but also to identify the semantic concepts which appear in the text. This task is called ontology-based information extraction (OBIE), which has been applied to electronic documents recently. OBIE methods first segment the text into tokens, then identify their values and their corresponding instances of the ontology, and finally try to generate new facts based on the text. To the authors’ knowledge, in this paper OBIE is proposed for the first time in handwriting literature. In our experiments we have evaluated the process up to the instantiation. We have found that using not only the top alternative, but also the k-best alternatives increases the performance of information extraction. Furthermore, the use of an ontology-based lexicon results in another performance increase.
empirical methods in natural language processing | 2015
Sebastian Ebert; Ngoc Thang Vu; Hinrich Schütze
Sentiment lexicons and other linguistic knowledge proved to be beneficial in polarity classification. This paper introduces a linguistically informed Convolutional Neural Network (lingCNN), which incorporates this valuable kind of information into the model. We present two intuitive and simple methods: The first one integrates word-level features, the second sentence-level features. By combining both types of features our model achieves results that are comparable to state-of-the- art systems.
empirical methods in natural language processing | 2016
Sebastian Ebert; Thomas Müller; Hinrich Schütze
This paper introduces STEM and LAMB, embeddings trained for stems and lemmata instead of for surface forms. For morphologically rich languages, they perform significantly better than standard embeddings on word similarity and polarity evaluations. On a new WordNet-based evaluation, STEM and LAMB are up to 50% better than standard embeddings. We show that both embeddings have high quality even for small dimensionality and training corpora.
north american chapter of the association for computational linguistics | 2015
Sebastian Ebert; Ngoc Thang Vu; Hinrich Schütze
This paper describes our automatic sentiment analysis system – CIS-positive – for SemEval 2015 Task 10 “Sentiment Analysis in Twitter”, subtask B “Message Polarity Classification”. In this system, we propose to normalize the Twitter data in a way that maximizes the coverage of sentiment lexicons and minimizes distracting elements. Furthermore, we integrate the output of Convolutional Neural Networks into Support Vector Machines for the polarity classification. Our system achieves a macro F1 score of the positive and negative class of 59.57 on the SemEval 2015 test data.
empirical methods in natural language processing | 2014
Sebastian Ebert; Hinrich Schütze
We put forward the hypothesis that highaccuracy sentiment analysis is only possible if word senses with different polarity are accurately recognized. We provide evidence for this hypothesis in a case study for the adjective “hard” and propose contextually enhanced sentiment lexicons that contain the information necessary for sentiment-relevant sense disambiguation. An experimental evaluation demonstrates that senses with different polarity can be distinguished well using a combination of standard and novel features.
Pattern Recognition Letters | 2014
Marcus Liwicki; Sebastian Ebert; Andreas Dengel
In this paper we introduce a new layer for the task of handwriting recognition (HWR), i.e., the use of semantic information in form of Resource Description Framework (RDF) knowledge bases. In particular, two novel processing stages are proposed for the first time in literature. The first stage is the inclusion of RDF knowledge bases into the HWR process, where we make use of a persons mental model. This process can be extended to use other ontological resource. The second stage is the transition from pure handwriting recognition to understanding the handwritten notes, i.e., the system extracts knowledge employing RDF knowledge-bases. This is also called ontology-based information extraction (OBIE). The task of our recognizer therefore is not only to recognize the ASCII transcription of the handwritten document, but also to identify the semantic concepts which appear in the text. For both novel approaches we performed a set of experiments on various data. First, the recognition rate of the HWR system is increased on several documents. Second, the performance of information extraction is also remarkable. By using the k-best word recognition alternatives in form of a lattice as an input for the OBIE system, the performance reaches a level which is very close to OBIE applied on pure ASCII text.
international conference on knowledge based and intelligent information and engineering systems | 2011
Marcus Liwicki; Sebastian Ebert; Andreas Dengel
In this paper we discuss recent development of hand writing recognition (HWR). In particular, the transition from pure hand writing recognition to understanding of the hand written notes is described. Therefore we first summarize the state-of-the-art in HWR. Next, two recent approaches in order to improve HWR and extracting knowledge are described. Experimental results on various data are reported and an outlook to future directions is given.
Archive | 2015
Sebastian Ebert; Ngoc Thang Vu; Hinrich Schütze