Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Mijail A. Kabadjov is active.

Publication


Featured researches published by Mijail A. Kabadjov.


decision support systems | 2011

Creating Sentiment Dictionaries via Triangulation

Josef Steinberger; Polina Lenkova; Mohamed Ebrahim; Maud Ehrman; Ali Hürriyetoğlu; Mijail A. Kabadjov; Ralf Steinberger; Hristo Tanev; Vanni Zavarella; Silvia Vázquez

The paper presents a semi-automatic approach to creating sentiment dictionaries in many languages. We first produced high-level gold-standard sentiment dictionaries for two languages and then translated them automatically into third languages. Those words that can be found in both target language word lists are likely to be useful because their word senses are likely to be similar to that of the two source languages. These dictionaries can be further corrected, extended and improved. In this paper, we present results that verify our triangulation hypothesis, by evaluating triangulated lists and comparing them to non-triangulated machine-translated word lists.


web intelligence | 2009

Opinion Mining on Newspaper Quotations

Alexandra Balahur; Ralf Steinberger; Erik Van der Goot; Bruno Pouliquen; Mijail A. Kabadjov

Opinion mining is the task of extracting from a set of documents opinions expressed by a source on a specified target. This article presents a comparative study on the methods and resources that can be employed for mining opinions from quotations (reported speech) in newspaper articles. We show the difficulty of this task, motivated by the presence of different possible targets and the large variety of affect phenomena that quotes contain. We evaluate our approaches using annotated quotations extracted from news provided by the EMM news gathering engine. We conclude that a generic opinion mining system requires both the use of large lexicons, as well as specialised training and testing data.


empirical methods in natural language processing | 2005

Improving LSA-based Summarization with Anaphora Resolution

Josef Steinberger; Mijail A. Kabadjov; Massimo Poesio; Olivia Sanchez-Graillet

We propose an approach to summarization exploiting both lexical information and the output of an automatic anaphoric resolver, and using Singular Value Decomposition (SVD) to identify the main terms. We demonstrate that adding anaphoric information results in significant performance improvements over a previously developed system, in which only lexical terms are used as the input to SVD. However, we also show that how anaphoric information is used is crucial: whereas using this information to add new terms does result in improved performance, simple substitution makes the performance worse.


cross language evaluation forum | 2010

Using parallel corpora for multilingual (multi-document) summarisation evaluation

Marco Turchi; Josef Steinberger; Mijail A. Kabadjov; Ralf Steinberger

We are presenting a method for the evaluation of multilingual multi-document summarisation that allows saving precious annotation time and that makes the evaluation results across languages directly comparable. The approach is based on the manual selection of the most important sentences in a cluster of documents from a sentence-aligned parallel corpus, and by projecting the sentence selection to various target languages. We also present two ways of exploiting inter-annotator agreement levels, apply them both to a baseline sentence extraction summariser in seven languages, and discuss the result differences between the two evaluation versions, as well as a preliminary analysis between languages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.


annual meeting of the special interest group on discourse and dialogue | 2015

MultiLing 2015: Multilingual Summarization of Single and Multi-Documents, On-line Fora, and Call-center Conversations

George Giannakopoulos; Jeff Kubina; John M. Conroy; Josef Steinberger; Benoit Favre; Mijail A. Kabadjov; Udo Kruschwitz; Massimo Poesio

In this paper we present an overview of MultiLing 2015, a special session at SIGdial 2015. MultiLing is a communitydriven initiative that pushes the state-ofthe-art in Automatic Summarization by providing data sets and fostering further research and development of summarization systems. There were in total 23 participants this year submitting their system outputs to one or more of the four tasks of MultiLing: MSS, MMS, OnForumS and CCCS. We provide a brief overview of each task and its participation and evaluation.


european conference on machine learning | 2010

NewsGist: a multilingual statistical news summarizer

Mijail A. Kabadjov; Martin Atkinson; Josef Steinberger; Ralf Steinberger; Erik Van der Goot

In this paper we present NewsGist, a multilingual, multidocument news summarization system underpinned by the Singular Value Decomposition (SVD) paradigm for document summarization and purpose-built for the Europe Media Monitor (EMM). The summarization method employed yielded state-ofthe-art performance for English at the Update Summarization task of the last Text Analysis Conference (TAC) 2009 and integrated with EMM represents the first online summarization system able to produce summaries for so many languages. We discuss the context and motivation for developing the system and provide an overview of its architecture. The paper is intended to serve as accompaniment of a live demo of the system, which can be of interest to researchers and engineers working on multilingual open-source news analysis and mining.


intelligent information systems | 2012

Challenges and solutions in the opinion summarization of user-generated content

Alexandra Balahur; Mijail A. Kabadjov; Josef Steinberger; Ralf Steinberger; Andrés Montoyo

The present is marked by the influence of the Social Web on societies and people worldwide. In this context, users generate large amounts of data, especially containing opinion, which has been proven useful for many real-world applications. In order to extract knowledge from user-generated content, automatic methods must be developed. In this paper, we present different approaches to multi-document summarization of opinion from blogs and reviews. We apply these approaches to: (a) identify positive and negative opinions in blog threads in order to produce a list of arguments in favor and against a given topic and (b) summarize the opinion expressed in reviews. Subsequently, we evaluate the proposed methods on two distinct datasets and analyze the quality of the obtained results, as well as discuss the errors produced. Although much remains to be done, the approaches we propose obtain encouraging results and point to clear directions in which further improvements can be made.


web intelligence | 2009

Multilingual Statistical News Summarisation: Preliminary Experiments with English

Mijail A. Kabadjov; Josef Steinberger; Bruno Pouliquen; Ralf Steinberger; Massimo Poesio

In this paper we present a generic approach for summarising multilingual news clusters such as the ones produced by the Europe Media Monitor (EMM) system. It is generic because it uses robust statistical techniques to perform the summarisation step and its multilinguality is inherited from the multilingual entity disambiguation system used to build the source representation. We ran preliminary experiments with the TAC 2008 data, an English corpus for summarisation research, and we obtained promising improvements over a summarisation system ranked in the top 20% at the TAC 2008 competition.


Multi-source, Multilingual Information Extraction and Summarization | 2013

Multilingual Statistical News Summarization

Mijail A. Kabadjov; Josef Steinberger; Ralf Steinberger

In this chapter we present a generic approach for summarizing clusters of multilingual news articles such as the ones produced by the Europe Media Monitor (EMM) system. Our approach uses robust statistical techniques as well as multilingual tools for named entity recognition and disambiguation to produce entity-centered summaries. We run experiments with the TAC 2008 and 2009 data sets (English corpora for summarization research), and we obtained very promising results; at TAC 2009 our runs attained top rank for linguistic quality and second best for overall responsiveness. We also run a small-scale evaluation on languages other than English, demonstrating thereby the multilinguality of our approach, but also providing interesting evidence that contradicts the pervasive assumption “if it works for English, it works for any language”. Finally, we present an online system currently under development which will eventually incorporate all the elements of the summarization approach discussed hereby and we show sample output summaries in various languages.


language resources and evaluation | 2011

Expanding a multilingual media monitoring and information extraction tool to a new language: Swahili

Ralf Steinberger; Sylvia Ombuya; Mijail A. Kabadjov; Bruno Pouliquen; Leonida Della Rocca; Jenya Belyaeva; Monica de Paola; Camelia Ignat; Erik Van der Goot

The Europe Media Monitor (EMM) family of applications is a set of multilingual tools that gather, cluster and classify news in currently fifty languages and that extract named entities and quotations (reported speech) from twenty languages. In this paper, we describe the recent effort of adding the African Bantu language Swahili to EMM. EMM is designed in an entirely modular way, allowing plugging in a new language by providing the language-specific resources for that language. We thus describe the type of language-specific resources needed, the effort involved, and ways of boot-strapping the generation of these resources in order to keep the effort of adding a new language to a minimum. The text analysis applications pursued in our efforts include clustering, classification, recognition and disambiguation of named entities (persons, organisations and locations), recognition and normalisation of date expressions, as well as the identification of reported speech quotations by and about people.

Collaboration


Dive into the Mijail A. Kabadjov's collaboration.

Top Co-Authors

Avatar

Josef Steinberger

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bruno Pouliquen

University of West Bohemia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jenya Belyaeva

European Food Safety Authority

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ander Soraluze

University of the Basque Country

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge