Kateryna Tymoshenko
fondazione bruno kessler
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kateryna Tymoshenko.
international semantic web conference | 2010
Volha Bryl; Claudio Giuliano; Luciano Serafini; Kateryna Tymoshenko
Systems based on statistical and machine learning methods have been shown to be extremely effective and scalable for the analysis of large amount of textual data. However, in the recent years, it becomes evident that one of the most important directions of improvement in natural language processing (NLP) tasks, like word sense disambiguation, coreference resolution, relation extraction, and other tasks related to knowledge extraction, is by exploiting semantics. While in the past, the unavailability of rich and complete semantic descriptions constituted a serious limitation of their applicability, nowadays, the Semantic Web made available a large amount of logically encoded information (e.g. ontologies, RDF(S)-data, linked data, etc.), which constitutes a valuable source of semantics. However, web semantics cannot be easily plugged into machine learning systems. Therefore the objective of this paper is to define a reference methodology for combining semantic information available in the web under the form of logical theories, with statistical methods for NLP. The major problems that we have to solve to implement our methodology concern (i) the selection of the correct and minimal knowledge among the large amount available in the web, (ii) the representation of uncertain knowledge, and (iii) the resolution and the encoding of the rules that combine knowledge retrieved from Semantic Web sources with semantics in the text. In order to evaluate the appropriateness of our approach, we present an application of the methodology to the problem of intra-document coreference resolution, and we show by means of some experiments on the standard dataset, how the injection of knowledge leads to the improvement of this task performance.
Artificial Intelligence | 2013
Sara Tonelli; Claudio Giuliano; Kateryna Tymoshenko
Many applications in the context of natural language processing have been proven to achieve a significant performance when exploiting semantic information extracted from high-quality annotated resources. However, the practical use of such resources is often biased by their limited coverage. Furthermore, they are generally available only for English and few other languages. We propose a novel methodology that, starting from the mapping between FrameNet lexical units and Wikipedia pages, automatically leverages from Wikipedia new lexical units and example sentences. The goal is to build a reference data set for the semi-automatic development of new FrameNets. In addition, this methodology can be adapted to perform frame identification in any language available in Wikipedia. Our approach relies on a state-of-the-art word sense disambiguation system that is first trained on English Wikipedia to assign a page to the lexical units in a frame. Then, this mapping is further exploited to perform frame identification in English or in any other language available in Wikipedia. Our approach shows a high potential in multilingual settings, because it can be applied to languages for which other lexical resources such as WordNet or thesauri are not available.
international semantic web conference | 2010
Claudio Giuliano; Alfio Massimiliano Gliozzo; Aldo Gangemi; Kateryna Tymoshenko
Acquiring structured data from wikis is a problem of increasing interest in knowledge engineering and Semantic Web. In fact, collaboratively developed resources are growing in time, have high quality and are constantly updated. Among these problems, an area of interest is extracting thesauri from wikis. A thesaurus is a resource that lists words grouped together according to similarity of meaning, generally organized into sets of synonyms. Thesauri are useful for a large variety of applications, including information retrieval and knowledge engineering. Most information in wikis is expressed by means of natural language texts and internal links among Web pages, the so-called wikilinks. In this paper, an innovative method for inducing thesauri from Wikipedia is presented. It leverages on the Wikipedia structure to extract concepts and terms denoting them, obtaining a thesaurus that can be profitably used into applications. This method boosts sensibly precision and recall if applied to re-rank a state-of-the-art baseline approach. Finally, we discuss how to represent the extracted results in RDF/OWL, with respect to existing good practices.
Exploiting Linked Data and Knowledge Graphs in Large Organisations | 2017
Alessandro Moschitti; Kateryna Tymoshenko; Panos Alexopoulos; Andrew D. Walker; Massimo Nicosia; Guido Vetere; Alessandro Faraotti; Marco Monti; Jeff Z. Pan; Honghan Wu; Yuting Zhao
In the Digital and Information Age, companies and government agencies are highly digitalized, as the information exchanges happening in their processes. They store information both as natural language text and structured data, e.g., relational databases or knowledge graphs. In this scenario, methods for organizing, finding, and selecting relevant information, beyond the capabilities of classic Information Retrieval, are always active topics of research and development.
meeting of the association for computational linguistics | 2017
Kateryna Tymoshenko; Alessandro Moschitti; Massimo Nicosia; Aliaksei Severyn
We present a highly-flexible UIMA-based pipeline for developing structural kernelbased systems for relational learning from text, i.e., for generating training and test data for ranking, classifying short text pairs or measuring similarity between pieces of text. For example, the proposed pipeline can represent an input question and answer sentence pairs as syntacticsemantic structures, enriching them with relational information, e.g., links between question class, focus and named entities, and serializes them as training and test files for the tree kernel-based reranking framework. The pipeline generates a number of dependency and shallow chunkbased representations shown to achieve competitive results in previous work. It also enables easy evaluation of the models thanks to cross-validation facilities.
european conference on artificial intelligence | 2010
Volha Bryl; Claudio Giuliano; Luciano Serafini; Kateryna Tymoshenko
international conference on computational linguistics | 2010
Luisa Bentivogli; Pamela Forner; Claudio Giuliano; Alessandro Marchetti; Emanuele Pianta; Kateryna Tymoshenko
the florida ai research society | 2011
Olga Uryupina; Massimo Poesio; Claudio Giuliano; Kateryna Tymoshenko
meeting of the association for computational linguistics | 2010
Kateryna Tymoshenko; Claudio Giuliano
international conference on computational linguistics | 2010
M. Atif Qureshi; Arjumand Younus; Muhammad Saeed; Nasir Touheed; Emanuele Pianta; Kateryna Tymoshenko