Monica Monachini | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Monica Monachini is active.

Explore More

Publication

Featured researches published by Monica Monachini.

language resources and evaluation | 2009

Multilingual resources for NLP in the lexical markup framework (LMF)

Gil Francopoulo; Núria Bel; Monte George; Nicoletta Calzolari; Monica Monachini; Mandy Pet; Claudia Soria

Optimizing the production, maintenance and extension of lexical resources is one the crucial aspects impacting natural language processing (NLP). A second aspect involves optimizing the process leading to their integration in applications. With this respect, we believe that a consensual specification on monolingual, bilingual and multilingual lexicons can be a useful aid for the various NLP actors. Within ISO, one purpose of Lexical Markup Framework (LMF, ISO-24613) is to define a standard for lexicons that covers multilingual lexical data.

BMC Bioinformatics | 2011

The BioLexicon: A large-scale terminological resource for biomedical text mining

Paul Thompson; John McNaught; Simonetta Montemagni; Nicoletta Calzolari; Riccardo Del Gratta; Vivian Lee; Simone Marchi; Monica Monachini; Piotr Pęzik; Valeria Quochi; Christopher Rupp; Yutaka Sasaki; Giulia Venturi; Dietrich Rebholz-Schuhmann; Sophia Ananiadou

BackgroundDue to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events) involving these concepts, e.g., protein-protein interactions. Such functionality requires access to detailed information about words used in the biomedical literature. Existing databases and ontologies often have a specific focus and are oriented towards human use. Consequently, biological knowledge is dispersed amongst many resources, which often do not attempt to account for the large and frequently changing set of variants that appear in the literature. Additionally, such resources typically do not provide information about how terms relate to each other in texts to describe events.ResultsThis article provides an overview of the design, construction and evaluation of a large-scale lexical and conceptual resource for the biomedical domain, the BioLexicon. The resource can be exploited by text mining tools at several levels, e.g., part-of-speech tagging, recognition of biomedical entities, and the extraction of events in which they are involved. As such, the BioLexicon must account for real usage of words in biomedical texts. In particular, the BioLexicon gathers together different types of terms from several existing data resources into a single, unified repository, and augments them with new term variants automatically extracted from biomedical literature. Extraction of events is facilitated through the inclusion of biologically pertinent verbs (around which events are typically organized) together with information about typical patterns of grammatical and semantic behaviour, which are acquired from domain-specific texts. In order to foster interoperability, the BioLexicon is modelled using the Lexical Markup Framework, an ISO standard.ConclusionsThe BioLexicon contains over 2.2 M lexical entries and over 1.8 M terminological variants, as well as over 3.3 M semantic relations, including over 2 M synonymy relations. Its exploitation can benefit both application developers and users. We demonstrate some such benefits by describing integration of the resource into a number of different tools, and evaluating improvements in performance that this can bring.

Proceedings of the 2009 international workshop on Intercultural collaboration | 2009

Wordnet-LMF: fleshing out a standardized format for wordnet interoperability

Claudia Soria; Monica Monachini; Piek Vossen

In this paper we present Wordnet-LMF, a dialect of ISO Lexical Markup Framework that instantiates LMF for representing wordnets. Wordnet-LMF was developed in the framework of the EU KYOTO project for the specific purpose of endowing a set of wordnets with a standardized interoperability format allowing the interchange of lexico-semantic information encoded in each of them. The aim of this format is twofold a) to give a preliminary assessment of LMF, by large-scale application to real lexical resources; b) to endow WordNet with a format representation that will allow easier integration among resources sharing the same structure (i.e other wordnets) and, more importantly, across resources with different theoretical and implementation approaches.

meeting of the association for computational linguistics | 2006

Infrastructure for Standardization of Asian Language Resources

Takenobu Tokunaga; Virach Sornlertlamvanich; Thatsanee Charoenporn; Nicoletta Calzolari; Monica Monachini; Claudia Soria; Chu-Ren Huang; Yingju Xia; Hao Yu; Laurent Prévot; Kiyoaki Shirai

As an area of great linguistic and cultural diversity, Asian language resources have received much less attention than their western counterparts. Creating a common standard for Asian language resources that is compatible with an international standard has at least three strong advantages: to increase the competitive edge of Asian countries, to bring Asian countries to closer to their western counterparts, and to bring more cohesion among Asian countries. To achieve this goal, we have launched a two year project to create a common standard for Asian language resources. The project is comprised of four research items, (1) building a description framework of lexical entries, (2) building sample lexicons, (3) building an upper-layer ontology and (4) evaluating the proposed framework through an application. This paper outlines the project in terms of its aim and approach.

Proceedings of the Workshop on Multilingual Language Resources and Interoperability | 2006

Lexical Markup Framework (LMF) for NLP Multilingual Resources

Gil Francopoulo; Núria Bel; Monte George; Nicoletta Calzolari; Monica Monachini; Mandy Pet; Claudia Soria

Optimizing the production, maintenance and extension of lexical resources is one the crucial aspects impacting Natural Language Processing (NLP). A second aspect involves optimizing the process leading to their integration in applications. With this respect, we believe that the production of a consensual specification on multilingual lexicons can be a useful aid for the various NLP actors. Within ISO, one purpose of LMF (ISO-24613) is to define a standard for lexicons that covers multilingual data.

Sprachwissenschaft | 2015

Converting the PAROLE SIMPLE CLIPS lexicon into RDF with lemon

Riccardo Del Gratta; Francesca Frontini; Fahad Khan; Monica Monachini

This paper describes the publication and linking of (parts of) PAROLE SIMPLE CLIPS (PSC), a large scale Italian lexicon, to the Semantic Web and the Linked Data cloud using the lemon model. The main challenge of the conversion is discussed, namely the reconciliation between the PSC semantic structure which contains richly encoded semantic information, following the qualia structure of the Generative Lexicon theory and the lemon view of lexical sense as a reified pairing of a lexical item and a concept in an ontology. The result is two datasets: one consists of a list of lemon lexical entries with their lexical properties, relations and senses; the other consists of a list of OWL individuals representing the referents for the lexical senses. These OWL individuals are linked to each other by a set of semantic relations and mapped onto the SIMPLE OWL ontology of higher level semantic types.

language resources and evaluation | 2009

Exploring Interoperability of Language Resources: the Case of Cross-lingual Semi-automatic Enrichment of Wordnets

Claudia Soria; Monica Monachini; Francesca Bertagna; Nicoletta Calzolari; Chu-Ren Huang; Shu-Kai Hsieh; Andrea Marchetti; Maurizio Tesconi

In this paper we present an application fostering the integration and interoperability of computational lexicons, focusing on the particular case of mutual linking and cross-lingual enrichment of two wordnets, the ItalWordNet and Sinica BOW lexicons. This is intended as a case-study investigating the needs and requirements of semi-automatic integration and interoperability of lexical resources, in the view of developing a prototype web application to support the GlobalWordNet Grid initiative.

The Language Grid | 2011

Language Service Ontology

Yoshihiko Hayashi; Thierry Declerck; Nicoletta Calzolari; Monica Monachini; Claudia Soria; Paul Buitelaar

The Language Grid is a distinctive language service infrastructure in the sense that it accommodates a wide variety of user needs, ranging from technical novices to experts; language resource consumers to language resource providers. As these language services are various in type and each of them can be idiosyncratic in many aspects, the service infrastructure has to address the issue of interoperability. A key to solve this issue is not only to build the services around standardized resources and interfaces, but also to establish a knowledge structure that copes effectively with a range of language services. Given this knowledge structure, referred to as a service ontology, each language service can be systematically classified and its usage specified by a corresponding API. This not only enables the utilization of existing language resources but facilitates the dissemination of newly created language resources as services.

Archive | 1999

Standardization in the Lexicon

Monica Monachini; Nicoletta Calzolari

Lexicons, as described in the previous chapter, are a valuable resource, not only for wordclass tagging but also for many other applications in the broad area of language engineering (LE), which encompasses fields such as computational linguistics and Natural Language Processing (NLP). Furthermore, the last decade in particular has seen an increasing use of corpora for computational lexicography, other corpus-based research and development of applications, all of which has led to the general recognition of the value of ‘authentic’ data.

Proceedings of the Workshop on Multilingual Language Resources and Interoperability | 2006