Francesco Corcoglioniti

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Francesco Corcoglioniti is active.

Explore More

Publication

Featured researches published by Francesco Corcoglioniti.

acm symposium on applied computing | 2015

Processing billions of RDF triples on a single machine using streaming and sorting

Francesco Corcoglioniti; Marco Rospocher; Michele Mostarda; Marco Amadori

We consider the feasibility of processing billions of RDF triples on a single commodity machine using streaming and sorting techniques and focusing on RDF processing tasks relevant for Linked Data consumption: data filtering and transformation, RDFS inference, owl:sameAs smushing and statistics extraction. To investigate this research question we built RDFpro (rdf processor), an open source tool that provides streaming and sorting-based processors for the considered tasks and allows their sequential and parallel composition in complex pipelines. an empirical evaluation of RDFpro in four application scenario---dataset analysis, filtering, merging and massaging---shows the effectiveness of the tool and allows to positively answer our research question.

Semantic Web Evaluation Challenges | 2015

Supervised Opinion Frames Detection with RAID

Alessio Palmero Aprosio; Francesco Corcoglioniti; Mauro Dragoni; Marco Rospocher

Most systems for opinion analysis focus on the classification of opinion polarities and rarely consider the task of identifying the different elements and relations forming an opinion frame. In this paper, we present RAID, a tool featuring a processing pipeline for the extraction of opinion frames from text with their opinion expressions, holders, targets and polarities. RAID leverages a lexical, syntactic and semantic analysis of text, using several NLP tools such as dependency parsing, semantic role labelling, named entity recognition and word sense disambiguation. In addition, linguistic resources such as SenticNet and the MPQA Subjectivity Lexicon are used both to locate opinions in the text and to classify their polarities according to a fuzzy model that combines the sentiment values of different opinion words. RAID was evaluated on three different datasets and is released as open source software under the GPLv3 license.

acm symposium on applied computing | 2016

A 2-phase frame-based knowledge extraction framework

Francesco Corcoglioniti; Marco Rospocher; Alessio Palmero Aprosio

We present an approach for extracting knowledge from natural language English texts where processing is decoupled in two phases. The first phase comprises several standard NLP tasks whose results are integrated in a single RDF graph of mentions. The second phase processes the mention graph with SPARQL-like mapping rules to produce a knowledge graph organized around semantic frames (i.e., prototypical descriptions of events and situations). The decoupling allows: (i) choosing different tools for the NLP tasks without affecting the remaining computation; (ii) combining the outputs of different NLP tasks in non-trivial ways, leveraging their integrated and coherent representation in a mention graph; and (iii) relating each piece of extracted knowledge to the mention(s) it comes from, leveraging the single RDF representation. We evaluate precision and recall of our approach on a gold standard, showing its competitiveness w.r.t. the state of the art. We also evaluate execution times and (sampled) accuracy on a corpus of 110K Wikipedia pages, showing the applicability of the approach on large corpora.

ieee international conference semantic computing | 2013

Interlinking Unstructured and Structured Knowledge in an Integrated Framework

Francesco Corcoglioniti; Marco Rospocher; Roldano Cattoni; Bernardo Magnini; Luciano Serafini

Despite the widespread diffusion of structured data sources and the public acclaim of the Linked Open Data initiative, a preponderant amount of information remains nowadays available only in unstructured form, both on the Web and within organizations. While different in form, structured and unstructured contents speak about the very same entities of the world, their properties and relations, still, frameworks for their seamless integration are lacking. In this paper we describe the design of the Knowledge Store, a scalable, fault-tolerant, and Semantic Web grounded storage system for interlinking structured and unstructured data. We discuss its capability to adapt to new types of content and application scenarios, and to provide reasoning and semantic queries services on top of stored contents. We also comment on its envisaged use in the News Reader EU project to manage large amounts of economical-financial data.

international semantic web conference | 2016

Knowledge Extraction for Information Retrieval

Francesco Corcoglioniti; Mauro Dragoni; Marco Rospocher; Alessio Palmero Aprosio

Document retrieval is the task of returning relevant textual resources for a given user query. In this paper, we investigate whether the semantic analysis of the query and the documents, obtained exploiting state-of-the-art Natural Language Processing techniques e.g., Entity Linking, Frame Detection and Semantic Web resources e.g., YAGO, DBpedia, can improve the performances of the traditional term-based similarity approach. Our experiments, conducted on a recently released document collection, show that Mean Average Precision MAP increases of 3.5i?ź% points when combining textual and semantic analysis, thus suggesting that semantic content can effectively improve the performances of Information Retrieval systems.

International Journal on Semantic Web and Information Systems | 2015

The KnowledgeStore: A Storage Framework for Interlinking Unstructured and Structured Knowledge

Francesco Corcoglioniti; Marco Rospocher; Roldano Cattoni; Bernardo Magnini; Luciano Serafini

Although the quantity of structured information on the Web and within organizations is increasing, the majority of information remains available only in unstructured form. While different in form, both unstructured and structured information sources provide information about entities in the world and their properties and relations; still, frameworks for their seamless integration have not been deeply investigated. In this paper the authors describe the KnowledgeStore, a scalable, fault-tolerant, and Semantic Web grounded open-source storage system for interlinking structured and unstructured data. They present the concept, design, function and implementation of the system, and report on its concrete usage in three application scenarios within the NewsReader EU project, where it stores and supports the querying of millions of news articles interlinked with millions of RDF triples extracted from text and imported from Linked Open Data sources. The authors report on data population and data retrieval performances of the system measured through a number of experiments, and they also discuss the practical issues and lessons learned from these experiences.

IEEE Transactions on Knowledge and Data Engineering | 2016

Frame-Based Ontology Population with PIKES

Francesco Corcoglioniti; Marco Rospocher; Alessio Palmero Aprosio

We present an approach for ontology population from natural language English texts that extracts RDF triples according to FrameBase, a Semantic Web ontology derived from FrameNet. Processing is decoupled in two independently-tunable phases. First, text is processed by several NLP tasks, including Semantic Role Labeling (SRL), whose results are integrated in an RDF graph of mentions, i.e., snippets of text denoting some entity/fact. Then, the mention graph is processed with SPARQL-like rules using a specifically created mapping resource from NomBank/PropBank/FrameNet annotations to FrameBase concepts, producing a knowledge graph whose content is linked to DBpedia and organized around semantic frames, i.e., prototypical descriptions of events and situations. A single RDF/OWL representation is used where each triple is related to the mentions/tools it comes from. We implemented the approach in PIKES, an open source tool that combines two complementary SRL systems and provides a working online demo. We evaluated PIKES on a manually annotated gold standard, assessing precision/recall in (i) populating FrameBase ontology, and (ii) extracting semantic frames modeled after standard predicate models, for comparison with state-of-the-art tools for the Semantic Web. We also evaluated (iii) sampled precision and execution times on a large corpus of 110 K Wikipedia-like pages.

International Workshop on Evaluation of Natural Language and Speech Tool for Italian | 2013

Exploiting Background Knowledge for Clustering Person Names

Roberto Zanoli; Francesco Corcoglioniti; Christian Girardi

Nowadays, surfing the Web and looking for persons seems to be one of the most common activities of Internet users. However, person names could be highly ambiguous and consequently search results are often a collection of documents about different people sharing the same name. In this paper a cross-document coreference system able to identify person names in different documents which refer to the same person entity is presented. The system exploits background knowledge through two mechanisms: (1) the use of a dynamic similarity threshold for clustering person names, which depends on the ambiguity of the name estimated using a phonebook; and (2) the disambiguation of names against a knowledge base containing person descriptions, using an entity linking system and including its output as an additional feature for computing similarity. The paper describes the system and reports its performance tested taking part in the News People Search (NePS) task at Evalita 2011. A version of the system is being used in a real-word application, which requires to corefer millions of names from multimedia sources.

symposium on applied computing | 2017

Linking knowledge bases to social media profiles

Yaroslav Nechaev; Francesco Corcoglioniti; Claudio Giuliano

Social media have become an invaluable source of data for a wide variety of tasks. Unfortunately, this data is hard to gather and process due to low amount of machine readable attributes, API limitations and noisiness. In this paper we propose a system that aligns knowledge base entries of people and organisations to the corresponding social media profiles. The motivation is twofold: (i) on the one hand, we facilitate processing of social media data by allowing the import of rich entity descriptions from knowledge bases; (ii) on the other hand, we are enabling an automatic enrichment of a knowledge base with additional data from the social media. We used this system to create a resource of 893,446 alignments between DBpedia entities and Twitter profiles. This resource allows, effectively, to connect Twitter to the Linked Open Data cloud.

international semantic web conference | 2014

Semantic-Based Process Analysis

Chiara Di Francescomarino; Francesco Corcoglioniti; Mauro Dragoni; Piergiorgio Bertoli; Roberto Tiella; Chiara Ghidini; Michele Nori; Marco Pistore

The widespread adoption of Information Technology systems and their capability to trace data about process executions has made available Information Technology data for the analysis of process executions. Meanwhile, at business level, static and procedural knowledge, which can be exploited to analyze and reason on data, is often available. In this paper we aim at providing an approach that, combining static and procedural aspects, business and data levels and exploiting semantic-based techniques allows business analysts to infer knowledge and use it to analyze system executions. The proposed solution has been implemented using current scalable Semantic Web technologies, that offer the possibility to keep the advantages of semantic-based reasoning with non-trivial quantities of data.

Explore More