Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Paolo Papotti is active.

Publication


Featured researches published by Paolo Papotti.


extending database technology | 2008

Flint: Google-basing the Web

Lorenzo Blanco; Valter Crescenzi; Paolo Merialdo; Paolo Papotti

Several Web sites deliver a large number of pages, each publishing data about one instance of some real world entity, such as an athlete, a stock quote, a book. Even though it is easy for a human reader to recognize these instances, current search engines are unaware of them. Technologies for the Semantic Web aim at achieving this goal; however, so far they have been of little help in this respect, as semantic publishing is very limited.n We have developed a system, called Flint, for automatically searching, collecting and indexing Web pages that publish data representing an instance of a certain conceptual entity. Flint takes as input a small set of labeled sample pages: it automatically infers a description of the underlying conceptual entity and then searches the Web for other pages containing data representing the same entity. Flint automatically extracts data from the collected pages and stores them into a semi-structured self-describing database, such as Google Base. Also, the collected pages can be used to populate a custom search engine; to this end we rely on the facilities provided by Google Co-op.


DATA-CENTRIC SYSTEMS AND APPLICATIONS | 2012

Flint: From Web Pages to Probabilistic Semantic Data

Paolo Merialdo; Paolo Papotti

A large and increasing number of web sites publish structured data about recognizable concepts (such as stock quotes, movies, restaurants). The great chance to create applications that rely on the huge amount of data taken from these sites has been discussed for more than a decade now, but in practice, only a small fraction of such information is currently used. The main reason is that extracting and integrating web data of good quality is an expensive task, which often requires human intervention. In this chapter, we present the main results of the Flint project, which aims at developing automatic and domain-independent tools to perform all the steps required to benefit from Web data: discovering data-intensive web sites containing information about entities of interest, extracting and integrating the published data, and performing a probabilistic analysis to characterize the impreciseness of the data and the accuracy of the sources. The results of the processing are semantically annotated data that can be used to populate a probabilistic database and to develop novel applications.


conference on advanced information systems engineering | 2004

An Approach to Heterogeneous Data Translation based on XML Conversion.

Paolo Papotti; Riccardo Torlone


Unknown Journal | 2008

Data exchange with datametadata translations

Mauricio A. Hernández; Paolo Papotti; Wang Chiew Tan


Unknown Journal | 2011

++Spicy: An opensource tool for secondgeneration schema mapping and data exchange

Bruno Marnette; Giansalvatore Mecca; Paolo Papotti; Salvatore Raunich; Donatello Santoro


18th Italian Symposium on Advanced Database Systems, SEBD 2010 | 2010

Probabilistic reconciliation of records from inaccurate web sources (extended abstract)

Lorenzo Blanco; Valter Crescenzi; Paolo Merialdo; Paolo Papotti


Seventeenth Italian Symposium on Advanced Database Systems, SEBD 2009 | 2009

Data Extraction and Integration from Imprecise Web Sources

Lorenzo Blanco; Mirko Bronzi; Valter Crescenzi; Paolo Merialdo; Paolo Papotti


17th Italian Symposium on Advanced Database Systems, SEBD 2009 | 2009

Data extraction and integration from imprecise web sources (Extended abstract)

Lorenzo Blanco; Mirko Bronzi; Valter Crescenzi; Paolo Merialdo; Paolo Papotti


15th Italian Symposium on Advanced Database Systems, SEBD 2007 | 2007

On the schema exchange problem

Paolo Papotti; Riccardo Torlone


Archive | 2005

Automatic Techniques for Data Model Translation

Paolo Papotti; Riccardo Torlone

Collaboration


Dive into the Paolo Papotti's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wang Chiew Tan

University of California

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge