Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Michele Catasta is active.

Publication


Featured researches published by Michele Catasta.


Journal of Web Semantics | 2010

Invited paper: Sig.ma: Live views on the Web of Data

Giovanni Tummarello; Richard Cyganiak; Michele Catasta; Szymon Danielczyk; Renaud Delbru; Stefan Decker

We present Sig.ma, both a service and an end user application to access the Web of Data as an integrated information space. Sig.ma uses an holistic approach in which large scale semantic Web indexing, logic reasoning, data aggregation heuristics, ad-hoc ontology consolidation, external services and responsive user interaction all play together to create rich entity descriptions. These consolidated entity descriptions then form the base for embeddable data mashups, machine oriented services as well as data browsing services. Finally, we discuss Sig.mas peculiar characteristics and report on lessons learned and ideas it inspires.


international world wide web conferences | 2010

Sig.ma: live views on the web of data

Giovanni Tummarello; Richard Cyganiak; Michele Catasta; Szymon Danielczyk; Renaud Delbru; Stefan Decker

We demonstrate Sig.ma, both a service and an end user application to access the Web of Data as an integrated information space. Sig.ma uses an holistic approach in which large scale semantic web indexing, logic reasoning, data aggregation heuristics, ad hoc ontology consolidation, external services and responsive user interaction all play together to create rich entity descriptions. These consolidated entity descriptions then form the base for embeddable data mashups, machine oriented services as well as data browsing services. Finally, we discuss Sig.mas peculiar characteristics and report on lessions learned and ideas it inspires.


international semantic web conference | 2010

Hierarchical link analysis for ranking web data

Renaud Delbru; Nickolai Toupikov; Michele Catasta; Giovanni Tummarello; Stefan Decker

On the Web of Data, entities are often interconnected in a way similar to web documents. Previous works have shown how PageRank can be adapted to achieve entity ranking. In this paper, we propose to exploit locality on the Web of Data by taking a layered approach, similar to hierarchical PageRank approaches. We provide justifications for a two-layer model of the Web of Data, and introduce DING (Dataset Ranking) a novel ranking methodology based on this two-layer model. DING uses links between datasets to compute dataset ranks and combines the resulting values with semantic-dependent entity ranking strategies. We quantify the effectiveness of the approach with other link-based algorithms on large datasets coming from the Sindice search engine. The evaluation which includes a user study indicates that the resulting rank is better than the other approaches. Also, the resulting algorithm is shown to have desirable computational properties such as parallelisation.


international semantic web conference | 2010

A node indexing scheme for web entity retrieval

Renaud Delbru; Nickolai Toupikov; Michele Catasta; Giovanni Tummarello

Now motivated also by the partial support of major search engines, hundreds of millions of documents are being published on the web embedding semi-structured data in RDF, RDFa and Microformats. This scenario calls for novel information search systems which provide effective means of retrieving relevant semi-structured information. In this paper, we present an “entity retrieval system” designed to provide entity search capabilities over datasets as large as the entire Web of Data. Our system supports full-text search, semi-structural queries and top-k query results while exhibiting a concise index and efficient incremental updates. We advocate the use of a node indexing scheme and show that it offers a good compromise between query expressiveness, query processing time and update complexity in comparison to three other indexing techniques. We then demonstrate how such system can effectively answer queries over 10 billion triples on a single commodity machine.


international semantic web conference | 2013

TRank: Ranking Entity Types Using the Web of Data

Alberto Tonon; Michele Catasta; Gianluca Demartini; Philippe Cudré-Mauroux; Karl Aberer

Much of Web search and browsing activity is today centered around entities. For this reason, Search Engine Result Pages (SERPs) increasingly contain information about the searched entities such as pictures, short summaries, related entities, and factual information. A key facet that is often displayed on the SERPs and that is instrumental for many applications is the entity type. However, an entity is usually not associated to a single generic type in the background knowledge bases but rather to a set of more specific types, which may be relevant or not given the document context. For example, one can find on the Linked Open Data cloud the fact that Tom Hanks is a person, an actor, and a person from Concord, California. All those types are correct but some may be too general to be interesting (e.g., person), while other may be interesting but already known to the user (e.g., actor), or may be irrelevant given the current browsing context (e.g., person from Concord, California). In this paper, we define the new task of ranking entity types given an entity and its context. We propose and evaluate new methods to find the most relevant entity type based on collection statistics and on the graph structure interconnecting entities and types. An extensive experimental evaluation over several document collections at different levels of granularity (e.g., sentences, paragraphs, etc.) and different type hierarchies (including DBPedia, Freebase, and schema.org) shows that hierarchy-based approaches provide more accurate results when picking entity types to be displayed to the end-user while still being highly scalable.


Journal of Web Semantics | 2014

B-hist

Michele Catasta; Alberto Tonon; Gianluca Demartini; Jean-Eudes Ranvier; Karl Aberer; Philippe Cudré-Mauroux

Web Search is increasingly entity centric; as a large fraction of common queries target specific entities, search results get progressively augmented with semi-structured and multimedia information about those entities. However, search over personal web browsing history still revolves around keyword-search mostly. In this paper, we present a novel approach to answer queries over web browsing logs that takes into account entities appearing in the web pages, user activities, as well as temporal information. Our system, B-hist, aims at providing web users with an effective tool for searching and accessing information they previously looked up on the web by supporting multiple ways of filtering results using clustering and entity-centric search. In the following, we present our system and motivate our User Interface (UI) design choices by detailing the results of a survey on web browsing and history search. In addition, we present an empirical evaluation of our entity-based approach used to cluster web pages.


conference on advanced information systems engineering | 2010

From web data to entities and back

Zoltán Miklós; Nicolas Bonvin; Paolo Bouquet; Michele Catasta; Daniele Cordioli; Peter Fankhauser; Julien Gaugaz; Ekaterini Ioannou; Hristo Koshutanski; Antonio Maña; Claudia Niederée; Themis Palpanas; Heiko Stoermer

We present the Entity Name System (ENS), an enabling infrastructure, which can host descriptions of named entities and provide unique identifiers, on large-scale. In this way, it opens new perspectives to realize entity-oriented, rather than keyword-oriented, Web information systems. We describe the architecture and the functionality of the ENS, along with tools, which all contribute to realize the Web of entities.


Journal of Web Semantics | 2016

Contextualized ranking of entity types based on knowledge graphs

Alberto Tonon; Michele Catasta; Roman Prokofyev; Gianluca Demartini; Karl Aberer; Philippe Cudré-Mauroux

A large fraction of online queries targets entities. For this reason, Search Engine Result Pages (SERPs) increasingly contain information about the searched entities such as pictures, short summaries, related entities, and factual information. A key facet that is often displayed on the SERPs and that is instrumental for many applications is the entity type. However, an entity is usually not associated to a single generic type in the background knowledge graph but rather to a set of more specific types, which may be relevant or not given the document context. For example, one can find on the Linked Open Data cloud the fact that Tom Hanks is a person, an actor, and a person from Concord, California. All these types are correct but some may be too general to be interesting (e.g., person), while other may be interesting but already known to the user (e.g., actor), or may be irrelevant given the current browsing context (e.g., person from Concord, California). In this paper, we define the new task of ranking entity types given an entity and its context. We propose and evaluate new methods to find the most relevant entity type based on collection statistics and on the knowledge graph structure interconnecting entities and types. An extensive experimental evaluation over several document collections at different levels of granularity (e.g., sentences, paragraphs) and different type hierarchies (including DBpedia, Freebase, and schema.org) shows that hierarchy-based approaches provide more accurate results when picking entity types to be displayed to the end-user.


information reuse and integration | 2013

Entity disambiguation in tweets leveraging user social profiles

Surender Reddy Yerva; Michele Catasta; Gianluca Demartini; Karl Aberer

Pervasive web and social networks are becoming part of everyones life. Users through their activities on these networks are leaving traces of their expertise, interests and personalities. With the advances in Web mining and user modeling techniques it is possible to leverage the user social network activity history to extract the semantics of user-generated content. In this work we explore various techniques for constructing user profiles based on the content they publish on social networks. We further show that one of the advantages of maintaining social network user profiles is to provide the context for better understanding of microposts. We propose and experimentally evaluate different approaches for entity disambiguation in social networks based on syntactic and semantic features on top of two different social networks: a general-interest network (i.e., Twitter) and a domain-specific network (i.e., StackOverflow). We demonstrate how disambiguation accuracy increases when considering enriched user profiles integrating content from both social networks.


international conference on computational science and its applications | 2011

Building a front end for a sensor data cloud

Ian Rolewicz; Michele Catasta; Ho Young Jeung; Zoltán Miklós; Karl Aberer

This document introduces the TimeCloud Front End, a webbased interface for the TimeCloud platform that manages large-scale time series in the cloud. While the Back End is built upon scalable, fault-tolerant distributed systems as Hadoop and HBase and takes novel approaches for facilitating data analysis over massive time series, the Front End was built as a simple and intuitive interface for viewing the data present in the cloud, both with simple tabular display and the help of various visualizations. In addition, the Front End implements modelbased views and data fetch on-demand for reducing the amount of work performed at the Back End.

Collaboration


Dive into the Michele Catasta's collaboration.

Top Co-Authors

Avatar

Karl Aberer

École Polytechnique Fédérale de Lausanne

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Giovanni Tummarello

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Renaud Delbru

National University of Ireland

View shared research outputs
Top Co-Authors

Avatar

Jean-Eudes Ranvier

École Polytechnique Fédérale de Lausanne

View shared research outputs
Top Co-Authors

Avatar

Horia Radu

École Polytechnique Fédérale de Lausanne

View shared research outputs
Top Co-Authors

Avatar

Matteo Vasirani

École Polytechnique Fédérale de Lausanne

View shared research outputs
Researchain Logo
Decentralizing Knowledge