Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ahmad Assaf is active.

Publication


Featured researches published by Ahmad Assaf.


international world wide web conferences | 2015

Roomba: Automatic Validation, Correction and Generation of Dataset Metadata

Ahmad Assaf; Aline Senart; Raphaël Troncy

Data is being published by both the public and private sectors and covers a diverse set of domains ranging from life sciences to media or government data. An example is the Linked Open Data (LOD) cloud which is potentially a gold mine for organizations and individuals who are trying to leverage external data sources in order to produce more informed business decisions. Considering the significant variation in size, the languages used and the freshness of the data, one realizes that spotting spam datasets or simply finding useful datasets without prior knowledge is increasingly complicated. In this paper, we propose Roomba, a scalable automatic approach for extracting, validating, correcting and generating descriptive linked dataset profiles. While Roomba is generic, we target CKAN-based data portals and we validate our approach against a set of open data portals including the Linked Open Data (LOD) cloud as viewed on the DataHub. The results demonstrate that the general state of various datasets and groups, including the LOD cloud group, needs more attention as most of the datasets suffer from bad quality metadata and lack some informative metrics that are required to facilitate dataset search.


international conference on knowledge capture | 2015

The 3cixty Knowledge Base for Expo Milano 2015: Enabling Visitors to Explore the City

Giuseppe Rizzo; Oscar Corcho; Raphaël Troncy; Julien Plu; Juan Carlos Ballesteros Hermida; Ahmad Assaf

In this paper, we present the 3cixty Knowledge Base, which collects and harmonizes descriptions of events, places, transportation facilities and user-generated data such as reviews of the city and Expo site of Milan. This knowledge base is used by a set of web and mobile applications to guide Expo Milano 2015 visitors in the city and in the exhibit, allowing them to find places, satellite events and transportation facilities around Milan. As of July 24th, 2015 the knowledge base contains 18665 unique events, 225821 unique places, 94789 reviews, and 9343 transportation facilities, collected from several static, near- and real time local and global data providers, including Expo Milano 2015 official services and numerous social media platforms. The ontologies used as a backbone for structuring the knowledge base follow a rigorous development method where the design principle has generally been to re-use existing ontologies when they exist. We think that the lessons learned from this development will be useful for similar endeavors in other cities or large events around the world with a similar ecosystem of data provisioning services.


european semantic web conference | 2015

Roomba: An Extensible Framework to Validate and Build Dataset Profiles

Ahmad Assaf; Raphaël Troncy; Aline Senart

Linked Open Data LOD has emerged as one of the largest collections of interlinked datasets on the web. In order to benefit from this mine of data, one needs to access to descriptive information about each dataset or metadata. This information can be used to delay data entropy, enhance dataset discovery, exploration and reuse as well as helping data portal administrators in detecting and eliminating spam. However, such metadata information is currently very limited to a few data portals where they are usually provided manually, thus being often incomplete and inconsistent in terms of quality. To address these issues, we propose a scalable automatic approach for extracting, validating, correcting and generating descriptive linked dataset profiles. This approach applies several techniques in order to check the validity of the metadata provided and to generate descriptive and statistical information for a particular dataset or for an entire data portal.


ieee international conference semantic computing | 2012

Data Quality Principles in the Semantic Web

Ahmad Assaf; Aline Senart

The increasing size and availability of web data make data quality a core challenge in many applications. Principles of data quality are recognized as essential to ensure that data fit for their intended use in operations, decision-making, and planning. However, with the rise of the Semantic Web, new data quality issues appear and require deeper consideration. In this paper, we propose to extend the data quality principles to the context of Semantic Web. Based on our extensive industrial experience in data integration, we identify five main classes suited for data quality in Semantic Web. For each class, we list the principles that are involved at all stages of the data management process. Following these principles will provide a sound basis for better decision-making within organizations and will maximize long-term data integration and interoperability.


International Journal on Semantic Web and Information Systems | 2016

Towards an objective assessment framework for linked data quality: Enriching dataset profiles with quality indicators

Ahmad Assaf; Aline Senart; Raphaël Troncy

Ensuring data quality in Linked Open Data is a complex process as it consists of structured information supported by models, ontologies and vocabularies and contains queryable endpoints and links. In this paper, the authors first propose an objective assessment framework for Linked Data quality. The authors build upon previous efforts that have identified potential quality issues but focus only on objective quality indicators that can measured regardless on the underlying use case. Secondly, the authors present an extensible quality measurement tool that helps on one hand data owners to rate the quality of their datasets, and on the other hand data consumers to choose their data sources from a ranked set. The authors evaluate this tool by measuring the quality of the LOD cloud. The results demonstrate that the general state of the datasets needs attention as they mostly have low completeness, provenance, licensing and comprehensibility quality scores.


Proceedings of the First International Workshop on Open Data | 2012

RUBIX: a framework for improving data integration with linked data

Ahmad Assaf; Eldad Louw; Aline Senart; Corentin Follenfant; Raphaël Troncy; David Trastour

With todays public data sets containing billions of data items, more and more companies are looking to integrate external data with their traditional enterprise data to improve business intelligence analysis. These distributed data sources however exhibit heterogeneous data formats and terminologies and may contain noisy data. In this paper, we present RUBIX, a novel framework that enables business users to semi-automatically perform data integration on potentially noisy tabular data. This framework offers an extension to Google Refine with novel schema matching algorithms leveraging Freebase rich types. First experiments show that using Linked Data to map cell values with instances and column headers with types improves significantly the quality of the matching results and therefore should lead to more informed decisions.


extended semantic web conference | 2013

SNARC - An approach for aggregating and recommending contextualized social content

Ahmad Assaf; Aline Senart; Raphaël Troncy

The Internet has created a paradigm shift in how we consume and disseminate information. Data nowadays is spread over heterogeneous silos of archived and live data. People willingly share data on social media by posting news, views, presentations, pictures and videos. SNARC is a service that uses semantic web technology and combines services available on the web to aggregate social news. SNARC brings live and archived information to the user that is directly related to his active page. The key advantage is an instantaneous access to complementary information without the need to dig for it. Information appears when it is relevant enabling the user to focus on what is really important.


european semantic web conference | 2015

What's up LOD Cloud?

Ahmad Assaf; Raphaël Troncy; Aline Senart

Linked Open Data LOD has emerged as one of the largest collections of interlinked datasets on the web. In order to benefit from this mine of data, one needs to access descriptive information about each dataset or metadata. However, the heterogeneous nature of data sources reflects directly on the data quality as these sources often contain inconsistent as well as misinterpreted and incomplete metadata information. Considering the significant variation in size, the languages used and the freshness of the data, one realizes that finding useful datasets without prior knowledge is increasingly complicated. We have developed Roomba, a tool that enables to validate, correct and generate dataset metadata. In this paper, we present the results of running this tool on parts of the LOD cloud accessible via the datahub.io API. The results demonstrate that the general state of the datasets needs more attention as most of them suffers from bad quality metadata and lacking some informative metrics that are needed to facilitate dataset search. We also show that the automatic corrections done by Roomba increase the overall quality of the datasets metadata and we highlight the need for manual efforts to correct some important missing information.


european semantic web conference | 2014

What Are the Important Properties of an Entity

Ahmad Assaf; Ghislain Auguste Atemezing; Raphaël Troncy; Elena Cabrio

Entities play a key role in knowledge bases in general and in the Web of Data in particular. Entities are generally described with a lot of properties, this is the case for DBpedia. It is, however, difficult to assess which ones are more “important” than others for particular tasks such as visualizing the key facts of an entity or filtering out the ones which will yield better instance matching. In this paper, we perform a reverse engineering of the Google Knowledge graph panel to find out what are the most “important” properties for an entity according to Google. We compare these results with a survey we conducted on 152 users. We finally show how we can represent and explicit this knowledge using the Fresnel vocabulary.


Archive | 2015

3cixty@Expo Milano 2015 enabling visitors to explore a smart city

Giuseppe Rizzo; Raphaël Troncy; Oscar Corcho; Anthony Jameson; Julien Plu; Juan Carlos Ballesteros Hermida; Ahmad Assaf; Catalin Barbu; Adrian Spirescu; Kai-Dominik Kuhn; Irene Celino; Rachit Agarwal; Cong Kinh Nguyen; Animesh Pathak; Christian Scanu; Massimo Valla; Timber Haaker; Emiliano Sergio Verga; Matteo Rossi; José Luis Redondo García

Collaboration


Dive into the Ahmad Assaf's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Oscar Corcho

Technical University of Madrid

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge