Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where José M. Giménez-García is active.

Publication


Featured researches published by José M. Giménez-García.


european semantic web conference | 2015

HDT-MR: A Scalable Solution for RDF Compression with HDT and MapReduce

José M. Giménez-García; Javier D. Fernández; Miguel A. Martínez-Prieto

HDT a is binary RDF serialization aiming at minimizing the space overheads of traditional RDF formats, while providing retrieval features in compressed space. Several HDT-based applications, such as the recent Linked Data Fragments proposal, leverage these features for diverse publication, interchange and consumption purposes. However, scalability issues emerge in HDT construction because the whole RDF dataset must be processed in a memory-consuming task. This is hindering the evolution of novel applications and techniques at Web scale. This paper introduces HDT-MR, a MapReduce-based technique to process huge RDF and build the HDT serialization. HDT-MR performs in linear time with the dataset size and has proven able to serialize datasets upi¾?to several billion triples, preserving HDT compression and retrieval features.


european semantic web conference | 2017

NdFluents: An Ontology for Annotated Statements with Inference Preservation

José M. Giménez-García; Antoine Zimmermann; Pierre Maret

RDF provides the means to publish, link, and consume heterogeneous information on the Web of Data, whereas OWL allows the construction of ontologies and inference of new information that is implicit in the data. Annotating RDF data with additional information, such as provenance, trustworthiness, or temporal validity is becoming more and more important in recent times; however, it is possible to natively represent only binary (or dyadic) relations between entities in RDF and OWL. While there are some approaches to represent metadata on RDF, they lose most of the reasoning power of OWL. In this paper we present an extension of Welty and Fikes’ 4dFluents ontology—on associating temporal validity to statements—to any number of dimensions, provide guidelines and design patterns to implement it on actual data, and compare its reasoning power with alternative representations.


web intelligence, mining and semantics | 2016

Are Linked Datasets fit for Open-domain Question Answering? A Quality Assessment

Harsh Thakkar; Kemele M. Endris; José M. Giménez-García; Jeremy Debattista; Christoph Lange; Sören Auer

The current decade is a witness to an enormous explosion of data being published on the Web as Linked Data to maximise its reusability. Answering questions that users speak or write in natural language is an increasingly popular application scenario for Web Data, especially when the domain of the questions is not limited to a domain where dedicated curated datasets exist, like in medicine. The increasing use of Web Data in this and other settings has highlighted the importance of assessing its quality. While quite some work has been done with regard to assessing the quality of Linked Data, only few efforts have been dedicated to quality assessment of linked data from the question answering domains perspective. From the linked data quality metrics that have so far been well documented in the literature, we have identified those that are most relevant for QA. We apply these quality metrics, implemented in the Luzzu framework, to subsets of two datasets of crucial importance to open domain QA -- DBpedia and Wikidata -- and thus present the first assessment of the quality of these datasets for QA. From these datasets, we assess slices covering the specific domains of restaurants, politicians, films and soccer players. The results of our experiments suggest that for most of these domains, the quality of Wikidata with regard to the majority of relevant metrics is higher than that of DBpedia.


international semantic web conference | 2016

Assessing Trust with PageRank in the Web of Data

José M. Giménez-García; Harsh Thakkar; Antoine Zimmermann

While a number of quality metrics have been successfully proposed for datasets in the Web of Data, there is a lack of trust metrics that can be computed for any given dataset. We argue that reuse of data can be seen as an act of trust. In the Semantic Web environment, datasets regularly include terms from other sources, and each of these connections express a degree of trust on that source. However, determining what is a dataset in this context is not straightforward. We study the concepts of dataset and dataset link, to finally use the concept of Pay-Level Domain to differentiate datasets, and consider usage of external terms as connections among them. Using these connections we compute the PageRank value for each dataset, and examine the influence of ignoring predicates for computation. This process has been performed for more than 300 datasets, extracted from the LOD Laundromat. The results show that reuse of a dataset is not correlated with its size, and provide some insight on the limitations of the approach and ways to improve its efficacy.


international conference on knowledge capture | 2017

Dataset Reuse: An Analysis of References in Community Discussions, Publications and Data

Kemele M. Endris; José M. Giménez-García; Harsh Thakkar; Elena Demidova; Antoine Zimmermann; Christoph Lange; Elena Simperl

Following the Linked Data principles means maximising the reusability of data over the Web. Reuse of datasets can become apparent when datasets are linked to from other datasets, and referred in scientific articles or community discussions. It can thus be measured, similarly to citations of papers. In this paper we propose dataset reuse metrics and use these metrics to analyse indications of dataset reuse in different communication channels within a scientific community. In particular we consider mailing lists and publications in the Semantic Web community and their correlation with data interlinking. Our results demonstrate that indications of dataset reuse across different communication channels and reuse in terms of data interlinking are positively correlated.


20th International Conference on Knowledge Engineering and Knowledge Management | 2016

Representing Contextual Information as Fluents

José M. Giménez-García; Antoine Zimmermann; Pierre Maret

Annotating semantic data with metadata is becoming more and more important to provide information about the statements. While there are solutions to represent temporal information about a statement, a general annotation framework which allows representing more contextual information is needed. In this paper, we extend the 4dFluents ontology by Welty and Fikes to any dimension of context.


Open Journal of Semantic Web (OJSW) | 2014

MapReduce-based Solutions for Scalable SPARQL Querying

José M. Giménez-García; Javier D. Fernández; Miguel A. Martínez-Prieto


arXiv: Logic in Computer Science | 2017

Integrating Context of Statements within Description Logics

Antoine Zimmermann; José M. Giménez-García


arXiv: Artificial Intelligence | 2016

NdFluents: A Multi-dimensional Contexts Ontology

José M. Giménez-García; Antoine Zimmermann; Pierre Maret


arXiv: Databases | 2018

NELL2RDF: Reading the Web, and Publishing it as Linked Data.

José M. Giménez-García; Maisa C. Duarte; Antoine Zimmermann; Christophe Gravier; Estevam R. Hruschke; Pierre Maret

Collaboration


Dive into the José M. Giménez-García's collaboration.

Top Co-Authors

Avatar

Antoine Zimmermann

Centre national de la recherche scientifique

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Javier D. Fernández

Vienna University of Economics and Business

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elena Simperl

University of Southampton

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge