Javier D. Fernández
Vienna University of Economics and Business
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Javier D. Fernández.
Journal of Web Semantics | 2013
Javier D. Fernández; Miguel A. Martínez-Prieto; Claudio Gutierrez; Axel Polleres; Mario Arias
The current Web of Data is producing increasingly large RDF datasets. Massive publication efforts of RDF data driven by initiatives like the Linked Open Data movement, and the need to exchange large datasets has unveiled the drawbacks of traditional RDF representations, inspired and designed by a document-centric and human-readable Web. Among the main problems are high levels of verbosity/redundancy and weak machine-processable capabilities in the description of these datasets. This scenario calls for efficient formats for publication and exchange. This article presents a binary RDF representation addressing these issues. Based on a set of metrics that characterizes the skewed structure of real-world RDF data, we develop a proposal of an RDF representation that modularly partitions and efficiently represents three components of RDF datasets: Header information, a Dictionary, and the actual Triples structure (thus called HDT). Our experimental evaluation shows that datasets in HDT format can be compacted by more than fifteen times as compared to current naive representations, improving both parsing and processing while keeping a consistent publication scheme. Specific compression techniques over HDT further improve these compression rates and prove to outperform existing compression solutions for efficient RDF exchange.
international semantic web conference | 2010
Javier D. Fernández; Miguel A. Martínez-Prieto; Claudio Gutierrez
Increasingly huge RDF data sets are being published on the Web. Currently, they use different syntaxes of RDF, contain high levels of redundancy and have a plain indivisible structure. All this leads to fuzzy publications, inefficient management, complex processing and lack of scalability. This paper presents a novel RDF representation (HDT) which takes advantage of the structural properties of RDF graphs for splitting and representing, efficiently, three components of RDF data: Header, Dictionary and Triples structure. On-demand management operations can be implemented on top of HDT representation. Experiments show that data sets can be compacted in HDT by more than fifteen times the current naive representation, improving parsing and processing while keeping a consistent publication scheme. For exchanging, specific compression techniques over HDT improve current compression solutions.
international world wide web conferences | 2010
Javier D. Fernández; Claudio Gutierrez; Miguel A. Martínez-Prieto
This paper studies the compressibility of RDF data sets. We show that big RDF data sets are highly compressible due to the structure of RDF graphs (power law), organization of URIs and RDF syntax verbosity. We present basic approaches to compress RDF data and test them with three well-known, real-world RDF data sets.
international semantic web conference | 2014
Javier D. Fernández; Alejandro Llaves; Oscar Corcho
RDF streams are sequences of timestamped RDF statements or graphs, which can be generated by several types of data sources (sensors, social networks, etc.). They may provide data at high volumes and rates, and be consumed by applications that require real-time responses. Hence it is important to publish and interchange them efficiently. In this paper, we exploit a key feature of RDF data streams, which is the regularity of their structure and data values, proposing a compressed, efficient RDF interchange (ERI) format, which can reduce the amount of data transmitted when processing RDF streams. Our experimental evaluation shows that our format produces state-of-the-art streaming compression, remaining efficient in performance.
Future Generation Computer Systems | 2015
Miguel A. Martínez-Prieto; Carlos E. Cuesta; Mario Arias; Javier D. Fernández
Big Data?management has become a critical task in many application systems, which usually rely on heavyweight batch processes to manage such large amounts of data. However, batch architectures are not an adequate choice for designing real-time systems in which data updates and reads must be satisfied with very low latency. Thus, gathering and consuming high volumes of data at high velocities is an emerging challenge which we specifically address in the scope of innovative scenarios based on semantic data (RDF) management. The Linked Open Data initiative or emergent projects in the Internet of Things are examples of such scenarios. This paper describes a new architecture (referred to as Solid) which separates the complexities of Big Semantic Data?storage and indexing from real-time data acquisition and consumption. This decision relies on the use of two optimized datastores which respectively store historical (big) data and run-time data. It ensures efficient volume management and high processing velocity, but adds the need of coordinating both datastores. Solid ?proposes a 3-tiered architecture in which each responsibility is specifically addressed. Besides its theoretical description, we also propose and evaluate a Solid ?prototype built on top of binary RDF and state-of-the-art triplestores. Our experimental numbers report that Solid ?achieves large savings in data storage (it uses up to 5 times less space than the compared triplestores), while provides efficient SPARQL resolution over the Big Semantic Data?(in the order of 10-20?ms for the studied queries). These experiments also show that Solid ?ensures low-latency operations because data effectively managed in real-time remain small, so do not suffer Big Data?issues. We propose an architecture (Solid) for managing big semantic data in real-time.Specific big data and real-time responsibilities are isolated in dedicated layers.A dynamic pipe-filter solution is introduced for addressing query responsibilities.Solid ?leverages Rdf/Hdt ?features to obtain the most compressed representations.The Solid ?prototype performs competitive respect to the most prominent triplestores.
Knowledge and Information Systems | 2015
Sandra Álvarez-García; Nieves R. Brisaboa; Javier D. Fernández; Miguel A. Martínez-Prieto; Gonzalo Navarro
The Web of Data has been gaining momentum in recent years. This leads to increasingly publish more and more semi-structured datasets following, in many cases, the RDF (Resource Description Framework) data model based on atomic triple units of subject, predicate, and object. Although it is a very simple model, specific compression methods become necessary because datasets are increasingly larger and various scalability issues arise around their organization and storage. This requirement is even more restrictive in RDF stores because efficient SPARQL solution on the compressed RDF datasets is also required. This article introduces a novel RDF indexing technique that supports efficient SPARQL solution in compressed space. Our technique, called
ACM Sigapp Applied Computing Review | 2012
Miguel A. Martínez-Prieto; Javier D. Fernández; Rodrigo Cánovas
european conference on software architecture | 2013
Carlos E. Cuesta; Miguel A. Martínez-Prieto; Javier D. Fernández
\hbox {k}^2
international conference on semantic systems | 2016
Javier D. Fernández; Jürgen Umbrich; Axel Polleres; Magnus Knuth
acm symposium on applied computing | 2012
Miguel A. Martínez-Prieto; Javier D. Fernández; Rodrigo Cánovas
k2-triples, uses the predicate to vertically partition the dataset into disjoint subsets of pairs (subject, object), one per predicate. These subsets are represented as binary matrices of subjects