Emir Muñoz
National University of Ireland, Galway
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Emir Muñoz.
web search and data mining | 2014
Emir Muñoz; Aidan Hogan; Alessandra Mileo
The tables embedded in Wikipedia articles contain rich, semi-structured encyclopaedic content. However, the cumulative content of these tables cannot be queried against. We thus propose methods to recover the semantics of Wikipedia tables and, in particular, to extract facts from them in the form of RDF triples. Our core method uses an existing Linked Data knowledge-base to find pre-existing relations between entities in Wikipedia tables, suggesting the same relations as holding for other entities in analogous columns on different rows. We find that such an approach extracts RDF triples from Wikipedias tables at a raw precision of 40%. To improve the raw precision, we define a set of features for extracted triples that are tracked during the extraction phase. Using a manually labelled gold standard, we then test a variety of machine learning methods for classifying correct/incorrect triples. One such method extracts 7.9 million unique and novel RDF triples from over one million Wikipedia tables at an estimated precision of 81.5%.
database and expert systems applications | 2012
Flavio Ferrarotti; Sven Hartmann; Sebastian Link; Mauricio Marin; Emir Muñoz
Keys are fundamental for database management, independently of the particular data model used. In particular, several notions of XML keys have been proposed over the last decade, and their expressiveness and computational properties have been analyzed in theory. In practice, however, expressive notions of XML keys with good reasoning capabilities have been widely ignored. In this paper we present an efficient implementation of an algorithm that decides the implication problem for a tractable and expressive class of XML keys. We also evaluate the performance of the proposed algorithm, demonstrating that reasoning about expressive notions of XML keys can be done efficiently in practice and scales well. Our work indicates that XML keys as those studied here have great potential for diverse areas such as schema design, query optimization, storage and updates, data exchange and integration. To exemplify this potential, we use the algorithm to calculate non-redundant covers for sets of XML keys, and show that these covers can significantly reduce the number of XML keys against which XML documents must be validated. This can result in enormous time savings.
Trans. Large-Scale Data- and Knowledge-Centered Systems | 2013
Flavio Ferrarotti; Sven Hartmann; Sebastian Link; Mauricio Marin; Emir Muñoz
The increasing popularity of XML for persistent data storage, processing and exchange has triggered the demand for efficient algorithms to manage XML data. Both industry and academia have long since recognized the importance of keys in XML data management. In this paper we make a theoretical as well as a practical contribution to this area. This endeavour is ambitious given the multitude of intractability results that have been established. Our theoretical contribution is based in the definition of a new fragment of XML keys that keeps the right balance between expressiveness and efficiency of maintenance. More precisely, we characterize the associated implication problem axiomatically and develop a low-degree polynomial time decision algorithm. In comparison to previous work, this new fragment of XML keys provides designers with an enhanced ability to capture properties of XML data that are significant for the application at hand. Our practical contribution includes an efficient implementation of this decision algorithm and a thorough evaluation of its performance, demonstrating that reasoning about expressive notions of XML keys can be done efficiently in practice, and scales well. Our results promote the use of XML keys on real-world XML practice, where a little more semantics makes applications a lot more effective. To exemplify this potential, we use the decision algorithm to calculate non-redundant covers for sets of XML keys. In turn, this allow us to reduce significantly the time required to validate large XML documents against keys from the proposed fragment.
international semantic web conference | 2016
Emir Muñoz
RDF is structured, dynamic, and schemaless data, which enables a big deal of flexibility for Linked Data to be available in an open environment such as the Web. However, for RDF data, flexibility turns out to be the source of many data quality and knowledge representation issues. Tasks such as assessing data quality in RDF require a different set of techniques and tools compared to other data models. Furthermore, since the use of existing schema, ontology and constraint languages is not mandatory, there is always room for misunderstanding the structure of the data. Neglecting this problem can represent a threat to the widespread use and adoption of RDF and Linked Data. Users should be able to learn the characteristics of RDF data in order to determine its fitness for a given use case, for example. For that purpose, in this doctoral research, we propose the use of constraints to inform users about characteristics that RDF data naturally exhibits, in cases where ontologies or any other form of explicitly given constraints or schemata are not present or not expressive enough. We aim to address the problems of defining and discovering classes of constraints to help users in data analysis and assessment of RDF and Linked Data quality.
web information systems engineering | 2013
Flavio Ferrarotti; Sven Hartmann; Sebastian Link; Mauricio Marin; Emir Muñoz
We introduce soft cardinality constraints which need to be satisfied on average only, and thus permit violations in a controlled manner. Starting from a highly expressive but intractable class, we establish a fragment that is maximal with respect to both expressivity and efficiency. More precisely, we characterise the associated implication problem axiomatically and develop a low-degree polynomial time decision algorithm. Any increase in expressivity of our fragment results in coNP-hardness of the implication problem. Finally, we extensively test the performance of our algorithm. The performance evaluation provides first-hand evidence that reasoning about expressive notions of soft cardinality constraints on XML data is practically efficient and scales well. Our results unleash soft cardinality constraints on real-world XML practice, where a little more semantics makes applications a lot more effective in contexts where exceptions to common rules may occur.
language data and knowledge | 2017
Sameh K. Mohamed; Emir Muñoz; Vít Nováček; Pierre-Yves Vandenbussche
Relation paths are sequences of relations with inverse that allow for complete exploration of knowledge graphs in a two-way unconstrained manner. They are powerful enough to encode complex relationships between entities and are crucial in several contexts, such as knowledge base verification, rule mining, and link prediction. However, fundamental forms of reasoning such as containment and equivalence of relation paths have hitherto been ignored. Intuitively, two relation paths are equivalent if they share the same extension, i.e., set of source and target entity pairs. In this paper, we study the problem of containment as a means to find equivalent relation paths and show that it is very expensive in practice to enumerate paths between entities. We characterize the complexity of containment and equivalence of relation paths and propose a domain-independent and unsupervised method to obtain approximate equivalences ranked by a tri-criteria ranking function. We evaluate our algorithm using test cases over real-world data and show that we are able to find semantically meaningful equivalences efficiently.
LD4IE'13 Proceedings of the First International Conference on Linked Data for Information Extraction - Volume 1057 | 2013
Emir Muñoz; Aidan Hogan; Alessandra Mileo
international semantic web conference | 2013
Emir Muñoz; Aidan Hogan; Alessandra Mileo
LD4IE'14 Proceedings of the Second International Conference on Linked Data for Information Extraction - Volume 1267 | 2014
Emir Muñoz
asia-pacific conference on conceptual modelling | 2015
Bo Liu; Sebastian Link; Emir Muñoz