André Freitas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where André Freitas is active.

Explore More

Publication

Featured researches published by André Freitas.

international conference natural language processing | 2011

Querying linked data using semantic relatedness: a vocabulary independent approach

André Freitas; João Gabriel Oliveira; Sean O'Riain; Edward Curry; João Carlos Pereira da Silva

Linked Data brings the promise of incorporating a new dimension to the Web where the availability of Web-scale data can determine a paradigmatic transformation of the Web and its applications. However, together with its opportunities, Linked Data brings inherent challenges in the way users and applications consume the available data. Users consuming Linked Data on the Web, or on corporate intranets, should be able to search and query data spread over potentially a large number of heterogeneous, complex and distributed datasets. Ideally, a query mechanism for Linked Data should abstract users from the representation of data. This work focuses on the investigation of a vocabulary independent natural language query mechanism for Linked Data, using an approach based on the combination of entity search, a Wikipediabased semantic relatedness measure and spreading activation. The combination of these three elements in a query mechanism for Linked Data is a new contribution in the space. Wikipedia-based relatedness measures address existing limitations of existing works which are based on similarity measures/term expansion based on WordNet. Experimental results using the query mechanism to answer 50 natural language queries over DBPedia achieved a mean reciprocal rank of 61.4%, an average precision of 48.7% and average recall of 57.2%, answering 70% of the queries.

Archive | 2010

The Role of Community-Driven Data Curation for Enterprises

Edward Curry; André Freitas; Sean O'Riain

With increased utilization of data within their operational and strategic processes, enterprises need to ensure data quality and accuracy. Data curation is a process that can ensure the quality of data and its fitness for use. Traditional approaches to curation are struggling with increased data volumes, and near real-time demands for curated data. In response, curation teams have turned to community crowd-sourcing and semi-automatedmetadata tools for assistance. This chapter provides an overview of data curation, discusses the business motivations for curating data and investigates the role of community-based data curation, focusing on internal communities and pre-competitive data collaborations. The chapter is supported by case studies from Wikipedia, The New York Times, Thomson Reuters, Protein Data Bank and ChemSpider upon which best practices for both social and technical aspects of community-driven data curation are described.

Future Generation Computer Systems | 2011

W3P: Building an OPM based provenance model for the Web

André Freitas; Tomas Knap; Sean O'Riain; Edward Curry

The Web is evolving into a complex information space where the unprecedented volume of documents and data will offer to the information consumer a level of information integration and aggregation that has up until now not been possible. Indiscriminate addition of information can, however, come with inherent problems such as the provision of poor quality or fraudulent information. Provenance represents the cornerstone element which will enable information consumers to assess information quality, which will play a fundamental role in the continued evolution of the Web. This paper investigates the characteristics and requirements of provenance on the Web, describing how the Open Provenance Model (OPM) can be used as a foundation for the creation of W3P, a provenance model and ontology designed to meet the core requirements for the Web.

intelligent user interfaces | 2014

Natural language queries over heterogeneous linked data graphs: a distributional-compositional semantics approach

André Freitas; Edward Curry

The demand to access large amounts of heterogeneous structured data is emerging as a trend for many users and applications. However, the effort involved in querying heterogeneous and distributed third-party databases can create major barriers for data consumers. At the core of this problem is the semantic gap between the way users express their information needs and the representation of the data. This work aims to provide a natural language interface and an associated semantic index to support an increased level of vocabulary independency for queries over Linked Data/Semantic Web datasets, using a distributional-compositional semantics approach. Distributional semantics focuses on the automatic construction of a semantic model based on the statistical distribution of co-occurring words in large-scale texts. The proposed query model targets the following features: (i) a principled semantic approximation approach with low adaptation effort (independent from manually created resources such as ontologies, thesauri or dictionaries), (ii) comprehensive semantic matching supported by the inclusion of large volumes of distributional (unstructured) commonsense knowledge into the semantic approximation process and (iii) expressive natural language queries. The approach is evaluated using natural language queries on an open domain dataset and achieved avg. recall=0.81, mean avg. precision=0.62 and mean reciprocal rank=0.49.

Reasoning Web. Reasoning on the Web in the Big Data Era: 10th International Summer School 2014, Athens, Greece, September 8-13, 2014. Proceedings | 2014

An Introduction to Question Answering over Linked Data

Christina Unger; André Freitas; Philipp Cimiano

While the amount of knowledge available as linked data grows, so does the need for providing end users with access to this knowledge. Especially question answering systems are receiving much interest, as they provide intuitive access to data via natural language and shield end users from technical aspects related to data modelling, vocabularies and query languages. This tutorial gives an introduction to the rapidly developing field of question answering over linked data. It gives an overview of the main challenges involved in the interpretation of a user’s information need expressed in natural language with respect to the data that is queried. The paper summarizes the main existing approaches and systems including available tools and resources, benchmarks and evaluation campaigns. Finally, it lists the open topics that will keep question answering over linked data an exciting area of research in the years to come.

International Journal of Semantic Computing | 2011

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA

André Freitas; Edward Curry; João Gabriel Oliveira; Sean O'Riain

The vision of creating a Linked Data Web brings together the challenge of allowing queries across highly heterogeneous and distributed datasets. In order to query Linked Data on the Web today, end users need to be aware of which datasets potentially contain the data and also which data model describes these datasets. The process of allowing users to expressively query relationships in RDF while abstracting them from the underlying data model represents a fundamental problem for Web-scale Linked Data consumption. This article introduces a distributional structured semantic space which enables data model independent natural language queries over RDF data. The center of the approach relies on the use of a distributional semantic model to address the level of semantic interpretation demanded to build the data model independent approach. The article analyzes the geometric aspects of the proposed space, providing its description as a distributional structured vector space, which is built upon the Generalized Vector Space Model (GVSM). The final semantic space proved to be flexible and precise under real-world query conditions achieving mean reciprocal rank = 0.516, avg. precision = 0.482 and avg. recall = 0.491.

meeting of the association for computational linguistics | 2017

SemEval-2017 Task 5: Fine-grained sentiment analysis on financial microblogs and news

Siegfried Handschuh; Manuela Huerlimann; Keith Cortis; André Freitas; Manel Zarrouk; Brian Davis; Tobias Daudert

Horizon 2020 ICT Program Project SSIX: Social Sentiment analysis financial IndeXes, has received funding from the European Union’s Horizon 2020 Research and Innovation Program ICT 2014 - Information and Communications Technologies under grant agreement No. 645425

data and knowledge engineering | 2013

Editorial: Querying linked data graphs using semantic relatedness: A vocabulary independent approach

André Freitas; João Gabriel Oliveira; Sean O'Riain; João Carlos Pereira da Silva; Edward Curry

Linked Data brings inherent challenges in the way users and applications consume the available data. Users consuming Linked Data on the Web, should be able to search and query data spread over potentially large numbers of heterogeneous, complex and distributed datasets. Ideally, a query mechanism for Linked Data should abstract users from the representation of data. This work focuses on the investigation of a vocabulary independent natural language query mechanism for Linked Data, using an approach based on the combination of entity search, a Wikipedia-based semantic relatedness measure and spreading activation. Wikipedia-based semantic relatedness measures address existing limitations of existing works which are based on similarity measures/term expansion based on WordNet. Experimental results using the query mechanism to answer 50 natural language queries over DBpedia achieved a mean reciprocal rank of 61.4%, an average precision of 48.7% and average recall of 57.2%.

applications of natural language to data bases | 2014

A Distributional Semantics Approach for Selective Reasoning on Commonsense Graph Knowledge Bases

André Freitas; João Carlos Pereira da Silva; Edward Curry; Paul Buitelaar

Tasks such as question answering and semantic search are dependent on the ability of querying & reasoning over large-scale commonsense knowledge bases (KBs). However, dealing with commonsense data demands coping with problems such as the increase in schema complexity, semantic inconsistency, incompleteness and scalability. This paper proposes a selective graph navigation mechanism based on a distributional relational semantic model which can be applied to querying & reasoning over heterogeneous knowledge bases (KBs). The approach can be used for approximative reasoning, querying and associational knowledge discovery. In this paper we focus on commonsense reasoning as the main motivational scenario for the approach. The approach focuses on addressing the following problems: (i) providing a semantic selection mechanism for facts which are relevant and meaningful in a specific reasoning & querying context and (ii) allowing coping with information incompleteness in large KBs. The approach is evaluated using ConceptNet as a commonsense KB, and achieved high selectivity, high scalability and high accuracy in the selection of meaningful navigational paths. Distributional semantics is also used as a principled mechanism to cope with information incompleteness.

New Horizons for a Data-Driven Economy | 2016

Big Data Curation

André Freitas; Edward Curry

With the emergence of data environments with growing data variety and volume, organizations need to be supported by processes and technologies that allow them to produce and maintain high-quality data facilitating data reuse, accessibility, and analysis. In contemporary data management environments, data curation infrastructures have a key role in addressing the common challenges found across many different data production and consumption environments. Recent changes in the scale of the data landscape bring major changes and new demands to data curation processes and technologies. This chapter investigates how the emerging big data landscape is defining new requirements for data curation infrastructures and how curation infrastructures are evolving to meet these challenges. Different dimensions of scaling-up data curation for big data are described, including emerging technologies, economic models, incentive models, social aspects, and supporting standards. This analysis is grounded by literature research, interviews with domain experts, surveys, and case studies and provides an overview of the state-of-the-art, future requirements and emerging trends in the field.

Explore More