José João Almeida | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where José João Almeida is active.

Explore More

Publication

Featured researches published by José João Almeida.

processing of the portuguese language | 2012

Dicionário-Aberto: a source of resources for the portuguese language processing

Alberto Simões; Álvaro Iriarte Sanromán; José João Almeida

In this paper we describe how Dicionaio-Aberto, an online dictionary for the Portuguese language, is being used as the base to construct diverse resources that are relevant in the processing of the Portuguese language. We will briefly present its history, explaining how we got here. Then, we will describe the resources already available to download and use, followed by the discussion on the resources that are being currently developed.

symposium on languages applications and technologies | 2012

Probabilistic synSet based concept location

Nuno Ramos Carvalho; José João Almeida; Maria João Varanda Pereira; Pedro Rangel Henriques

Concept location is a common task in program comprehension techniques, essential in many approaches used for software care and software evolution. An important goal of this process is to discover a mapping between source code and human oriented concepts. Although programs are written in a strict and formal language, natural language terms and sentences like identifiers (variables or functions names), constant strings or comments, can still be found embedded in programs. Using terminology concepts and natural language processing techniques these terms can be exploited to discover clues about which real world concepts source code is addressing. This work extends symbol tables build by compilers with ontology driven constructs, extends synonym sets defined by linguistics, with automatically created Probabilistic SynSets from software domain parallel corpora. And using a relational algebra, creates semantic bridges between program elements and human oriented concepts, to enhance concept location tasks. 1998 ACM Subject Classification D.2.5 Testing and Debugging: code inspections and walkthroughs

Markup Languages | 1999

SGML documents: where does quality go?

Jorge Gustavo Rocha; José João Almeida; Pedro Rangel Henriques

Quality control in electronic publications should be one of the major concerns of everyone who is managing a project. Big projects, like digital libraries, try to gather information from a series of different sources: libraries, museums, universities, and other scientific or cultural organizations. Collecting and treating information from several different sources raises very interesting problems, one being the assurance of quality. Quality in electronic publications can be reflected in several forms, from the visual aspects of the interface, to linguistic and literary aspects, to the correctness of data. With SGML we can solve part of the problem, structural/syntactic correctness. SGML provides a nice way to specify the structure of documents keeping a complete separation between structure (syntax) and typesetting. Today there are lots of editors and environments that can assist the user producing well-formed and valid SGML documents (validating their structure). However, current software still gives the user too much freedom. The user has full control of the data being introduced, creating a margin for errors. In this context there are situations where pre-conditions over the information being introduced should be enforced in order to prevent the user from introducing erroneous data; we shall call this process data semantics validation. The idea is to constrain the values of some structural elements of a document according to its final purpose. This way the user (who writes the documents according to that DTD) will not have full control of his data; he will be forced to obey certain domain range limitations or certain information relationships. SGML does not have the necessary constructs to implement this extra validation task. In this paper we will present and discuss ways of associating a constraint language with the SGML model. We will present the steps towards the implementation of that language. In the end, we present a new SGML authoring and processing model which has an extra validation task: semantic validation. Along the paper we will show some case studies that could have their quality improved with this new working scheme.

algebraic methodology and software technology | 1997

CAMILA: Prototyping and Refinement of Constructive Specifications

José João Almeida; Luís Soares Barbosa; F. L. Neves; José Nuno Fonseca Oliveira

This paper accompanies the demonstration of Camila, an experimental platform for formal software development, rooted in the tradition of constructive specification methods. The Camila approach is an attempt to make available at software development level the basic problem solving strategy one got used to from school physics - create, experiment and reason on a mathematical model. Based on a notion of formal software component, it encompasses a set-theoretic language and an inequational calculus for classification and refinement. Its kernel is a functional prototyping environment, fully connectable to external applications, equipped with a classified component repository and distribution facilities.

international conference on computational science and its applications | 2014

Conclave: Ontology-Driven Measurement of Semantic Relatedness between Source Code Elements and Problem Domain Concepts

Nuno Carvalho; José João Almeida; Pedro Rangel Henriques; Maria João Varanda Pereira

Software maintainers are often challenged with source code changes to improve software systems, or eliminate defects, in unfamiliar programs. To undertake these tasks a sufficient understanding of the system (or at least a small part of it) is required. One of the most time consuming tasks of this process is locating which parts of the code are responsible for some key functionality or feature. Feature (or concept) location techniques address this problem. This paper introduces Conclave, an environment for software analysis, and in particular the Conclave-Mapper tool that provides a feature location facility. This tool explores natural language terms used in programs (e.g. function and variable names), and using textual analysis and a collection of Natural Language Processing techniques, computes synonymous sets of terms. These sets are used to score relatedness between program elements, and search queries or problem domain concepts, producing sorted ranks of program elements that address the search criteria, or concepts. An empirical study is also discussed to evaluate the underlying feature location technique.

Archive | 2018

Annotated Documents and Expanded CIDOC-CRM Ontology in the Automatic Construction of a Virtual Museum

Cristiana Araújo; Ricardo Giuliani Martini; Pedro Rangel Henriques; José João Almeida

The Museum of the Person (Museu da Pessoa, MP) is a virtual museum with the purpose of exhibit life stories of common people. Its assets are composed of several interviews involving people whose stories we want to perpetuate. So the museum holds an heterogeneous collection of XML (eXtensible Markup Language) documents that constitute the working repository. The main idea is to extract automatically the information included in the repository in order to build the virtual museum’s exhibition rooms. The goal of this paper is to describe an architectural approach to build a system that will create the virtual rooms from the XML repository to enable visitors to lookup individual life stories and also inter-cross information among them. We adopted the standard for museum ontologies CIDOC-CRM (CIDOC Conceptual Reference Model) refined with FOAF (Friend of a Friend) and DBpedia ontologies to represent OntoMP. That ontology is intended to allow a conceptual navigation over the available information. The approach here discussed is based on a TripleStore and uses SPARQL (SPARQL Protocol and RDF Query Language) to extract the information. Aiming at the extraction of meaningful information, we built a text filter that converts the interviews into a RDF triples file that reflects the assets described by the ontology.

iberian conference on information systems and technologies | 2016

Architectural approaches to build the museum of the person

Cristiana Araújo; Pedro Rangel Henriques; Ricardo Giuliani Martini; José João Almeida

The Museum of the Person (Museu da Pessoa, MP) is a virtual museum aimed at exhibiting life stories of common people. Its assets are composed of several interviews involving people whose stories we want to perpetuate. So the museum holds an heterogeneous collection of XML (eXtensible Markup Language) documents that constitute the working repository. The main idea is to extract automatically the information included in the repository in order to build the web pages that realize the museums exhibition rooms. This project started by creating a specific ontology (OntoMP) for the knowledge repository of MP. That ontology is intended to allow a conceptual navigation over the available information. We will adopt the standard for museum ontologies CIDOC-CRM (CIDOC Conceptual Reference Model) refined with FOAF to represent OntoMP. The objective of this paper is to discuss different architectural approaches to build a system that will create the virtual rooms from the XML repository to enable visitors to lookup individual life stories and also intercross information among them. The first architecture is based on a TripleStore and uses SPARQL (SPARQL Protocol and RDF Query Language) technology to extract the information, while the second proposal is based on a Relational Database and uses CaVa Generator to query the repository and build the exhibition spaces.

world conference on information systems and technologies | 2013

Open source software documentation mining for quality assessment

Nuno Ramos Carvalho; Alberto Simões; José João Almeida

Besides source code, the fundamental source of information about Open Source Software lies in documentation, and other non source code files, like README, INSTALL, or HowTo files, commonly available in the software ecosystem. These documents, written in natural language, provide valuable information during the software development stage, but also in future maintenance and evolution tasks.

international conference on computational science and its applications | 2013

A framework for modular and customizable software analysis

Pedro Martins; Nuno Ramos Carvalho; João Paulo Fernandes; José João Almeida; João Saraiva

This paper presents a framework for the analysis of software artifacts. We revise and propose techniques that aid in the manipulation and combination of target-language specific tools, and in handling and controlling the results of such tools. We also propose to integrate under our framework techniques that are capable of performing language independent analyses.

world conference on information systems and technologies | 2016

OntoMP, an ontology to build the museum of the person

Ricardo Giuliani Martini; Cristiana Araújo; José João Almeida; Pedro Rangel Henriques

This paper is concerned with the creation of a specific ontology for the knowledge repository of the Museum of the Person (Museu da Pessoa). The Museum of the Person assets are composed of several interviews (collected previously for a large cultural project) involving common people, to perpetuate their life stories. The museum holds an heterogeneous collection of XML documents. In such format, the collection items are many times not recognizable and understandable by the visitors who wish to explore it. Therefore, we intend to use an ontology that allows a conceptual navigation over the available information, enabling the visitors to extract knowledge during the visit to these life stories. So, this paper aims at presenting the ontology we have developed using CIDOC Conceptual Reference Model (CIDOC-CRM) [1] to enable visitors to lookup individual life stories, read them, and also intercross information among a cluster of life stories to build up the story of a company/institution or to study social behaviors and customs.

Explore More