Gabriel David | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gabriel David is active.

Explore More

Publication

Featured researches published by Gabriel David.

international symposium on wikis and open collaboration | 2005

WikiWiki weaving heterogeneous software artifacts

Ademar Aguiar; Gabriel David

Good documentation benefits every software development project, especially large ones, but it can be hard, costly, and tiresome to produce when not supported by appropriate tools and methods.The documentation of a software system uses different artifacts, namely source code, for low-level internal documentation, and specific-purpose models and documents, for higher-level external documentation (e.g. requirements documents, use-case specifications, design notebooks, and reference manuals). All these artifacts require continual review and modification throughout the life-cycle to preserve their consistency and value.Good software documents are often heterogeneous, i.e., they combine different kinds of contents (text, code, models, images) gathered from separate software artifacts, a combination usually difficult to maintain as the system evolves over time, considering that source code, models and documents are typically produced and maintained separately in multiple sources using different environments and editors.This paper presents a wiki that helps on quickly weaving different kinds of contents into a single heterogeneous document, whilst preserving its semantic consistency. The fundamental goal of this wiki (XSDoc Wiki) is to reduce the development-documentation gap by making documentation more convenient and attractive to developers. An example taken from the JUnit framework documentation helps to illustrate the features more relevant to do such weaving.

international symposium on wikis and open collaboration | 2008

WikiChanges: exposing Wikipedia revision activity

Sérgio Nunes; Cristina Ribeiro; Gabriel David

Wikis are popular tools commonly used to support distributed collaborative work. Wikis can be seen as virtual scrap-books that anyone can edit without having any specific technical know-how. The Wikipedia is a flagship example of a real-word application of wikis. Due to the large scale of Wikipedia its difficult to easily grasp much of the information that is stored in this wiki. We address one particular aspect of this issue by looking at the revision history of each article. Plotting the revision activity in a timeline we expose the complete articles history in a easily understandable format. We present WikiChanges, a web-based application designed to plot an articles revision timeline in real time. WikiChanges also includes a web browser extension that incorporates activity sparklines in the real Wikipedia. Finally, we introduce a revisions summarization task that addresses the need to understand what occurred during a given set of revisions. We present a first approach to this task using tag clouds to present the revisions made.

web information and data management | 2007

Using neighbors to date web documents

Sérgio Nunes; Cristina Ribeiro; Gabriel David

Time has been successfully used as a feature in web information retrieval tasks. In this context, estimating a documents inception date or last update date is a necessary task. Classic approaches have used HTTP header fields to estimate a documents last update time. The main problem with this approach is that it is applicable to a small part of web documents. In this work, we evaluate an alternative strategy based on a documents neighborhood. Using a random sample containing 10,000 URLs from the Yahoo! Directory, we study each documents links and media assets to determine its age. If we only consider isolated documents, we are able to date 52% of them. Including the documents neighborhood, we are able to estimate the date of more than 86% of the same sample. Also, we find that estimates differ significantly according to the type of neighbors used. The most reliable estimates are based on the documents media assets, while the worst estimates are based on incoming links. These results are experimentally evaluated with a real world application using different datasets.

conference on image and video retrieval | 2006

Multidimensional descriptor indexing: exploring the bitmatrix

Catalin Calistru; Cristina Ribeiro; Gabriel David

Multimedia retrieval brings new challenges, mainly derived from the mismatch between the level of the user interaction—high-level concepts, and that of the automatically processed descriptors—low-level features. The effective use of the low-level descriptors is therefore mandatory. Many data structures have been proposed for managing the representation of multidimensional descriptors, each geared toward efficiency in some set of basic operations. The paper introduces a highly parametrizable structure called the BitMatrix, along with its search algorithms. The BitMatrix is compared with existing methods, all implemented in a common framework . The tests have been performed on two datasets, with parameters covering significant ranges of values. The BitMatrix has proved to be a robust and flexible structure that can compete with other methods for multidimensional descriptor indexing.

multimedia signal processing | 2004

A multimedia database workbench for content and context retrieval

Cristina Ribeiro; Gabriel David; Catalin Calistru

Multimedia databases are extending the scope of traditional databases to handle the complex structure of multimedia objects. Models for multimedia information must include representations for the structure and content of several media in a form that allows flexibility in retrieval. Content-based retrieval is the main motivation behind recent research in multimedia databases. The task of searching in video and audio content is made hard by the nature of audiovisual data where, unlike text, there is no direct syntactic channel between the object and its meaning [(R. Zhao and W. I. Grosy, 2002), (J. Chen et al., 2004), (M. Bertini et al., 2003)]. We propose a model for multimedia content storage and retrieval accounting for both context and content information and taking advantage of their dependencies for effective retrieval. We then describe a prototype multimedia database with a retrieval interface. It has been used as a workbench for testing the representation model and integrating tools for feature extraction, information interchange and retrieval. The workbench allows an easy inclusion of new tools for content analysis and new methods for context- and content-based retrieval while offering storage and access for both the actual digital content and its metadata.

Journal of the Association for Information Science and Technology | 2011

Term weighting based on document revision history

Sérgio Nunes; Cristina Ribeiro; Gabriel David

In real-world information retrieval systems, the underlying document collection is rarely stable or definitive. This work is focused on the study of signals extracted from the content of documents at different points in time for the purpose of weighting individual terms in a document. The basic idea behind our proposals is that terms that have existed for a longer time in a document should have a greater weight. We propose 4 term weighting functions that use each documents history to estimate a current term score. To evaluate this thesis, we conduct 3 independent experiments using a collection of documents sampled from Wikipedia. In the first experiment, we use data from Wikipedia to judge each set of terms. In a second experiment, we use an external collection of tags from a popular social bookmarking service as a gold standard. In the third experiment, we crowdsource user judgments to collect feedback on term preference. Across all experiments results consistently support our thesis. We show that temporally aware measures, specifically the proposed revision term frequency and revision term frequency span, outperform a term-weighting measure based on raw term frequency alone.

Transactions on pattern languages of programming II | 2011

Patterns for effectively documenting frameworks

Ademar Aguiar; Gabriel David

Good design and implementation are necessary but not sufficient prerequisites for successfully reusing object-oriented frameworks. Although not always recognized, good documentation is crucial for effective framework reuse, and often hard, costly, and tiresome, coming with many issues, especially when we are not aware of the key problems and respective ways of addressing them. Based on existing literature, case studies and lessons learned, the authors have been mining proven solutions to recurrent problems of documenting object-oriented frameworks, and writing them in pattern form, as patterns are a very effective way of communicating expertise and best practices. This paper presents a small set of patterns addressing problems related to the framework documentation itself, here seen as an autonomous and tangible product independent of the process used to create it. The patterns aim at helping nonexperts on cost-effectively documenting object-oriented frameworks. In concrete, these patterns provide guidance on choosing the kinds of documents to produce, how to relate them, and which contents to include. Although the focus is more on the documents themselves, rather than on the process and tools to produce them, some guidelines are also presented in the paper to help on applying the patterns to a specific framework.

international conference on data engineering | 2007

An Evaluation Framework for Multidimensional Multimedia Descriptor Indexing

B. Gonalves; Cristina Ribeiro; Catalin Calistru; Gabriel David

Automatic multimedia retrieval /equities the use of complex features, which are typically captured by multidimensional descriptors. A basic operation in a multimedia retrieval system is similarity computation, making use of descriptor-dependant metrics. Many data structures have been proposed for managing the representation of multidimensional descriptors, each geared towards efficiency in some set of basic operations. The paper describes a framework for evaluating multidimensional descriptor indexing structures and reports a set of experiments with selected descriptors indexing methods. The extensibility of the framework is illustrated by incorporating a recently-proposed structure, the BitMatrix. Data sets and experiment conditions can be set up so as to provide results that can be used in the choice of appropriate indexing structures for a class of multimedia retrieval applications.

international conference on asian digital libraries | 2007

Multimedia in cultural heritage collections: a model and applications

Cristina Ribeiro; Gabriel David; Catalin Calistru

The paper presents a multimedia database model accounting for the representation of documents, collections and the associated metadata. Appropriate structures are provided for descriptive metadata and for metadata resulting from automatic content analysis. The model is based on the identification and unification of the main concepts in the archival standards and the audiovisual area. The main features of the model, designed to support multimedia database applications, are the integration of descriptive and content analysis metadata, the association of metadata to collections as well as to items, the extensibility with respect to the inclusion of new descriptors and the support to several retrieval modes. The MetaMedia application development platform, based on the model, has been used to support the construction of a historic documentation collection where a common web interface provides collection administrators, metadata creators and visitors a multi-faceted view of the repository.

international conference on computational science and its applications | 2006

Higher education web information system usage analysis with a data webhouse

Carla Lopes; Gabriel David

Usage analysis of a Web Information System is a valuable help to predict user needs, to assess systems impact and to guide to its improvement. This is usually done analysing clickstreams, a low-level approach, with huge amounts of data that calls for data warehouse techniques. This paper presents a dimensional model to monitor user behaviour in Higher Education Web Information Systems and an architecture for the extraction, transformation and load process. These have been applied in the development of a data warehouse to monitor the use of SIGARRA, the University of Portos Higher Education Web Information System. The efficiency and effectiveness of this monitorization method were confirmed by the knowledge extracted from a 3 month period analysis. A brief description of the main results and recommendations are also described.

Explore More