Inês de Castro Dutra

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Inês de Castro Dutra is active.

Explore More

Publication

Featured researches published by Inês de Castro Dutra.

european conference on machine learning | 2005

An integrated approach to learning bayesian networks of rules

Jesse Davis; Elizabeth S. Burnside; Inês de Castro Dutra; David C. Page; Vítor Santos Costa

Inductive Logic Programming (ILP) is a popular approach for learning rules for classification tasks. An important question is how to combine the individual rules to obtain a useful classifier. In some instances, converting each learned rule into a binary feature for a Bayes net learner improves the accuracy compared to the standard decision list approach [3,4,14]. This results in a two-step process, where rules are generated in the first phase, and the classifier is learned in the second phase. We propose an algorithm that interleaves the two steps, by incrementally building a Bayes net during rule learning. Each candidate rule is introduced into the network, and scored by whether it improves the performance of the classifier. We call the algorithm SAYU for Score As You Use. We evaluate two structure learning algorithms Naive Bayes and Tree Augmented Naive Bayes. We test SAYU on four different datasets and see a significant improvement in two out of the four applications. Furthermore, the theories that SAYU learns tend to consist of far fewer rules than the theories in the two-step approach.

european conference on machine learning | 2005

Mode directed path finding

Irene M. Ong; Inês de Castro Dutra; David C. Page; Vítor Santos Costa

Learning from multi-relational domains has gained increasing attention over the past few years. Inductive logic programming (ILP) systems, which often rely on hill-climbing heuristics in learning first-order concepts, have been a dominating force in the area of multi-relational concept learning. However, hill-climbing heuristics are susceptible to local maxima and plateaus. In this paper, we show how we can exploit the links between objects in multi-relational data to help a first-order rule learning system direct the search by explicitly traversing these links to find paths between variables of interest. Our contributions are twofold: (i) we extend the pathfinding algorithm by Richards and Mooney [12] to make use of mode declarations, which specify the mode of call (input or output) for predicate variables, and (ii) we apply our extended path finding algorithm to saturated bottom clauses, which anchor one end of the search space, allowing us to make use of background knowledge used to build the saturated clause to further direct search. Experimental results on a medium-sized dataset show that path finding allows one to consider interesting clauses that would not easily be found by Aleph.

international conference of the ieee engineering in medicine and biology society | 2011

DigiScope — Unobtrusive collection and annotating of auscultations in real hospital environments

Daniel Pereira; Fábio de Lima Hedayioglu; Ricardo Correia; Tiago H. Silva; Inês de Castro Dutra; Fernando Gomes de Almeida; Sandra da Silva Mattos; Miguel Tavares Coimbra

Digital stethoscopes are medical devices that can collect, store and sometimes transmit acoustic auscultation signals in a digital format. These can then be replayed, sent to a colleague for a second opinion, studied in detail after an auscultation, used for training or, as we envision it, can be used as a cheap powerful tool for screening cardiac pathologies. In this work, we present the design, development and deployment of a prototype for collecting and annotating auscultation signals within real hospital environments. Our main objective is not only pave the way for future unobtrusive systems for cardiac pathology screening, but more immediately we aim to create a repository of annotated auscultation signals for biomedical signal processing and machine learning research. The presented prototype revolves around a digital stethoscope that can stream the collected audio signal to a nearby tablet PC. Interaction with this system is based on two models: a data collection model adequate for the uncontrolled hospital environments of both emergency room and primary care, and a data annotation model for offline metadata input. A specific data model was created for the repository. The prototype has been deployed and is currently being tested in two Hospitals, one in Portugal and one in Brazil.

european conference on parallel processing | 2003

Toward Automatic Management of Embarrassingly Parallel Applications

Inês de Castro Dutra; David C. Page; Vítor Santos Costa; Jude W. Shavlik; Michael Waddell

Large-scale applications that require executing very large numbers of tasks are only feasible through parallelism. In this work we present a system that automatically handles large numbers of experiments and data in the context of machine learning. Our system controls all experiments, including re-submission of failed jobs and relies on available resource managers to spawn jobs through pools of machines. Our results show that we can manage a very large number of experiments, using a reasonable amount of idle CPU cycles, with very little user intervention.

bioinformatics and biomedicine | 2012

Extracting BI-RADS features from Portuguese clinical texts

Houssam Nassif; Filipe Cunha; Inês Moreira; Ricardo Cruz-Correia; Eliana Sousa; David C. Page; Elizabeth S. Burnside; Inês de Castro Dutra

In this work we build the first BI-RADS parser for Portuguese free texts, modeled after existing approaches to extract BI-RADS features from English medical records. Our concept finder uses a semantic grammar based on the BI-RADS lexicon and on iterative transferred expert knowledge. We compare the performance of our algorithm to manual annotation by a specialist in mammography. Our results show that our parsers performance is comparable to the manual method.

Theory and Practice of Logic Programming | 2010

Threads and or-parallelism unified

Vítor Santos Costa; Inês de Castro Dutra; Ricardo Rocha

One of the main advantages of Logic Programming (LP) is that it provides an excellent framework for the parallel execution of programs. In this work we investigate novel techniques to efficiently exploit parallelism from real-world applications in low cost multi-core architectures. To achieve these goals, we revive and redesign the YapOr system to exploit or-parallelism based on a multi-threaded implementation. Our new approach takes full advantage of the state-of-the-art fast and optimized YAP Prolog engine and shares the underlying execution environment, scheduler and most of the data structures used to support YapOrs model. Initial experiments with our new approach consistently achieve almost linear speedups for most of the applications, proving itself as a good alternative for exploiting implicit parallelism in the currently available low cost multi-core architectures.

Proceedings of the 1st international doctoral symposium on Middleware | 2004

Application partitioning and hierarchical management in grid environments

Patrícia Kayser Vargas; Inês de Castro Dutra; Cláudio Fernando Resin Geyer

Several works on grid computing have been proposed in the last years. However, most of them, including available software, can not deal properly with some issues related to control of applications that spread a very large number of tasks across the grid network. This work presents a step toward solving the problem of controlling such applications. We propose and discuss an architectural model called GRAND (Grid Robust ApplicatioN Deployment) based on partitioning and hierarchical submission and control of such applications. The main contribution of our model is to be able to control the execution of a huge number of distributed tasks while preserving data locality and reducing the load of the submit machines. We propose a taxonomy to classify application models to run in grid environments and partitioning methods. We also present our application description language GRID-ADL.

data mining in bioinformatics | 2015

Predicting malignancy from mammography findings and image-guided core biopsies

Pedro Ferreira; Nuno A. Fonseca; Inês de Castro Dutra; Ryan W. Woods; Elizabeth S. Burnside

The main goal of this work is to produce machine learning models that predict the outcome of a mammography from a reduced set of annotated mammography findings. In the study we used a dataset consisting of 348 consecutive breast masses that underwent image guided core biopsy performed between October 2005 and December 2007 on 328 female subjects. We applied various algorithms with parameter variation to learn from the data. The tasks were to predict mass density and to predict malignancy. The best classifier that predicts mass density is based on a support vector machine and has accuracy of 81.3%. The expert correctly annotated 70% of the mass densities. The best classifier that predicts malignancy is also based on a support vector machine and has accuracy of 85.6%, with a positive predictive value of 85%. One important contribution of this work is that our model can predict malignancy in the absence of the mass density attribute, since we can fill up this attribute using our mass density predictor.

Workshop on Logic Programming | 2013

A Datalog Engine for GPUs

Carlos Alberto Martinez-Angeles; Inês de Castro Dutra; Vítor Santos Costa; Jorge Buenabad-Chávez

We present the design and evaluation of a Datalog engine for execution in Graphics Processing Units (GPUs). The engine evaluates recursive and non-recursive Datalog queries using a bottom-up approach based on typical relational operators. It includes a memory management scheme that automatically swaps data between memory in the host platform (a multicore) and memory in the GPU in order to reduce the number of memory transfers. To evaluate the performance of the engine, four Datalog queries were run on the engine and on a single CPU in the multicore host. One query runs up to 200 times faster on the (GPU) engine than on the CPU.

Concurrency and Computation: Practice and Experience | 2007

GRAND: toward scalability in a Grid environment

Patrícia Kayser Vargas; Inês de Castro Dutra; Vinícius Dalto do Nascimento; Lucas A. S. Santos; Luciano Cavalheiro da Silva; Cláudio Fernando Resin Geyer; Bruno Schulze

One of the challenges in Grid computing research is to provide a means to automatically submit, manage, and monitor applications whose main characteristic is to be composed of a large number of tasks. The large number of explicit tasks, generally placed on a centralized job queue, can cause several problems: (1) they can quickly exhaust the memory of the submission machine; (2) they can deteriorate the response time of the submission machine due to these demanding too many open ports to manage remote execution of each of the tasks; (3) they may cause network traffic congestion if all tasks try to transfer input and/or output files across the network at the same time; (4) they make it impossible for the user to follow execution progress without an automatic tool or interface; (5) they may depend on fault‐tolerance mechanisms implemented at application level to ensure that all tasks terminate successfully. In this work we present and validate a novel architectural model, GRAND (Grid Robust ApplicatioN Deployment), whose main objective is to deal with the submission of a large numbers of tasks. Copyright

Explore More