Pedro Henriques Abreu
University of Coimbra
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pedro Henriques Abreu.
Computers in Biology and Medicine | 2015
Pedro J. García-Laencina; Pedro Henriques Abreu; Miguel Henriques Abreu; Noémia Afonoso
Breast cancer is the most frequently diagnosed cancer in women. Using historical patient information stored in clinical datasets, data mining and machine learning approaches can be applied to predict the survival of breast cancer patients. A common drawback is the absence of information, i.e., missing data, in certain clinical trials. However, most standard prediction methods are not able to handle incomplete samples and, then, missing data imputation is a widely applied approach for solving this inconvenience. Therefore, and taking into account the characteristics of each breast cancer dataset, it is required to perform a detailed analysis to determine the most appropriate imputation and prediction methods in each clinical environment. This research work analyzes a real breast cancer dataset from Institute Portuguese of Oncology of Porto with a high percentage of unknown categorical information (most clinical data of the patients are incomplete), which is a challenge in terms of complexity. Four scenarios are evaluated: (I) 5-year survival prediction without imputation and 5-year survival prediction from cleaned dataset with (II) Mode imputation, (III) Expectation-Maximization imputation and (IV) K-Nearest Neighbors imputation. Prediction models for breast cancer survivability are constructed using four different methods: K-Nearest Neighbors, Classification Trees, Logistic Regression and Support Vector Machines. Experiments are performed in a nested ten-fold cross-validation procedure and, according to the obtained results, the best results are provided by the K-Nearest Neighbors algorithm: more than 81% of accuracy and more than 0.78 of area under the Receiver Operator Characteristic curve, which constitutes very good results in this complex scenario.
soft computing | 2012
Pedro Henriques Abreu; José Moura; Daniel Castro Silva; Luís Paulo Reis; Júlio Garganta
In soccer, like in business, results are often the best indicator of a team’s performance in a certain competition but insufficient to a coach to asses his team performance. As a consequence, measurement tools play an important role in this particular field. In this research work, a performance tool for soccer, based only in Cartesian coordinates is presented. Capable of calculating final game statistics, suisber of shots, the calculus methodology analyzes the game in a sequential manner, starting with the identification of the kick event (the basis for detecting all events), which is related with a positive variation in the ball’s velocity vector. The achieved results were quite satisfactory, mainly due to the number of successfully detected events in the validation process (based on manual annotation). For the majority of the statistics, these values are above 92% and only in the case of shots do these values drop to numbers between 74 and 85%. In the future, this methodology could be improved, especially regarding the shot statistics, integrated with a real-time localization system, or expanded for other collective sports games, such as hockey or basketball.
Engineering Applications of Artificial Intelligence | 2014
José J.C. Teixeira Dias; Penousal Machado; Daniel Castro Silva; Pedro Henriques Abreu
With an ever increasing number of vehicles traveling the roads, traffic problems such as congestions and increased travel times became a hot topic in the research community, and several approaches have been proposed to improve the performance of the traffic networks.This paper introduces the Inverted Ant Colony Optimization (IACO) algorithm, a variation of the classic Ant Colony algorithm that inverts its logic by converting the attraction of ants towards pheromones into a repulsion effect. IACO is then used in a decentralized traffic management system, where drivers become ants that deposit pheromones on the followed paths; they are then repelled by the pheromone scent, thus avoiding congested roads, and distributing the traffic through the network.Using SUMO (Simulation of Urban MObility), several experiments were conducted to compare the effects of using IACO with a shortest time algorithm in artificial and real world scenarios - using the map of a real city, and corresponding traffic data.The effect of the behavior caused by this algorithm is a decrease in traffic density in widely used roads, leading to improvements on the traffic network at a local and global level, decreasing trip time for drivers that adhere to the suggestions made by IACO as well as for those who do not. Considering different degrees of adhesion to the algorithm, IACO has significant advantages over the shortest time algorithm, improving overall network performance by decreasing trip times for both IACO-compliant vehicles (up to 84%) and remaining vehicles (up to 71%). Thus, it benefits individual drivers, promoting the adoption of IACO, and also the global road network. Furthermore, fuel consumption and CO2 emissions from both vehicle types decrease significantly when using IACO (up to 49%).
The Breast | 2016
Miguel Henriques Abreu; Noemia Afonso; Pedro Henriques Abreu; Francisco Menezes; Paula A. Lopes; Rui Henrique; Deolinda Pereira; Carlos Lopes
PURPOSE Male Breast Cancer (MBC) remains a poor understood disease. Prognostic factors are not well established and specific prognostic subgroups are warranted. PATIENTS/METHODS Retrospectively revision of 111 cases treated in the same Cancer Center. Blinded-central pathological revision with immunohistochemical (IHQ) analysis for estrogen (ER), progesterone (PR) and androgen (AR) receptors, HER2, ki67 and p53 was done. Cox regression model was used for uni/multivariate survival analysis. Two classifications of Female Breast Cancer (FBC) subgroups (based in ER, PR, HER2, 2000 classification, and in ER, PR, HER2, ki67, 2013 classification) were used to achieve their prognostic value in MBC patients. Hierarchical clustering was performed to define subgroups based on the six-IHQ panel. RESULTS According to FBC classifications, the majority of tumors were luminal: A (89.2%; 60.0%) and B (7.2%; 35.8%). Triple negative phenotype was infrequent (2.7%; 3.2%) and HER2 enriched, non-luminal, was rare (≤1% in both). In multivariate analysis the poor prognostic factors were: size >2 cm (HR:1.8; 95%CI:1.0-3.4 years, p = 0.049), absence of ER (HR:4.9; 95%CI:1.7-14.3 years, p = 0.004) and presence of distant metastasis (HR:5.3; 95%CI:2.2-3.1 years, p < 0.001). FBC subtypes were independent prognostic factors (p = 0.009, p = 0.046), but when analyzed only luminal groups, prognosis did not differ regardless the classification used (p > 0.20). Clustering defined different subgroups, that have prognostic value in multivariate analysis (p = 0.005), with better survival in ER/PR+, AR-, HER2-and ki67/p53 low group (median: 11.5 years; 95%CI: 6.2-16.8 years) and worst in PR-group (median:4.5 years; 95%CI: 1.6-7.8 years). CONCLUSION FBC subtypes do not give the same prognostic information in MBC even in luminal groups. Two subgroups with distinct prognosis were identified in a common six-IHQ panel. Future studies must achieve their real prognostic value in these patients.
International Journal of Systematic and Evolutionary Microbiology | 2014
Gabriel Paiva; Pedro Henriques Abreu; Diogo Neves Proença; Susana Santos; M. F. Nobre; Paula V. Morais
Bacterial strain M47C3B(T) was isolated from the endophytic microbial community of a Pinus pinaster tree branch from a mixed grove of pines. Phylogenetic analysis of 16S rRNA gene sequences showed that this organism represented one distinct branch within the family Sphingobacteriaceae, most closely related to the genus Mucilaginibacter. Strain M47C3B(T) formed a distinct lineage, closely related to Mucilaginibacter dorajii KACC 14556(T), with which it shared 97.2% 16S rRNA gene sequence similarity. The other members of the genus Mucilaginibacter included in the same clade were Mucilaginibacter lappiensis ATCC BAA-1855(T) sharing 97.0% similarity and Mucilaginibacter composti TR6-03(T) that had a lower similarity (95.7%). The novel strain was Gram-staining-negative, formed rod-shaped cells, grew optimally at 26 °C and at pH 7, and was able to grow with up to 0.3% (w/v) NaCl. The respiratory quinone was menaquinone 7 (MK-7) and the major fatty acids of the strain were summed feature 3 (C16 : 1ω7c/iso-C15 : 0 2-OH), iso-C15 : 0 and iso-C17 : 0 3-OH, representing 73.5% of the total fatty acids. The major components of the polar lipid profile of strain M47C3B(T) consisted of phosphatidylethanolamine, three unidentified aminophospholipids, one unidentified aminolipid and three unidentified polar lipids. The G+C content of the DNA was 40.6 mol%. On the basis of the phylogenetic analysis and physiological and biochemical characteristics we propose the name Mucilaginibacter pineti sp. nov. for the novel species represented by strain M47C3B(T) ( = CIP 110632(T) = LMG 28160(T)).
Journal of Biomedical Informatics | 2015
Miriam Seoane Santos; Pedro Henriques Abreu; Pedro J. Garca-Laencina; Adlia Simo; Armando Carvalho
Liver cancer is the sixth most frequently diagnosed cancer and, particularly, Hepatocellular Carcinoma (HCC) represents more than 90% of primary liver cancers. Clinicians assess each patients treatment on the basis of evidence-based medicine, which may not always apply to a specific patient, given the biological variability among individuals. Over the years, and for the particular case of Hepatocellular Carcinoma, some research studies have been developing strategies for assisting clinicians in decision making, using computational methods (e.g. machine learning techniques) to extract knowledge from the clinical data. However, these studies have some limitations that have not yet been addressed: some do not focus entirely on Hepatocellular Carcinoma patients, others have strict application boundaries, and none considers the heterogeneity between patients nor the presence of missing data, a common drawback in healthcare contexts. In this work, a real complex Hepatocellular Carcinoma database composed of heterogeneous clinical features is studied. We propose a new cluster-based oversampling approach robust to small and imbalanced datasets, which accounts for the heterogeneity of patients with Hepatocellular Carcinoma. The preprocessing procedures of this work are based on data imputation considering appropriate distance metrics for both heterogeneous and missing data (HEOM) and clustering studies to assess the underlying patient groups in the studied dataset (K-means). The final approach is applied in order to diminish the impact of underlying patient profiles with reduced sizes on survival prediction. It is based on K-means clustering and the SMOTE algorithm to build a representative dataset and use it as training example for different machine learning procedures (logistic regression and neural networks). The results are evaluated in terms of survival prediction and compared across baseline approaches that do not consider clustering and/or oversampling using the Friedman rank test. Our proposed methodology coupled with neural networks outperformed all others, suggesting an improvement over the classical approaches currently used in Hepatocellular Carcinoma prediction models.
ACM Computing Surveys | 2016
Pedro Henriques Abreu; Miriam Seoane Santos; Miguel Henriques Abreu; Bruno Andrade; Daniel Castro Silva
Background: Recurrence is an important cornerstone in breast cancer behavior, intrinsically related to mortality. In spite of its relevance, it is rarely recorded in the majority of breast cancer datasets, which makes research in its prediction more difficult. Objectives: To evaluate the performance of machine learning techniques applied to the prediction of breast cancer recurrence. Material and Methods: Revision of published works that used machine learning techniques in local and open source databases between 1997 and 2014. Results: The revision showed that it is difficult to obtain a representative dataset for breast cancer recurrence and there is no consensus on the best set of predictors for this disease. High accuracy results are often achieved, yet compromising sensitivity. The missing data and class imbalance problems are rarely addressed and most often the chosen performance metrics are inappropriate for the context. Discussion and Conclusions: Although different techniques have been used, prediction of breast cancer recurrence is still an open problem. The combination of different machine learning techniques, along with the definition of standard predictors for breast cancer recurrence seem to be the main future directions to obtain better results.
Archive | 2014
Pedro Henriques Abreu; Hugo Amaro; Daniel Castro Silva; Penousal Machado; Miguel Henriques Abreu; Noemia Afonso; António Dourado
Breast Cancer is the most common type of cancer in women worldwide. In spite of this fact, there are insufficient studies that, using data mining techniques, are capable of helping medical doctors in their daily practice.
The Scientific World Journal | 2014
Pedro Henriques Abreu; J. Xavier; Daniel Castro Silva; Luís Paulo Reis; Marcelo Petry
Nowadays, there are many technologies that support location systems involving intrusive and nonintrusive equipment and also varying in terms of precision, range, and cost. However, the developers some time neglect the noise introduced by these systems, which prevents these systems from reaching their full potential. Focused on this problem, in this research work a comparison study between three different filters was performed in order to reduce the noise introduced by a location system based on RFID UWB technology with an associated error of approximately 18 cm. To achieve this goal, a set of experiments was devised and executed using a miniature train moving at constant velocity in a scenario with two distinct shapes—linear and oval. Also, this train was equipped with a varying number of active tags. The obtained results proved that the Kalman Filter achieved better results when compared to the other two filters. Also, this filter increases the performance of the location system by 15% and 12% for the linear and oval paths respectively, when using one tag. For a multiple tags and oval shape similar results were obtained (11–13% of improvement).
soft computing | 2013
Fernando Almeida; Pedro Henriques Abreu; Nuno Lau; Luís Paulo Reis
Soccer is a competitive and collective sport in which teammates try to combine the execution of basic actions (cooperative behavior) to lead their team to more advantageous situations. The ability to recognize, extract and reproduce such behaviors can prove useful to improve the performance of a team in future matches. This work describes a methodology for achieving just that makes use of a plan definition language to abstract the representation of relevant behaviors in order to promote their reuse. Experiments were conducted based on a set of game log files generated by the Soccer Server simulator which supports the RoboCup 2D simulated robotic soccer league. The effectiveness of the proposed approach was verified by focusing primarily on the analysis of behaviors which started from set-pieces and led to the scoring of goals while the ball possession was kept. One of the results obtained showed that a significant part of the total goals scored was based on this type of behaviors, demonstrating the potential of conducting this analysis. Other results allowed us to assess the complexity of these behaviors and infer meaningful guidelines to consider when defining plans from scratch. Some possible extensions to this work include assessing which plans have the ability to maximize the creation of goal opportunities by countering the opponent’s team strategy and how the effectiveness of plans can be improved using optimization techniques.