José Augusto Baranauskas
University of São Paulo
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by José Augusto Baranauskas.
Biochemistry and Molecular Biology Education | 2005
Francisco A. Leone; José Augusto Baranauskas; Rosa Prazeres Melo Furriel; Ivana Aparecida Borin
SigrafW is Windows‐compatible software developed using the Microsoft® Visual Basic Studio program that uses the simplified Hill equation for fitting kinetic data from allosteric and Michaelian enzymes. SigrafW uses a modified Fibonacci search to calculate maximal velocity (V), the Hill coefficient (n), and the enzyme‐substrate apparent dissociation constant (K). The estimation of V, K, and the sum of the squares of residuals is performed using a Wilkinson nonlinear regression at any Hill coefficient (n). In contrast to many currently available kinetic analysis programs, SigrafW shows several advantages for the determination of kinetic parameters of both hyperbolic and nonhyperbolic saturation curves. No initial estimates of the kinetic parameters are required, a measure of the goodness‐of‐the‐fit for each calculation performed is provided, the nonlinear regression used for calculations eliminates the statistical bias inherent in linear transformations, and the software can be used for enzyme kinetic simulations either for educational or research purposes.
Applied and Environmental Microbiology | 2013
Felipe Lira; Pedro Santoro Perez; José Augusto Baranauskas; Sérgio Ricardo Nozawa
ABSTRACT Antimicrobial resistance is a persistent problem in the public health sphere. However, recent attempts to find effective substitutes to combat infections have been directed at identifying natural antimicrobial peptides in order to circumvent resistance to commercial antibiotics. This study describes the development of synthetic peptides with antimicrobial activity, created in silico by site-directed mutation modeling using wild-type peptides as scaffolds for these mutations. Fragments of antimicrobial peptides were used for modeling with molecular modeling computational tools. To analyze these peptides, a decision tree model, which indicated the action range of peptides on the types of microorganisms on which they can exercise biological activity, was created. The decision tree model was processed using physicochemistry properties from known antimicrobial peptides available at the Antimicrobial Peptide Database (APD). The two most promising peptides were synthesized, and antimicrobial assays showed inhibitory activity against Gram-positive and Gram-negative bacteria. Colossomin C and colossomin D were the most inhibitory peptides at 5 μg/ml against Staphylococcus aureus and Escherichia coli. The methods described in this work and the results obtained are useful for the identification and development of new compounds with antimicrobial activity through the use of computational tools.
Journal of Medical Systems | 2012
Juliana Tarossi Pollettini; Sylvia R. G. Panico; Julio Cesar Daneluzzi; Renato Tinós; José Augusto Baranauskas; Alessandra Alaniz Macedo
Surveillance Levels (SLs) are categories for medical patients (used in Brazil) that represent different types of medical recommendations. SLs are defined according to risk factors and the medical and developmental history of patients. Each SL is associated with specific educational and clinical measures. The objective of the present paper was to verify computer-aided, automatic assignment of SLs. The present paper proposes a computer-aided approach for automatic recommendation of SLs. The approach is based on the classification of information from patient electronic records. For this purpose, a software architecture composed of three layers was developed. The architecture is formed by a classification layer that includes a linguistic module and machine learning classification modules. The classification layer allows for the use of different classification methods, including the use of preprocessed, normalized language data drawn from the linguistic module. We report the verification and validation of the software architecture in a Brazilian pediatric healthcare institution. The results indicate that selection of attributes can have a great effect on the performance of the system. Nonetheless, our automatic recommendation of surveillance level can still benefit from improvements in processing procedures when the linguistic module is applied prior to classification. Results from our efforts can be applied to different types of medical systems. The results of systems supported by the framework presented in this paper may be used by healthcare and governmental institutions to improve healthcare services in terms of establishing preventive measures and alerting authorities about the possibility of an epidemic.
Knowledge Based Systems | 2003
José Augusto Baranauskas; Maria Carolina Monard
Classification algorithms for large databases have many practical applications in data mining. Whenever a dataset is too large for a particular learning algorithm to be applied, sampling can be used to scale up classifiers to massive datasets. One general approach associated with sampling is the construction of ensembles. Although benefits in accuracy can be obtained from the use of ensembles, one problem is their interpretability. This has motivated our work on trying to use the benefits of combining symbolic classifiers, while still keeping the symbolic component in the learning system. This idea has been implemented in the XRULER system. We describe the XRULER system, as well as experiments performed to evaluate it on 10 datasets. The results show that it is possible to combine symbolic classifiers into a final symbolic classifier with increase in the accuracy and decrease in the number of final rules.
Journal of Biomedical Informatics | 2015
Erica Akemi Tanaka; Sérgio Ricardo Nozawa; Alessandra Alaniz Macedo; José Augusto Baranauskas
Many classification problems, especially in the field of bioinformatics, are associated with more than one class, known as multi-label classification problems. In this study, we propose a new adaptation for the Binary Relevance algorithm taking into account possible relations among labels, focusing on the interpretability of the model, not only on its performance. Experiments were conducted to compare the performance of our approach against others commonly found in the literature and applied to functional genomic datasets. The experimental results show that our proposal has a performance comparable to that of other methods and that, at the same time, it provides an interpretable model from the multi-label problem.
Knowledge Based Systems | 2016
Pedro Santoro Perez; Sérgio Ricardo Nozawa; Alessandra Alaniz Macedo; José Augusto Baranauskas
We propose several improvements for the windowing algorithm.We evaluated model performance, interpretability, and stability.Our methodology focuses on the interpretability of the model.Our approach shows differences in terms of interpretability, without harming performance.Our approach may yield better classification models. The induction of decision tree searches for relevant characteristics in the data which would allow it to precisely model a certain concept, but it also worries about the comprehensibility of the generated model, helping human specialists to discover new knowledge, something very important in the medical and biological areas. On the other hand, such inducers present some instability. The main problem handled here refers to the behavior of those inducers when it comes to high-dimensional data, more specifically to gene expression data: irrelevant attributes may harm the learning process and many models with similar performance may be generated. In order to treat those problems, we have explored and revised windowing: pruning of the trees generated during intermediary steps of the algorithm; the use of the estimated error instead of the training error; the use of the error weighted according to the size of the current window; and the use of the classification confidence as the window update criterion. The results show that the proposed algorithm outperform the classical one, especially considering measures of complexity and comprehensibility of the induced models.
BMC Medical Genomics | 2014
Juliana Tarossi Pollettini; José Augusto Baranauskas; Evandro Eduardo Seron Ruiz; Maria da Graça Campos Pimentel; Alessandra Alaniz Macedo
BackgroundResearch on Genomic medicine has suggested that the exposure of patients to early life risk factors may induce the development of chronic diseases in adulthood, as the presence of premature risk factors can influence gene expression. The large number of scientific papers published in this research area makes it difficult for the healthcare professional to keep up with individual results and to establish association between them. Therefore, in our work we aim at building a computational system that will offer an innovative approach that alerts health professionals about human development problems such as cardiovascular disease, obesity and type 2 diabetes.MethodsWe built a computational system called Chronic Illness Surveillance System (CISS), which retrieves scientific studies that establish associations (conceptual relationships) between chronic diseases (cardiovascular diseases, diabetes and obesity) and the risk factors described on clinical records. To evaluate our approach, we submitted ten queries to CISS as well as to three other search engines (Google™, Google Scholar™ and Pubmed®;) — the queries were composed of terms and expressions from a list of risk factors provided by specialists.ResultsCISS retrieved a higher number of closely related (+) and somewhat related (+/-) documents, and a smaller number of unrelated (-) and almost unrelated (-/+) documents, in comparison with the three other systems. The results from the Friedman’s test carried out with the post-hoc Holm procedure (95% confidence) for our system (control) versus the results for the three other engines indicate that our system had the best performance in three of the categories (+), (-) and (+/-). This is an important result, since these are the most relevant categories for our users.ConclusionOur system should be able to assist researchers and health professionals in finding out relationships between potential risk factors and chronic diseases in scientific papers.
computational science and engineering | 2009
Juliana Tarossi Pollettini; Flávia P. Nicolas; Sylvia R. G. Panico; Julio Cesar Daneluzzi; Renato Tinós; José Augusto Baranauskas; Alessandra Alaniz Macedo
Surveillance Level (SL) is an attempt to accompany human development and promote health in primary attention. SL supports patient classification through analysis of his/her risk and protective factors. Considering SLs, health care professionals are able to adopt specific educative, therapeutic or specialized therapeutic actions. This paper proposes a software architecture for manipulation of patient information aiming to suggest her/his SL. The architecture contains a classification layer composed by linguistic and classification software modules. The classification modules implement three different classification methods and the linguistic artifact exploits medical thesauri. The instantiated architecture run experiments in pediatric health care. These experiments showed the feasibility of our proposal and indicated better performance of automatic definition of SL when linguistic processing occurs before classification.
Applied Intelligence | 2018
José Augusto Baranauskas; Oscar Picchi Netto; Sérgio Ricardo Nozawa; Alessandra Alaniz Macedo
This paper presents an improved version of a decision tree-based filter algorithm for attribute selection. This algorithm can be seen as a pre-processing step of induction algorithms of machine learning and data mining tasks. The filter was evaluated based on thirty medical datasets considering its execution time, data compression ability and AUC (Area Under ROC Curve) performance. On average, our filter was faster than Relief-F but slower than both CFS and Gain Ratio. However for low-density (high-dimensional) datasets, our approach selected less than 2% of all attributes at the same time that it did not produce performance degradation during its further evaluation based on five different machine learning algorithms.
Methods of Information in Medicine | 2016
Alessandra Alaniz Macedo; E. E. S. Ruiz; José Augusto Baranauskas
BACKGROUND In 2003, the University of São Paulo established the first Biomedical Informatics (BMI) undergraduate course in Brazil. Our mission is to provide undergraduate students with formal education on the fundamentals of BMI and its applied methods. This undergraduate course offers theoretical aspects, practical knowledge and scientifically oriented skills in the area of BMI, enab- ling students to contribute to research and methodical development in BMI. Course coordinators, professors and students frequently evaluate the BMI course and the curriculum to ensure that alumni receive quality higher education. OBJECTIVES This study investigates (i) the main job activities undertake by USP BMI graduates, (ii) subjects that are fundamental important for graduates to pursue a career in BMI, and (iii) the course quality perceived by the alumni. METHODS Use of a structured questionnaire to conduct a survey involving all the BMI graduates who received their Bachelor degree before July, 2015 (attempted n = 205). RESULTS One hundred and forty-five graduates (71 %) answered the questionnaire. Nine out of ten of our former students currently work as informaticians. Seventy-six graduates (52 %) work within the biomedical informatics field. Fifty-five graduates (38 %) work outside the biomedical informatics field, but they work in other IT areas. Ten graduates (7 %) do not work with BMI or any other informatics activities, and four (3 %) are presently unemployed. Among the 145 surveyed BMI graduates, 46 (32 %) and seven (5 %) hold a Masters degree and a PhD degree, respectively. Database Systems, Software Engineering, Introduction to Computer Science, Object-Oriented Programming, and Data Structures are regarded as the most important subjects during the higher education course. The majority of the graduates (105 or 72 %) are satisfied with the BMI education and training they received during the undergraduate course. CONCLUSIONS More than half of the graduates from our BMI course work in their primary education area. Besides technical adequacy, the diverse job profiles, and the high level of satisfaction of our graduates indicate the importance of undergraduate courses specialized in the BMI domain are of utmost importance.