Piotr A. Habas | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Piotr A. Habas is active.

Explore More

Publication

Featured researches published by Piotr A. Habas.

Neural Networks | 2008

Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.

Maciej A. Mazurowski; Piotr A. Habas; Jacek M. Zurada; Joseph Y. Lo; Jay A. Baker; Georgia D. Tourassi

This study investigates the effect of class imbalance in training data when developing neural network classifiers for computer-aided medical diagnosis. The investigation is performed in the presence of other characteristics that are typical among medical data, namely small training sample size, large number of features, and correlations between features. Two methods of neural network training are explored: classical backpropagation (BP) and particle swarm optimization (PSO) with clinically relevant training criteria. An experimental study is performed using simulated data and the conclusions are further validated on real clinical data for breast cancer diagnosis. The results show that classifier performance deteriorates with even modest class imbalance in the training data. Further, it is shown that BP is generally preferable over PSO for imbalanced training data especially with small data sample and large number of features. Finally, it is shown that there is no clear preference between oversampling and no compensation approach and some guidance is provided regarding a proper selection.

Physics in Medicine and Biology | 2008

Decision optimization of case-based computer-aided decision systems using genetic algorithms with application to mammography

Maciej A. Mazurowski; Piotr A. Habas; Jacek M. Zurada; Georgia D. Tourassi

This paper presents an optimization framework for improving case-based computer-aided decision (CB-CAD) systems. The underlying hypothesis of the study is that each example in the knowledge database of a medical decision support system has different importance in the decision making process. A new decision algorithm incorporating an importance weight for each example is proposed to account for these differences. The search for the best set of importance weights is defined as an optimization problem and a genetic algorithm is employed to solve it. The optimization process is tailored to maximize the systems performance according to clinically relevant evaluation criteria. The study was performed using a CAD system developed for the classification of regions of interests (ROIs) in mammograms as depicting masses or normal tissue. The system was constructed and evaluated using a dataset of ROIs extracted from the Digital Database for Screening Mammography (DDSM). Experimental results show that, according to receiver operator characteristic (ROC) analysis, the proposed method significantly improves the overall performance of the CAD system as well as its average specificity for high breast mass detection rates.

Medical Physics | 2007

Reliability analysis framework for computer-assisted medical decision systems.

Piotr A. Habas; Jacek M. Zurada; Adel Said Elmaghraby; Georgia D. Tourassi

We present a technique that enhances computer-assisted decision (CAD) systems with the ability to assess the reliability of each individual decision they make. Reliability assessment is achieved by measuring the accuracy of a CAD system with known cases similar to the one in question. The proposed technique analyzes the feature space neighborhood of the query case to dynamically select an input-dependent set of known cases relevant to the query. This set is used to assess the local (query-specific) accuracy of the CAD system. The estimated local accuracy is utilized as a reliability measure of the CAD response to the query case. The underlying hypothesis of the study is that CAD decisions with higher reliability are more accurate. The above hypothesis was tested using a mammographic database of 1337 regions of interest (ROIs) with biopsy-proven ground truth (681 with masses, 656 with normal parenchyma). Three types of decision models, (i) a back-propagation neural network (BPNN), (ii) a generalized regression neural network (GRNN), and (iii) a support vector machine (SVM), were developed to detect masses based on eight morphological features automatically extracted from each ROI. The performance of all decision models was evaluated using the Receiver Operating Characteristic (ROC) analysis. The study showed that the proposed reliability measure is a strong predictor of the CAD systems case-specific accuracy. Specifically, the ROC area index for CAD predictions with high reliability was significantly better than for those with low reliability values. This result was consistent across all decision models investigated in the study. The proposed case-specific reliability analysis technique could be used to alert the CAD user when an opinion that is unlikely to be reliable is offered. The technique can be easily deployed in the clinical environment because it is applicable with a wide range of classifiers regardless of their structure and it requires neither additional training nor building multiple decision models to assess the case-specific CAD accuracy.

international symposium on neural networks | 2007

Impact of Low Class Prevalence on the Performance Evaluation of Neural Network Based Classifiers: Experimental Study in the Context of Computer-Assisted Medical Diagnosis

Maciej A. Mazurowski; Piotr A. Habas; Georgia D. Tourassi; Jacek M. Zurada

This paper presents an experimental study on the impact of low class prevalence on the neural network based classifier performance as measured using receiver operator characteristic (ROC) analysis. Two methods of dealing with the problem are investigated: oversampling and undersampling in the context of varying the class prevalence and the size of training datasets with uncorrelated and correlated features. The results show that the class imbalance can significantly decrease the classifier performance especially in the case of small training datasets. Furthermore, the oversampling method is shown to be more effective than the undersampling method in compensating the class imbalance. Statistically significant differences, however, are observed only in the cases with large total number of samples and very low prevalence.

congress on evolutionary computation | 2007

Case-base reduction for a computer assisted breast cancer detection system using genetic algorithms

Maciej A. Mazurowski; Piotr A. Habas; Georgia D. Tourassi; Jacek M. Zurada

A knowledge-based computer assisted decision (KB-CAD) system is a case-based reasoning system previously proposed for breast cancer detection. Although it was demonstrated to be very effective for the diagnostic problem, it was also shown to be computationally expensive due to the use of mutual information between images as a similarity measure. Here, the authors propose to alleviate this drawback by reducing the case-base size. The problem is formalized and a genetic algorithm is utilized as an optimization tool. Appropriate for the problem representation and operators are presented and discussed. A clinically relevant index of the area under the receiver operator characteristic curve is used as a measure of the system performance during the optimization and testing stages. Experimental results show that application of the proposed method can significantly reduce the case-base size while the classification performance of the KB-CAD, in fact, increases.

Neural Networks | 2008

Training neural network classifiers for medical decision making

Maciej A. Mazurowski; Piotr A. Habas; Jacek M. Zurada; Joseph Y. Lo; Jay A. Baker; Georgia D. Tourassi

Medical Imaging 2007: Computer-Aided Diagnosis | 2007

Particle swarm optimization of neural network CAD systems with clinically relevant objectives

Piotr A. Habas; Jacek M. Zurada; Adel Said Elmaghraby; Georgia D. Tourassi

Neural networks (NN) are typically developed to minimize the squared difference between the networks output and the target value for a set of training patterns; namely the mean squared error (MSE). However, lower MSE does not necessarily translate into a clinically more useful decision model. The purpose of this study was to investigate the particle swarm optimization (PSO) algorithm as an alternative way of NN optimization with clinically relevant objective functions (e.g., ROC and partial ROC area indices). The PSO algorithm was evaluated with respect to a NN-based CAD system developed to discriminate mammographic regions of interest (ROIs) that contained masses from normal regions based on 8 computer-extracted morphology-oriented features. Neural networks were represented as points (particle locations) in a D-dimensional search/optimization space where each dimension corresponded to one adaptable NN parameter. The study database of 1,337 ROIs (681 with masses, 656 normal) was split into two subsets to implement two-fold cross-validation sampling scheme. Neural networks were optimized with the PSO algorithm and the following objective functions (1) MSE, (2) ROC area index AUC, and (3) partial ROC area indices TPFAUC with TPF=0.90 and TPF=0.98. For comparison, performance of neural networks of the same architecture trained with the traditional backpropagation algorithm was also evaluated. Overall, the study showed that when the PSO algorithm optimized network parameters using a particular training objective, the NN test performance was superior with respect to the corresponding performance index. This was particularly true for the partial ROC area indices where statistically significant improvements were observed.

international conference of the ieee engineering in medicine and biology society | 2006

Probabilistic Framework for Reliability Analysis of Information-Theoretic CAD Systems in Mammography

Piotr A. Habas; Jacek M. Zurada; Adel Said Elmaghraby; Georgia D. Tourassi

The purpose of this study is to develop and evaluate a probabilistic framework for reliability analysis of information-theoretic computer-assisted detection (IT-CAD) systems in mammography. The study builds upon our previous work on a feature-based reliability analysis technique tailored to traditional CAD systems developed with a supervised learning scheme. The present study proposes a probabilistic framework to facilitate application of the reliability analysis technique for knowledge-based CAD systems that are not feature-based. The study was based on an information-theoretic CAD system developed for detection of masses in screening mammograms from the Digital Database for Screening Mammography (DDSM). The experimental results reveal that the query-specific reliability estimate provided by the proposed probabilistic framework is an accurate predictor of CAD performance for the query case. It can also be successfully applied as a base for stratification of CAD predictions into clinically meaningful reliability groups (i.e., HIGH, MEDIUM, and LOW). Based on a leave-one-out sampling scheme and ROC analysis, the study demonstrated that the diagnostic performance of the IT-CAD is significantly higher for cases with HIGH reliability (Az=0.92plusmn0.03) than for those stratified as MEDIUM (Az=0.84plusmn0.02) or LOW reliability predictions (Az=0.78plusmn0.02)

Medical Imaging 2005: Image Processing | 2005

DNA: directional neighborhood analysis for detection of breast masses in screening mammograms

Nevine H. Eltonsy; Georgia D. Tourassi; Piotr A. Habas; Adel Said Elmaghraby

We introduce a computer-assisted detection (CAD) system for the automated detection of breast masses in screening mammograms. The system targets the directional behavior of the neighborhood pixels surrounding a reference image pixel. The underlying hypothesis is that in the presence of a mass the directional properties of the breast tissue surrounding the mass should be altered. The hypothesis was tested using a database of 1,337 mammographic regions of interest (ROIs) extracted from DDSM mammograms. There were 681 ROIs containing a biopsy-proven mass centered in the ROI (340 malignant, 341 benign) and 656 ROIs depicting normal breast parenchyma. Initially, eight main directional propagations were identified and modeled given the center of the ROI as the reference pixel. Subsequently, eight novel morphological features were extracted for each direction. The features were designed to characterize the disturbance occurring in normal breast parenchyma due to the presence of a mass. Finally, the extracted features were merged using a back propagation neural network (BPANN). The network served as a non linear classifier trained to determine the presence of a mass centered at the reference image pixel. The BPANN was trained and tested using a leave-one-out sampling scheme. Its performance was evaluated with Receiver Operating Characteristics (ROC) analysis. Our CAD system showed an ROC area index of Az=0.88±0.01 for discriminating mass vs. normal ROIs. Detection performance was robust for both malignant (Az=0.88±0.01) and benign masses (Az=0.87±0.01). Thus, the proposed directional neighborhood analysis (DNA) can be applied effectively to identify suspicious masses in screening mammograms.

international symposium on neural networks | 2007

Stacked Generalization in Computer-Assisted Decision Systems: Empirical Comparison of Data Handling Schemes

Georgia D. Tourassi; Jonathan L. Jesneck; Maciej A. Mazurowski; Piotr A. Habas

Computer-assisted decision (CAD) systems are becoming increasingly popular for the diagnostic interpretation of radiologic images. These CAD systems often involve the stacked generalization of several different decision models. Combining decision models is a common meta-analysis strategy to improve upon the diagnostic performance of each individual model. This study investigates how different data handling schemes may affect the performance evaluation of CAD systems that rely on stacked generalization. The study is based on a multistage CAD system for the detection of masses in screening mammograms. The CAD system consists of a series of knowledge-based modules that operate at Level 0 capturing morphological as well as multiscale textural information. Then, the knowledge-based predictions are combined with a Level 1 classifier. The study shows that a leave-one-out sampling scheme appears to be an effective and relatively unbiased strategy for the estimation of the overall performance of a CAD system that is based on stacked generalization. However, extra caution should be placed on the complexity of the Level 1 combiner. When the available dataset is relatively small, a relatively simple learning system such as a backpropagation neural network with very few hidden nodes is preferable to avoid optimistically biased estimates of diagnostic performance.

Explore More