Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Christina Parpoula is active.

Publication


Featured researches published by Christina Parpoula.


International Journal of Biomedical Engineering and Technology | 2012

Classification methods and ROC analysis for outcome prediction of patients following injuries

Christos Koukouvinos; Christina Parpoula; E.-M. Theodoraki

Receiver Operating Characteristics (ROC) analysis is commonly used in medical decision making, and in recent years has been used increasingly in machine learning and data-mining research. In this study, it is used for assessing the performance of classification algorithms in predicting trauma patients’ outcome. Data set comprised 8544 severely injured patients who had been admitted to Hellenic hospitals from the year 2005 to 2006. We analysed the demographic data and the factors that may have influenced the outcome in the group of patients with trauma and several combinations of significant factors were determined for that purpose.


Journal of data science | 2014

A New Variable Selection Approach Inspired by Supersaturated Designs Given a Large-Dimensional Dataset

Christina Parpoula; K. Drosou; Christos Koukouvinos; Kalliopi Mylona

The problem of variable selection is fundamental to statistical modelling in diverse fields of sciences. In this paper, we study in particular the problem of selecting important variables in regression problems in the case where observations and labels of a real-world dataset are available. At first, we examine the performance of several existing statistical methods for analyzing a real large trauma dataset which consists of 7000 observations and 70 factors, that include demographic, transport and intrahospital data. The statistical methods employed in this work are the nonconcave penalized likelihood methods (SCAD, LASSO, and Hard), the generalized linear logistic regression, and the best subset variable selection (with AIC and BIC), used to detect possible risk factors of death. Supersaturated designs (SSDs) are a large class of factorial designs which can be used for screening out the important factors from a large set of potentially active variables. This paper presents a new variable selection approach inspired by supersaturated designs given a dataset of observations. The merits and the effectiveness of this approach for identifying important variables in observational studies are evaluated by considering several two-levels supersaturated designs, and a variety of different statistical models with respect to the combinations of factors and the number of observations. The derived results are encouraging since the alternative approach using supersaturated designs provided specific information that are logical and consistent with the medical experience, which may also assist as guidelines for trauma management.


Communications in Statistics - Simulation and Computation | 2012

Analyzing Supersaturated Designs by Means of an Information Based Criterion

Christos Koukouvinos; Christina Parpoula

The cost and time consumption of many industrial experimentations can be reduced using the class of supersaturated designs since this can be used for screening out the important factors from a large set of potentially active variables. A supersaturated design is a design for which there are fewer runs than effects to be estimated. Although there exists a wide study of construction methods for supersaturated designs, their analysis methods are yet in an early research stage. In this article, we propose a method for analyzing data using a correlation-based measure, named as symmetrical uncertainty. This method combines measures from the information theory field and is used as the main idea of variable selection algorithms developed in data mining. In this work, the symmetrical uncertainty is used from another viewpoint in order to determine more directly the important factors. The specific method enables us to use supersaturated designs for analyzing data of generalized linear models for a Bernoulli response. We evaluate our method by using some of the existing supersaturated designs, obtained according to methods proposed by Tang and Wu (1997) as well as by Koukouvinos et al. (2008). The comparison is performed by some simulating experiments and the Type I and Type II error rates are calculated. Additionally, Receiver Operating Characteristics (ROC) curves methodology is applied as an additional statistical tool for performance evaluation.


Journal of Statistical Computation and Simulation | 2016

Computer-aided unbalanced supersaturated designs involving interactions

Kashinath Chatterjee; Christos Koukouvinos; Christina Parpoula

Supersaturated designs (SSDs) are defined as fractional factorial designs whose experimental run size is smaller than the number of main effects to be estimated. While most of the literature on SSDs has focused only on main effects designs, the construction and analysis of such designs involving interactions has not been developed to a great extent. In this paper, we propose a backward elimination design-driven optimization (BEDDO) method, with one main goal in mind, to eliminate the factors which are identified to be fully aliased or highly partially aliased with each other in the design. Under the proposed BEDDO method, we implement and combine correlation-based statistical measures taken from classical test theory and design of experiments field, and we also present an optimality criterion which is a modified form of Cronbachs alpha coefficient. In this way, we provide a new class of computer-aided unbalanced SSDs involving interactions, that derive directly from BEDDO optimization.


Quality and Reliability Engineering International | 2015

A Penalized Wrapper Method for Screening Main Effects and Interactions in Supersaturated Designs

Christos Koukouvinos; Christina Parpoula

Supersaturated designs (SSDs) are defined as fractional factorial designs whose experimental run size is smaller than the number of main effects to be estimated. The main goal using the class of SSDs is to identify the important effects efficiently, that is, at a minimal computational cost and time. Several methods for analyzing SSDs have been proposed in recent literature. While most of the literature on SSDs has focused on main effects models, the analysis of such designs involving models with interactions has not been developed to a great extent. In this paper, we attempt to relate several penalty and loss functions with support vector machines, with one main goal in mind, screening active effects in SSDs. In this spirit, we propose a penalized wrapper screening method for identifying in one stage the important main effects and two-factor interactions of two-level SSDs, by assuming generalized linear models. We also carry out simulation studies and a real data analysis to assess the performance of the proposed screening procedure, showing that the proposed method works satisfactorily. Copyright


Journal of statistical theory and practice | 2015

On the Computation of Entropy Prior Complexity and Marginal Prior Distribution for the Bernoulli Model

N. Balakrishnan; Christos Koukouvinos; Christina Parpoula

As the size and complexity of models grow, the choice of the best model becomes a difficult and challenging task. Once the best model is specified, the goodness of fit of the model needs to be examined first. A highly complex model may provide a good fit, but giving no consideration to model complexity could result in incorrect estimates of parameter values and predictions. In order to improve the model selection process, model complexity needs to be defined clearly. This article studies different aspects of model complexity and discusses the extent to which they can be measured. The most common attribute that is usually ignored from many complexity measures is the parameter prior, which is an inherent part of the model and could impact the complexity significantly. The concept of parameter prior and its connection to model complexity are therefore discussed here, and some relationships to the entropy measure elements are also addressed.


Communications in Statistics - Simulation and Computation | 2015

On the Analysis of Unbalanced Two-level Supersaturated Designs via Generalized Linear Models

Kashinath Chatterjee; Christos Koukouvinos; Christina Parpoula

ABSTRACT Supersaturated designs (SSDs) are factorial designs in which the number of experimental runs is smaller than the number of parameters to be estimated in the model. While most of the literature on SSDs has focused on balanced designs, the construction and analysis of unbalanced designs has not been developed to a great extent. Recent studies discuss the possible advantages of relaxing the balance requirement in construction or data analysis of SSDs, and that unbalanced designs compare favorably to balanced designs for several optimality criteria and for the way in which the data are analyzed. Moreover, the effect analysis framework of unbalanced SSDs until now is restricted to the central assumption that experimental data come from a linear model. In this article, we consider unbalanced SSDs for data analysis under the assumption of generalized linear models (GLMs), revealing that unbalanced SSDs perform well despite the unbalance property. The examination of Type I and Type II error rates through an extensive simulation study indicates that the proposed method works satisfactorily.


International Journal of Information and Decision Sciences | 2013

A combination of variable selection and data mining techniques for high-dimensional statistical modelling

Christos Koukouvinos; Kalliopi Mylona; Christina Parpoula

Variable selection is fundamental to statistical modelling in diverse fields of sciences. This paper deals with the problem of high-dimensional statistical modelling through the analysis of seismological data in Greece acquired during the years 1962-2003. The dataset consists of 10,333 observations and 11 factors, used to detect possible risk factors of large earthquakes. In our study, different statistical variable selection techniques are applied, while data mining techniques enable us to discover associations, meaningful patterns and rules. The statistical methods employed in this work were the non-concave penalised likelihood methods, SCAD, LASSO and Hard, the generalised linear logistic regression and the best subset variable selection. The applied data mining methods were three decision trees algorithms, the classification and regression tree (C%RT), the chi-square automatic interaction detection (CHAID) and the C5.0 algorithm. The way of identifying the significant variables in large datasets along with the performance of used techniques are also discussed.


Journal of Biomedical Science and Engineering | 2010

Innovative data mining approaches for outcome prediction of trauma patients

E.-M. Theodoraki; Stylianos Katsaragakis; Christos Koukouvinos; Christina Parpoula


Metrika | 2013

An information theoretical algorithm for analyzing supersaturated designs for a binary response

N. Balakrishnan; Christos Koukouvinos; Christina Parpoula

Collaboration


Dive into the Christina Parpoula's collaboration.

Top Co-Authors

Avatar

Christos Koukouvinos

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

E.-M. Theodoraki

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar

Kalliopi Mylona

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

E. Massou

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar

K. Drosou

National Technical University of Athens

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stylianos Katsaragakis

National and Kapodistrian University of Athens

View shared research outputs
Researchain Logo
Decentralizing Knowledge