Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jean-Benoit Hardouin is active.

Publication


Featured researches published by Jean-Benoit Hardouin.


Health and Quality of Life Outcomes | 2014

Sample size used to validate a scale: a review of publications on newly-developed patient reported outcomes measures

Emmanuelle Anthoine; Leïla Moret; Antoine Regnault; Véronique Sébille; Jean-Benoit Hardouin

PurposeNew patient reported outcome (PRO) measures are regularly developed to assess various aspects of the patients’ perspective on their disease and treatment. For these instruments to be useful in clinical research, they must undergo a proper psychometric validation, including demonstration of cross-sectional and longitudinal measurement properties. This quantitative evaluation requires a study to be conducted on an appropriate sample size. The aim of this research was to list and describe practices in PRO and proxy PRO primary psychometric validation studies, focusing primarily on the practices used to determine sample size.MethodsA literature review of articles published in PubMed between January 2009 and September 2011 was conducted. Three selection criteria were applied including a search strategy, an article selection strategy, and data extraction. Agreements between authors were assessed, and practices of validation were described.ResultsData were extracted from 114 relevant articles. Within these, sample size determination was low (9.6%, 11/114), and were reported as either an arbitrary minimum sample size (n = 2), a subject to item ratio (n = 4), or the method was not explicitly stated (n = 5). Very few articles (4%, 5/114) compared a posteriori their sample size to a subject to item ratio. Content validity, construct validity, criterion validity and internal consistency were the most frequently measurement properties assessed in the validation studies.Approximately 92% of the articles reported a subject to item ratio greater than or equal to 2, whereas 25% had a ratio greater than or equal to 20. About 90% of articles had a sample size greater than or equal to 100, whereas 7% had a sample size greater than or equal to 1000.ConclusionsThe sample size determination for psychometric validation studies is rarely ever justified a priori. This emphasizes the lack of clear scientifically sound recommendations on this topic. Existing methods to determine the sample size needed to assess the various measurement properties of interest should be made more easily available.


BMC Medical Research Methodology | 2010

Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients - a simulation study

Véronique Sébille; Jean-Benoit Hardouin; Tanguy Le Neel; Gildas Kubis; F.-C. Boyer; Francis Guillemin; Bruno Falissard

BackgroundPatients-Reported Outcomes (PRO) are increasingly used in clinical and epidemiological research. Two main types of analytical strategies can be found for these data: classical test theory (CTT) based on the observed scores and models coming from Item Response Theory (IRT). However, whether IRT or CTT would be the most appropriate method to analyse PRO data remains unknown. The statistical properties of CTT and IRT, regarding power and corresponding effect sizes, were compared.MethodsTwo-group cross-sectional studies were simulated for the comparison of PRO data using IRT or CTT-based analysis. For IRT, different scenarios were investigated according to whether items or person parameters were assumed to be known, to a certain extent for item parameters, from good to poor precision, or unknown and therefore had to be estimated. The powers obtained with IRT or CTT were compared and parameters having the strongest impact on them were identified.ResultsWhen person parameters were assumed to be unknown and items parameters to be either known or not, the power achieved using IRT or CTT were similar and always lower than the expected power using the well-known sample size formula for normally distributed endpoints. The number of items had a substantial impact on power for both methods.ConclusionWithout any missing data, IRT and CTT seem to provide comparable power. The classical sample size formula for CTT seems to be adequate under some conditions but is not appropriate for IRT. In IRT, it seems important to take account of the number of items to obtain an accurate formula.


Statistics in Medicine | 2011

Comparison of CTT and Rasch-based approaches for the analysis of longitudinal Patient Reported Outcomes.

Myriam Blanchin; Jean-Benoit Hardouin; Tanguy Le Neel; Gildas Kubis; Claire Blanchard; E. Mirallié; Véronique Sébille

Health sciences frequently deal with Patient Reported Outcomes (PRO) data for the evaluation of concepts, in particular health-related quality of life, which cannot be directly measured and are often called latent variables. Two approaches are commonly used for the analysis of such data: Classical Test Theory (CTT) and Item Response Theory (IRT). Longitudinal data are often collected to analyze the evolution of an outcome over time. The most adequate strategy to analyze longitudinal latent variables, which can be either based on CTT or IRT models, remains to be identified. This strategy must take into account the latent characteristic of what PROs are intended to measure as well as the specificity of longitudinal designs. A simple and widely used IRT model is the Rasch model. The purpose of our study was to compare CTT and Rasch-based approaches to analyze longitudinal PRO data regarding type I error, power, and time effect estimation bias. Four methods were compared: the Score and Mixed models (SM) method based on the CTT approach, the Rasch and Mixed models (RM), the Plausible Values (PV), and the Longitudinal Rasch model (LRM) methods all based on the Rasch model. All methods have shown comparable results in terms of type I error, all close to 5 per cent. LRM and SM methods presented comparable power and unbiased time effect estimations, whereas RM and PV methods showed low power and biased time effect estimations. This suggests that RM and PV methods should be avoided to analyze longitudinal latent variables.


Quality of Life Research | 2015

RespOnse Shift ALgorithm in Item response theory (ROSALI) for response shift detection with missing data in longitudinal patient-reported outcome studies

Alice Guilleux; Myriam Blanchin; Antoine Vanier; Francis Guillemin; Bruno Falissard; Carolyn E. Schwartz; Jean-Benoit Hardouin; Véronique Sébille

AbstractPurposeSome IRT models have the advantage of being robust to missing data and thus can be used with complete data as well as different patterns of missing data (informative or not). The purpose of this paper was to develop an algorithm for response shift (RS) detection using IRT models allowing for non-uniform and uniform recalibration, reprioritization RS recognition and true change estimation with these forms of RS taken into consideration if appropriate. MethodsThe algorithm is described, and its implementation is shown and compared to Oort’s structural equation modeling (SEM) procedure using data from a clinical study assessing health-related quality of life in 669 hospitalized patients with chronic conditions.ResultsThe results were quite different for the two methods. Both showed that some items of the SF-36 General Health subscale were affected by response shift, but those items usually differed between IRT and SEM. The IRT algorithm found evidence of small recalibration and reprioritization effects, whereas SEM mostly found evidence of small recalibration effects.ConclusionAn algorithm has been developed for response shift analyses using IRT models and allows the investigation of non-uniform and uniform recalibration as well as reprioritization. Differences in RS detection between IRT and SEM may be due to differences between the two methods in handling missing data. However, one cannot conclude on the differences between IRT and SEM based on a single application on a dataset since the underlying truth is unknown. A next step would be to implement a simulation study to investigate those differences.


Communications in Statistics - Simulation and Computation | 2007

The SAS Macro-Program %AnaQol to Estimate the Parameters of Item Responses Theory Models

Jean-Benoit Hardouin; Mounir Mesbah

The analysis of quality of life questionnaires is taking a great importance in clinical research. Usual and general statistical packages like SAS do not allow users to perform classical analysis of items or to estimate parameters of most used models in this specific field: the practitioners must use various specific software to analyze a quality of life scale. In this article, we present an easy to use SAS macro-program that enables SAS users to obtain classical indices, usual graphical representations, and estimation of parameters of five usual Item Response models. We illustrate capabilities of our macro-program by presenting some practical real Quality of Life examples.


Communications in Statistics-theory and Methods | 2004

Clustering Binary Variables in Subscales Using an Extended Rasch Model and Akaike Information Criterion

Jean-Benoit Hardouin; Mounir Mesbah

Abstract In this work, we handle the problem of selection of dichotomous items (questions with two possible answers) of a Quality of Life (QoL) questionnaire in sub-scales (subgroup of items producing unidimensional score). A procedure of clustering binary variables (items) in sub-scales with nice measurement properties is proposed. It is based on a new multidimensional Rasch model chosen in order to guarantee some specific measurement properties to the produced scores. The proposed process is presented, discussed and compared by simulations with the Mokken scale procedure (MSP). These simulations show that this new procedure is promising, specially when the structure of the set of binary variables is multidimensional, even if, several drawbacks persist, specially the time of computing of the procedure.


Journal of Clinical Epidemiology | 2014

The minimal clinically important difference determined using item response theory models: an attempt to solve the issue of the association with baseline score

Alexandra Rouquette; Myriam Blanchin; Véronique Sébille; Francis Guillemin; Sylvana M. Côté; Bruno Falissard; Jean-Benoit Hardouin

OBJECTIVES Determining the minimal clinically important difference (MCID) of questionnaires on an interval scale, the trait level (TL) scale, using item response theory (IRT) models could overcome its association with baseline severity. The aim of this study was to compare the sensitivity (Se), specificity (Sp), and predictive values (PVs) of the MCID determined on the score scale (MCID-Sc) or the TL scale (MCID-TL). STUDY DESIGN AND SETTING The MCID-Sc and MCID-TL of the MOS-SF36 general health subscale were determined for deterioration and improvement on a cohort of 1,170 patients using an anchor-based method and a partial credit model. The Se, Sp, and PV were calculated using the global rating of change (the anchor) as the gold standard test. RESULTS The MCID-Sc magnitude was smaller for improvement (1.58 points) than for deterioration (-7.91 points). The Se, Sp, and PV were similar for MCID-Sc and MCID-TL in both cases. However, if the MCID was defined on the score scale as a function of a range of baseline scores, its Se, Sp, and PV were consistently higher. CONCLUSION This study reinforces the recommendations concerning the use of an MCID-Sc defined as a function of a range of baseline scores.


Journal of Gambling Studies | 2013

A Shorter and Multidimensional Version of the Gambling Attitudes and Beliefs Survey (GABS-23)

Gaëlle Bouju; Jean-Benoit Hardouin; Claude Boutin; Philip Gorwood; Jean-Damien Le Bourvellec; Fanny Feuillet; Jean-Luc Venisse; Marie Grall-Bronnec

The Gambling Attitudes and Beliefs Survey (GABS) is a questionnaire which explores gambling-related dysfunctional beliefs in an unidimensional way. The present research aims to investigate the dimensionality of the scale. 343 undergraduate student gamblers and 75 pathological gamblers seeking treatment completed the GABS and the south oaks gambling screen. Exploratory and confirmatory factor analyses revealed that the original one-factor structure of the GABS did not fit the data effectively. We then proposed a shorter version of the GABS (GABS-23) with a new five-factor structure, which fitted with the data more efficiently. The comparisons between students (problem vs. non-problem gamblers) and pathological gamblers seeking treatment indicated that the GABS-23 can discriminate between problem and non-problem gamblers as efficiently as the original GABS. To ensure the validity and the stability of the new structure of the GABS-23, analyses were replicated in an independent sample that consisted of 628 gamblers (256 non problem gamblers, 169 problem gamblers who are not treatment-seeking and 203 problem gamblers seeking treatment). Analyses showed satisfactory results, and the multidimensional structure of the GABS-23 was then confirmed. The GABS-23 seems to be a valid and useful assessment tool for screening gambling-related beliefs, emotions and attitudes among problem and non-problem gamblers. Moreover, it presents the advantage of being shorter than the original GABS, and of screening irrational beliefs and attitudes about gambling in a multidimensional way. The five-factor model of the GABS-23 is discussed based on the theory of locus of control.


BMC Medical Research Methodology | 2011

Imputation by the mean score should be avoided when validating a Patient Reported Outcomes questionnaire by a Rasch model in presence of informative missing data.

Jean-Benoit Hardouin; Ronan Conroy; Véronique Sébille

BackgroundNowadays, more and more clinical scales consisting in responses given by the patients to some items (Patient Reported Outcomes - PRO), are validated with models based on Item Response Theory, and more specifically, with a Rasch model. In the validation sample, presence of missing data is frequent. The aim of this paper is to compare sixteen methods for handling the missing data (mainly based on simple imputation) in the context of psychometric validation of PRO by a Rasch model. The main indexes used for validation by a Rasch model are compared.MethodsA simulation study was performed allowing to consider several cases, notably the possibility for the missing values to be informative or not and the rate of missing data.ResultsSeveral imputations methods produce bias on psychometrical indexes (generally, the imputation methods artificially improve the psychometric qualities of the scale). In particular, this is the case with the method based on the Personal Mean Score (PMS) which is the most commonly used imputation method in practice.ConclusionsSeveral imputation methods should be avoided, in particular PMS imputation. From a general point of view, it is important to use an imputation method that considers both the ability of the patient (measured for example by his/her score), and the difficulty of the item (measured for example by its rate of favourable responses). Another recommendation is to always consider the addition of a random process in the imputation method, because such a process allows reducing the bias. Last, the analysis realized without imputation of the missing data (available case analyses) is an interesting alternative to the simple imputation in this context.


PLOS ONE | 2014

Power and Sample Size Determination in the Rasch Model: Evaluation of the Robustness of a Numerical Method to Non-Normality of the Latent Trait

Alice Guilleux; Myriam Blanchin; Jean-Benoit Hardouin; Véronique Sébille

Patient-reported outcomes (PRO) have gained importance in clinical and epidemiological research and aim at assessing quality of life, anxiety or fatigue for instance. Item Response Theory (IRT) models are increasingly used to validate and analyse PRO. Such models relate observed variables to a latent variable (unobservable variable) which is commonly assumed to be normally distributed. A priori sample size determination is important to obtain adequately powered studies to determine clinically important changes in PRO. In previous developments, the Raschpower method has been proposed for the determination of the power of the test of group effect for the comparison of PRO in cross-sectional studies with an IRT model, the Rasch model. The objective of this work was to evaluate the robustness of this method (which assumes a normal distribution for the latent variable) to violations of distributional assumption. The statistical power of the test of group effect was estimated by the empirical rejection rate in data sets simulated using a non-normally distributed latent variable. It was compared to the power obtained with the Raschpower method. In both cases, the data were analyzed using a latent regression Rasch model including a binary covariate for group effect. For all situations, both methods gave comparable results whatever the deviations from the model assumptions. Given the results, the Raschpower method seems to be robust to the non-normality of the latent trait for determining the power of the test of group effect.

Collaboration


Dive into the Jean-Benoit Hardouin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

F.-C. Boyer

University of Reims Champagne-Ardenne

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Damien Jolly

University of Reims Champagne-Ardenne

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge