Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hui-Fang Chen is active.

Publication


Featured researches published by Hui-Fang Chen.


Research in Developmental Disabilities | 2012

Validity, Responsiveness, Minimal Detectable Change, and Minimal Clinically Important Change of the Pediatric Motor Activity Log in Children with Cerebral Palsy.

Keh-chung Lin; Hui-Fang Chen; Chia-Ling Chen; Tien Ni Wang; Ching-yi Wu; Yu-wei Hsieh; Li-ling Wu

This study examined criterion-related validity and clinimetric properties of the Pediatric Motor Activity Log (PMAL) in children with cerebral palsy. Study participants were 41 children (age range: 28-113 months) and their parents. Criterion-related validity was evaluated by the associations between the PMAL and criterion measures at baseline and posttreatment, including the self-care, mobility, and cognition subscale, the total performance of the Functional Independence Measure in children (WeeFIM), and the grasping and visual-motor integration of the Peabody Developmental Motor Scales. Pearson correlation coefficients were calculated. Responsiveness was examined using the paired t test and the standardized response mean, the minimal detectable change was captured at the 90% confidence level, and the minimal clinically important change was estimated using anchor-based and distribution-based approaches. The PMAL-QOM showed fair concurrent validity at pretreatment and posttreatment and predictive validity, whereas the PMAL-AOU had fair concurrent validity at posttreatment only. The PMAL-AOU and PMAL-QOM were both markedly responsive to change after treatment. Improvement of at least 0.67 points on the PMAL-AOU and 0.66 points on the PMAL-QOM can be considered as a true change, not measurement error. A mean change has to exceed the range of 0.39-0.94 on the PMAL-AOU and the range of 0.38-0.74 on the PMAL-QOM to be regarded as clinically important change.


Educational and Psychological Measurement | 2015

Item Response Theory Models for Wording Effects in Mixed-Format Scales

Wen-Chung Wang; Hui-Fang Chen; Kuan-Yu Jin

Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to test the assumption of reverse coding and evaluate the magnitude of the wording effect. The parameters of the bi-factor IRT models can be estimated with existing computer programs. Two empirical examples from the Program for International Student Assessment and the Trends in International Mathematics and Science Study were given to demonstrate the advantages of the bi-factor approach over traditional ones. It was found that the wording effect in these two data sets was substantial and that ignoring the wording effect resulted in overestimated test reliability and biased person measures.


Journal of Rehabilitation Medicine | 2012

Multidimensional Rasch validation of the Frenchay Activities Index in stroke patients receiving rehabilitation.

Keh-chung Lin; Hui-Fang Chen; Ching-yi Wu; Yu Ty; Pei Ouyang

OBJECTIVE To validate the dimensionality, hierarchical properties, and reliability of the Frenchay Activities Index. DESIGN Self-report survey of patients with stroke. PATIENTS A total of 127 patients provided 254 observations before and after treatments. METHODS Multidimensional Rasch model was conducted. RESULTS The 2-factor model showed the significantly smallest deviance and fitted the data best among 6 possible models. The 2-factor structure was stable before and after treatments, after the rating scale was revised from 4 points to 3 points. Differential item functioning relevant to the time since stroke was detected for 2 tasks. The item difficulty hierarchy of the 2 domains was determined. The correlation between the 2 domains was 0.58. The scale demonstrated acceptable ceiling and floor effects. The overall person (separation) reliability was 0.99. The reliabilities for the 2 domains were 0.81 and 0.73. CONCLUSION The Frenchay Activities Index is a useful 2-dimensional scale for evaluating daily functions in stroke patients. The item difficulty hierarchy and significant differential item functioning related to the time since stroke might reflect the changes in the recovery course after stroke. The Frenchay Activities Index could be improved by adding items to capture patients with high and low levels of daily activities in domestic chores.


Journal of Rehabilitation Medicine | 2012

VALIDITy, ReLIAbILITy AND ReSPONSIVeNeSS OF A SHORT VeRSION OF THe STROke-SPeCIFIC QUALITy OF LIFe SCALe IN PATIeNTS ReCeIVINg ReHAbILITATION

Hui-Fang Chen; Ching-yi Wu; Keh-chung Lin; Ming-wei Li; Hung-wen Yu

OBJECTIVE To examine the measurement properties of a short version of the Stroke-Specific Quality of Life Scale (SS-QoL-12). DESIGN Self-report survey of patients with mild to moderate upper extremity dysfunction. PATIENTS A total of 126 patients provided 252 observations before and after treatment. METHODS The construct validity and reliability was examined using the Rasch model; the concurrent and predictive validity was estimated using Spearmans rank correlation coefficients. Paired t-test and the standardized response mean (SRM) were performed to estimate the responsiveness of the SS-QoL-12. RESULTS The 2-factor model (psychosocial and physical domains) fit the data better with smaller deviances. All but 1 item showed acceptable fit, and no item biases were detected. The reliability of the subscales and the whole scale ranged from 0.67 to 0.99. The total score showed fair correlations with the criterion measures at pretreatment (ρ = 0.28-0.40) and fair to good correlations at post-treatment (ρ = 0.39-0.54). The subscales had low to fair correlations at pretreatment (ρ = 0.19-0.49) and fair to good correlations at post-treatment (ρ = 0.31-0.56). The total and the subscales had low to good predictions at baseline (ρ = 0.22-0.52). The whole scale and the psychosocial subscale were mildly responsive to change (SRM = 0.22), but the physical subscale was not responsive to change (SRM = 0.08). CONCLUSION The SS-QoL-12 has acceptable to good measurement properties, with an advantage of requiring less time to administer than other scales. The use of the subscale and total scores depends on the purpose of research. Future studies should recruit stroke patients with a broad range of dysfunction and use a large sample size to validate the findings.


Neurorehabilitation and Neural Repair | 2013

Rasch validation of a combined measure of basic and extended daily life functioning after stroke.

Hui-Fang Chen; Ching-yi Wu; Keh-chung Lin; Chia-Ling Chen; Pai Chuan Huang; Ching ju Hsieh; Jung sen Liu

Background. Tools used to measure poststroke functional status must include basic and instrumental activities of daily living and reflect the patient’s and the clinician’s perspective of the disease and its effect on daily living performance. Objective. The authors combined the Functional Independence Measure (FIM) and the Nottingham Extended Activities of Daily Living (NEADL) to create a scale providing a comprehensive evaluation of ADLs functional status in patients with stroke. Methods. The study participants were 188 patients completing the FIM and the NEADL. The psychometric properties of the combined measure were examined with Rasch analysis. Results. A 3-point scale and a dichotomous scale were suggested for use in the FIM and the NEADL, respectively. The combined 40 items worked consistently to reflect a single construct, and “bladder management” and “bowel management” were highly related. After “bowel management” was removed from the combined scale, all but 3 items fit the model’s expectations, and the 39-item scale showed reasonable item difficulty hierarchy, with high reliability. The 3 misfit items were removed, and no differences in unidimensionality, differential item functioning, and reliability were found between the 36-item and 39-item scales. Conclusions. The combined measure of the FIM and the NEADL provides a comprehensive picture of ADLs. It extends the utility of the FIM and the NEADL and is recommended for use to measure the independence of patients after discharge home.


Neurorehabilitation and Neural Repair | 2014

Measurement Properties of Streamlined Wolf Motor Function Test in Patients at Subacute to Chronic Stages After Stroke

Hui-Fang Chen; Ching-yi Wu; Keh-chung Lin; Yuh Jang; Shih-chieh Lin; Ju-wen Cheng; Chia-Ying Chung; Yanning Yan

Background. Previous research using the streamlined Wolf Motor Function Test (SWMFT) has focused either on the 3- to 9-month period or on the >12-month period after stroke and lacked the information for those at 9 to 12 months. Whether SWMFT scores reflect motor deficit and recovery from early to late stages after stroke remains unclear. Objective. A retrospective study using the Functional Ability Scale (FAS) was conducted to evaluate whether all SWMFTs items measure the poststroke recovery of upper extremity (UE) motor function and if they could be used for patients within 9 to 12 months after a stroke. Methods. Rasch analysis was conducted, and data were drawn from patients 3 months to years after a stroke. Results. The continuum of UE motor function in SWMFT-FAS was supported. Subacute patients had the best motor function, followed by the 9- to 12-month group, and then chronic patients. Variation in UE motor function was large (2.35-2.72 logits), and motor abilities of these 3 groups overlapped. The 8 SWMFT items could target a broad range of UE motor function, from −8.28 to 7.80 logits. The average difficulty of these 8 items also matched the UE motor ability of the subgroup at 9 to 12 months after stroke, and individual versions of the SWMFT performed well to assess the motor ability of this group. Conclusions. The SWMFTs had sound hierarchical properties. The SWMFT-Chronic or the SWMFT-Subacute could be used to evaluate UE function of this subgroup at 9 to 12 months after stroke.


Organizational Research Methods | 2018

Mixture Item Response Models for Inattentive Responding Behavior

Kuan-Yu Jin; Hui-Fang Chen; Wen-Chung Wang

Inattentive responses can threaten measurement quality, yet they are common in rating- or Likert-scale data. In this study, we proposed a new mixture item response theory model to distinguish inattentive responses from normal responses so that test validity can be ascertained. Simulation studies demonstrated that the parameters of the new model were recovered fairly well using the Bayesian methods implemented in the freeware WinBUGS, and fitting the new model to data that lacked inattentive responses did not result in severely biased parameter estimates. In contrast, ignoring inattentive responses by fitting standard item response theory models to data containing inattentive responses yielded seriously biased parameter estimates and a failure to distinguish inattentive participants from normal participants; the person-fit statistic lz was also unsatisfactory in identifying inattentive responses. Two empirical examples demonstrate the applications of the new model.


Frontiers in Psychology | 2018

Corrigendum: Modified Logistic Regression Approaches to Eliminating the Impact of Response Styles on Differential Item Functioning Detection in Likert-Type Scale

Hui-Fang Chen; Kuan-Yu Jin; Wen-Chung Wang

[This corrects the article on p. 1143 in vol. 8, PMID: 28736542.].


Frontiers in Psychology | 2018

Applying Logistic Regression to Detect Differential Item Functioning in Multidimensional Data

Hui-Fang Chen; Kuan-Yu Jin

Conventional differential item functioning (DIF) approaches such as logistic regression (LR) often assume unidimensionality of a scale and match participants in the reference and focal groups based on total scores. However, many educational and psychological assessments are multidimensional by design, and a matching variable using total scores that does not reflect the test structure may not be good practice in multidimensional items for DIF detection. We propose the use of all subscores of a scale in LR and compare its performance with alternative matching methods, including the use of total score and individual subscores. We focused on uniform DIF situation in which 250, 500, or 1,000 participants in each group answered 21 items reflecting two dimensions, and the 21st item was the studied item. Five factors were manipulated in the study: (a) the test structure, (b) numbers of cross-loaded items, (c) group differences in latent abilities, (d) the magnitude of DIF, and (e) group sample size. The results showed that, when the studied item measured a single domain, the conventional LR incorporating total scores as a matching variable yielded inflated false positive rates (FPRs) when two groups differed in one latent ability. The situation worsened when one group had a higher ability in one domain and lower ability in another. The LR using a single subscore as the matching variable performed well in terms of FPRs and true positive rates (TPRs) when two groups did not differ in either one latent ability or differed in one latent ability. However, this approach yielded inflated FPRs when two groups differed in two latent abilities. The proposed LR using two subscores yielded well-controlled FPRs across all conditions and yielded the highest TPRs. When the studied item measured two domains, the use of either the total score or two subscores worked well in the control of FPRs and yielded similar TPRs across conditions, whereas the use of a single subscore resulted in inflated FPRs when two groups differed in one or two latent abilities. In conclusion, we recommend the use of multiple subscores to match subjects in DIF detection for multidimensional data.


Applied Psychological Measurement | 2018

Using Odds Ratios to Detect Differential Item Functioning

Kuan-Yu Jin; Hui-Fang Chen; Wen-Chung Wang

Differential item functioning (DIF) makes test scores incomparable and substantially threatens test validity. Although conventional approaches, such as the logistic regression (LR) and the Mantel–Haenszel (MH) methods, have worked well, they are vulnerable to high percentages of DIF items in a test and missing data. This study developed a simple but effective method to detect DIF using the odds ratio (OR) of two groups’ responses to a studied item. The OR method uses all available information from examinees’ responses, and it can eliminate the potential influence of bias in the total scores. Through a series of simulation studies in which the DIF pattern, impact, sample size (equal/unequal), purification procedure (with/without), percentages of DIF items, and proportions of missing data were manipulated, the performance of the OR method was evaluated and compared with the LR and MH methods. The results showed that the OR method without a purification procedure outperformed the LR and MH methods in controlling false positive rates and yielding high true positive rates when tests had a high percentage of DIF items favoring the same group. In addition, only the OR method was feasible when tests adopted the item matrix sampling design. The effectiveness of the OR method with an empirical example was illustrated.

Collaboration


Dive into the Hui-Fang Chen's collaboration.

Top Co-Authors

Avatar

Kuan-Yu Jin

University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Keh-chung Lin

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Herman H. M. Lo

Hong Kong Polytechnic University

View shared research outputs
Top Co-Authors

Avatar

Jerf W. K. Yeung

City University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gloria Hongyee Chan

Caritas Institute of Higher Education

View shared research outputs
Top Co-Authors

Avatar

Jinxin Zhu

University of Hong Kong

View shared research outputs
Researchain Logo
Decentralizing Knowledge