Katja Ocepek-Welikson

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Katja Ocepek-Welikson is active.

Explore More

Publication

Featured researches published by Katja Ocepek-Welikson.

Statistics in Medicine | 2000

Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures

Jeanne A. Teresi; Marjorie Kleinman; Katja Ocepek-Welikson

Cognitive screening tests and items have been found to perform differently across groups that differ in terms of education, ethnicity and race. Despite the profound implications that such bias holds for studies in the epidemiology of dementia, little research has been conducted in this area. Using the methods of modern psychometric theory (in addition to those of classical test theory), we examined the performance of the Attention subscale of the Mattis Dementia Rating Scale. Several item response theory models, including the two- and three-parameter dichotomous response logistic model, as well as a polytomous response model were compared. (Log-likelihood ratio tests showed that the three-parameter model was not an improvement over the two-parameter model.) Data were collected as part of the ten-study National Institute on Aging Collaborative investigation of special dementia care in institutional settings. The subscale KR-20 estimate for this sample was 0.92. IRT model-based reliability estimates, provided at several points along the latent attribute, ranged from 0.65 to 0.97; the measure was least precise at the less disabled tail of the distribution. Most items performed in similar fashion across education groups; the item characteristic curves were almost identical, indicating little or no differential item functioning (DIF). However, four items were problematic. One item (digit span backwards) demonstrated a large error term in the confirmatory factor analysis; item-fit chi-square statistics developed using BIMAIN confirm this result for the IRT models. Further, the discrimination parameter for that item was low for all education subgroups. Generally, persons with the highest education had a greater probability of passing the item for most levels of theta. Model-based tests of DIF using MULTILOG identified three other items with significant, albeit small, DIF. One item, for example, showed non-uniform DIF in that at the impaired tail of the latent distribution, persons with higher education had a higher probability of correctly responding to the item than did lower education groups, but at less impaired levels, they had a lower probability of a correct response than did lower education groups. Another method of detection identified this item as having DIF (unsigned area statistic=3.05, p<0.01, and 2.96, p<0.01). On average, across the entire score range, the lower education groups probability of answering the item correctly was 0.11 higher than the higher education groups probability. A cross-validation with larger subgroups confirmed the overall result of little DIF for this measure. The methods used for detecting differential item functioning (which may, in turn, be indicative of bias) were applied to a neuropsychological subtest. These methods have been used previously to examine bias in screening measures across education and ethnic and racial subgroups. In addition to the important epidemiological applications of ensuring that screening measures and neuropsychological tests used in diagnoses are free of bias so that more culture-fair classifications will result, these methods are also useful for the examination of site differences in large multi-site clinical trials. It is recommended that these methods receive wider attention in the medical statistical literature.

Medical Care | 2006

Identification of differential item functioning using item response theory and the likelihood-based model comparison approach. Application to the Mini-Mental State Examination.

Maria Orlando Edelen; David Thissen; Jeanne A. Teresi; Marjorie Kleinman; Katja Ocepek-Welikson

Background:An important part of examining the adequacy of measures for use in ethnically diverse populations is the evaluation of differential item functioning (DIF) among subpopulations such as those administered the measure in different languages. A number of methods exist for this purpose. Objective:The objective of this study was to introduce and demonstrate the identification of DIF using item response theory (IRT) and the likelihood-based model comparison approach. Methods:Data come from a sample of community-residing elderly who were part of a dementia case registry. A total of 1578 participants were administered either an English (n = 913) or Spanish (n = 665) version of the 21-item Mini-Mental State Examination. IRT was used to identify language DIF in these items with the likelihood-based model comparison approach. Results:Fourteen of the 21 items exhibited significant DIF according to language of administration. However, because the direction of the identified DIF was not consistent for one language version over the other, the impact at the scale level was negligible. Conclusions:IRT and the likelihood-based model comparison approach comprise a powerful tool for DIF detection that can aid in the development, refinement, and evaluation of measures for use in ethnically diverse populations.

Drug and Alcohol Dependence | 1995

Imipramine treatment of cocaine abuse: possible boundaries of efficacy

Edward V. Nunes; Frederic M. Quitkin; Katja Ocepek-Welikson; Jonathan W. Stewart; Teresa Koenig; Steven Wager; Donald F. Klein

A 12-week placebo-controlled, randomized clinical trial was undertaken to evaluate imipramine as a treatment for cocaine abuse, and to examine whether its effect may be limited to subgroups defined by route of use or by diagnosis of depression. One-hundred thirteen patients were randomized, stratified by route of use and depression. All patients received weekly individual counseling. Compared to placebo the imipramine group showed greater reductions in cocaine craving, cocaine euphoria, and depression, but the effect of imipramine on cocaine use was less clear. A favorable response, defined as at least 3 consecutive, urine-confirmed, cocaine-free weeks was achieved by 19% (11/59) of patients on imipramine compared to 7% (4/54) on placebo (P < 0.09). The imipramine effect was greater among nasal users--33% (9/27) response on imipramine vs. 5% (1/22) on placebo (P < 0.02). Response was also more frequent, but not significantly so, among depressed users on imipramine (26%, 10/38) than on placebo (13%, 4/31) (P < 0.19). Response rates were low in intravenous and freebase users and those without depression. Considered together with the literature on desipramine, these data suggest tricyclic antidepressants are not promising as a mainstay of treatment for unselected cocaine abusers. However, tricyclics may be useful for selected cocaine abusers with comorbid depression or intranasal use, or in conjunction with a more potent psychosocial intervention.

Quality of Life Research | 2007

A comparison of three sets of criteria for determining the presence of differential item functioning using ordinal logistic regression

Paul K. Crane; Laura E. Gibbons; Katja Ocepek-Welikson; Karon F. Cook; David Cella; Kaavya Narasimhalu; Ron D. Hays; Jeanne A. Teresi

BackgroundSeveral techniques have been developed to detect differential item functioning (DIF), including ordinal logistic regression (OLR). This study compared different criteria for determining whether items have DIF using OLR.ObjectivesTo compare and contrast findings from three different sets of criteria for detecting DIF using OLR. General distress and physical functioning items were evaluated for DIF related to five covariates: age, marital status, gender, race, and Hispanic origin.Research designCross-sectional study.Subjects1,714 patients with cancer or HIV/AIDS.MeasuresA total of 23 items addressing physical functioning and 15 items addressing general distress were selected from a pool of 154 items from four different health-related quality of life questionnaires.ResultsThe three sets of criteria produced qualitatively and quantitatively different results. Criteria based on statistical significance alone detected DIF in almost all the items, while alternative criteria based on magnitude detected DIF in far fewer items. Accounting for DIF by using demographic-group specific item parameters had negligible effects on individual scores, except for race.ConclusionsSpecific criteria chosen to determine whether items have DIF have an impact on the findings. Criteria based entirely on statistical significance may detect small differences that are clinically negligible.

Journal of Clinical Psychopharmacology | 1992

Predictive Value of Symptoms of Atypical Depression: for Differential Drug Treatment Outcome

Patrick J. McGrath; Jonathan W. Stewart; Wilma Harrison; Katja Ocepek-Welikson; Judith G. Rabkin; Edward N. Nunes; Steven Wager; Elaine Tricamo; Frederic M. Quitkin; Donald F. Klein

Data for 401 depressed outpatients with mood reactivity who participated in a randomized trial comparing placebo, imipramine, and phenelzine were analyzed for predictors of differential response by stepwise multiple regression techniques. Features of the Columbia criteria for atypical depression including oversleeping, overeating, severe anergy, and pathologic rejection sensitivity were each predictive of a poorer response to imipramine than to phenelzine only when compared to those patients with none of the features. These features were not additive in their contribution to differential outcome. Lack of endogenous features was not predictive of a differential drug treatment response. Compared with patients who have no symptoms of atypical depression, patients with any of the four features had an inferior imipramine response rather than a superior phenelzine response. These analyses indicate that the clear differential responsivity to medication treatment in atypical depression is not simply related to any one defining symptom and that further correlates of this apparent biological heterogeneity need to be explored.

Quality of Life Research | 2007

IRT health outcomes data analysis project: an overview and summary

Karon F. Cook; Cayla R. Teal; Jakob B. Bjorner; David Cella; Chih Hung Chang; Paul K. Crane; Laura E. Gibbons; Ron D. Hays; Colleen A. McHorney; Katja Ocepek-Welikson; Anastasia E. Raczek; Jeanne A. Teresi; Bryce B. Reeve

BackgroundIn June 2004, the National Cancer Institute and the Drug Information Association co-sponsored the conference, “Improving the Measurement of Health Outcomes through the Applications of Item Response Theory (IRT) Modeling: Exploration of Item Banks and Computer-Adaptive Assessment.” A component of the conference was presentation of a psychometric and content analysis of a secondary dataset.ObjectivesA thorough psychometric and content analysis was conducted of two primary domains within a cancer health-related quality of life (HRQOL) dataset.Research designHRQOL scales were evaluated using factor analysis for categorical data, IRT modeling, and differential item functioning analyses. In addition, computerized adaptive administration of HRQOL item banks was simulated, and various IRT models were applied and compared.SubjectsThe original data were collected as part of the NCI-funded Quality of Life Evaluation in Oncology (Q-Score) Project. A total of 1,714 patients with cancer or HIV/AIDS were recruited from 5 clinical sites.MeasuresItems from 4 HRQOL instruments were evaluated: Cancer Rehabilitation Evaluation System–Short Form, European Organization for Research and Treatment of Cancer Quality of Life Questionnaire, Functional Assessment of Cancer Therapy and Medical Outcomes Study Short-Form Health Survey.Results and conclusionsFour lessons learned from the project are discussed: the importance of good developmental item banks, the ambiguity of model fit results, the limits of our knowledge regarding the practical implications of model misfit, and the importance in the measurement of HRQOL of construct definition. With respect to these lessons, areas for future research are suggested. The feasibility of developing item banks for broad definitions of health is discussed.

Quality of Life Research | 2007

Evaluating measurement equivalence using the item response theory log-likelihood ratio (IRTLR) method to assess differential item functioning (DIF): applications (with illustrations) to measures of physical functioning ability and general distress

Jeanne A. Teresi; Katja Ocepek-Welikson; Marjorie Kleinman; Karon F. Cook; Paul K. Crane; Laura E. Gibbons; Leo S. Morales; Maria Orlando-Edelen; David Cella

Background Methods based on item response theory (IRT) that can be used to examine differential item functioning (DIF) are illustrated. An IRT-based approach to the detection of DIF was applied to physical function and general distress item sets. DIF was examined with respect to gender, age and race. The method used for DIF detection was the item response theory log-likelihood ratio (IRTLR) approach. DIF magnitude was measured using the differences in the expected item scores, expressed as the unsigned probability differences, and calculated using the non-compensatory DIF index (NCDIF). Finally, impact was assessed using expected scale scores, expressed as group differences in the total test (measure) response functions. Methods The example for the illustration of the methods came from a study of 1,714 patients with cancer or HIV/AIDS. The measure contained 23 items measuring physical functioning ability and 15 items addressing general distress, scored in the positive direction. Results The substantive findings were of relatively small magnitude DIF. In total, six items showed relatively larger magnitude (expected item score differences greater than the cutoff) of DIF with respect to physical function across the three comparisons: “trouble with a long walk” (race), “vigorous activities” (race, age), “bending, kneeling stooping” (age), “lifting or carrying groceries” (race), “limited in hobbies, leisure” (age), “lack of energy” (race). None of the general distress items evidenced high magnitude DIF; although “worrying about dying” showed some DIF with respect to both age and race, after adjustment. Conclusions The fact that many physical function items showed DIF with respect to age, even after adjustment for multiple comparisons, indicates that the instrument may be performing differently for these groups. While the magnitude and impact of DIF at the item and scale level was minimal, caution should be exercised in the use of subsets of these items, as might occur with selection for clinical decisions or computerized adaptive testing. The issues of selection of anchor items, and of criteria for DIF detection, including the integration of significance and magnitude measures remain as issues requiring investigation. Further research is needed regarding the criteria and guidelines appropriate for DIF detection in the context of health-related items.

International Journal of Eating Disorders | 1994

A double-blind placebo-controlled comparison of phenelzine and imipramine in the treatment of bulimia in atypical depressives.

Rachel Rothschild; H. Matthew Quitkin; Frederic M. Quitkin; Jonathan W. Stewart; Katja Ocepek-Welikson; Elaine Tricamo

Although antidepressants have been found to be superior to placebo in 12 of 14 studies, the relationship between improvement in the depressive diathesis and bulimia is unclear. In this study, the efficacy of placebo, imipramine, and phenelzine is examined in patients comorbid for atypical depression and bulimia. Greater improvement was observed for both depressive and bulimic symptoms with phenelzine than with either imipramine or placebo. Consistent with its poor antidepressant effects in atypical depression, imipramine seemed to have minimal efficacy for the bulimic symptoms of atypical depressives. These data suggest that the presence of bulimia does not alter the treatment response of atypically depressed patients. Furthermore, the data may suggest a link between depression and bulimia in atypical depressives. Demonstrating a statistical difference with a small sample suggests the effect size is robust, however conclusions are limited by a small sample size.

Journal of Affective Disorders | 1997

The efficacy of imipramine and psychotherapy in early-onset chronic depression: a reanalysis of the National Institute of Mental Health Treatment of Depression Collaborative Research Program

Vito Agosti; Katja Ocepek-Welikson

The authors compared the effectiveness of Cognitive Behavioral Therapy (CBT), Interpersonal Psychotherapy (IPT), Imipramine Clinical Management (ICM) to Placebo Clinical Management (PCM) for outpatients with early-onset chronic depression (N = 65) in the National Institute of Mental Health (NIMH) Treatment of Depression Collaborative Research Program (TDRP). The post-treatment depression scores of the CBT. IPT, and ICM groups were not significantly different from the PCM group. We did not find a relationship between the duration of Major Depression and response to a specific treatment. Studies are needed to determine if combining psychotherapy with medication improves social functioning and enhances the quality of life for patients with chronic depression.

Medical Care | 2006

Effect of Inpatient Quality of Care on Functional Outcomes in Patients With Hip Fracture

Albert L. Siu; Kenneth S. Boockvar; Joan D. Penrod; R. Sean Morrison; Ethan A. Halm; Ann Litke; Stacey B. Silberzweig; Jeanne A. Teresi; Katja Ocepek-Welikson; Jay Magaziner

Objectives:We sought to examine the relationship between functional outcome and process of care for patients with hip fracture. Research Design and Participants:We undertook a prospective cohort study in 4 hospitals of 554 patients treated with surgery for hip fracture. Measurements:Information on patient characteristics and processes of hospital care collected from the medical record, interviews, and bedside observations. Follow-up information obtained at 6 months on function (using the Functional Independence Measure [FIM]), survival, and readmission. Results:Individual processes of care were generally not associated with adjusted outcomes. A scale of 9 processes related to mobilization was associated with improved adjusted locomotion (P = 0.006), self care (P = 0.022), and transferring (P = 0.007) at 2 months, but the benefits were smaller and not significant by 6 months. These processes were not associated with mortality. The predicted value for the FIM locomotion measure (range, 2–14) at 2 months was 5.9 (95% confidence interval 5.4–6.4) for patients at the 10th percentile of performance on these processes compared with 7.1 (95% confidence interval 6.6, 7.6) at the 90th percentile. Patients who experienced no hospital complications and no readmissions retained the benefits in locomotion at 6 months. Anticoagulation processes were associated with improved transferring at 2 months (P = 0.046) but anticoagulation and other processes of care were not otherwise associated with improved function. Discussion:Our findings indicate the need to attend to all steps in the care of patients with hip fracture. Additionally, functional outcomes were more sensitive markers of improved process of care, compared with 6-month mortality, in the case of hip fracture.

Explore More