[PDF] Identifying latent classes with ordered categorical indicators

Abstract

A Monte Carlo simulation was used to determine which assumptions for ordered categorical data, continuity vs. discrete categories, most frequently identifies the underlying factor structure when a response variable has five ordered categories. The impact of infrequently endorsed response categories was also examined, a condition that has not been fully explored. The typical method for overcoming infrequently endorsed categories in applied research is to collapse response options with adjacent categories resulting in less response categories that are endorsed more frequently, but this approach may not necessarily provide useful information. Response category endorsement issues have been studied in Item Response Theory, but this issue has not been addressed in classification analyses nor has fit measure performance been examined under these conditions. We found that the performance of commonly used fit statistics to identify the true number of latent class depends on the whether continuity is assumed, sample size, and convergence. Fit statistics performed best when the five response options are assumed to be categorical. However, in situations with lower sample sizes and when convergence is an issue, assuming continuity and using the adjusted Lo-Mendell-Rubin likelihood ratio test may be useful.

Full PDF

IIdentifying latent classes with ordered categorical indicators

R. Noah PadgettDepartment of Educational PsychologyBaylor UniversityWaco, TX 76706 [email protected]

Rebecca J. TiptonDepartment of Educational PsychologyBaylor UniversityWaco, TX 76706

Abstract

A Monte Carlo simulation was used to determine which assumptions for ordered categorical data, con-tinuity vs. discrete categories, most frequently identiﬁes the underlying factor structure when a responsevariable has ﬁve ordered categories. The impact of infrequently endorsed response categories was alsoexamined, a condition that has not been fully explored. The typical method for overcoming infrequentlyendorsed categories in applied research is to collapse response options with adjacent categories resultingin less response categories that are endorsed more frequently, but this approach may not necessarilyprovide useful information. Response category endorsement issues have been studied in Item ResponseTheory, but this issue has not been addressed in classiﬁcation analyses nor has ﬁt measure performancebeen examined under these conditions. We found that the performance of commonly used ﬁt statistics toidentify the true number of latent class depends on the whether continuity is assumed, sample size, andconvergence. Fit statistics performed best when the ﬁve response options are assumed to be categorical.However, in situations with lower sample sizes and when convergence is an issue, assuming continuityand using the adjusted Lo-Mendell-Rubin likelihood ratio test may be useful.

Classiﬁcation is an integral part of modern society. The presence of classiﬁcation systems is ubiquitous;ranging from how groceries are organized in a store to grouping children by ability levels in a classroom.Classiﬁcation, or clustering, of individuals in a population is oftentimes required so treatments or programscan be implemented for identiﬁed subgroups. Classiﬁcation analyses aim to bring a more objective stance intosuch groupings. The utility of interventions is dependent on accurate group identiﬁcation through appropriateuse of classiﬁcation analysis, thus choosing the appropriate models and discarding those that are not a goodﬁt or representation of the sample is important [16, 5]. Many procedures and frameworks exist to assess group1 a r X i v : . [ s t a t . A P ] S e p elonging and how well a generated model, or theoretical mathematical representation of relationships in apopulation, ﬁts the observed sample. One such framework is mixture modeling. Mixture modeling worksunder the premise that observed data are from a heterogeneous population and the modeling frameworkaims to identify groups of cases that are similar based on observed characteristics into more homogeneoussubgroups. The resulting homogeneous populations are the mixtures that comprise the observed data [13].Two such analyses that aim to ﬁnd homogeneous subgroups from data that are assumed to be froma heterogeneous population are latent class analysis (LCA) and latent proﬁle analysis (LPA). These twoanalyses have the same goal; to classify cases (individuals) into a small number of groups. Although, eachanalysis makes diﬀerent assumptions about the observed data, LCA assumes the indicators are categoricalwhile LPA assumes the indicators are continuous. Prior methodological investigations have examined theeﬀect of nonnormal indicators have on model selection and particularly on the deciding the number of latentclasses [13, 15].A type of nonnormally distributed indicator is when a response category is infrequently selected; acondition that has not been fully explored. Consider an item where all the respondents answer a questionsuch that no missing data exist for the item, but the responses could still likely be skewed and one ofthe categories could be unendorsed or infrequently endorsed by chance or by nature of the sample. Forexample, the resulting frequency of item category responses would contain a category without suﬃcientendorsement in the sample (i.e. sample size = 500; SD = 25, D = 0, N = 50, D = 200, SD = 225). Thetypical method for overcoming this occurrence in applied research is to collapse the response options with anadjacent category, resulting in a more parsimonious solution but not necessarily one that is as representativeor useful. Collapsing categories is a simple solution to help with estimation of parameters to make thenumber of categories one less. Infrequency of response category endorsement is discussed in Item ResponseTheory, where the latent variable of interest is continuous. “If a response category has zero or few responses,a program to estimate parameters will not be able to estimate [graded response model parameters]” [8].Although, the issue has not been fully explored in classiﬁcation analyses nor has ﬁt measure performancebeen examined under these conditions.In this study, we investigate what happens when ordered response categories are assumed continuousand mixture modeling is employed. The assumption of continuity is expected to aid in model selection whena response category is unendorsed or infrequently endorsed. Furthermore, we investigate the performanceof ﬁt statistics used in model selection under these conditions. The remainder of our discussion structuresas follows. We discuss the LCA/LPA model considered in more detail, followed by an brief introduction tothe statistical ﬁt measures for model selection included. We then describe the Monte Carlo simulation studyconditions. The results are in the third section followed by a discussion that includes recommendations for2se of statistical ﬁt indices for model selection. LCA and LPA

Classiﬁcation analyses such as Latent Class Analysis (LCA) and Latent Proﬁle Analysis (LPA) are two typesof mixture models used for classiﬁcation. These models are receiving growing attention in methodologicalresearch due to the expanding application of such models [26, 14, 18, 22]. The goal of LCA and LPA alikeis to classify similar objects into one of K groups or classes of unknown form and frequency, with the formof the group referring to cluster-speciﬁc centroids, variances, and covariance, and the frequency referring tothe number of underlying groups present.The model is brieﬂy introduction below; however a more in-depth discussion of the LCA/LPA modelscan be found in [13]. Let y i represent a vector of responses of individual i on the j th item, where i = 1 , , ..., N for the number of individuals sampled and j = 1 , ..., K for K being the number of latent classes speciﬁed.LCA is a probabilistic model for each unique response pattern observed in a sample and is deﬁned as P ( y i ) = K (cid:88) k =1 P ( X = k ) J (cid:89) j =1 P ( y ij | X = k ) (1)where P ( y i ) is the probability of response pattern for individual i , P ( X = k ) is the probability of membershipin class k also known as class prevalence which is the proportional size of class k , and P ( y ij | X = k ) is theprobability of the response of individual i on the j th item conditional on class membership. The diﬀerencebetween LCA and LPA is in the form of the item response probability. In LCA, each class has speciﬁcitem response/endorsement probabilities that diﬀerential the classes, whereas in LPA each class has speciﬁcresponse means and variances. Model Selection Assumptions

Models, by deﬁnition, are only approximations to unknown reality or truth. George Box made the famousstatement, “All models are wrong but some are useful.” Models that perfectly reﬂect reality do not exist, butselecting models that best represent relationships in the population can still be useful in decision-making.Three general guiding principles that should be considered during model selection are (a) parsimony, (b)multiplicity or alternative model hypotheses, and (c) strength of evidence. Inference under models with toofew parameters can be biased, but models having too many parameters results in poor precision and spuriousconclusions. A balance must exist between under- and over-speciﬁcation during the model selection process.As such, several working hypotheses or alternate models must be considered as potential representations of3he true population based on theory and previous research, and assessed accordingly, but the number shouldbe kept small and should consider scientiﬁc context. Judgment on the part of the researcher is critical duringthe hypothesis testing stage, and considering multiple sources and strength of evidence collected in concertwith previous research ﬁndings is important in model selection.

Fit Indices for Model Selection

Once models have been estimated, the researcher is tasked with selecting the optimal ﬁtting model. Themodel with the correct number of latent classes is selected using a mix of statistical evidence and guidingtheory. There are no established guidelines to determine adequate ﬁt of a theoretical model to data, butensuring that parameter estimates are within reasonable range and standard errors for the estimates are nottoo large is an important component of considering model ﬁt [12]. The ﬁt indices included in a study aresuggested to be chosen based upon the study being conducted, but cannot be evaluated independently ofone another [24]. In this article, we examine the methods for determining ﬁt using two major procedures ofmodel selection statistical using information criteria and likelihood ratio based tests.

Information Criteria Procedures

Measures that use the information criteria are of particular importance because they provide a frameworkfor comparing models with diﬀering numbers of parameters and diﬀerent class enumerations. In general, themodel that has the lowest information criteria in terms of absolute value, is the model that best approximatesthe relationships observed in the population, and models that do not ﬁt better than the baseline model canbe dismissed. For this study, we have chosen to assess Akaike’s Information Criterion (AIC), corrected AIC(AICc), Bayesian Information Criterion (BIC), and sample size-adjusted Bayesian Information Criterion(ssBIC). These criteria are sensitive to parameters like sample size and parsimony, and model selectionshould be informed by biases innate in the criteria selected. To date, there is not common acceptance of thebest criteria for determining the number of classes in mixture modeling, despite various suggestions.To see how well these models ﬁt collected data, the log-likelihood value is typically used. Due tothe likelihood function taking on small values and the resulting log likelihood being a negative value, avalue close to 0 indicates optimal ﬁt. A log-likelihood closer to 0 is the same as the likelihood functionapproaching 1, which would indicated that the model predicts these data well. Prior examination of correctclass enumeration based on the log-likelihood values, however, has shown poor results [18, 14].The Akaike Information Criterion (AIC) is a goodness-of-ﬁt measure that reﬂects the extent to whichthe observed covariance matrix varies from the model or predicted covariance matrix [1, 2], with a lower AIC4alue for two competing models indicating better model ﬁt (Kaplan, 2000; Takane & Bozdogan, 1987). TheAIC is deﬁned as

AIC = − L + 2 p (2)Where L is value from the likelihood function and p is the number of free model parameters. The AIC valueis penalized for complexity of the proposed model, meaning use of the AIC is an attempt to minimize theoverall error caused by added parameters. The corrected AIC (AICc) is biased corrected version of the AICunder small sample sizes [23, 6]. The AICc is considered to be more stringent than the AIC, having a greaterpenalty for models having larger numbers of parameters. AICc = − L + 2 p + 2 p ( p + 1) n − p − (3)The Bayesian Information Criteria (BIC) is similar to AIC in that it is also a measure of goodness-of-ﬁt aswell as the extent to which the observed covariance matrix varies from the predicted covariance matrix [21].The BIC is deﬁned as BIC = − L + p log( n ) (4)A lower BIC value for two competing models indicates a better model ﬁt; the penalty of BIC increaseswith sample size. To accommodate for this penalty, the sample size-adjusted Bayesian Information Criterion(ssBIC) replaces n in the above equation with n ∗ = ( n + 2) / ssBIC = − L + p log[( n ∗ + 2) / (5)For each of the information criteria, the optimal ﬁtting model is chosen based on which of the comparedmodel have the lowest value. Likelihood Ratio Tests

Other measures of ﬁt include the Lo-Mendell-Rubin likelihood ratio tests. We focus our attention on theadjusted Lo-Mendell-Rubin Likelihood Ratio Test (aLMR) [11, 25]. The likelihood ratio test is able tocompare models that diﬀer in the number of classes by indicating that the model with K-1 classes shouldbe rejected in favor of the model with K classes (Lo, Mendell, & Rubin, 2001; Muthen, 2004). The aLMRis sample-size dependent, meaning that a larger sample size inﬂates the test statistics [11].Bootstrapping procedures can also be used to supplement the likelihood ratio test family by gener-ating and using empirical distributions of likelihoods [13]. Comparison of models using a likelihood-based5echnique can be done by a parametric bootstrapping method where bootstrap samples are used to estimatediﬀerences in the distribution and a p-value allows for comparison of the K-1 and K class models. The boot-strap likelihood ratio test (BLRT) can then be used to determine if two competing models are signiﬁcantlydiﬀerent, as described in detail by [18, 4]. It has been suggested that the BLRT may be more appropri-ate than LMR statistics when comparing LCA models due to BLRT’s accuracy and tendency to producenonsigniﬁcant ﬁndings for models with increased K, whereas LMR ﬁndings tend to show signiﬁcance, theninsigniﬁcance, and then signiﬁcance again as k increases.Both the aLMR and BLRT procedures will be generated in this study to provide a value for assess-ment of whether the proposed K-1 class model should be rejected in favor of the K class model. The K-1class is selected as optimal if the aLMR p-value for the test of K-1 to K is found to be greater than .05,meaning that the K class does not provide a better ﬁt than the K-1 class.

Methods

Population Model Structure

Conditions were selected that mirror empirical research situations as closely as possible. The populationstructures for this study were replicated from [14], where Morgan reviewed educational databases for stud-ies using LCA or similar mixture modeling techniques. The three factors varied were sample size, classprevalence, and category response distribution. The included sample sizes are 500, 1000, and 1500.

Class Prevalence

The class prevalence or class size is representative of various sizes of the latent populations. The underlyingpopulations may be approximately equal in size or a class may be more prevalent than the others. Theexistence of a rare class might also be speculated based on guiding theory; for example, in psychiatricdisorders the class exhibiting the disorder is likely a small portion of the general population. Varying classprevalence changes the probability of accurately deﬁning a class, and classes that are rare are harder todeﬁne or statistically justify. Three underlying classes are assumed in our study and the class prevalencesconsidered are 0.45-0.40-0.15, 0.59-0.26-0.15, and 0.89-0.08-0.03.

Category Response Distribution

The main scope of our investigation is to assess the category response distribution when obtained dataare ordered. We assumed the collection of data from ten items each have ﬁve ordered categories. Response6robability is the primary method for classifying people based on like responses, so if a category is unendorsedor infrequently endorsed, classiﬁcation becomes more diﬃcult because less variability exists in responsepatterns. In our study, three diﬀerent conditions are compared. The ﬁrst condition has a distributionthat contains an unendorsed response in the second the lowest category. The second condition contains acategory that is infrequently endorsed with only 1% of subjects in each class endorsing that category. Thelast condition is a control condition where each pf the categories are endorsed in a normal distribution. Thedistribution of class category response probabilities for the indicators are given in Table 1.Table 1: Distribution of class response probabilities perconditionResponse Response CategoriesFrequency Class SD D N A SAUnendorsed 1 .05 .00 .05 .10 .802 .05 .00 .75 .15 .053 .90 .00 .05 .025 .025Infrequent 1 .049 .01 .05 .10 .802 .05 .01 .75 .149 .053 .90 .01 .049 .025 .025Control 1 .025 .025 .05 .20 .702 .05 .10 .70 .10 .053 .70 .20 .05 .025 .025

Note. condition designating how endorsed each item responseis across conditions; 1) a response category is not endorsed, 2)infrequently endorsed response category, and 3) control withall response options selected. Summarizing Results

Mirroring real-life applied research problems when trying to identify the unknown number of classes, we ﬁtone through ﬁve class solutions for each type of analysis. Model estimation was conducted in M plus esults

The entirety of the results of this simulation study are available online [19]. A total of 27,000 data setswere successfully generated, 27(3x3x3) conditions by 1,000 replications. For each data set one- throughﬁve- class solutions were tested using categorical and continuous indicator assumptions, i.e. LCA and LPA.Testing these models for each of 1000 replicates in each of 27 conditions yields 270,000 models were estimated.Convergence was found to be problematic when assuming categorical indicators, especially in the infrequentlyendorsed category conditions. Models that did not converge were ﬂagged and the replication was deemedunusable, an approach taken in similar simulation studies, and nonconvergence is typically due to an ill-deﬁned likelihood function and/or an insuﬃcient number of random starts [18, 14]. The convergence ratesfor each condition are provided in Table 2.Table 2: Convergence rate across replicationsResponse Class N=500 N=1000 N=1500Frequency Prevalence LCA LPA LCA LPA LCA LPAUnendorsed .45-.40-.15 1.000 1.000 .990 1.000 .967 1.000.59-.26-.15 .998 1.000 .994 1.000 .970 1.000.89-.08-.03 .999 1.000 .979 1.000 .898 1.000Infrequent .45-.40-.15 .996 1.000 .935 1.000 .826 1.000.59-.26-.15 .998 1.000 .967 1.000 .875 1.000.89-.08-.03 .996 1.000 .938 1.000 .736 1.000Control .45-.40-.15 .999 1.000 .981 1.000 .931 1.000.59-.26-.15 .999 1.000 .985 1.000 .949 1.000.89-.08-.03 .999 1.000 .958 1.000 .801 1.000

Information Criteria

In this study, four diﬀerent information criteria were examined for model selection, the Akaike informationcriterion (AIC), corrected Akaike information criterion (AICc), Bayesian information criterion (BIC), andsample size-adjusted (ssBIC). The proportion of valid replications that each index correctly identiﬁed thenumber of latent classes are reported in Table 3. The results from the log-likelihood statistic are not reportedbecause under all conditions the correct number of latent classes was not found.The diﬀerence in correct enumeration selection is obvious between the analyses for all informationcriteria considered. By assuming the indicators are continuous, the four information criteria are typicallyunable to ﬁnd the correct number of latent classes under these conditions. The only exception is theBIC under the control condition when the response distribution is fairly normally distributed; yet, theidentiﬁcation rate is still low (see Table 3). When the assumption of categorical indicators is used, the8ate of correct speciﬁcation increases remarkably. For example, the ssBIC perfectly identiﬁed the numberof latent classes in all conditions under the assumption of categorical indicators, but under the assumptionof continuity, the ssBIC performed poorly (see Table 3). These results lend evidence to how sensitive theinformation criteria are to the assumption of continuity.Table 3: Proportion of data sets each ﬁt measure correctly identiﬁed a three class solution asoptimal by analysis type

AIC AICc BIC ssBIC aLMR BLRTRF CP N LCA LPA LCA LPA LCA LPA LCA LPA LCA LPA LCA LPAU 1 500 .75 .00 1.00 .00 1.00 .00 1.00 .00 .58 .69 .69 .001000 .74 .00 1.00 .00 1.00 .00 1.00 .00 .43 .27 .61 .001500 .72 .00 .98 .00 1.00 .00 1.00 .00 .35 .12 .61 .002 500 .72 .00 1.00 .00 1.00 .00 1.00 .00 .70 .64 .69 .001000 .70 .00 .99 .00 1.00 .00 1.00 .00 .47 .15 .62 .001500 .67 .00 .96 .00 1.00 .00 1.00 .00 .37 .03 .58 .003 500 .75 .00 1.00 .00 .98 .00 1.00 .00 .66 .45 .83 .001000 .70 .00 .99 .00 1.00 .00 1.00 .00 .49 .07 .76 .001500 .72 .00 .97 .00 1.00 .00 1.00 .00 .34 .01 .73 .00I 1 500 .86 .00 1.00 .00 1.00 .00 1.00 .00 .01 .69 .67 .001000 .83 .00 1.00 .00 1.00 .00 1.00 .00 .25 .31 .69 .001500 .81 .00 1.00 .00 1.00 .00 1.00 .00 .56 .12 .70 .002 500 .90 .00 1.00 .00 1.00 .00 1.00 .00 .07 .66 .67 .001000 .82 .00 1.00 .00 1.00 .00 1.00 .00 .48 .20 .66 .001500 .76 .00 .99 .00 1.00 .00 1.00 .00 .77 .04 .60 .003 500 .89 .00 1.00 .00 .96 .00 1.00 .00 .09 .46 .78 .001000 .83 .00 1.00 .00 1.00 .00 1.00 .00 .25 .13 .79 .001500 .80 .00 1.00 .00 1.00 .00 1.00 .00 .38 .02 .76 .00C 1 500 .63 .00 1.00 .00 1.00 .45 1.00 .00 .58 .71 .87 .001000 .65 .00 1.00 .00 1.00 .04 1.00 .00 .78 .43 .83 .001500 .66 .00 .99 .00 1.00 .00 1.00 .00 .92 .25 .80 .002 500 .64 .00 1.00 .00 1.00 .14 1.00 .00 .72 .67 .84 .001000 .62 .00 1.00 .00 1.00 .00 1.00 .00 .89 .21 .81 .001500 .62 .00 .99 .00 1.00 .00 1.00 .00 .96 .07 .78 .003 500 .65 .00 1.00 .00 .88 .11 1.00 .00 .43 .57 .91 .001000 .64 .00 1.00 .00 1.00 .00 1.00 .00 .48 .16 .90 .001500 .66 .00 .99 .00 1.00 .00 1.00 .00 .57 .03 .86 .00

Note.

Cells in boldface indicate that LPA correctly speciﬁed the correct number of class more often thanLCA. LCA = latent class analysis; LPA = latent proﬁle analysis; AIC = Akaike information criterion;AICc = corrected Akaike information criterion; BIC = Bayesian information criterion; ssBIC = samplesize-adjusted Bayesian information criterion; aLMR = adjusted Lo - Mendell - Rubin likelihood ratio test;BLRT = bootstrap likelihood ratio test. RF is the response frequency condition described in Table 1, where U = unendorsed category, I = infre-quently endorsed category, and C = control condition; CP is class prevalence condition, 1) .45, .40, .15; 2) .59, .26, .15; 3) .89, .08, .03.

Likelihood Ratio Tests

Two likelihood ratio tests for model selection were examined, the adjusted Lo-Mendell-Rubin likelihood ratiotest (aLMR) and the bootstrap likelihood ratio test (BLRT). Both likelihood ratio tests performed worsethan the information criteria included. The assumption of continuity had the most eﬀect on the performanceof the BLRT, where the BLRT did not identify the correct number of latent classes under all conditions (see9able 3). Although, the BLRT performed consistently when the indicators are assumed to be categorical.The stark diﬀerence of BLRT performance between LCA and LPA is not found with the aLMR.The performance of aLMR is more nuanced in the studied conditions compared to the other ﬁt mea-sures. We examined the performance of the aLMR with logistic regression to help identify which factors andlevels included in our study are inﬂuential in determining correct speciﬁcation. Notable results are evidenceof an interaction between analysis type-LPA and response distribution-infrequently endorsed category (OR= 124.64); and an interaction among the analysis type (LPA), response distribution-infrequently endorsedcategory, class prevalence (.89 - .08 - .03) and sample size-1000 (OR = 8.17).The complex interactions lend evidence that the aLMR performs better under some conditions andwhen diﬀerent assumptions of the continuity of the indicators are made. These interaction are diﬃcult tomake sense of from looking at the values reported in Table 3, so the proportion of replications where theaLMR identiﬁed a three class solution as optimal is plotted by each condition and analysis type in Figure1 to aid interpretation. The assumption of continuity does appear to help when a response category isinfrequently endorsed and sample size is small, but as sample size increases the assumption of continuitydoes not appear to help with model selection.

Discussion

The use of classiﬁcation analyses is multidisciplinary so determining proper uses of available techniquesand how to evaluate obtained results is crucial. The aim of our investigation was to ﬁnd evidence of whatoccurs in model selection when ordered response categories are assumed to be continuous. We studied whatstatistical measures of model ﬁt are useful when data are ordered response categories and when a responsecategory is unendorsed or infrequently endorsed under the diﬀerent assumptions of continuity of indicators.The use of ordered response categories can lead to possible extraction issues. When ordered cate-gories are assumed to be continuous, extraction of an incorrect number of latent classes is more likely. Theseordered categorical data are highly prevalent in applied research in education and psychology, as found ina review by [10]. An accepted guideline in methodological research is that these data can be treated ascontinuous if there are at least ﬁve categories. However, in conditions studied here for mixture modelingwe found evidence that ﬁve ordered response categories should be treated as ordered categorical and notcontinuous. The assumption of continuity only seems beneﬁcial when convergence is problematic, such aswhen a response category is infrequently endorsed. When convergence is not an issue, our results are similarto previous simulation studies of these ﬁt indices. The BIC and ssBIC were found to consistently select thecorrect number of latent classes, a result similar to other studies [14, 18, 6, 22]. The BLRT was also found10 l ll l ll l l l l ll l ll l l l l ll l ll l l

Unendorsed Category Infrequently Endorsed Control . − . − . . − . − . . − . − .

500 1000 1500 500 1000 1500 500 1000 15000.000.250.500.751.000.000.250.500.751.000.000.250.500.751.00

Sample Size P r opo r t i on o f C o rr e c t M ode l S e l e c t i on Figure 1: Complex interactions of the aLMR results for estimating the correct number of latent classes

Note.

The solid line represents LCA while the dashed line represents LPA results. The vertical panels arefor each condition of class prevalence and the horizontal panels are for the response frequency conditions.11o perform poorly, which is counter to what [18] found but is similar to the results of [7].These statistical measures of model selection appears to be sensitive to the assumption of continuityand the frequency of response category endorsement. This implies that there are no strict rules or guidelinesfor use of these ﬁt indices, and researchers should be aware of possible issues of relying on these indices tooheavily. Despite the fact that these measures can be useful, performance of these ﬁt indices changes basedon the distribution pf the indicators and the assumptions of the researcher.

Limitations and Delimitations

As with any simulation study, these results generalize only to the conditions included in this study. Weused default settings in M plus plus could help with issues that arise with estimation. For example, Bayesian estimation procedures area promising avenue [3]. The included statistical ﬁt indices are only a subset of possible choices available topractitioners; therefore the generalizations of our investigation for model selection is limited to these includedmeasures.

Recommendations for Use of Fit Indices in LCA & LPA

We recommend that when data collected are ﬁve ordered categories, such as Likert-type data, and arebeing used for identiﬁcation of homogeneous subpopulations, these data should be considered categoricaland not continuous. Despite the advantage of convergence, the commonly available ﬁt statistics for modelselection performed poorly under the assumption of continuity with these data. One exception is the adjustedLo-Mendell-Rubin likelihood ratio statistic. The aLMR is the only ﬁt measure examined in this study toidentify the correct number of latent classes when a response category is infrequently or not endorsed and theindicators are assumed to be continuous. The use of aLMR could help practitioners when model convergenceis an issue and sample size is small. Although, under most conditions we recommend that the corrected AIC(AICc), Bayesian information criterion (BIC), or the sample size-adjusted BIC (ssBIC) be used to aid modelselection when the indicators are ﬁve ordered categories along with substantial weight to substantive theoryand interpretation of results.

Author Contributions

R.N. Padgett and R. J. Tipton jointly generated the idea for the study. R.N. Padgett wrote the code forgenerating the data, estimating the models, extracting the relevant results and analyzing the model sum-maries. R. J. Tipton veriﬁed the accuracy of those analyses. R.N. Padgett and R. J. Tipton collaboratively12rote the ﬁrst draft of the manuscript, and both authors critically edited it. Both authors approved theﬁnal submitted version of the manuscript. 13 eferences [1] Hirotugu Akaike. “A New Look at the Statistical Model Identiﬁcation”. In:

IEEE Transactions onAutomatic Control doi : .[2] Hirotugu Akaike. “Factor analysis and AIC”. In: Psychometrika doi : .[3] Tihomir Asparouhov, Ellen L. Hamaker, and Bengt Muthén. “Dynamic Latent Class Analysis”. In: Structural Equation Modeling doi : .[4] Tihomir Asparouhov and Bengt O Muthén. “Using Mplus TECH11 and TECH14 to test the numberof latent classes”. In: Mplus Web Notes

14 (2012), pp. 1–17.[5] Leo Breinman. “Random forests”. In:

Machine Learning doi : .[6] Kenneth P Burnham and David R Anderson. “Multimodel inference: understanding AIC and BIC inmodel selection”. In: Sociological methods & research

StructuralEquation Modeling: A Multidisciplinary Journal doi : .[8] Susan E Embretson and Steven P Reise. Item response theory for psychologists . Psychology Press,2000, pp. 83–85.[9] Gregory R. Hancock and Ralph O. Mueller.

Structural Equation Modeling: A Second Course . 2nd.Quantitative methods in education and the behavioral sciences. Charlotte: Information AgePublishing, 2013, pp. 625–663.[10] Michael R Harwell and Guido G Gatti. “Rescaling Ordinal Data to Interval Data in EducationalResearch”. In:

Review of Educational Research Spring doi : .[11] Y Lo, Nancy R. Mendell, and Donald B. Rubin. “Testing the number of components in a normalmixture”. In: Biometrika doi : .[12] Herbert W Marsh and David Grayson. “Latent variable models of multitrait-multimethod data”. In: Structural equation modeling: Concepts, issues and applications . Ed. by R. Hoyle. Thousand Oaks,CA: Sage Publications, 1995, pp. 177–198. 1413] Geoﬀrey McLachlan and David Peel.

Finite Mixture Models . John Wiley & Sons, 2000.[14] Grant B. Morgan. “Mixed Mode Latent Class Analysis: An Examination of Fit Index Performancefor Classiﬁcation”. In:

Structural Equation Modeling: A Multidisciplinary Journal doi : .[15] Grant B. Morgan, Kari J. Hodge, and Aaron R. Baggett. “Latent proﬁle analysis with nonnormalmixtures: A Monte Carlo examination of model selection using ﬁt indices”. In: ComputationalStatistics and Data Analysis

93 (2016), pp. 146–161. doi : .[16] Frederick Mosteller and W. Tukey, J. Data Analysis and Regression: A Second Course in Statistics .1977.[17] L.K. Muthén and B.O. Muthén.

Mplus User’s Guide . 8th. Los Angeles, CA: Muthén & Muthén, 2017.[18] Karen L. Nylund, Tihomir Asparouhov, and Bengt O. Muthén. “Deciding on the Number of Classesin Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study”. In:

Structural Equation Modeling: A Multidisciplinary Journal doi : .[19] Monte Carlo Simulation Results Categorical Latent Class Analysis . Texas Data Repository Dataverse,2019. doi : .[20] R Core Team. R: A Language and Environment for Statistical Computing . R Foundation forStatistical Computing. Vienna, Austria, 2017. url : .[21] Gideon Schwarz. “Estimating the Dimension of a Model”. In: The Annals of Statistics doi : .[22] G Soromenho. “Comparing approaches for testing the number of components in a ﬁnite mixturemodel”. In: Computational Statistics

Communications in Statistics-Theory andMethods

Testing structuralequation models . Newbury Park, CA: Sage, 1993, pp. 10–40.[25] Quang H Vuong. “Likelihood ratio tests for model selection and non-nested hypotheses”. In:

Econometrica