Source: UNIVERSITY OF CALIFORNIA, DAVIS submitted to NRP
BEST PRACTICES FOR GENERAL GROWTH MIXTURE MODELING
Sponsoring Institution
National Institute of Food and Agriculture
Project Status
COMPLETE
Funding Source
Reporting Frequency
Annual
Accession No.
0211526
Grant No.
(N/A)
Cumulative Award Amt.
(N/A)
Proposal No.
(N/A)
Multistate No.
(N/A)
Project Start Date
Oct 1, 2006
Project End Date
Dec 31, 2009
Grant Year
(N/A)
Program Code
[(N/A)]- (N/A)
Recipient Organization
UNIVERSITY OF CALIFORNIA, DAVIS
410 MRAK HALL
DAVIS,CA 95616-8671
Performing Department
HUMAN AND COMMUNITY DEVELOPMENT
Non Technical Summary
Despite the great promise of growth mixture models for testing taxonomic theories and evaluating differential intervention effects on longitudinal outcomes, the popularity of these models with applied researchers has outstripped the development of statistical "best practices" for building and testing these growth mixture models on real data sets. The purpose of this project is to begin to resolve the need for clear recommendations about growth mixture model specification and estimation so that these models can be applied in a consistent way to effectively evaluate important developmental research questions.
Animal Health Component
(N/A)
Research Effort Categories
Basic
25%
Applied
(N/A)
Developmental
75%
Classification

Knowledge Area (KA)Subject of Investigation (SOI)Field of Science (FOS)Percent
90173102090100%
Goals / Objectives
1.Clearly characterize the regularities (or lack thereof)in past applications of growth mixture modeling with regards to model specification, class-enumeration, and inclusion of covariates and distal outcomes. Highlight the shortcomings and limitations in existing practices related to inconsistencies in the application of growth mixture modeling. 2.Develop recommendations for model specification in the exploratory class-enumeration step of the growth mixture modeling process. 3.Develop recommendations for the combined use of existing fit indices and model tests for model comparisons in the exploratory class-enumeration step of the growth mixture modeling process. 4.Develop recommendations for the addition of covariates to the growth mixture model with emphasis on the distinction between direct and indirect effects. 5.Develop recommendations for the addition of distal outcomes, i.e., consequences, of the growth mixture model.
Project Methods
Year 1 (in progress) 1)Conduct a content analysis of published applications of growth mixture modeling in APA journals from 2000 to 2006. Summarize the following information: i) nature of longitudinal outcome modeled, e.g., alcohol use, aggressive behavior, criminal arrests, etc.; ii) software used for analysis; iii) mean and variance structure of growth model used for class-enumeration; iv) fit indices used for model selection in class-enumeration process; v) model specification used with inclusion of covariates (if applicable); vi) model specification used for inclusion of distal outcome (if applicable); and vii) interpretation of model results, particularly in terms of the latent growth classes. Years 2-3 1)Conduct a simulation study to evaluate the impact of various specifications and misspecifications of the mean and variance structure in a cross-sectional latent class cluster analysis on exploratory class-enumeration. Simultaneously evaluate the sensitivity of existing fit indices and tests to model misspecification when utilizing the indices for exploratory class-enumeration. 2)Conduct a similar simulation study for growth mixture models. Year 4 1)Conduct a simulation study to evaluate the impact of including covariates as predictors of latent class membership on the process of class-enumeration in a cross-sectional latent class cluster analysis. Compare the differences in model results when allowing covariates to influence observed outcomes directly and when allowing those direct effects to vary across latent classes. 2)Conduct a similar simulation study for growth mixture models. Year 5 1)Conduct a simulation study to compare the differences in model results using the following specifications for a distal outcome of a cross-sectional latent class cluster analysis: i) distal outcome as a latent class indicator; ii) distal outcome regressed on the latent class variable; iii) distal outcomes regress on latent class variable with the latent class measurement model parameters fixed to the estimated value of the model without the distal outcome; and iv) distal outcome regressed on predicted latent class membership. 2)Conduct a simulation study to compare the differences in model results using the following specifications for a distal outcome of a growth mixture model: i) distal outcome as a latent class indicator; ii) distal outcome regressed on the latent class variable; iii) distal outcomes regress on latent class variable with the latent class measurement model parameters fixed to the estimated value of the model without the distal outcome; and iv) distal outcome regressed on predicted latent class membership. For each scenario, also compare difference in model results if allowing the growth factors to influence the distal outcome directly.

Progress 10/01/06 to 12/31/09

Outputs
OUTPUTS: PI has transferred to new university; not available to report. PARTICIPANTS: Nothing significant to report during this reporting period. TARGET AUDIENCES: Nothing significant to report during this reporting period. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
PI has transferred to new university; not available to report.

Publications

  • No publications reported this period


Progress 01/01/08 to 12/31/08

Outputs
OUTPUTS: Activities: During this reporting period, I taught a 4-day summer institute course at Johns Hopkins University, June 2008, for postdoctoral fellows and applied researchers in Public Health on growth and growth mixture modeling for categorical longitudinal outcomes. I also presented a 1-day pre-conference workshop on mixture modeling applied to event histories for the annual meeting of the Society for Prevention Research in May 2008. Events: During this reporting period, I organized and chaired a paper symposium, Indicators, covariates, and distal outcomes in mixture modeling, for the annual meeting of the Society for Prevention Research in May 2008. Services: I provided consulting services on four NIH R01 grants applying growth mixture modeling to studies of: the etiology and course of depressive symptoms in African American adolescents; the co-morbidity of conduct problems and depressive symptoms in grade school; women's adult drinking patterns from age 21-80 years; and the longitudinal associations between high-risk parenting and self-regulation in pre-Kindergarten children. Dissemination: Dissemination of the results described in the outcome section was accomplished through training workshops, conference presentations, publications, and consultation/collaboration with applied researchers utilizing these methods to address important substantive empirical questions. All of the above outputs are absolutely critical to the dissemination goals of this project. PARTICIPANTS: Individuals: As the principal investigator, I took the lead on all outputs for this project and hold primary responsibility for the resultant outcomes. There is no other individual who worked more than one person month per year or received salary, wages, stipend, etc. Collaborators: As the PI for this project, I made contact with and involved others outside my UC Davis organization in the project. I enlisted a close colleague, Karen Nylund, at UC Santa Barbara, to assist with the vast simulation study. My colleague, Hanno Petras, at University of Maryland, College Park, served as a co-instructor on training courses and as a co-first-author on a book chapter regarding growth mixture modeling. Training and professional development: Training workshops are offered as part of the project outputs (see corresponding section above). In addition, select graduate students have had the opportunity to participate in various parts of the project, e.g., literature reviews, simulations, etc. during this reporting period. TARGET AUDIENCES: Nothing significant to report during this reporting period. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.

Impacts
Changes in knowledge: For some time, methodologists have advocated the inclusion of covariates in the mixture modeling process in order to inform the formation and identification of the latent classes (mixtures), to facilitate the interpretation of the resultant classes, and to minimize bias that could result by excluding potential confounders. However, despite the clear utility of their inclusion, little research has focused on when covariates should be included during the mixture modeling process and how their effects should be specified. Issues surrounding when and how to include covariates in mixture models become especially relevant in education research when one of the covariates of interest is related to an intervention or a risk factor targeted by a preventive intervention. During this reporting period, I began an extensive Monte Carlo simulation study in collaboration with another UC faculty member. We began by focusing on cross-sectional Latent Class Analysis (LCA) models and simulate data under different modeling conditions, sample sizes, and class distributions. We were interested in studying the impact of covariate exclusion and misspecification of covariate effects on the class enumeration (i.e., determining the number of classes), and, where applicable, the impact on the estimation of covariate effects on class membership. Since a model may be specified that allows a covariate to be directly related to the observed items or to be indirectly related to the items via the latent class variable, we considered population and analysis models with both direct and indirect paths. We focus on the fit indices commonly used for deciding on the number of classes in mixture modeling, specifically information criterion (BIC, ABIC, and AIC) and two types of likelihood based tests that compares the improvement in fit of a model with one more latent class (Lo-Mendell-Rubin Test and the Bootstrapped Likelihood ratio test). Preliminary results suggest that a misspecified covariate effect most commonly leads to the over extraction of classes. And further, when the exact relationships between the covariates and latent class (mixture) indicators is unknown, results indicate that an approach of deciding on the number of classes without covariates (i.e., the unconditional model) will most often lead to the most accurate class enumeration, contrary to conventional wisdom. This result has important implications to the application of all mixture models, including growth mixture models. We are currently in the process of developing a method for the systematic inclusion of covariates following the class enumeration process that will allow frequent evaluations of potential misspecifications of covariate effects.

Publications

  • Feldman, B., Masyn, K., and Conger, R. (Accepted in 2008). New approaches to studying problem behaviors: A comparison of methods for modeling longitudinal, categorical data. Developmental Psychology.
  • Petras, H. and Masyn, K. (2009). General growth mixture analysis with antecedents and consequences of change. Forthcoming in Piquero, A. and Weisburd, D. (Eds.) Handbook of Quantitative Criminology. Order of authors determined by randomization
  • Witkiewitz, K. and Masyn, K.E. (2008). Drinking trajectories following an initial lapse. Psychology of Addictive Behaviors, 22(2), 157 to 167.


Progress 01/01/07 to 12/31/07

Outputs
Activities: 1) Conducted a simulation study to evaluate the impact of including covariates as predictors of latent class membership on the process of class-enumeration in a cross-sectional latent class cluster analysis. Compare the differences in model results when allowing covariates to influence observed outcomes directly and when allowing those direct effects to vary across latent classes. 2) Conducted a simulation study to evaluate the impact of including covariates as predictors of latent class membership on the process of class-enumeration in growth mixture modeling. Compared the differences in model results when allowing covariates to influence the growth factors directly and when allowing those direct effects to vary across latent classes. 3) Compilation and illustration of findings from (1) and (2) in manuscript form is currently in-progress. Workshops: Masyn, K. and Petras, H. (July 2007). Longitudinal Analysis with Latent Variables. 4-day short course presented as part of the Summer Institute in Mental Health, Department of Mental Health, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD. Conferences: Masyn, K. - Chair (August 2007). Analysis of categorical and latent observed variables. Organized paper symposium for the meeting of the American Psychological Association, San Francisco, CA. Masyn, K.- Chair (May 2007). Prediction to and from repeated measures. Organized paper symposium for the meeting of the Society for Prevention Research, Washington, DC. Dissemination: During this period, collaborative partnerships were established with researchers at the National Center on Addiction and Substance Abuse (CASA) in New York, the Social Development Research Group in Seattle, and the Prevention Research Center in Baltimore, who are interested in utilizing growth mixture modeling methodology in their applied work. These partnerships will provide active routes of dissemination for the methodology developed in subsequent years of this project.

Impacts
Change in knowledge: Covariates like gender and ethnicity along with risk and protective factors play an important role for predicting and interpreting differences among classes in mixture models. For many years now, researchers have advocated the use of covariates in mixture modeling because they help in the formation of the classes, help to describe and explain heterogeneity in the observed data, and help to minimize modeling error that occurs by leaving out important relationships in the model. However, despite their known usefulness for mixture models, little is known about when covariates should be included during the modeling process and how their effects should be specified. Issues surrounding when and how to include covariates in mixture models become especially relevant in prevention research when one of the covariates of interest is related to an intervention or a risk factor targeted by a preventive intervention. During this period of work, simulation work has begun to yield answers to these important and practical concerns regarding the use of covariates in growth mixture modeling. Using Monte Carlo simulation techniques and beginning with a simple cross-sectional LCA model, we simulated data under different modeling conditions to study the impact of leaving out covariates or misspecifying covariate relationships. We explored the impact of covariates on the class enumeration (i.e., determining the number of classes) as well as the difference between allowing covariates to only influence the observed items indirectly via the latent class variable versus allowing the covariate to also have a direct effect on the items. Preliminary results suggest that covariates should be excluded from the initial class enumeration process. Extensions are currently being made to the growth mixture model (GMM) setting. In addition to the impact on class enumeration process in GMM, the consequences on the formation and interpretation of the latent classes from allowing the covariates to influence the within-class growth factors as well as allowing the covariate effects on the growth factors to vary across latent classes are being investigated.

Publications

  • Abstracts: Nylund, K. and Masyn, K. (June 2007). Covariates and growth mixture modeling: Early simulation results into the mystery of when and how to include covariates. Conference proceedings (abstract), Society for Prevention Research, Washington, DC.