Have a Question?
Ask the Graduate
College at our new
Doctoral Dissertation Announcement
Candidate: Xiaofan Cai
Doctor of Philosophy
Department: Educational Leadership, Research and Technology
Title: Missing Data Treatment of a Level-2 Variable in a 3-level Hierarchical Linear Model
Dr. Brooks Applegate, Chair
Dr. Susan Carlson
Dr. Warren Lacefield
Date: Thursday, March 20, 2008 2:00 p.m. – 4:00 p.m.
211 W. Walwood Hall, Emeriti Lounge
Data used in educational research often come with a hierarchical structure such as students nested in classrooms and classrooms nested in schools. Hierarchical linear modeling (HLM) analysis allows applied researchers to incorporate the hierarchical structure of the data into data analysis to examine effects of variables at each level. However, problems such as missing data pose analytical challenges of biased estimation. With missing data occurring in level-2 variables in a 3-level HLM analysis, the choice of the missing data treatment may affect parameter estimation at all levels.
This Monte Carlo simulation study was designed to compare performance of six missing data treatment (MDT) methods—listwise deletion, mean substitution, restrictive Expectation-Maximization (EM), inclusive EM, restrictive multiple imputation (MI) and inclusive MI in generating unbiased estimates in a 3-level HLM model. A “random intercept” (or “intercept-only”) 3-level HLM model was adopted. Missingness was generated as missing at random (MAR) for a level-2 predictor variable. The six MDTs were applied and the imputed datasets were analyzed using the same HLM model. Parameter estimates from the imputed datasets were compared to those obtained from the complete datasets. The comparisons focused on the accuracy and precision of parameter estimates of fixed and random effects in the HLM model.
Not unexpectedly, every MDT method produced more biases in the estimates with high proportion of missingness, and their performances improved as level-2 sample size increased. Listwise deletion was a viable choice when level-2 sample size was small, it generated the most accurate but less precise estimates. Under medium and large sample sizes, the restrictive EM method was effective in producing accurate and precise estimates for fixed effect parameters at all levels. The inclusive EM method outperformed all other methods in producing accurate and precise estimates for random effects. The two MI methods did not produce satisfactory estimates for level-2 fixed effects. However, the inclusive MI outperformed the restrictive MI on level-2 estimates of both fixed and random effects across the study conditions.
This study provides statistical evidence and practical recommendations for researchers who must consider different MDT methods when they encounter missing data in hierarchical data structure.