TY - JOUR
T1 - The advantage of imputation of missing income data to evaluate the association between income and self-reported health status (SRH) in a Mexican American cohort study
AU - Ryder, Anthony B.
AU - Wilkinson, Anna V.
AU - McHugh, Michelle K.
AU - Saunders, Katherine
AU - Kachroo, Sumesh
AU - D'Amelio, Anthony
AU - Bondy, Melissa
AU - Etzel, Carol J.
PY - 2011/12
Y1 - 2011/12
N2 - Missing data often occur in cross-sectional surveys and longitudinal and experimental studies. The purpose of this study was to compare the prediction of self-rated health (SRH), a robust predictor of morbidity and mortality among diverse populations, before and after imputation of the missing variable "yearly household income." We reviewed data from 4,162 participants of Mexican origin recruited from July 1, 2002, through December 31, 2005, and who were enrolled in a population-based cohort study. Missing yearly income data were imputed using three different single imputation methods and one multiple imputation under a Bayesian approach. Of 4,162 participants, 3,121 were randomly assigned to a training set (to derive the yearly income imputation methods and develop the health-outcome prediction models) and 1,041 to a testing set (to compare the areas under the curve (AUC) of the receiver-operating characteristic of the resulting health-outcome prediction models). The discriminatory powers of the SRH prediction models were good (range, 69-72%) and compared to the prediction model obtained after no imputation of missing yearly income, all other imputation methods improved the prediction of SRH (P < 0.05 for all comparisons) with the AUC for the model after multiple imputation being the highest (AUC = 0.731). Furthermore, given that yearly income was imputed using multiple imputation, the odds of SRH as good or better increased by 11% for each $5,000 increment in yearly income. This study showed that although imputation of missing data for a key predictor variable can improve a risk health-outcome prediction model, further work is needed to illuminate the risk factors associated with SRH.
AB - Missing data often occur in cross-sectional surveys and longitudinal and experimental studies. The purpose of this study was to compare the prediction of self-rated health (SRH), a robust predictor of morbidity and mortality among diverse populations, before and after imputation of the missing variable "yearly household income." We reviewed data from 4,162 participants of Mexican origin recruited from July 1, 2002, through December 31, 2005, and who were enrolled in a population-based cohort study. Missing yearly income data were imputed using three different single imputation methods and one multiple imputation under a Bayesian approach. Of 4,162 participants, 3,121 were randomly assigned to a training set (to derive the yearly income imputation methods and develop the health-outcome prediction models) and 1,041 to a testing set (to compare the areas under the curve (AUC) of the receiver-operating characteristic of the resulting health-outcome prediction models). The discriminatory powers of the SRH prediction models were good (range, 69-72%) and compared to the prediction model obtained after no imputation of missing yearly income, all other imputation methods improved the prediction of SRH (P < 0.05 for all comparisons) with the AUC for the model after multiple imputation being the highest (AUC = 0.731). Furthermore, given that yearly income was imputed using multiple imputation, the odds of SRH as good or better increased by 11% for each $5,000 increment in yearly income. This study showed that although imputation of missing data for a key predictor variable can improve a risk health-outcome prediction model, further work is needed to illuminate the risk factors associated with SRH.
KW - Data imputation techniques
KW - Mean substitution
KW - Minority health
KW - Missing income data
KW - Multiple imputation
KW - Self-rated health
UR - http://www.scopus.com/inward/record.url?scp=80755143369&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80755143369&partnerID=8YFLogxK
U2 - 10.1007/s10903-010-9415-8
DO - 10.1007/s10903-010-9415-8
M3 - Review article
C2 - 21103931
AN - SCOPUS:80755143369
SN - 1557-1912
VL - 13
SP - 1099
EP - 1109
JO - Journal of Immigrant and Minority Health
JF - Journal of Immigrant and Minority Health
IS - 6
ER -