The advantage of imputation of missing income data to evaluate the association between income and self-reported health status (SRH) in a Mexican American cohort study

Anthony B. Ryder, Anna V. Wilkinson, Michelle K. McHugh, Katherine Saunders, Sumesh Kachroo, Anthony D'Amelio, Melissa Bondy, Carol J. Etzel

Research output: Contribution to journalReview articlepeer-review

15 Scopus citations

Abstract

Missing data often occur in cross-sectional surveys and longitudinal and experimental studies. The purpose of this study was to compare the prediction of self-rated health (SRH), a robust predictor of morbidity and mortality among diverse populations, before and after imputation of the missing variable "yearly household income." We reviewed data from 4,162 participants of Mexican origin recruited from July 1, 2002, through December 31, 2005, and who were enrolled in a population-based cohort study. Missing yearly income data were imputed using three different single imputation methods and one multiple imputation under a Bayesian approach. Of 4,162 participants, 3,121 were randomly assigned to a training set (to derive the yearly income imputation methods and develop the health-outcome prediction models) and 1,041 to a testing set (to compare the areas under the curve (AUC) of the receiver-operating characteristic of the resulting health-outcome prediction models). The discriminatory powers of the SRH prediction models were good (range, 69-72%) and compared to the prediction model obtained after no imputation of missing yearly income, all other imputation methods improved the prediction of SRH (P < 0.05 for all comparisons) with the AUC for the model after multiple imputation being the highest (AUC = 0.731). Furthermore, given that yearly income was imputed using multiple imputation, the odds of SRH as good or better increased by 11% for each $5,000 increment in yearly income. This study showed that although imputation of missing data for a key predictor variable can improve a risk health-outcome prediction model, further work is needed to illuminate the risk factors associated with SRH.

Original languageEnglish (US)
Pages (from-to)1099-1109
Number of pages11
JournalJournal of Immigrant and Minority Health
Volume13
Issue number6
DOIs
StatePublished - Dec 2011

Keywords

  • Data imputation techniques
  • Mean substitution
  • Minority health
  • Missing income data
  • Multiple imputation
  • Self-rated health

ASJC Scopus subject areas

  • Epidemiology
  • Public Health, Environmental and Occupational Health

Fingerprint

Dive into the research topics of 'The advantage of imputation of missing income data to evaluate the association between income and self-reported health status (SRH) in a Mexican American cohort study'. Together they form a unique fingerprint.

Cite this