Model-based estimates of the finite population mean for two-stage cluster samples with unit non-response

Yuan Ying; Roderick J.A. Little

doi:10.1111/j.1467-9876.2007.00566.x

Model-based estimates of the finite population mean for two-stage cluster samples with unit non-response

Yuan Ying, Roderick J.A. Little

Biostatistics

Research output: Contribution to journal › Article › peer-review

20 Scopus citations

Abstract

We propose new model-based methods for unit non-response in two-stage survey samples. A commonly used design-based adjustment weights respondents by the inverse of the estimated response rate in each cluster (method WT).This approach is consistent if the response probabilities are constant within clusters but is potentially inefficient when the estimated cluster response rates are very variable. Clusters can be collapsed to increase precision, but this may introduce bias. We consider here the model-based approach to survey inference that treats the clusters as random effects. We note that, from a model-based perspective, a missing data mechanism that assumes that the response rate varies across clusters is non-ignorable, and we propose the term cluster-specific non-ignorable (CSNI) non-response to describe this mechanism. We show that the standard random-effects model estimator RE of the population mean is biased under CSNI non-response, and we propose two modifications of RE to correct this bias. One approach includes the observed response rate as a cluster level covariate (method RERR), and the other is based on a probit model for response (method Nl1 ).The RERR approach is simpler than NI1 but approximate, in that uncertainty in estimating the response rates is not taken into account. In addition, a simple method that corrects the bias of RE by reweighting (method RWRE) is also discussed. We show by simulations that estimators from RERR and NI1 can correct the bias of RE under CSNI non-response and have comparable or lower root-mean-squared error than WT in a variety of simulation settings, and RWRE has similar performance to WT. We also consider another non-ignorable response model estimate of the population mean (NI2) that removes the bias of WT, RWRE, RERR and NI1 under an outcome-specific non-ignorable response mechanism where non-response depends directly on the individual level survey outcomes. However, that estimate is not robust to model misspecification. The various methods are compared on a data set from the Detroit Dental Health Project.

Original language	English (US)
Pages (from-to)	79-97
Number of pages	19
Journal	Journal of the Royal Statistical Society. Series C: Applied Statistics
Volume	56
Issue number	1
DOIs	https://doi.org/10.1111/j.1467-9876.2007.00566.x
State	Published - Jan 2007

Keywords

Cluster sampling
Non-ignorable non-response
Random-effects model
Unit non-response

ASJC Scopus subject areas

Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.1111/j.1467-9876.2007.00566.x

Cite this

@article{62ae5ac414204ed0aa27e6c69868e3cc,

title = "Model-based estimates of the finite population mean for two-stage cluster samples with unit non-response",

abstract = "We propose new model-based methods for unit non-response in two-stage survey samples. A commonly used design-based adjustment weights respondents by the inverse of the estimated response rate in each cluster (method WT).This approach is consistent if the response probabilities are constant within clusters but is potentially inefficient when the estimated cluster response rates are very variable. Clusters can be collapsed to increase precision, but this may introduce bias. We consider here the model-based approach to survey inference that treats the clusters as random effects. We note that, from a model-based perspective, a missing data mechanism that assumes that the response rate varies across clusters is non-ignorable, and we propose the term cluster-specific non-ignorable (CSNI) non-response to describe this mechanism. We show that the standard random-effects model estimator RE of the population mean is biased under CSNI non-response, and we propose two modifications of RE to correct this bias. One approach includes the observed response rate as a cluster level covariate (method RERR), and the other is based on a probit model for response (method Nl1 ).The RERR approach is simpler than NI1 but approximate, in that uncertainty in estimating the response rates is not taken into account. In addition, a simple method that corrects the bias of RE by reweighting (method RWRE) is also discussed. We show by simulations that estimators from RERR and NI1 can correct the bias of RE under CSNI non-response and have comparable or lower root-mean-squared error than WT in a variety of simulation settings, and RWRE has similar performance to WT. We also consider another non-ignorable response model estimate of the population mean (NI2) that removes the bias of WT, RWRE, RERR and NI1 under an outcome-specific non-ignorable response mechanism where non-response depends directly on the individual level survey outcomes. However, that estimate is not robust to model misspecification. The various methods are compared on a data set from the Detroit Dental Health Project.",

keywords = "Cluster sampling, Non-ignorable non-response, Random-effects model, Unit non-response",

author = "Yuan Ying and Little, {Roderick J.A.}",

year = "2007",

month = jan,

doi = "10.1111/j.1467-9876.2007.00566.x",

language = "English (US)",

volume = "56",

pages = "79--97",

journal = "Journal of the Royal Statistical Society. Series C: Applied Statistics",

issn = "0035-9254",

publisher = "Wiley-Blackwell",

number = "1",

}

TY - JOUR

T1 - Model-based estimates of the finite population mean for two-stage cluster samples with unit non-response

AU - Ying, Yuan

AU - Little, Roderick J.A.

PY - 2007/1

Y1 - 2007/1

N2 - We propose new model-based methods for unit non-response in two-stage survey samples. A commonly used design-based adjustment weights respondents by the inverse of the estimated response rate in each cluster (method WT).This approach is consistent if the response probabilities are constant within clusters but is potentially inefficient when the estimated cluster response rates are very variable. Clusters can be collapsed to increase precision, but this may introduce bias. We consider here the model-based approach to survey inference that treats the clusters as random effects. We note that, from a model-based perspective, a missing data mechanism that assumes that the response rate varies across clusters is non-ignorable, and we propose the term cluster-specific non-ignorable (CSNI) non-response to describe this mechanism. We show that the standard random-effects model estimator RE of the population mean is biased under CSNI non-response, and we propose two modifications of RE to correct this bias. One approach includes the observed response rate as a cluster level covariate (method RERR), and the other is based on a probit model for response (method Nl1 ).The RERR approach is simpler than NI1 but approximate, in that uncertainty in estimating the response rates is not taken into account. In addition, a simple method that corrects the bias of RE by reweighting (method RWRE) is also discussed. We show by simulations that estimators from RERR and NI1 can correct the bias of RE under CSNI non-response and have comparable or lower root-mean-squared error than WT in a variety of simulation settings, and RWRE has similar performance to WT. We also consider another non-ignorable response model estimate of the population mean (NI2) that removes the bias of WT, RWRE, RERR and NI1 under an outcome-specific non-ignorable response mechanism where non-response depends directly on the individual level survey outcomes. However, that estimate is not robust to model misspecification. The various methods are compared on a data set from the Detroit Dental Health Project.

AB - We propose new model-based methods for unit non-response in two-stage survey samples. A commonly used design-based adjustment weights respondents by the inverse of the estimated response rate in each cluster (method WT).This approach is consistent if the response probabilities are constant within clusters but is potentially inefficient when the estimated cluster response rates are very variable. Clusters can be collapsed to increase precision, but this may introduce bias. We consider here the model-based approach to survey inference that treats the clusters as random effects. We note that, from a model-based perspective, a missing data mechanism that assumes that the response rate varies across clusters is non-ignorable, and we propose the term cluster-specific non-ignorable (CSNI) non-response to describe this mechanism. We show that the standard random-effects model estimator RE of the population mean is biased under CSNI non-response, and we propose two modifications of RE to correct this bias. One approach includes the observed response rate as a cluster level covariate (method RERR), and the other is based on a probit model for response (method Nl1 ).The RERR approach is simpler than NI1 but approximate, in that uncertainty in estimating the response rates is not taken into account. In addition, a simple method that corrects the bias of RE by reweighting (method RWRE) is also discussed. We show by simulations that estimators from RERR and NI1 can correct the bias of RE under CSNI non-response and have comparable or lower root-mean-squared error than WT in a variety of simulation settings, and RWRE has similar performance to WT. We also consider another non-ignorable response model estimate of the population mean (NI2) that removes the bias of WT, RWRE, RERR and NI1 under an outcome-specific non-ignorable response mechanism where non-response depends directly on the individual level survey outcomes. However, that estimate is not robust to model misspecification. The various methods are compared on a data set from the Detroit Dental Health Project.

KW - Cluster sampling

KW - Non-ignorable non-response

KW - Random-effects model

KW - Unit non-response

UR - http://www.scopus.com/inward/record.url?scp=33846348259&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33846348259&partnerID=8YFLogxK

U2 - 10.1111/j.1467-9876.2007.00566.x

DO - 10.1111/j.1467-9876.2007.00566.x

M3 - Article

AN - SCOPUS:33846348259

SN - 0035-9254

VL - 56

SP - 79

EP - 97

JO - Journal of the Royal Statistical Society. Series C: Applied Statistics

JF - Journal of the Royal Statistical Society. Series C: Applied Statistics

IS - 1

ER -

Model-based estimates of the finite population mean for two-stage cluster samples with unit non-response

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this