Leveraging a surrogate outcome to improve inference on a partially missing target outcome

Zachary R. McCaw, Sheila M. Gaynor, Ryan Sun, Xihong Lin

Research output: Contribution to journalArticlepeer-review

Abstract

Sample sizes vary substantially across tissues in the Genotype-Tissue Expression (GTEx) project, where considerably fewer samples are available from certain inaccessible tissues, such as the substantia nigra (SSN), than from accessible tissues, such as blood. This severely limits power for identifying tissue-specific expression quantitative trait loci (eQTL) in undersampled tissues. Here we propose Surrogate Phenotype Regression Analysis (Spray) for leveraging information from a correlated surrogate outcome (eg, expression in blood) to improve inference on a partially missing target outcome (eg, expression in SSN). Rather than regarding the surrogate outcome as a proxy for the target outcome, Spray jointly models the target and surrogate outcomes within a bivariate regression framework. Unobserved values of either outcome are treated as missing data. We describe and implement an expectation conditional maximization algorithm for performing estimation in the presence of bilateral outcome missingness. Spray estimates the same association parameter estimated by standard eQTL mapping and controls the type I error even when the target and surrogate outcomes are truly uncorrelated. We demonstrate analytically and empirically, using simulations and GTEx data, that in comparison with marginally modeling the target outcome, jointly modeling the target and surrogate outcomes increases estimation precision and improves power.

Original languageEnglish (US)
Pages (from-to)1472-1484
Number of pages13
JournalBiometrics
Volume79
Issue number2
DOIs
StatePublished - Jun 2023

Keywords

  • EM algorithm
  • genetic association analysis
  • missing data
  • multivariate analysis
  • surrogate outcomes

ASJC Scopus subject areas

  • Statistics and Probability
  • General Biochemistry, Genetics and Molecular Biology
  • General Immunology and Microbiology
  • General Agricultural and Biological Sciences
  • Applied Mathematics

MD Anderson CCSG core facilities

  • Biostatistics Resource Group

Fingerprint

Dive into the research topics of 'Leveraging a surrogate outcome to improve inference on a partially missing target outcome'. Together they form a unique fingerprint.

Cite this