TY - JOUR
T1 - The Big Data Paradox in Clinical Practice
AU - Msaouel, Pavlos
N1 - Funding Information:
Pavlos Msaouel is supported by a Career Development Award from the American Society of Clinical Oncology, a Research Award from KCCure, the MD Anderson Khalifa Scholar Award, the MD Anderson Physician-Scientist Award, philanthropic donations by Mike and Mary Allen, and the Andrew Sabin Family Foundation Fellowship. The author thank Drs Bora Lim (Associate Professor, Baylor College of Medicine, Houston, TX, USA) and Christopher Logothetis (Professor, The University of Texas MD Anderson Cancer Center, Houston, TX, USA) for helpful conversations, as well as Sarah Townsend (Senior Technical Writer; Department of Genitourinary Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA) for editorial assistance.
Funding Information:
Pavlos Msaouel reports honoraria for scientific advisory boards membership for Mirati Therapeutics, Bristol Myers Squibb, and Exelixis; consulting fees from Axiom Healthcare; non-branded educational programs supported by Exelixis and Pfizer; leadership or fiduciary roles as a Medical Steering Committee member for the Kidney Cancer Association and a Kidney Cancer Scientific Advisory Board member for KCCure; and research funding from Takeda, Bristol Myers Squibb, Mirati Therapeutics, and Gateway for Cancer Research.
Publisher Copyright:
© 2022 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
PY - 2022
Y1 - 2022
N2 - The big data paradox is a real-world phenomenon whereby as the number of patients enrolled in a study increases, the probability that the confidence intervals from that study will include the truth decreases. This occurs in both observational and experimental studies, including randomized clinical trials, and should always be considered when clinicians are interpreting research data. Furthermore, as data quantity continues to increase in today’s era of big data, the paradox is becoming more pernicious. Herein, I consider three mechanisms that underlie this paradox, as well as three potential strategies to mitigate it: (1) improving data quality; (2) anticipating and modeling patient heterogeneity; (3) including the systematic error, not just the variance, in the estimation of error intervals.
AB - The big data paradox is a real-world phenomenon whereby as the number of patients enrolled in a study increases, the probability that the confidence intervals from that study will include the truth decreases. This occurs in both observational and experimental studies, including randomized clinical trials, and should always be considered when clinicians are interpreting research data. Furthermore, as data quantity continues to increase in today’s era of big data, the paradox is becoming more pernicious. Herein, I consider three mechanisms that underlie this paradox, as well as three potential strategies to mitigate it: (1) improving data quality; (2) anticipating and modeling patient heterogeneity; (3) including the systematic error, not just the variance, in the estimation of error intervals.
KW - Bias-variance trade-off
KW - big data
KW - patient relevance
KW - relevance-robustness trade-off
KW - robustness
UR - http://www.scopus.com/inward/record.url?scp=85131699989&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131699989&partnerID=8YFLogxK
U2 - 10.1080/07357907.2022.2084621
DO - 10.1080/07357907.2022.2084621
M3 - Article
C2 - 35671042
AN - SCOPUS:85131699989
SN - 0735-7907
VL - 40
SP - 567
EP - 576
JO - Cancer Investigation
JF - Cancer Investigation
IS - 7
ER -