TY - JOUR
T1 - IPDfromKM
T2 - reconstruct individual patient data from published Kaplan-Meier survival curves
AU - Liu, Na
AU - Zhou, Yanhong
AU - Lee, J. Jack
N1 - Funding Information:
JJL’s research was supported in part by the grants CA016672 and CA221703 from the National Cancer Institute, RP150519 and RP160668 from the Cancer Prevention and Research Institute of Texas, and The University of Texas MD Anderson Cancer Center-Oropharynx Cancer Program, generously supported by Mr. and Mrs. Charles W. Stiefel.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/12
Y1 - 2021/12
N2 - Background: When applying secondary analysis on published survival data, it is critical to obtain each patient’s raw data, because the individual patient data (IPD) approach has been considered as the gold standard of data analysis. However, researchers often lack access to IPD. We aim to propose a straightforward and robust approach to obtain IPD from published survival curves with a user-friendly software platform. Results: Improving upon existing methods, we propose an easy-to-use, two-stage approach to reconstruct IPD from published Kaplan-Meier (K-M) curves. Stage 1 extracts raw data coordinates and Stage 2 reconstructs IPD using the proposed method. To facilitate the use of the proposed method, we developed the R package IPDfromKM and an accompanying web-based Shiny application. Both the R package and Shiny application have an “all-in-one” feature such that users can use them to extract raw data coordinates from published K-M curves, reconstruct IPD from the extracted data coordinates, visualize the reconstructed IPD, assess the accuracy of the reconstruction, and perform secondary analysis on the basis of the reconstructed IPD. We illustrate the use of the R package and the Shiny application with K-M curves from published studies. Extensive simulations and real-world data applications demonstrate that the proposed method has high accuracy and great reliability in estimating the number of events, number of patients at risk, survival probabilities, median survival times, and hazard ratios. Conclusions: IPDfromKM has great flexibility and accuracy to reconstruct IPD from published K-M curves with different shapes. We believe that the R package and the Shiny application will greatly facilitate the potential use of quality IPD and advance the use of secondary data to facilitate informed decision making in medical research.
AB - Background: When applying secondary analysis on published survival data, it is critical to obtain each patient’s raw data, because the individual patient data (IPD) approach has been considered as the gold standard of data analysis. However, researchers often lack access to IPD. We aim to propose a straightforward and robust approach to obtain IPD from published survival curves with a user-friendly software platform. Results: Improving upon existing methods, we propose an easy-to-use, two-stage approach to reconstruct IPD from published Kaplan-Meier (K-M) curves. Stage 1 extracts raw data coordinates and Stage 2 reconstructs IPD using the proposed method. To facilitate the use of the proposed method, we developed the R package IPDfromKM and an accompanying web-based Shiny application. Both the R package and Shiny application have an “all-in-one” feature such that users can use them to extract raw data coordinates from published K-M curves, reconstruct IPD from the extracted data coordinates, visualize the reconstructed IPD, assess the accuracy of the reconstruction, and perform secondary analysis on the basis of the reconstructed IPD. We illustrate the use of the R package and the Shiny application with K-M curves from published studies. Extensive simulations and real-world data applications demonstrate that the proposed method has high accuracy and great reliability in estimating the number of events, number of patients at risk, survival probabilities, median survival times, and hazard ratios. Conclusions: IPDfromKM has great flexibility and accuracy to reconstruct IPD from published K-M curves with different shapes. We believe that the R package and the Shiny application will greatly facilitate the potential use of quality IPD and advance the use of secondary data to facilitate informed decision making in medical research.
KW - Individual patient data (IPD)
KW - Kaplan-Meier curve
KW - Meta-analysis
KW - R package
KW - Shiny application
KW - Survival analysis
UR - http://www.scopus.com/inward/record.url?scp=85107431877&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107431877&partnerID=8YFLogxK
U2 - 10.1186/s12874-021-01308-8
DO - 10.1186/s12874-021-01308-8
M3 - Article
C2 - 34074267
AN - SCOPUS:85107431877
SN - 1471-2288
VL - 21
JO - BMC Medical Research Methodology
JF - BMC Medical Research Methodology
IS - 1
M1 - 111
ER -