TY - JOUR
T1 - Bayesian Dose-Finding in Two Treatment Cycles Based on the Joint Utility of Efficacy and Toxicity
AU - Lee, Juhee
AU - Thall, Peter F.
AU - Ji, Yuan
AU - Müller, Peter
N1 - Publisher Copyright:
© 2015 American Statistical Association.
PY - 2015/4/3
Y1 - 2015/4/3
N2 - This article proposes a phase I/II clinical trial design for adaptively and dynamically optimizing each patient’s dose in each of two cycles of therapy based on the joint binary efficacy and toxicity outcomes in each cycle. A dose-outcome model is assumed that includes a Bayesian hierarchical latent variable structure to induce association among the outcomes and also facilitate posterior computation. Doses are chosen in each cycle based on posteriors of a model-based objective function, similar to a reinforcement learning or Q-learning function, defined in terms of numerical utilities of the joint outcomes in each cycle. For each patient, the procedure outputs a sequence of two actions, one for each cycle, with each action being the decision to either treat the patient at a chosen dose or not to treat. The cycle 2 action depends on the individual patient’s cycle 1 dose and outcomes. In addition, decisions are based on posterior inference using other patients’ data, and therefore, the proposed method is adaptive both within and between patients. A simulation study of the method is presented, including comparison to two-cycle extensions of the conventional 3 + 3 algorithm, continual reassessment method, and a Bayesian model-based design, and evaluation of robustness. Supplementary materials for this article are available online.
AB - This article proposes a phase I/II clinical trial design for adaptively and dynamically optimizing each patient’s dose in each of two cycles of therapy based on the joint binary efficacy and toxicity outcomes in each cycle. A dose-outcome model is assumed that includes a Bayesian hierarchical latent variable structure to induce association among the outcomes and also facilitate posterior computation. Doses are chosen in each cycle based on posteriors of a model-based objective function, similar to a reinforcement learning or Q-learning function, defined in terms of numerical utilities of the joint outcomes in each cycle. For each patient, the procedure outputs a sequence of two actions, one for each cycle, with each action being the decision to either treat the patient at a chosen dose or not to treat. The cycle 2 action depends on the individual patient’s cycle 1 dose and outcomes. In addition, decisions are based on posterior inference using other patients’ data, and therefore, the proposed method is adaptive both within and between patients. A simulation study of the method is presented, including comparison to two-cycle extensions of the conventional 3 + 3 algorithm, continual reassessment method, and a Bayesian model-based design, and evaluation of robustness. Supplementary materials for this article are available online.
KW - Adaptive design
KW - Bayesian design
KW - Dynamic treatment regime
KW - Latent probit model
KW - Phase I-II clinical trial
KW - Q-learning
UR - http://www.scopus.com/inward/record.url?scp=84936744053&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84936744053&partnerID=8YFLogxK
U2 - 10.1080/01621459.2014.926815
DO - 10.1080/01621459.2014.926815
M3 - Article
C2 - 26366026
AN - SCOPUS:84936744053
SN - 0162-1459
VL - 110
SP - 711
EP - 722
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 510
ER -