A simulation study for comparing testing statistics in response-adaptive randomization

Xuemin Gu; J. Jack Lee

doi:10.1186/1471-2288-10-48

A simulation study for comparing testing statistics in response-adaptive randomization

Xuemin Gu, J. Jack Lee

Research output: Contribution to journal › Article › peer-review

11 Scopus citations

Abstract

Background. Response-adaptive randomizations are able to assign more patients in a comparative clinical trial to the tentatively better treatment. However, due to the adaptation in patient allocation, the samples to be compared are no longer independent. At large sample sizes, many asymptotic properties of test statistics derived for independent sample comparison are still applicable in adaptive randomization provided that the patient allocation ratio converges to an appropriate target asymptotically. However, the small sample properties of commonly used test statistics in response-adaptive randomization are not fully studied. Methods. Simulations are systematically conducted to characterize the statistical properties of eight test statistics in six response-adaptive randomization methods at six allocation targets with sample sizes ranging from 20 to 200. Since adaptive randomization is usually not recommended for sample size less than 30, the present paper focuses on the case with a sample of 30 to give general recommendations with regard to test statistics for contingency tables in response-adaptive randomization at small sample sizes. Results. Among all asymptotic test statistics, the Cook's correction to chi-square test (T _MC) is the best in attaining the nominal size of hypothesis test. The William's correction to log-likelihood ratio test (T_ML) gives slightly inflated type I error and higher power as compared with T_MC, but it is more robust against the unbalance in patient allocation. T _MCand T_MLare usually the two test statistics with the highest power in different simulation scenarios. When focusing on T _MCand T_ML, the generalized drop-the-loser urn (GDL) and sequential estimation-adjusted urn (SEU) have the best ability to attain the correct size of hypothesis test respectively. Among all sequential methods that can target different allocation ratios, GDL has the lowest variation and the highest overall power at all allocation ratios. The performance of different adaptive randomization methods and test statistics also depends on allocation targets. At the limiting allocation ratio of drop-the-loser (DL) and randomized play-the-winner (RPW) urn, DL outperforms all other methods including GDL. When comparing the power of test statistics in the same randomization method but at different allocation targets, the powers of log-likelihood-ratio, log-relative-risk, log-odds-ratio, Wald-type Z, and chi-square test statistics are maximized at their corresponding optimal allocation ratios for power. Except for the optimal allocation target for log-relative-risk, the other four optimal targets could assign more patients to the worse arm in some simulation scenarios. Another optimal allocation target, R_RSIHR, proposed by Rosenberger and Sriram (Journal of Statistical Planning and Inference, 1997) is aimed at minimizing the number of failures at fixed power using Wald-type Z test statistics. Among allocation ratios that always assign more patients to the better treatment, R_RSIHRusually has less variation in patient allocation, and the values of variation are consistent across all simulation scenarios. Additionally, the patient allocation at R_RSIHRis not too extreme. Therefore, R_RSIHRprovides a good balance between assigning more patients to the better treatment and maintaining the overall power. Conclusion. The Cook's correction to chi-square test and Williams' correction to log-likelihood-ratio test are generally recommended for hypothesis test in response-adaptive randomization, especially when sample sizes are small. The generalized drop-the-loser urn design is the recommended method for its good overall properties. Also recommended is the use of the R _RSIHRallocation target.

Original language	English (US)
Article number	48
Journal	BMC Medical Research Methodology
Volume	10
DOIs	https://doi.org/10.1186/1471-2288-10-48
State	Published - 2010
Externally published	Yes

ASJC Scopus subject areas

Epidemiology
Health Informatics

MD Anderson CCSG core facilities

Biostatistics Resource Group

Access to Document

10.1186/1471-2288-10-48

Cite this

@article{833f656afca44143b564f309276bb948,

title = "A simulation study for comparing testing statistics in response-adaptive randomization",

abstract = "Background. Response-adaptive randomizations are able to assign more patients in a comparative clinical trial to the tentatively better treatment. However, due to the adaptation in patient allocation, the samples to be compared are no longer independent. At large sample sizes, many asymptotic properties of test statistics derived for independent sample comparison are still applicable in adaptive randomization provided that the patient allocation ratio converges to an appropriate target asymptotically. However, the small sample properties of commonly used test statistics in response-adaptive randomization are not fully studied. Methods. Simulations are systematically conducted to characterize the statistical properties of eight test statistics in six response-adaptive randomization methods at six allocation targets with sample sizes ranging from 20 to 200. Since adaptive randomization is usually not recommended for sample size less than 30, the present paper focuses on the case with a sample of 30 to give general recommendations with regard to test statistics for contingency tables in response-adaptive randomization at small sample sizes. Results. Among all asymptotic test statistics, the Cook's correction to chi-square test (T MC) is the best in attaining the nominal size of hypothesis test. The William's correction to log-likelihood ratio test (TML) gives slightly inflated type I error and higher power as compared with TMC, but it is more robust against the unbalance in patient allocation. T MCand TMLare usually the two test statistics with the highest power in different simulation scenarios. When focusing on T MCand TML, the generalized drop-the-loser urn (GDL) and sequential estimation-adjusted urn (SEU) have the best ability to attain the correct size of hypothesis test respectively. Among all sequential methods that can target different allocation ratios, GDL has the lowest variation and the highest overall power at all allocation ratios. The performance of different adaptive randomization methods and test statistics also depends on allocation targets. At the limiting allocation ratio of drop-the-loser (DL) and randomized play-the-winner (RPW) urn, DL outperforms all other methods including GDL. When comparing the power of test statistics in the same randomization method but at different allocation targets, the powers of log-likelihood-ratio, log-relative-risk, log-odds-ratio, Wald-type Z, and chi-square test statistics are maximized at their corresponding optimal allocation ratios for power. Except for the optimal allocation target for log-relative-risk, the other four optimal targets could assign more patients to the worse arm in some simulation scenarios. Another optimal allocation target, RRSIHR, proposed by Rosenberger and Sriram (Journal of Statistical Planning and Inference, 1997) is aimed at minimizing the number of failures at fixed power using Wald-type Z test statistics. Among allocation ratios that always assign more patients to the better treatment, RRSIHRusually has less variation in patient allocation, and the values of variation are consistent across all simulation scenarios. Additionally, the patient allocation at RRSIHRis not too extreme. Therefore, RRSIHRprovides a good balance between assigning more patients to the better treatment and maintaining the overall power. Conclusion. The Cook's correction to chi-square test and Williams' correction to log-likelihood-ratio test are generally recommended for hypothesis test in response-adaptive randomization, especially when sample sizes are small. The generalized drop-the-loser urn design is the recommended method for its good overall properties. Also recommended is the use of the R RSIHRallocation target.",

author = "Xuemin Gu and Lee, {J. Jack}",

note = "Funding Information: This work was supported in part by grants CA16672 from the National Cancer Institute and W81XWH-06-1-0303 and W81XWH-07-1-0306 from the Department of Defense. The authors thank Dr. Lunagomez for helpful discussions. The authors also thank Ms. Lee Ann Chastain for her help, which greatly improved the presentation of our study.",

year = "2010",

doi = "10.1186/1471-2288-10-48",

language = "English (US)",

volume = "10",

journal = "BMC Medical Research Methodology",

issn = "1471-2288",

publisher = "BioMed Central",

}

TY - JOUR

T1 - A simulation study for comparing testing statistics in response-adaptive randomization

AU - Gu, Xuemin

AU - Lee, J. Jack

N1 - Funding Information: This work was supported in part by grants CA16672 from the National Cancer Institute and W81XWH-06-1-0303 and W81XWH-07-1-0306 from the Department of Defense. The authors thank Dr. Lunagomez for helpful discussions. The authors also thank Ms. Lee Ann Chastain for her help, which greatly improved the presentation of our study.

PY - 2010

Y1 - 2010

N2 - Background. Response-adaptive randomizations are able to assign more patients in a comparative clinical trial to the tentatively better treatment. However, due to the adaptation in patient allocation, the samples to be compared are no longer independent. At large sample sizes, many asymptotic properties of test statistics derived for independent sample comparison are still applicable in adaptive randomization provided that the patient allocation ratio converges to an appropriate target asymptotically. However, the small sample properties of commonly used test statistics in response-adaptive randomization are not fully studied. Methods. Simulations are systematically conducted to characterize the statistical properties of eight test statistics in six response-adaptive randomization methods at six allocation targets with sample sizes ranging from 20 to 200. Since adaptive randomization is usually not recommended for sample size less than 30, the present paper focuses on the case with a sample of 30 to give general recommendations with regard to test statistics for contingency tables in response-adaptive randomization at small sample sizes. Results. Among all asymptotic test statistics, the Cook's correction to chi-square test (T MC) is the best in attaining the nominal size of hypothesis test. The William's correction to log-likelihood ratio test (TML) gives slightly inflated type I error and higher power as compared with TMC, but it is more robust against the unbalance in patient allocation. T MCand TMLare usually the two test statistics with the highest power in different simulation scenarios. When focusing on T MCand TML, the generalized drop-the-loser urn (GDL) and sequential estimation-adjusted urn (SEU) have the best ability to attain the correct size of hypothesis test respectively. Among all sequential methods that can target different allocation ratios, GDL has the lowest variation and the highest overall power at all allocation ratios. The performance of different adaptive randomization methods and test statistics also depends on allocation targets. At the limiting allocation ratio of drop-the-loser (DL) and randomized play-the-winner (RPW) urn, DL outperforms all other methods including GDL. When comparing the power of test statistics in the same randomization method but at different allocation targets, the powers of log-likelihood-ratio, log-relative-risk, log-odds-ratio, Wald-type Z, and chi-square test statistics are maximized at their corresponding optimal allocation ratios for power. Except for the optimal allocation target for log-relative-risk, the other four optimal targets could assign more patients to the worse arm in some simulation scenarios. Another optimal allocation target, RRSIHR, proposed by Rosenberger and Sriram (Journal of Statistical Planning and Inference, 1997) is aimed at minimizing the number of failures at fixed power using Wald-type Z test statistics. Among allocation ratios that always assign more patients to the better treatment, RRSIHRusually has less variation in patient allocation, and the values of variation are consistent across all simulation scenarios. Additionally, the patient allocation at RRSIHRis not too extreme. Therefore, RRSIHRprovides a good balance between assigning more patients to the better treatment and maintaining the overall power. Conclusion. The Cook's correction to chi-square test and Williams' correction to log-likelihood-ratio test are generally recommended for hypothesis test in response-adaptive randomization, especially when sample sizes are small. The generalized drop-the-loser urn design is the recommended method for its good overall properties. Also recommended is the use of the R RSIHRallocation target.

AB - Background. Response-adaptive randomizations are able to assign more patients in a comparative clinical trial to the tentatively better treatment. However, due to the adaptation in patient allocation, the samples to be compared are no longer independent. At large sample sizes, many asymptotic properties of test statistics derived for independent sample comparison are still applicable in adaptive randomization provided that the patient allocation ratio converges to an appropriate target asymptotically. However, the small sample properties of commonly used test statistics in response-adaptive randomization are not fully studied. Methods. Simulations are systematically conducted to characterize the statistical properties of eight test statistics in six response-adaptive randomization methods at six allocation targets with sample sizes ranging from 20 to 200. Since adaptive randomization is usually not recommended for sample size less than 30, the present paper focuses on the case with a sample of 30 to give general recommendations with regard to test statistics for contingency tables in response-adaptive randomization at small sample sizes. Results. Among all asymptotic test statistics, the Cook's correction to chi-square test (T MC) is the best in attaining the nominal size of hypothesis test. The William's correction to log-likelihood ratio test (TML) gives slightly inflated type I error and higher power as compared with TMC, but it is more robust against the unbalance in patient allocation. T MCand TMLare usually the two test statistics with the highest power in different simulation scenarios. When focusing on T MCand TML, the generalized drop-the-loser urn (GDL) and sequential estimation-adjusted urn (SEU) have the best ability to attain the correct size of hypothesis test respectively. Among all sequential methods that can target different allocation ratios, GDL has the lowest variation and the highest overall power at all allocation ratios. The performance of different adaptive randomization methods and test statistics also depends on allocation targets. At the limiting allocation ratio of drop-the-loser (DL) and randomized play-the-winner (RPW) urn, DL outperforms all other methods including GDL. When comparing the power of test statistics in the same randomization method but at different allocation targets, the powers of log-likelihood-ratio, log-relative-risk, log-odds-ratio, Wald-type Z, and chi-square test statistics are maximized at their corresponding optimal allocation ratios for power. Except for the optimal allocation target for log-relative-risk, the other four optimal targets could assign more patients to the worse arm in some simulation scenarios. Another optimal allocation target, RRSIHR, proposed by Rosenberger and Sriram (Journal of Statistical Planning and Inference, 1997) is aimed at minimizing the number of failures at fixed power using Wald-type Z test statistics. Among allocation ratios that always assign more patients to the better treatment, RRSIHRusually has less variation in patient allocation, and the values of variation are consistent across all simulation scenarios. Additionally, the patient allocation at RRSIHRis not too extreme. Therefore, RRSIHRprovides a good balance between assigning more patients to the better treatment and maintaining the overall power. Conclusion. The Cook's correction to chi-square test and Williams' correction to log-likelihood-ratio test are generally recommended for hypothesis test in response-adaptive randomization, especially when sample sizes are small. The generalized drop-the-loser urn design is the recommended method for its good overall properties. Also recommended is the use of the R RSIHRallocation target.

UR - http://www.scopus.com/inward/record.url?scp=77953053344&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77953053344&partnerID=8YFLogxK

U2 - 10.1186/1471-2288-10-48

DO - 10.1186/1471-2288-10-48

M3 - Article

C2 - 20525382

AN - SCOPUS:77953053344

SN - 1471-2288

VL - 10

JO - BMC Medical Research Methodology

JF - BMC Medical Research Methodology

M1 - 48

ER -

A simulation study for comparing testing statistics in response-adaptive randomization

Abstract

ASJC Scopus subject areas

MD Anderson CCSG core facilities

Access to Document

Other files and links

Fingerprint

Cite this