TY - JOUR
T1 - eClock
T2 - An ensemble-based method to accurately predict ages with a biased distribution from DNA methylation data
AU - Liu, Yu
N1 - Publisher Copyright:
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
PY - 2022/5
Y1 - 2022/5
N2 - DNA methylation is closely related to senescence, so it has been used to develop statistical models, called clock models, to predict chronological ages accurately. However, because the training data always have a biased age distribution, the model performance becomes weak for the samples with a small age distribution density. To solve this problem, we developed the R package eClock, which uses a bagging-SMOTE method to adjust the biased distribution and predict age with an ensemble model. Moreover, it also provides a bootstrapped model based on bagging only and a traditional clock model. The performance on three datasets showed that the bagging-SMOTE model significantly improved rare sample age prediction. In addition to model construction, the package also provides other functions such as data visualization and methylation feature conversion to facilitate the research in relevant areas.
AB - DNA methylation is closely related to senescence, so it has been used to develop statistical models, called clock models, to predict chronological ages accurately. However, because the training data always have a biased age distribution, the model performance becomes weak for the samples with a small age distribution density. To solve this problem, we developed the R package eClock, which uses a bagging-SMOTE method to adjust the biased distribution and predict age with an ensemble model. Moreover, it also provides a bootstrapped model based on bagging only and a traditional clock model. The performance on three datasets showed that the bagging-SMOTE model significantly improved rare sample age prediction. In addition to model construction, the package also provides other functions such as data visualization and methylation feature conversion to facilitate the research in relevant areas.
UR - http://www.scopus.com/inward/record.url?scp=85129714330&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129714330&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0267349
DO - 10.1371/journal.pone.0267349
M3 - Article
C2 - 35522643
AN - SCOPUS:85129714330
SN - 1932-6203
VL - 17
JO - PloS one
JF - PloS one
IS - 5 May
M1 - e0267349
ER -