TY - JOUR
T1 - scDeconv
T2 - an R package to deconvolve bulk DNA methylation data with scRNA-seq data and paired bulk RNA–DNA methylation data
AU - Liu, Yu
N1 - Publisher Copyright:
© The Author(s) 2022. Published by Oxford University Press. All rights reserved.
PY - 2022/5/1
Y1 - 2022/5/1
N2 - Many DNA methylation (DNAm) data are from tissues composed of various cell types, and hence cell deconvolution methods are needed to infer their cell compositions accurately. However, a bottleneck for DNAm data is the lack of cell-type-specific DNAm references. On the other hand, scRNA-seq data are being accumulated rapidly with various cell-type transcriptomic signatures characterized, and also, many paired bulk RNA-DNAm data are publicly available currently. Hence, we developed the R package scDeconv to use these resources to solve the reference deficiency problem of DNAm data and deconvolve them from scRNA-seq data in a trans-omics manner. It assumes that paired samples have similar cell compositions. So the cell content information deconvolved from the scRNA-seq and paired RNA data can be transferred to the paired DNAm samples. Then an ensemble model is trained to fit these cell contents with DNAm features and adjust the paired RNA deconvolution in a co-training manner. Finally, the model can be used on other bulk DNAm data to predict their relative cell-type abundances. The effectiveness of this method is proved by its accurate deconvolution on the three testing datasets here, and if given an appropriate paired dataset, scDeconv can also deconvolve other omics, such as ATAC-seq data. Furthermore, the package also contains other functions, such as identifying cell-type-specific inter-group differential features from bulk DNAm data. scDeconv is available at: https://github.com/yuabrahamliu/scDeconv.
AB - Many DNA methylation (DNAm) data are from tissues composed of various cell types, and hence cell deconvolution methods are needed to infer their cell compositions accurately. However, a bottleneck for DNAm data is the lack of cell-type-specific DNAm references. On the other hand, scRNA-seq data are being accumulated rapidly with various cell-type transcriptomic signatures characterized, and also, many paired bulk RNA-DNAm data are publicly available currently. Hence, we developed the R package scDeconv to use these resources to solve the reference deficiency problem of DNAm data and deconvolve them from scRNA-seq data in a trans-omics manner. It assumes that paired samples have similar cell compositions. So the cell content information deconvolved from the scRNA-seq and paired RNA data can be transferred to the paired DNAm samples. Then an ensemble model is trained to fit these cell contents with DNAm features and adjust the paired RNA deconvolution in a co-training manner. Finally, the model can be used on other bulk DNAm data to predict their relative cell-type abundances. The effectiveness of this method is proved by its accurate deconvolution on the three testing datasets here, and if given an appropriate paired dataset, scDeconv can also deconvolve other omics, such as ATAC-seq data. Furthermore, the package also contains other functions, such as identifying cell-type-specific inter-group differential features from bulk DNAm data. scDeconv is available at: https://github.com/yuabrahamliu/scDeconv.
KW - cell deconvolution
KW - cell-type-specific inter-group differential features
KW - co-training
KW - DNA methylation
KW - ensemble
KW - scRNA-seq
UR - http://www.scopus.com/inward/record.url?scp=85130766827&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130766827&partnerID=8YFLogxK
U2 - 10.1093/bib/bbac150
DO - 10.1093/bib/bbac150
M3 - Article
C2 - 35453146
AN - SCOPUS:85130766827
SN - 1467-5463
VL - 23
JO - Briefings in bioinformatics
JF - Briefings in bioinformatics
IS - 3
M1 - bbac150
ER -