A transformer-based hierarchical registration framework for multimodality deformable image registration

Yao Zhao; Xinru Chen; Brigid McDonald; Cenji Yu; Abdalah S.R. Mohamed; Clifton D. Fuller; Laurence E. Court; Tinsu Pan; He Wang; Xin Wang; Jack Phan; Jinzhong Yang

doi:10.1016/j.compmedimag.2023.102286

A transformer-based hierarchical registration framework for multimodality deformable image registration

Yao Zhao, Xinru Chen, Brigid McDonald, Cenji Yu, Abdalah S.R. Mohamed, Clifton D. Fuller, Laurence E. Court, Tinsu Pan, He Wang, Xin Wang, Jack Phan, Jinzhong Yang

Research output: Contribution to journal › Article › peer-review

Abstract

Deformable image registration (DIR) between daily and reference images is fundamentally important for adaptive radiotherapy. In the last decade, deep learning-based image registration methods have been developed with faster computation time and improved robustness compared to traditional methods. However, the registration performance is often degraded in extra-cranial sites with large volume containing multiple anatomic regions, such as Computed Tomography (CT)/Magnetic Resonance (MR) images used in head and neck (HN) radiotherapy. In this study, we developed a hierarchical deformable image registration (DIR) framework, Patch-based Registration Network (Patch-RegNet), to improve the accuracy and speed of CT-MR and MR-MR registration for head-and-neck MR-Linac treatments. Patch-RegNet includes three steps: a whole volume global registration, a patch-based local registration, and a patch-based deformable registration. Following a whole-volume rigid registration, the input images were divided into overlapping patches. Then a patch-based rigid registration was applied to achieve accurate local alignment for subsequent DIR. We developed a ViT-Morph model, a combination of a convolutional neural network (CNN) and the Vision Transformer (ViT), for the patch-based DIR. A modality independent neighborhood descriptor was adopted in our model as the similarity metric to account for both inter-modality and intra-modality registration. The CT-MR and MR-MR DIR models were trained with 242 CT-MR and 213 MR-MR image pairs from 36 patients, respectively, and both tested with 24 image pairs (CT-MR and MR-MR) from 6 other patients. The registration performance was evaluated with 7 manually contoured organs (brainstem, spinal cord, mandible, left/right parotids, left/right submandibular glands) by comparing with the traditional registration methods in Monaco treatment planning system and the popular deep learning-based DIR framework, Voxelmorph. Evaluation results show that our method outperformed VoxelMorph by 6 % for CT-MR registration, and 4 % for MR-MR registration based on DSC measurements. Our hierarchical registration framework has been demonstrated achieving significantly improved DIR accuracy of both CT-MR and MR-MR registration for head-and-neck MR-guided adaptive radiotherapy.

Original language	English (US)
Article number	102286
Journal	Computerized Medical Imaging and Graphics
Volume	108
DOIs	https://doi.org/10.1016/j.compmedimag.2023.102286
State	Published - Sep 2023

Keywords

CT/MR deformable registration
Multi-modality registration
Patch-based registration
Vision transformer

ASJC Scopus subject areas

Radiological and Ultrasound Technology
Radiology Nuclear Medicine and imaging
Computer Vision and Pattern Recognition
Health Informatics
Computer Graphics and Computer-Aided Design

Access to Document

10.1016/j.compmedimag.2023.102286

Cite this

@article{12e848819bc24dbaaa21fff9c2de544e,

title = "A transformer-based hierarchical registration framework for multimodality deformable image registration",

abstract = "Deformable image registration (DIR) between daily and reference images is fundamentally important for adaptive radiotherapy. In the last decade, deep learning-based image registration methods have been developed with faster computation time and improved robustness compared to traditional methods. However, the registration performance is often degraded in extra-cranial sites with large volume containing multiple anatomic regions, such as Computed Tomography (CT)/Magnetic Resonance (MR) images used in head and neck (HN) radiotherapy. In this study, we developed a hierarchical deformable image registration (DIR) framework, Patch-based Registration Network (Patch-RegNet), to improve the accuracy and speed of CT-MR and MR-MR registration for head-and-neck MR-Linac treatments. Patch-RegNet includes three steps: a whole volume global registration, a patch-based local registration, and a patch-based deformable registration. Following a whole-volume rigid registration, the input images were divided into overlapping patches. Then a patch-based rigid registration was applied to achieve accurate local alignment for subsequent DIR. We developed a ViT-Morph model, a combination of a convolutional neural network (CNN) and the Vision Transformer (ViT), for the patch-based DIR. A modality independent neighborhood descriptor was adopted in our model as the similarity metric to account for both inter-modality and intra-modality registration. The CT-MR and MR-MR DIR models were trained with 242 CT-MR and 213 MR-MR image pairs from 36 patients, respectively, and both tested with 24 image pairs (CT-MR and MR-MR) from 6 other patients. The registration performance was evaluated with 7 manually contoured organs (brainstem, spinal cord, mandible, left/right parotids, left/right submandibular glands) by comparing with the traditional registration methods in Monaco treatment planning system and the popular deep learning-based DIR framework, Voxelmorph. Evaluation results show that our method outperformed VoxelMorph by 6 % for CT-MR registration, and 4 % for MR-MR registration based on DSC measurements. Our hierarchical registration framework has been demonstrated achieving significantly improved DIR accuracy of both CT-MR and MR-MR registration for head-and-neck MR-guided adaptive radiotherapy.",

keywords = "CT/MR deformable registration, Multi-modality registration, Patch-based registration, Vision transformer",

author = "Yao Zhao and Xinru Chen and Brigid McDonald and Cenji Yu and Mohamed, {Abdalah S.R.} and Fuller, {Clifton D.} and Court, {Laurence E.} and Tinsu Pan and He Wang and Xin Wang and Jack Phan and Jinzhong Yang",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",

year = "2023",

month = sep,

doi = "10.1016/j.compmedimag.2023.102286",

language = "English (US)",

volume = "108",

journal = "Computerized Medical Imaging and Graphics",

issn = "0895-6111",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - A transformer-based hierarchical registration framework for multimodality deformable image registration

AU - Zhao, Yao

AU - Chen, Xinru

AU - McDonald, Brigid

AU - Yu, Cenji

AU - Mohamed, Abdalah S.R.

AU - Fuller, Clifton D.

AU - Court, Laurence E.

AU - Pan, Tinsu

AU - Wang, He

AU - Wang, Xin

AU - Phan, Jack

AU - Yang, Jinzhong

PY - 2023/9

Y1 - 2023/9

N2 - Deformable image registration (DIR) between daily and reference images is fundamentally important for adaptive radiotherapy. In the last decade, deep learning-based image registration methods have been developed with faster computation time and improved robustness compared to traditional methods. However, the registration performance is often degraded in extra-cranial sites with large volume containing multiple anatomic regions, such as Computed Tomography (CT)/Magnetic Resonance (MR) images used in head and neck (HN) radiotherapy. In this study, we developed a hierarchical deformable image registration (DIR) framework, Patch-based Registration Network (Patch-RegNet), to improve the accuracy and speed of CT-MR and MR-MR registration for head-and-neck MR-Linac treatments. Patch-RegNet includes three steps: a whole volume global registration, a patch-based local registration, and a patch-based deformable registration. Following a whole-volume rigid registration, the input images were divided into overlapping patches. Then a patch-based rigid registration was applied to achieve accurate local alignment for subsequent DIR. We developed a ViT-Morph model, a combination of a convolutional neural network (CNN) and the Vision Transformer (ViT), for the patch-based DIR. A modality independent neighborhood descriptor was adopted in our model as the similarity metric to account for both inter-modality and intra-modality registration. The CT-MR and MR-MR DIR models were trained with 242 CT-MR and 213 MR-MR image pairs from 36 patients, respectively, and both tested with 24 image pairs (CT-MR and MR-MR) from 6 other patients. The registration performance was evaluated with 7 manually contoured organs (brainstem, spinal cord, mandible, left/right parotids, left/right submandibular glands) by comparing with the traditional registration methods in Monaco treatment planning system and the popular deep learning-based DIR framework, Voxelmorph. Evaluation results show that our method outperformed VoxelMorph by 6 % for CT-MR registration, and 4 % for MR-MR registration based on DSC measurements. Our hierarchical registration framework has been demonstrated achieving significantly improved DIR accuracy of both CT-MR and MR-MR registration for head-and-neck MR-guided adaptive radiotherapy.

AB - Deformable image registration (DIR) between daily and reference images is fundamentally important for adaptive radiotherapy. In the last decade, deep learning-based image registration methods have been developed with faster computation time and improved robustness compared to traditional methods. However, the registration performance is often degraded in extra-cranial sites with large volume containing multiple anatomic regions, such as Computed Tomography (CT)/Magnetic Resonance (MR) images used in head and neck (HN) radiotherapy. In this study, we developed a hierarchical deformable image registration (DIR) framework, Patch-based Registration Network (Patch-RegNet), to improve the accuracy and speed of CT-MR and MR-MR registration for head-and-neck MR-Linac treatments. Patch-RegNet includes three steps: a whole volume global registration, a patch-based local registration, and a patch-based deformable registration. Following a whole-volume rigid registration, the input images were divided into overlapping patches. Then a patch-based rigid registration was applied to achieve accurate local alignment for subsequent DIR. We developed a ViT-Morph model, a combination of a convolutional neural network (CNN) and the Vision Transformer (ViT), for the patch-based DIR. A modality independent neighborhood descriptor was adopted in our model as the similarity metric to account for both inter-modality and intra-modality registration. The CT-MR and MR-MR DIR models were trained with 242 CT-MR and 213 MR-MR image pairs from 36 patients, respectively, and both tested with 24 image pairs (CT-MR and MR-MR) from 6 other patients. The registration performance was evaluated with 7 manually contoured organs (brainstem, spinal cord, mandible, left/right parotids, left/right submandibular glands) by comparing with the traditional registration methods in Monaco treatment planning system and the popular deep learning-based DIR framework, Voxelmorph. Evaluation results show that our method outperformed VoxelMorph by 6 % for CT-MR registration, and 4 % for MR-MR registration based on DSC measurements. Our hierarchical registration framework has been demonstrated achieving significantly improved DIR accuracy of both CT-MR and MR-MR registration for head-and-neck MR-guided adaptive radiotherapy.

KW - CT/MR deformable registration

KW - Multi-modality registration

KW - Patch-based registration

KW - Vision transformer

UR - http://www.scopus.com/inward/record.url?scp=85169503588&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85169503588&partnerID=8YFLogxK

U2 - 10.1016/j.compmedimag.2023.102286

DO - 10.1016/j.compmedimag.2023.102286

M3 - Article

C2 - 37625307

AN - SCOPUS:85169503588

SN - 0895-6111

VL - 108

JO - Computerized Medical Imaging and Graphics

JF - Computerized Medical Imaging and Graphics

M1 - 102286

ER -

A transformer-based hierarchical registration framework for multimodality deformable image registration

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this