TY - JOUR
T1 - Comparison of deep learning schemes in grading non-alcoholic fatty liver disease using B-mode ultrasound hepatorenal window images with liver biopsy as the gold standard
AU - Drazinos, Petros
AU - Gatos, Ilias
AU - Katsakiori, Paraskevi F.
AU - Tsantis, Stavros
AU - Syrmas, Efstratios
AU - Spiliopoulos, Stavros
AU - Karnabatidis, Dimitris
AU - Theotokas, Ioannis
AU - Zoumpoulis, Pavlos
AU - Hazle, John D.
AU - Kagadis, George C.
N1 - Publisher Copyright:
© 2024 Associazione Italiana di Fisica Medica e Sanitaria
PY - 2025/1
Y1 - 2025/1
N2 - Background/Introduction: To evaluate the performance of pre-trained deep learning schemes (DLS) in hepatic steatosis (HS) grading of Non-Alcoholic Fatty Liver Disease (NAFLD) patients, using as input B-mode US images containing right kidney (RK) cortex and liver parenchyma (LP) areas indicated by an expert radiologist. Methods: A total of 112 consecutively enrolled, biopsy-validated NAFLD patients underwent a regular abdominal B-mode US examination. For each patient, a radiologist obtained a B-mode US image containing RK cortex and LP and marked a point between the RK and LP, around which a window was automatically cropped. The cropped image dataset was augmented using up-sampling, and the augmented and non-augmented datasets were sorted by HS grade. Each dataset was split into training (70%) and testing (30%), and fed separately as input to InceptionV3, MobileNetV2, ResNet50, DenseNet201, and NASNetMobile pre-trained DLS. A receiver operating characteristic (ROC) analysis of hepatorenal index (HRI) measurements by the radiologist from the same cropped images was used for comparison with the performance of the DLS. Results: With the test data, the DLS reached 89.15 %–93.75 % accuracy when comparing HS grades S0–S1 vs. S2–S3 and 79.69 %–91.21 % accuracy for S0 vs. S1 vs. S2 vs. S3 with augmentation, and 80.45–82.73 % accuracy when comparing S0–S1 vs. S2–S3 and 59.54 %–63.64 % accuracy for S0 vs. S1 vs. S2 vs. S3 without augmentation. The performance of radiologists’ HRI measurement after ROC analysis was 82 %, 91.56 %, and 96.19 % for thresholds of S ≥ S1, S ≥ S2, and S = S3, respectively. Conclusion: All networks achieved high performance in HS assessment. DenseNet201 with the use of augmented data seems to be the most efficient supplementary tool for NAFLD diagnosis and grading.
AB - Background/Introduction: To evaluate the performance of pre-trained deep learning schemes (DLS) in hepatic steatosis (HS) grading of Non-Alcoholic Fatty Liver Disease (NAFLD) patients, using as input B-mode US images containing right kidney (RK) cortex and liver parenchyma (LP) areas indicated by an expert radiologist. Methods: A total of 112 consecutively enrolled, biopsy-validated NAFLD patients underwent a regular abdominal B-mode US examination. For each patient, a radiologist obtained a B-mode US image containing RK cortex and LP and marked a point between the RK and LP, around which a window was automatically cropped. The cropped image dataset was augmented using up-sampling, and the augmented and non-augmented datasets were sorted by HS grade. Each dataset was split into training (70%) and testing (30%), and fed separately as input to InceptionV3, MobileNetV2, ResNet50, DenseNet201, and NASNetMobile pre-trained DLS. A receiver operating characteristic (ROC) analysis of hepatorenal index (HRI) measurements by the radiologist from the same cropped images was used for comparison with the performance of the DLS. Results: With the test data, the DLS reached 89.15 %–93.75 % accuracy when comparing HS grades S0–S1 vs. S2–S3 and 79.69 %–91.21 % accuracy for S0 vs. S1 vs. S2 vs. S3 with augmentation, and 80.45–82.73 % accuracy when comparing S0–S1 vs. S2–S3 and 59.54 %–63.64 % accuracy for S0 vs. S1 vs. S2 vs. S3 without augmentation. The performance of radiologists’ HRI measurement after ROC analysis was 82 %, 91.56 %, and 96.19 % for thresholds of S ≥ S1, S ≥ S2, and S = S3, respectively. Conclusion: All networks achieved high performance in HS assessment. DenseNet201 with the use of augmented data seems to be the most efficient supplementary tool for NAFLD diagnosis and grading.
KW - B-mode ultrasound
KW - Chronic liver disease
KW - Hepatic steatosis
KW - Pre-trained deep learning schemes
UR - http://www.scopus.com/inward/record.url?scp=85210546617&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85210546617&partnerID=8YFLogxK
U2 - 10.1016/j.ejmp.2024.104862
DO - 10.1016/j.ejmp.2024.104862
M3 - Article
C2 - 39626614
AN - SCOPUS:85210546617
SN - 1120-1797
VL - 129
JO - Physica Medica
JF - Physica Medica
M1 - 104862
ER -