Comparison of deep learning schemes in grading non-alcoholic fatty liver disease using B-mode ultrasound hepatorenal window images with liver biopsy as the gold standard

Petros Drazinos, Ilias Gatos, Paraskevi F. Katsakiori, Stavros Tsantis, Efstratios Syrmas, Stavros Spiliopoulos, Dimitris Karnabatidis, Ioannis Theotokas, Pavlos Zoumpoulis, John D. Hazle, George C. Kagadis

Research output: Contribution to journalArticlepeer-review

Abstract

Background/Introduction: To evaluate the performance of pre-trained deep learning schemes (DLS) in hepatic steatosis (HS) grading of Non-Alcoholic Fatty Liver Disease (NAFLD) patients, using as input B-mode US images containing right kidney (RK) cortex and liver parenchyma (LP) areas indicated by an expert radiologist. Methods: A total of 112 consecutively enrolled, biopsy-validated NAFLD patients underwent a regular abdominal B-mode US examination. For each patient, a radiologist obtained a B-mode US image containing RK cortex and LP and marked a point between the RK and LP, around which a window was automatically cropped. The cropped image dataset was augmented using up-sampling, and the augmented and non-augmented datasets were sorted by HS grade. Each dataset was split into training (70%) and testing (30%), and fed separately as input to InceptionV3, MobileNetV2, ResNet50, DenseNet201, and NASNetMobile pre-trained DLS. A receiver operating characteristic (ROC) analysis of hepatorenal index (HRI) measurements by the radiologist from the same cropped images was used for comparison with the performance of the DLS. Results: With the test data, the DLS reached 89.15 %–93.75 % accuracy when comparing HS grades S0–S1 vs. S2–S3 and 79.69 %–91.21 % accuracy for S0 vs. S1 vs. S2 vs. S3 with augmentation, and 80.45–82.73 % accuracy when comparing S0–S1 vs. S2–S3 and 59.54 %–63.64 % accuracy for S0 vs. S1 vs. S2 vs. S3 without augmentation. The performance of radiologists’ HRI measurement after ROC analysis was 82 %, 91.56 %, and 96.19 % for thresholds of S ≥ S1, S ≥ S2, and S = S3, respectively. Conclusion: All networks achieved high performance in HS assessment. DenseNet201 with the use of augmented data seems to be the most efficient supplementary tool for NAFLD diagnosis and grading.

Original languageEnglish (US)
Article number104862
JournalPhysica Medica
Volume129
DOIs
StatePublished - Jan 2025

Keywords

  • B-mode ultrasound
  • Chronic liver disease
  • Hepatic steatosis
  • Pre-trained deep learning schemes

ASJC Scopus subject areas

  • Biophysics
  • Radiology Nuclear Medicine and imaging
  • General Physics and Astronomy

Fingerprint

Dive into the research topics of 'Comparison of deep learning schemes in grading non-alcoholic fatty liver disease using B-mode ultrasound hepatorenal window images with liver biopsy as the gold standard'. Together they form a unique fingerprint.

Cite this