High quality machine-robust image features: Identification in nonsmall cell lung cancer computed tomography images

Luke A. Hunter, Shane Krafft, Francesco Stingo, Haesun Choi, Mary K. Martel, Stephen F. Kry, Laurence E. Court

Research output: Contribution to journalArticlepeer-review

85 Scopus citations

Abstract

Purpose: For nonsmall cell lung cancer (NSCLC) patients, quantitative image features extracted from computed tomography (CT) images can be used to improve tumor diagnosis, staging, and response assessment. For these findings to be clinically applied, image features need to have high intra and intermachine reproducibility. The objective of this study is to identify CT image features that are reproducible, nonredundant, and informative across multiple machines. Methods: Noncontrast-enhanced, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. Two machines ("M1" and "M2") used cine 4D-CT and one machine ("M3") used breath-hold helical 3D-CT. Gross tumor volumes (GTVs) were semiautonomously segmented then pruned by removing voxels with CT numbers less than a prescribed Hounsfield unit (HU) cutoff. Three hundred and twenty eight quantitative image features were extracted from each pruned GTV based on its geometry, intensity histogram, absolute gradient image, co-occurrence matrix, and run-length matrix. For each machine, features with concordance correlation coefficient values greater than 0.90 were considered reproducible. The Dice similarity coefficient (DSC) and the Jaccard index (JI) were used to quantify reproducible feature set agreement between machines. Multimachine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering based on the average correlation between features across multiple machines. Results: For all image types, GTV pruning was found to negatively affect reproducibility (reported results use no HU cutoff). The reproducible feature percentage was highest for average images (M1 = 90.5%, M2 = 94.5%, M1∩M2 = 86.3%), intermediate for end-exhale images (M1 = 75.0%, M2 = 71.0%, M1∩M2 = 52.1%), and lowest for breath-hold images (M3 = 61.0%). Between M1 and M2, the reproducible feature sets generated from end-exhale images were relatively machine-sensitive (DSC = 0.71, JI = 0.55), and the reproducible feature sets generated from average images were relatively machine-insensitive (DSC = 0.90, JI = 0.87). Histograms of feature pair correlation distances indicated that feature redundancy was machine-sensitive and image type sensitive. After hierarchical clustering, 38 features, 28 features, and 33 features were found to be reproducible and nonredundant for M1∩M2 (average images), M1∩M2 (end-exhale images), and M3, respectively. When blinded to the presence of test-retest images, hierarchical clustering showed that the selected features were informative by correctly pairing 55 out of 56 test-retest images using only their reproducible, nonredundant feature set values. Conclusions: Image feature reproducibility and redundancy depended on both the CT machine and the CT image type. For each image type, the authors found a set of cross-machine reproducible, nonredundant, and informative image features that would be useful for future image-based models. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multimachine reproducibility and are the best candidates for clinical correlation.

Original languageEnglish (US)
Article number121916
JournalMedical physics
Volume40
Issue number12
DOIs
StatePublished - Dec 2013

Keywords

  • Lung cancer
  • Quantitative image features
  • Reproducibility

ASJC Scopus subject areas

  • Biophysics
  • Radiology Nuclear Medicine and imaging

MD Anderson CCSG core facilities

  • Biostatistics Resource Group

Fingerprint

Dive into the research topics of 'High quality machine-robust image features: Identification in nonsmall cell lung cancer computed tomography images'. Together they form a unique fingerprint.

Cite this