The University of Southampton
University of Southampton Institutional Repository

Wasserstein-based texture analysis in radiomic studies

Wasserstein-based texture analysis in radiomic studies
Wasserstein-based texture analysis in radiomic studies
The emerging field of radiomics that transforms standard-of-care images to quantifiable scalar statistics endeavors to reveal the information hidden in these macroscopic images. The concept of texture is widely used and essential in many radiomic-based studies. Practice usually reduces spatial multidimensional texture matrices, e.g., gray-level co-occurrence matrices (GLCMs), to summary scalar features. These statistical features have been demonstrated to be strongly correlated and tend to contribute redundant information; and does not account for the spatial information hidden in the multivariate texture matrices. This study proposes a novel pipeline to deal with spatial texture features in radiomic studies. A new set of textural features that preserve the spatial information inherent in GLCMs is proposed and used for classification purposes. The set of the new features uses the Wasserstein metric from optimal mass transport theory (OMT) to quantify the spatial similarity between samples within a given label class. In particular, based on a selected subset of texture GLCMs from the training cohort, we propose new representative spatial texture features, which we incorporate into a supervised image classification pipeline. The pipeline relies on the support vector machine (SVM) algorithm along with Bayesian optimization and the Wasserstein metric. The selection of the best GLCM references is considered for each classification label and is performed during the training phase of the SVM classifier using a Bayesian optimizer. We assume that sample fitness is defined based on closeness (in the sense of the Wasserstein metric) and high correlation (Spearman’s rank sense) with other samples in the same class. Moreover, the newly defined spatial texture features consist of the Wasserstein distance between the optimally selected references and the remaining samples. We assessed the performance of the proposed classification pipeline in diagnosing the coronavirus disease 2019 (COVID-19) from computed tomographic (CT) images. To evaluate the proposed spatial features’ added value, we compared the performance of the proposed classification pipeline with other SVM-based classifiers that account for different texture features, namely: statistical features only, optimized spatial features using Euclidean metric, non-optimized spatial features with Wasserstein metric. The proposed technique, which accounts for the optimized spatial texture feature with Wasserstein metric, shows great potential in classifying new COVID CT images that the algorithm has not seen in the training step. The MATLAB code of the proposed classification pipeline is made available. It can be used to find the best reference samples in other data cohorts, which can then be employed to build different prediction models.
0895-6111
Belkhatir, Zehor
de90d742-a58f-4425-837c-20ff960fb9b6
Estépar, Raúl San José
0d30dfca-18df-4367-8c1b-63770adf4c0d
Tannenbaum, Allen R.
8c08f40e-6f54-4ed7-9de4-c1347d60c7db
Belkhatir, Zehor
de90d742-a58f-4425-837c-20ff960fb9b6
Estépar, Raúl San José
0d30dfca-18df-4367-8c1b-63770adf4c0d
Tannenbaum, Allen R.
8c08f40e-6f54-4ed7-9de4-c1347d60c7db

Belkhatir, Zehor, Estépar, Raúl San José and Tannenbaum, Allen R. (2022) Wasserstein-based texture analysis in radiomic studies. Computerized Medical Imaging and Graphics, 102, [102129]. (doi:10.1016/j.compmedimag.2022.102129).

Record type: Article

Abstract

The emerging field of radiomics that transforms standard-of-care images to quantifiable scalar statistics endeavors to reveal the information hidden in these macroscopic images. The concept of texture is widely used and essential in many radiomic-based studies. Practice usually reduces spatial multidimensional texture matrices, e.g., gray-level co-occurrence matrices (GLCMs), to summary scalar features. These statistical features have been demonstrated to be strongly correlated and tend to contribute redundant information; and does not account for the spatial information hidden in the multivariate texture matrices. This study proposes a novel pipeline to deal with spatial texture features in radiomic studies. A new set of textural features that preserve the spatial information inherent in GLCMs is proposed and used for classification purposes. The set of the new features uses the Wasserstein metric from optimal mass transport theory (OMT) to quantify the spatial similarity between samples within a given label class. In particular, based on a selected subset of texture GLCMs from the training cohort, we propose new representative spatial texture features, which we incorporate into a supervised image classification pipeline. The pipeline relies on the support vector machine (SVM) algorithm along with Bayesian optimization and the Wasserstein metric. The selection of the best GLCM references is considered for each classification label and is performed during the training phase of the SVM classifier using a Bayesian optimizer. We assume that sample fitness is defined based on closeness (in the sense of the Wasserstein metric) and high correlation (Spearman’s rank sense) with other samples in the same class. Moreover, the newly defined spatial texture features consist of the Wasserstein distance between the optimally selected references and the remaining samples. We assessed the performance of the proposed classification pipeline in diagnosing the coronavirus disease 2019 (COVID-19) from computed tomographic (CT) images. To evaluate the proposed spatial features’ added value, we compared the performance of the proposed classification pipeline with other SVM-based classifiers that account for different texture features, namely: statistical features only, optimized spatial features using Euclidean metric, non-optimized spatial features with Wasserstein metric. The proposed technique, which accounts for the optimized spatial texture feature with Wasserstein metric, shows great potential in classifying new COVID CT images that the algorithm has not seen in the training step. The MATLAB code of the proposed classification pipeline is made available. It can be used to find the best reference samples in other data cohorts, which can then be employed to build different prediction models.

Text
1-s2.0-S0895611122000994-main - Version of Record
Available under License Creative Commons Attribution.
Download (3MB)

More information

Accepted/In Press date: 3 October 2022
e-pub ahead of print date: 19 October 2022
Published date: 26 October 2022

Identifiers

Local EPrints ID: 502052
URI: http://eprints.soton.ac.uk/id/eprint/502052
ISSN: 0895-6111
PURE UUID: a7ca3106-cf85-45b5-aa48-bb6d10c03d97
ORCID for Zehor Belkhatir: ORCID iD orcid.org/0000-0001-7277-3895

Catalogue record

Date deposited: 16 Jun 2025 16:30
Last modified: 22 Aug 2025 02:38

Export record

Altmetrics

Contributors

Author: Zehor Belkhatir ORCID iD
Author: Raúl San José Estépar
Author: Allen R. Tannenbaum

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×