The University of Southampton
University of Southampton Institutional Repository

Learning high-order features for fine-grained visual categorization with causal inference

Learning high-order features for fine-grained visual categorization with causal inference
Learning high-order features for fine-grained visual categorization with causal inference

Recently, causal models have gained significant attention in natural language processing (NLP) and computer vision (CV) due to their capability of capturing features with causal relationships. This study addresses Fine-Grained Visual Categorization (FGVC) by incorporating high-order feature fusions to improve the representation of feature interactions while mitigating the influence of confounding factors through causal inference. A novel high-order feature learning framework with causal inference is developed to enhance FGVC. A causal graph tailored to FGVC is constructed, and the causal assumptions of baseline models are analyzed to identify confounding factors. A reconstructed causal structure establishes meaningful interactions between individual images and image pairs. Causal interventions are applied by severing specific causal links, effectively reducing confounding effects and enhancing model robustness. The framework combines high-order feature fusion with interventional fine-grained learning by performing causal interventions on both classifiers and categories. The experimental results demonstrate that the proposed method achieves accuracies of 90.7% on CUB-200, 92.0% on FGVC-Aircraft, and 94.8% on Stanford Cars, highlighting its effectiveness and robustness across these widely used fine-grained recognition datasets. Comprehensive evaluations of these three widely used fine-grained recognition datasets demonstrate the proposed framework’s effectiveness and robustness.

causal inference, causal intervention, causal models, feature fusion, fine-grained visual categorization
Zhang, Yuhang
53b50944-dfca-4e7d-8d7e-78836d6fda3c
Wan, Yuan
fd2e198c-d0d2-4358-bb47-4babcfbdab49
Hao, Jiahui
777f2c21-4fbc-4c89-add0-76d7111e9327
Yang, Zaili
82d4eebc-4532-4343-8555-35169e79bb6d
Li, Huanhuan
5e806b21-10a7-465c-9db3-32e466ae42f1
Zhang, Yuhang
53b50944-dfca-4e7d-8d7e-78836d6fda3c
Wan, Yuan
fd2e198c-d0d2-4358-bb47-4babcfbdab49
Hao, Jiahui
777f2c21-4fbc-4c89-add0-76d7111e9327
Yang, Zaili
82d4eebc-4532-4343-8555-35169e79bb6d
Li, Huanhuan
5e806b21-10a7-465c-9db3-32e466ae42f1

Zhang, Yuhang, Wan, Yuan, Hao, Jiahui, Yang, Zaili and Li, Huanhuan (2025) Learning high-order features for fine-grained visual categorization with causal inference. Mathematics, 13 (8), [1340]. (doi:10.3390/math13081340).

Record type: Article

Abstract

Recently, causal models have gained significant attention in natural language processing (NLP) and computer vision (CV) due to their capability of capturing features with causal relationships. This study addresses Fine-Grained Visual Categorization (FGVC) by incorporating high-order feature fusions to improve the representation of feature interactions while mitigating the influence of confounding factors through causal inference. A novel high-order feature learning framework with causal inference is developed to enhance FGVC. A causal graph tailored to FGVC is constructed, and the causal assumptions of baseline models are analyzed to identify confounding factors. A reconstructed causal structure establishes meaningful interactions between individual images and image pairs. Causal interventions are applied by severing specific causal links, effectively reducing confounding effects and enhancing model robustness. The framework combines high-order feature fusion with interventional fine-grained learning by performing causal interventions on both classifiers and categories. The experimental results demonstrate that the proposed method achieves accuracies of 90.7% on CUB-200, 92.0% on FGVC-Aircraft, and 94.8% on Stanford Cars, highlighting its effectiveness and robustness across these widely used fine-grained recognition datasets. Comprehensive evaluations of these three widely used fine-grained recognition datasets demonstrate the proposed framework’s effectiveness and robustness.

Text
mathematics-13-01340-v2 - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 17 April 2025
Published date: 19 April 2025
Keywords: causal inference, causal intervention, causal models, feature fusion, fine-grained visual categorization

Identifiers

Local EPrints ID: 503705
URI: http://eprints.soton.ac.uk/id/eprint/503705
PURE UUID: e5883558-e602-421d-b279-a9338d7cbdd3
ORCID for Huanhuan Li: ORCID iD orcid.org/0000-0002-4293-4763

Catalogue record

Date deposited: 11 Aug 2025 16:36
Last modified: 22 Aug 2025 02:49

Export record

Altmetrics

Contributors

Author: Yuhang Zhang
Author: Yuan Wan
Author: Jiahui Hao
Author: Zaili Yang
Author: Huanhuan Li ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×