Enhancing material features using dynamic backward attention on cross-resolution patches

Recent studies in material segmentation crop the image into patches to force the network to learn material features from local visual clues. This design is based on the expectation that the contextually invariant features can generalise the network to unseen images regardless of the object or scene in which the material appears. However, most approaches set a fixed patch resolution for all the images in a dataset, which does not consider the varying areas that materials cover within and across images due to the scene scale. As a consequence, the fixed patch resolution can limit the performance of networks. In consideration of this problem, this paper proposes a Dynamic Backward Attention Transformer (DBAT) to extract features from cross-resolution patches and dynamically aggregate these features based on per-pixel attention masks. Experiments show that DBAT achieves the best performance among state-of-the-art models (86.85% in average pixel accuracy, which is 2.15% higher than the second-best model) that can serve real-time inference. Moreover, we also illustrate the network behaviour through visualisation methods as well as descriptive statistics. The project code is available at https://github.com/heng-yuwen/Dynamic-Backward-Attention-Transformer.

Heng, Yuwen

a3edf9da-2d3b-450c-8d6d-85f76c861849

Wu, Yihong

2876bede-25f1-47a5-9e08-b98be99b2d31

Dasmahapatra, Srinandan

eb5fd76f-4335-4ae9-a88a-20b9e2b3f698

Kim, Hansung

2c7c135c-f00b-4409-acb2-85b3a9e8225f

Heng, Yuwen

a3edf9da-2d3b-450c-8d6d-85f76c861849

Wu, Yihong

2876bede-25f1-47a5-9e08-b98be99b2d31

Dasmahapatra, Srinandan

eb5fd76f-4335-4ae9-a88a-20b9e2b3f698

Kim, Hansung

2c7c135c-f00b-4409-acb2-85b3a9e8225f

Heng, Yuwen, Wu, Yihong, Dasmahapatra, Srinandan and Kim, Hansung (2022) Enhancing material features using dynamic backward attention on cross-resolution patches. The 33rd British Machine Vision Conference, , London, United Kingdom. 21 - 24 Nov 2022. 15 pp .

Record type: Conference or Workshop Item (Paper)

Abstract

UNSPECIFIED

119_22BMVC_Yuwen - Version of Record

Restricted to Repository staff only

Request a copy

Text

119_22BMVC-Yuwen

Restricted to Repository staff only

Request a copy

More information

e-pub ahead of print date: 21 November 2022

Venue - Dates: The 33rd British Machine Vision Conference, , London, United Kingdom, 2022-11-21 - 2022-11-24

Related URLs:

https://bmvc2022.mpi-inf.mpg.de/0004.pdf

Learn more about Institute for Life Sciences research Learn more about Vision, Learning and Control research Learn more about School of Electronics and Computer Science research Learn more about Institute for Life Sciences research

Identifiers

Local EPrints ID: 479337

URI: http://eprints.soton.ac.uk/id/eprint/479337

PURE UUID: 55d7a785-d0c3-4950-9e43-768939585d90

ORCID for Yuwen Heng:

orcid.org/0000-0003-3793-4811

ORCID for Yihong Wu:

orcid.org/0000-0003-3340-2535

ORCID for Hansung Kim:

orcid.org/0000-0003-4907-0491

Catalogue record

Date deposited: 20 Jul 2023 17:29

Last modified: 01 Oct 2024 02:03

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Yuwen Heng

Author: Yihong Wu

Author: Srinandan Dasmahapatra

Author: Hansung Kim

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information