Graph-based visual-semantic entanglement network for zero-shot image recognition
Graph-based visual-semantic entanglement network for zero-shot image recognition
Zero-shot learning uses semantic attributes to connect the search space of unseen objects. In recent years, although the deep convolutional network brings powerful visual modeling capabilities to the ZSL task, its visual features have severe pattern inertia and lack of representation of semantic relationships, which leads to severe bias and ambiguity. In response to this, we propose the Graph-based Visual-Semantic Entanglement Network to conduct graph modeling of visual features, which is mapped to semantic attributes by using a knowledge graph, it contains several novel designs: 1. it establishes a multi-path entangled network with the convolutional neural network (CNN) and the graph convolutional network (GCN), which input the visual features from CNN to GCN to model the implicit semantic relations, then GCN feedback the graph modeled information to CNN features; 2. it uses attribute word vectors as the target for the graph semantic modeling of GCN, which forms a self-consistent regression for graph modeling and supervise GCN to learn more personalized attribute relations; 3. it fuses and supplements the hierarchical visual-semantic features refined by graph modeling into visual embedding. Our method outperforms state-of-the-art approaches on multiple representative ZSL datasets: AwA2, CUB, and SUN by promoting the semantic linkage modelling of visual features.
Hu, Yang
3a9d668f-8b65-4a93-b15f-1363e07d44fa
Wen, Guihua
411fd94f-89bd-4ad7-908d-9c876afd7564
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
Pei, Yang
933fc229-1c3f-4225-8646-d47ce0c684f3
Luo, Mingnan
43faccbb-eead-4787-af0f-d3fbe7f2538b
Xu, Yingxue
d79d4331-b39f-4b6d-9dd5-574926fe7fa4
Dai, Dan
85b7cbb9-cd58-46e1-b7ff-c264e9f46908
Hall, Wendy
11f7f8db-854c-4481-b1ae-721a51d8790c
Hu, Yang
3a9d668f-8b65-4a93-b15f-1363e07d44fa
Wen, Guihua
411fd94f-89bd-4ad7-908d-9c876afd7564
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1
Pei, Yang
933fc229-1c3f-4225-8646-d47ce0c684f3
Luo, Mingnan
43faccbb-eead-4787-af0f-d3fbe7f2538b
Xu, Yingxue
d79d4331-b39f-4b6d-9dd5-574926fe7fa4
Dai, Dan
85b7cbb9-cd58-46e1-b7ff-c264e9f46908
Hall, Wendy
11f7f8db-854c-4481-b1ae-721a51d8790c
Hu, Yang, Wen, Guihua, Chapman, Adriane, Pei, Yang, Luo, Mingnan, Xu, Yingxue, Dai, Dan and Hall, Wendy
(2021)
Graph-based visual-semantic entanglement network for zero-shot image recognition.
IEEE Transactions on Multimedia.
(In Press)
Abstract
Zero-shot learning uses semantic attributes to connect the search space of unseen objects. In recent years, although the deep convolutional network brings powerful visual modeling capabilities to the ZSL task, its visual features have severe pattern inertia and lack of representation of semantic relationships, which leads to severe bias and ambiguity. In response to this, we propose the Graph-based Visual-Semantic Entanglement Network to conduct graph modeling of visual features, which is mapped to semantic attributes by using a knowledge graph, it contains several novel designs: 1. it establishes a multi-path entangled network with the convolutional neural network (CNN) and the graph convolutional network (GCN), which input the visual features from CNN to GCN to model the implicit semantic relations, then GCN feedback the graph modeled information to CNN features; 2. it uses attribute word vectors as the target for the graph semantic modeling of GCN, which forms a self-consistent regression for graph modeling and supervise GCN to learn more personalized attribute relations; 3. it fuses and supplements the hierarchical visual-semantic features refined by graph modeling into visual embedding. Our method outperforms state-of-the-art approaches on multiple representative ZSL datasets: AwA2, CUB, and SUN by promoting the semantic linkage modelling of visual features.
Text
Graph-based Visual-Semantic Entanglement Network for Zero-shot Image Recognition
- Accepted Manuscript
More information
Accepted/In Press date: 12 June 2021
Identifiers
Local EPrints ID: 450317
URI: http://eprints.soton.ac.uk/id/eprint/450317
ISSN: 1520-9210
PURE UUID: a4881702-63c3-4769-adf8-e190df47912d
Catalogue record
Date deposited: 22 Jul 2021 16:31
Last modified: 17 Mar 2024 03:46
Export record
Contributors
Author:
Yang Hu
Author:
Guihua Wen
Author:
Yang Pei
Author:
Mingnan Luo
Author:
Yingxue Xu
Author:
Dan Dai
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics