One-for-all: towards universal domain translation with a single StyleGAN
One-for-all: towards universal domain translation with a single StyleGAN
In this paper, we propose a novel translation model, UniTranslator, for transforming representations between visually distinct domains under conditions of limited training data and significant visual differences. The main idea behind our approach is leveraging the domain-neutral capabilities of CLIP as a bridging mechanism, while utilizing a separate module to extract abstract, domain-agnostic semantics from the embeddings of both the source and target realms. Fusing these abstract semantics with target-specific semantics results in a transformed embedding within the CLIP space. To bridge the gap between the disparate worlds of CLIP and StyleGAN, we introduce a new non-linear mapper, the CLIP2P mapper. Utilizing CLIP embeddings, this module is tailored to approximate the latent distribution in the StyleGAN's latent space, effectively acting as a connector between these two spaces. The proposed UniTranslator is versatile and capable of performing various tasks, including style mixing, stylization, and translations, even in visually challenging scenarios across different visual domains. Notably, UniTranslator generates high-quality translations that showcase domain relevance, diversity, and improved image quality. UniTranslator surpasses the performance of existing general-purpose models and performs well against specialized models in representative tasks. The source code and trained models will be released to the public.
GAN Embedding, Generative Adversarial Networks, Image-to-Image Translation
2865-2881
Du, Yong
5ae897d7-a3db-45d7-896a-187afa95ac43
Zhan, Jiahui
d8907c4a-dc52-41d5-b4ef-e99a7c2888b0
Li, Xinzhe
e38f78c4-0b8f-45ab-b59e-b3991d1f18d8
Dong, Junyu
f412ed20-b213-4c97-b0fd-f77262d6ab2f
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Yang, Ming-Hsuan
284704e5-cdfa-412a-a308-c739d6363989
He, Shengfeng
4abf3d21-503a-4e50-8b7c-a3eeabe60189
6 March 2025
Du, Yong
5ae897d7-a3db-45d7-896a-187afa95ac43
Zhan, Jiahui
d8907c4a-dc52-41d5-b4ef-e99a7c2888b0
Li, Xinzhe
e38f78c4-0b8f-45ab-b59e-b3991d1f18d8
Dong, Junyu
f412ed20-b213-4c97-b0fd-f77262d6ab2f
Chen, Sheng
9310a111-f79a-48b8-98c7-383ca93cbb80
Yang, Ming-Hsuan
284704e5-cdfa-412a-a308-c739d6363989
He, Shengfeng
4abf3d21-503a-4e50-8b7c-a3eeabe60189
Du, Yong, Zhan, Jiahui, Li, Xinzhe, Dong, Junyu, Chen, Sheng, Yang, Ming-Hsuan and He, Shengfeng
(2025)
One-for-all: towards universal domain translation with a single StyleGAN.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 47 (4), .
(doi:10.1109/TPAMI.2025.3530099).
Abstract
In this paper, we propose a novel translation model, UniTranslator, for transforming representations between visually distinct domains under conditions of limited training data and significant visual differences. The main idea behind our approach is leveraging the domain-neutral capabilities of CLIP as a bridging mechanism, while utilizing a separate module to extract abstract, domain-agnostic semantics from the embeddings of both the source and target realms. Fusing these abstract semantics with target-specific semantics results in a transformed embedding within the CLIP space. To bridge the gap between the disparate worlds of CLIP and StyleGAN, we introduce a new non-linear mapper, the CLIP2P mapper. Utilizing CLIP embeddings, this module is tailored to approximate the latent distribution in the StyleGAN's latent space, effectively acting as a connector between these two spaces. The proposed UniTranslator is versatile and capable of performing various tasks, including style mixing, stylization, and translations, even in visually challenging scenarios across different visual domains. Notably, UniTranslator generates high-quality translations that showcase domain relevance, diversity, and improved image quality. UniTranslator surpasses the performance of existing general-purpose models and performs well against specialized models in representative tasks. The source code and trained models will be released to the public.
Text
TPAMI-main-final
- Accepted Manuscript
Text
TPRMI2025-Apr
- Version of Record
More information
Accepted/In Press date: 8 January 2025
e-pub ahead of print date: 21 January 2025
Published date: 6 March 2025
Additional Information:
Publisher Copyright:
© 1979-2012 IEEE.
Keywords:
GAN Embedding, Generative Adversarial Networks, Image-to-Image Translation
Identifiers
Local EPrints ID: 498190
URI: http://eprints.soton.ac.uk/id/eprint/498190
ISSN: 1939-3539
PURE UUID: 510eff5d-fd4d-4fea-88c9-93cf5ec30a87
Catalogue record
Date deposited: 12 Feb 2025 17:37
Last modified: 21 Aug 2025 03:27
Export record
Altmetrics
Contributors
Author:
Yong Du
Author:
Jiahui Zhan
Author:
Xinzhe Li
Author:
Junyu Dong
Author:
Sheng Chen
Author:
Ming-Hsuan Yang
Author:
Shengfeng He
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics