The University of Southampton
University of Southampton Institutional Repository

OpenSDI: spotting diffusion-generated images in the open world

OpenSDI: spotting diffusion-generated images in the open world
OpenSDI: spotting diffusion-generated images in the open world
This paper identifies OpenSDI, a challenge for spotting diffusion-generated images in open-world settings. In response to this challenge, we define a new benchmark, the OpenSDI dataset (OpenSDID), which stands out from existing datasets due to its diverse use of large vision-language models that simulate open-world diffusion-based manipulations. Another outstanding feature of OpenSDID is its inclusion of both detection and localization tasks for images manipulated globally and locally by diffusion models. To address the OpenSDI challenge, we propose a Synergizing Pretrained Models (SPM) scheme to build up a mixture of foundation models. This approach exploits a collaboration mechanism with multiple pretrained foundation models to enhance generalization in the OpenSDI context, moving beyond traditional training by synergizing multiple pretrained models through prompting and attending strategies. Building on this scheme, we introduce MaskCLIP, an SPM-based model that aligns Contrastive Language-Image Pre-Training (CLIP) with Masked Autoencoder (MAE). Extensive evaluations on OpenSDID show that MaskCLIP significantly outperforms current state-of-the-art methods for the OpenSDI challenge, achieving remarkable relative improvements of 14.23% in IoU (14.11% in F1) and 2.05% in accuracy (2.38% in F1) compared to the second-best model in localization and detection tasks, respectively. Our dataset and code are available at https://github.com/iamwangyabin/OpenSDI.
IEEE
Wang, Yabin
e671a413-03b2-4d1e-9e5a-bbc996c04b6a
Huang, Zhiwu
84f477cd-9097-44dd-a33e-ff71f253d36b
Hong, Xiaopeng
8ed1e1b1-c10d-4466-a22b-4ed94e302afa
Wang, Yabin
e671a413-03b2-4d1e-9e5a-bbc996c04b6a
Huang, Zhiwu
84f477cd-9097-44dd-a33e-ff71f253d36b
Hong, Xiaopeng
8ed1e1b1-c10d-4466-a22b-4ed94e302afa

Wang, Yabin, Huang, Zhiwu and Hong, Xiaopeng (2025) OpenSDI: spotting diffusion-generated images in the open world. In 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE. 33 pp . (In Press)

Record type: Conference or Workshop Item (Paper)

Abstract

This paper identifies OpenSDI, a challenge for spotting diffusion-generated images in open-world settings. In response to this challenge, we define a new benchmark, the OpenSDI dataset (OpenSDID), which stands out from existing datasets due to its diverse use of large vision-language models that simulate open-world diffusion-based manipulations. Another outstanding feature of OpenSDID is its inclusion of both detection and localization tasks for images manipulated globally and locally by diffusion models. To address the OpenSDI challenge, we propose a Synergizing Pretrained Models (SPM) scheme to build up a mixture of foundation models. This approach exploits a collaboration mechanism with multiple pretrained foundation models to enhance generalization in the OpenSDI context, moving beyond traditional training by synergizing multiple pretrained models through prompting and attending strategies. Building on this scheme, we introduce MaskCLIP, an SPM-based model that aligns Contrastive Language-Image Pre-Training (CLIP) with Masked Autoencoder (MAE). Extensive evaluations on OpenSDID show that MaskCLIP significantly outperforms current state-of-the-art methods for the OpenSDI challenge, achieving remarkable relative improvements of 14.23% in IoU (14.11% in F1) and 2.05% in accuracy (2.38% in F1) compared to the second-best model in localization and detection tasks, respectively. Our dataset and code are available at https://github.com/iamwangyabin/OpenSDI.

Text
2503.19653v3 - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (24MB)

More information

Accepted/In Press date: 26 February 2025
Venue - Dates: The IEEE/CVF Conference on Computer Vision and Pattern Recognition, , Nashville, United States, 2025-06-11 - 2025-06-15

Identifiers

Local EPrints ID: 502825
URI: http://eprints.soton.ac.uk/id/eprint/502825
PURE UUID: 0965b6f6-94da-4077-993b-3db1f6ed8c76
ORCID for Zhiwu Huang: ORCID iD orcid.org/0000-0002-7385-079X

Catalogue record

Date deposited: 09 Jul 2025 16:30
Last modified: 22 Aug 2025 02:38

Export record

Contributors

Author: Yabin Wang
Author: Zhiwu Huang ORCID iD
Author: Xiaopeng Hong

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×