The University of Southampton
University of Southampton Institutional Repository

CPIQA: Climate Paper Image Question Answering

CPIQA: Climate Paper Image Question Answering
CPIQA: Climate Paper Image Question Answering
CPIQA is a large scale QA dataset focused on figures extracted from scientific research papers from various peer-reviewed venues in the climate science domain. The figures extracted include tables, graphs and diagrams, which inform the generation of questions using large language models (LLMs). Notably this dataset includes questions for 3 audiences: general public, climate skeptic and climate expert. 4 types of questions are generated with various focuses including figures, numerical, text-only and general. This results in 12 questions generated per scientific paper. Alongside figures, descriptions of the figures generated using multimodal LLMs are included and used.
Zenodo
Mutalik, Rudra
e90a6cf2-5c56-4d2a-b500-c96541bf288a
Panchalingam, Abiram
8d629d6c-05ce-4786-ab9b-8b02aafcfd26
Loitongbam, Gyanendro
c1d8ea4f-7a54-4c78-8830-3c3064e26ae6
Osborn, Timothy J.
82ca5e6f-f1bd-48d8-a5e4-fd3d7cf0f250
Hawkins, Ed.
39ffe578-ea2d-4a4f-bbbf-6a74251c44bb
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Mutalik, Rudra
e90a6cf2-5c56-4d2a-b500-c96541bf288a
Panchalingam, Abiram
8d629d6c-05ce-4786-ab9b-8b02aafcfd26
Loitongbam, Gyanendro
c1d8ea4f-7a54-4c78-8830-3c3064e26ae6
Osborn, Timothy J.
82ca5e6f-f1bd-48d8-a5e4-fd3d7cf0f250
Hawkins, Ed.
39ffe578-ea2d-4a4f-bbbf-6a74251c44bb
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f

Mutalik, Rudra, Panchalingam, Abiram, Loitongbam, Gyanendro, Osborn, Timothy J., Hawkins, Ed. and Middleton, Stuart (2025) CPIQA: Climate Paper Image Question Answering. Zenodo doi:10.5281/zenodo.15374870 [Dataset]

Record type: Dataset

Abstract

CPIQA is a large scale QA dataset focused on figures extracted from scientific research papers from various peer-reviewed venues in the climate science domain. The figures extracted include tables, graphs and diagrams, which inform the generation of questions using large language models (LLMs). Notably this dataset includes questions for 3 audiences: general public, climate skeptic and climate expert. 4 types of questions are generated with various focuses including figures, numerical, text-only and general. This results in 12 questions generated per scientific paper. Alongside figures, descriptions of the figures generated using multimodal LLMs are included and used.

This record has no associated files available for download.

More information

Published date: 9 May 2025

Identifiers

Local EPrints ID: 501364
URI: http://eprints.soton.ac.uk/id/eprint/501364
PURE UUID: e08b7955-6366-4d6c-9220-728ddfd0b23d
ORCID for Rudra Mutalik: ORCID iD orcid.org/0000-0003-0548-780X
ORCID for Stuart Middleton: ORCID iD orcid.org/0000-0001-8305-8176

Catalogue record

Date deposited: 29 May 2025 17:03
Last modified: 30 May 2025 02:05

Export record

Altmetrics

Contributors

Creator: Rudra Mutalik ORCID iD
Creator: Abiram Panchalingam
Creator: Gyanendro Loitongbam
Creator: Timothy J. Osborn
Creator: Ed. Hawkins
Creator: Stuart Middleton ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×