CPIQA: Climate Paper Image Question Answering
CPIQA: Climate Paper Image Question Answering
CPIQA is a large scale QA dataset focused on figures extracted from scientific research papers from various peer-reviewed venues in the climate science domain. The figures extracted include tables, graphs and diagrams, which inform the generation of questions using large language models (LLMs). Notably this dataset includes questions for 3 audiences: general public, climate skeptic and climate expert. 4 types of questions are generated with various focuses including figures, numerical, text-only and general. This results in 12 questions generated per scientific paper. Alongside figures, descriptions of the figures generated using multimodal LLMs are included and used.
Mutalik, Rudra
e90a6cf2-5c56-4d2a-b500-c96541bf288a
Panchalingam, Abiram
8d629d6c-05ce-4786-ab9b-8b02aafcfd26
Loitongbam, Gyanendro
c1d8ea4f-7a54-4c78-8830-3c3064e26ae6
Osborn, Timothy J.
82ca5e6f-f1bd-48d8-a5e4-fd3d7cf0f250
Hawkins, Ed.
39ffe578-ea2d-4a4f-bbbf-6a74251c44bb
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Mutalik, Rudra
e90a6cf2-5c56-4d2a-b500-c96541bf288a
Panchalingam, Abiram
8d629d6c-05ce-4786-ab9b-8b02aafcfd26
Loitongbam, Gyanendro
c1d8ea4f-7a54-4c78-8830-3c3064e26ae6
Osborn, Timothy J.
82ca5e6f-f1bd-48d8-a5e4-fd3d7cf0f250
Hawkins, Ed.
39ffe578-ea2d-4a4f-bbbf-6a74251c44bb
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Mutalik, Rudra, Panchalingam, Abiram, Loitongbam, Gyanendro, Osborn, Timothy J., Hawkins, Ed. and Middleton, Stuart
(2025)
CPIQA: Climate Paper Image Question Answering.
Zenodo
doi:10.5281/zenodo.15374870
[Dataset]
Abstract
CPIQA is a large scale QA dataset focused on figures extracted from scientific research papers from various peer-reviewed venues in the climate science domain. The figures extracted include tables, graphs and diagrams, which inform the generation of questions using large language models (LLMs). Notably this dataset includes questions for 3 audiences: general public, climate skeptic and climate expert. 4 types of questions are generated with various focuses including figures, numerical, text-only and general. This results in 12 questions generated per scientific paper. Alongside figures, descriptions of the figures generated using multimodal LLMs are included and used.
This record has no associated files available for download.
More information
Published date: 9 May 2025
Identifiers
Local EPrints ID: 501364
URI: http://eprints.soton.ac.uk/id/eprint/501364
PURE UUID: e08b7955-6366-4d6c-9220-728ddfd0b23d
Catalogue record
Date deposited: 29 May 2025 17:03
Last modified: 30 May 2025 02:05
Export record
Altmetrics
Contributors
Creator:
Rudra Mutalik
Creator:
Abiram Panchalingam
Creator:
Gyanendro Loitongbam
Creator:
Timothy J. Osborn
Creator:
Ed. Hawkins
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics