The University of Southampton
University of Southampton Institutional Repository

CPIQA: climate paper image question answering dataset for retrieval-augmented generation with context-based query expansion

CPIQA: climate paper image question answering dataset for retrieval-augmented generation with context-based query expansion
CPIQA: climate paper image question answering dataset for retrieval-augmented generation with context-based query expansion
Misinformation about climate science is a serious challenge for our society. This paper introduces CPIQA (Climate Paper Image Question-Answering), a new question-answer dataset featuring 4,551 full-text open-source academic papers in the area of climate science with 54,612 GPT-4o generated question-answer pairs. CPIQA contains four question types (numeric, figure-based, non-figure-based, reasoning), each generated using three user roles (expert, non-expert, climate sceptic). CPIQA is multimodal, incorporating information from figures and graphs with GPT-4o descriptive annotations. We describe Context-RAG, a novel method for RAG prompt decomposition and augmentation involving extracting distinct contexts for the question. Evaluation results for Context-RAG on the benchmark SPIQA dataset outperforms the previous best state of the art model in two out of three test cases. For our CPIQA dataset, Context-RAG outperforms our standard RAG baseline on all five base LLMs we tested, showing our novel contextual decomposition method can generalize to any LLM architecture. Expert evaluation of our best performing model (GPT-4o with Context-RAG) by climate science experts highlights strengths in precision and provenance tracking, particularly for figure-based and reasoning questions.
NLP, Machine Learning, Climate Science, AI
Mutalik, Rudra
e90a6cf2-5c56-4d2a-b500-c96541bf288a
Panchalingam, Abiram
8d629d6c-05ce-4786-ab9b-8b02aafcfd26
Loitongbam, Gyanendro
c1d8ea4f-7a54-4c78-8830-3c3064e26ae6
Osborn, Timothy J.
82ca5e6f-f1bd-48d8-a5e4-fd3d7cf0f250
Hawkins, Ed.
39ffe578-ea2d-4a4f-bbbf-6a74251c44bb
Middleton, Stuart E
404b62ba-d77e-476b-9775-32645b04473f
Mutalik, Rudra
e90a6cf2-5c56-4d2a-b500-c96541bf288a
Panchalingam, Abiram
8d629d6c-05ce-4786-ab9b-8b02aafcfd26
Loitongbam, Gyanendro
c1d8ea4f-7a54-4c78-8830-3c3064e26ae6
Osborn, Timothy J.
82ca5e6f-f1bd-48d8-a5e4-fd3d7cf0f250
Hawkins, Ed.
39ffe578-ea2d-4a4f-bbbf-6a74251c44bb
Middleton, Stuart E
404b62ba-d77e-476b-9775-32645b04473f

Mutalik, Rudra, Panchalingam, Abiram, Loitongbam, Gyanendro, Osborn, Timothy J., Hawkins, Ed. and Middleton, Stuart E (2025) CPIQA: climate paper image question answering dataset for retrieval-augmented generation with context-based query expansion. The 2nd Workshop of Natural Language Processing meets Climate Change: ACL 2025 Workshop, Austria Center Vienna, Vienna, Austria. 31 Jul 2025. 13 pp .

Record type: Conference or Workshop Item (Paper)

Abstract

Misinformation about climate science is a serious challenge for our society. This paper introduces CPIQA (Climate Paper Image Question-Answering), a new question-answer dataset featuring 4,551 full-text open-source academic papers in the area of climate science with 54,612 GPT-4o generated question-answer pairs. CPIQA contains four question types (numeric, figure-based, non-figure-based, reasoning), each generated using three user roles (expert, non-expert, climate sceptic). CPIQA is multimodal, incorporating information from figures and graphs with GPT-4o descriptive annotations. We describe Context-RAG, a novel method for RAG prompt decomposition and augmentation involving extracting distinct contexts for the question. Evaluation results for Context-RAG on the benchmark SPIQA dataset outperforms the previous best state of the art model in two out of three test cases. For our CPIQA dataset, Context-RAG outperforms our standard RAG baseline on all five base LLMs we tested, showing our novel contextual decomposition method can generalize to any LLM architecture. Expert evaluation of our best performing model (GPT-4o with Context-RAG) by climate science experts highlights strengths in precision and provenance tracking, particularly for figure-based and reasoning questions.

Text
23_CPIQA_Climate_Paper_Image_Q - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (466kB)

More information

Accepted/In Press date: 22 April 2025
Published date: 31 July 2025
Venue - Dates: The 2nd Workshop of Natural Language Processing meets Climate Change: ACL 2025 Workshop, Austria Center Vienna, Vienna, Austria, 2025-07-31 - 2025-07-31
Keywords: NLP, Machine Learning, Climate Science, AI

Identifiers

Local EPrints ID: 502572
URI: http://eprints.soton.ac.uk/id/eprint/502572
PURE UUID: 1a447218-bf22-4fb7-9e38-12c3cf771b1d
ORCID for Rudra Mutalik: ORCID iD orcid.org/0000-0003-0548-780X
ORCID for Stuart E Middleton: ORCID iD orcid.org/0000-0001-8305-8176

Catalogue record

Date deposited: 01 Jul 2025 16:34
Last modified: 22 Aug 2025 02:36

Export record

Contributors

Author: Rudra Mutalik ORCID iD
Author: Abiram Panchalingam
Author: Gyanendro Loitongbam
Author: Timothy J. Osborn
Author: Ed. Hawkins

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×