A RAG-based question-answering solution for cyber-attack investigation and attribution
A RAG-based question-answering solution for cyber-attack investigation and attribution
In the constantly evolving field of cybersecurity, it is imper- ative for analysts to stay abreast of the latest attack trends and perti- nent information that aids in the investigation and attribution of cyber- attacks. In this work, we introduce the first question-answering (QA) model and its application that provides information to the cybersecu- rity experts about cyber-attacks investigations and attribution. Our QA model is based on Retrieval Augmented Generation (RAG) techniques together with a Large Language Model (LLM) and provides answers to the users’ queries based on either our knowledge base (KB) that contains curated information about cyber-attacks investigations and attribution or on outside resources provided by the users. We have tested and evalu- ated our QA model with various types of questions, including KB-based, metadata-based, specific documents from the KB, and external sources- based questions. We compared the answers for KB-based questions with those from OpenAI’s GPT-3.5 and the latest GPT-4o LLMs. Our pro- posed QA model outperforms OpenAI’s GPT models by providing the source of the answers and overcoming the hallucination limitations of the GPT models, which is critical for cyber-attack investigation and attribu- tion. Additionally, our analysis showed that when the RAG QA model is given few-shot examples rather than zero-shot instructions, it gener- ates better answers compared to cases where no examples are supplied in addition to the query.
cyber-attack attribution, LLMs, RAG, QA
Rajapaksha, Sampath
584c9a51-17b5-4b18-b4f8-4e413a40e9f0
Rani, Ruby
f7fdd7c5-1940-4fbc-b1bd-5ccdaadc33ba
Karafili, Erisa
f5efa31c-22b8-443e-8107-e488bd28918e
Rajapaksha, Sampath
584c9a51-17b5-4b18-b4f8-4e413a40e9f0
Rani, Ruby
f7fdd7c5-1940-4fbc-b1bd-5ccdaadc33ba
Karafili, Erisa
f5efa31c-22b8-443e-8107-e488bd28918e
Rajapaksha, Sampath, Rani, Ruby and Karafili, Erisa
(2024)
A RAG-based question-answering solution for cyber-attack investigation and attribution.
In Workshop on Security and Artificial Intelligence 2024, 29th European Symposium on Research in Computer Security.
Springer..
(In Press)
Record type:
Conference or Workshop Item
(Paper)
Abstract
In the constantly evolving field of cybersecurity, it is imper- ative for analysts to stay abreast of the latest attack trends and perti- nent information that aids in the investigation and attribution of cyber- attacks. In this work, we introduce the first question-answering (QA) model and its application that provides information to the cybersecu- rity experts about cyber-attacks investigations and attribution. Our QA model is based on Retrieval Augmented Generation (RAG) techniques together with a Large Language Model (LLM) and provides answers to the users’ queries based on either our knowledge base (KB) that contains curated information about cyber-attacks investigations and attribution or on outside resources provided by the users. We have tested and evalu- ated our QA model with various types of questions, including KB-based, metadata-based, specific documents from the KB, and external sources- based questions. We compared the answers for KB-based questions with those from OpenAI’s GPT-3.5 and the latest GPT-4o LLMs. Our pro- posed QA model outperforms OpenAI’s GPT models by providing the source of the answers and overcoming the hallucination limitations of the GPT models, which is critical for cyber-attack investigation and attribu- tion. Additionally, our analysis showed that when the RAG QA model is given few-shot examples rather than zero-shot instructions, it gener- ates better answers compared to cases where no examples are supplied in addition to the query.
Text
A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution
Restricted to Repository staff only
Request a copy
More information
Accepted/In Press date: 20 July 2024
Keywords:
cyber-attack attribution, LLMs, RAG, QA
Identifiers
Local EPrints ID: 493520
URI: http://eprints.soton.ac.uk/id/eprint/493520
PURE UUID: caafd614-74d9-4aca-a35c-93f0f80252f6
Catalogue record
Date deposited: 05 Sep 2024 16:31
Last modified: 06 Sep 2024 01:59
Export record
Contributors
Author:
Sampath Rajapaksha
Author:
Ruby Rani
Author:
Erisa Karafili
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics