The University of Southampton
University of Southampton Institutional Repository

The anatomy of cyber threats: a natural language approach to asset-based threat modelling

The anatomy of cyber threats: a natural language approach to asset-based threat modelling
The anatomy of cyber threats: a natural language approach to asset-based threat modelling
The pervasiveness and criticality of emerging technologies throughout personal life, business, government and defence, is generating a growing demand for more advanced cyber threat detection, analysis and response capabilities. Threat modelling is a core component of these activities, and asset-based threat modelling in particular constitutes a primary methodological approach for characterising and understanding cyber threats in a broad range of technological and industrial domains. However, existing asset-based threat modelling processes typically exhibit a range of methodological limitations which can significantly constrain the validity of any resultant threat models.

Accordingly, this work develops and presents a generalised Reference Framework for Asset-based Threat Modelling (ReFAThM) intended to guide the development of such threat models, evaluate their completeness, and thus improve their robustness. We also present a novel automated method for characterising the threat landscape using topic modelling. Here, the CWE dataset is used as ground-truth and is pre-processed to synthesise 12 text-based threat attributes for each, using an LLM. These combined attributes constitute the input to topic modelling. Following a cluster merging process, this yields a concise set of 19 threat types, without undermining the breadth and depth of the originating vulnerability dataset. In a subsequent research activity, we utilise these 19 clusters in a classification paradigm to conduct feature importance analysis over the original 12 threat attributes. We quantitatively establish that the ’vulnerability’, ’countermeasures’, ’detection method’, ’technical impact’ and ’relevant assets’ are the most important threat attributes when characterising a cyber threat. These findings are then employed within an ontology engineering process to develop a ’core threat model’ which is suitable to form the basis of more specialised asset-based threat models in a broad range of domains.
cyber threat modelling, NLP, asset-based threat modelling, machine learning, LLM
University of Southampton
Bell, Tom James
3e86e394-03dd-4063-9a70-35709228dd16
Bell, Tom James
3e86e394-03dd-4063-9a70-35709228dd16
Sassone, vladi
df7d3c83-2aa0-4571-be94-9473b07b03e7
Surridge, Mike
870d2b8d-2e20-4c6b-b8b1-d1412c2a8ef8
Atamli, Ahmad
dacf7d9e-9898-4385-bf88-5aec14d76872

Bell, Tom James (2026) The anatomy of cyber threats: a natural language approach to asset-based threat modelling. University of Southampton, Doctoral Thesis, 185pp.

Record type: Thesis (Doctoral)

Abstract

The pervasiveness and criticality of emerging technologies throughout personal life, business, government and defence, is generating a growing demand for more advanced cyber threat detection, analysis and response capabilities. Threat modelling is a core component of these activities, and asset-based threat modelling in particular constitutes a primary methodological approach for characterising and understanding cyber threats in a broad range of technological and industrial domains. However, existing asset-based threat modelling processes typically exhibit a range of methodological limitations which can significantly constrain the validity of any resultant threat models.

Accordingly, this work develops and presents a generalised Reference Framework for Asset-based Threat Modelling (ReFAThM) intended to guide the development of such threat models, evaluate their completeness, and thus improve their robustness. We also present a novel automated method for characterising the threat landscape using topic modelling. Here, the CWE dataset is used as ground-truth and is pre-processed to synthesise 12 text-based threat attributes for each, using an LLM. These combined attributes constitute the input to topic modelling. Following a cluster merging process, this yields a concise set of 19 threat types, without undermining the breadth and depth of the originating vulnerability dataset. In a subsequent research activity, we utilise these 19 clusters in a classification paradigm to conduct feature importance analysis over the original 12 threat attributes. We quantitatively establish that the ’vulnerability’, ’countermeasures’, ’detection method’, ’technical impact’ and ’relevant assets’ are the most important threat attributes when characterising a cyber threat. These findings are then employed within an ontology engineering process to develop a ’core threat model’ which is suitable to form the basis of more specialised asset-based threat models in a broad range of domains.

Text
Southampton_PhD_Thesis__Tom_Bell___ongoing_ (25) - Version of Record
Available under License University of Southampton Thesis Licence.
Download (7MB)
Text
Final-thesis-submission-Examination-Mr-Tom-Bell
Restricted to Repository staff only

More information

Published date: 2026
Keywords: cyber threat modelling, NLP, asset-based threat modelling, machine learning, LLM

Identifiers

Local EPrints ID: 510393
URI: http://eprints.soton.ac.uk/id/eprint/510393
PURE UUID: f54457fa-3abc-42f3-abcd-e5069f182d89
ORCID for Tom James Bell: ORCID iD orcid.org/0000-0001-6027-2065
ORCID for vladi Sassone: ORCID iD orcid.org/0000-0002-6432-1482

Catalogue record

Date deposited: 30 Mar 2026 16:42
Last modified: 31 Mar 2026 01:58

Export record

Contributors

Author: Tom James Bell ORCID iD
Thesis advisor: vladi Sassone ORCID iD
Thesis advisor: Mike Surridge
Thesis advisor: Ahmad Atamli

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×