An NLP-driven framework for Business Email Compromise detection and authorship verification

Business Email Compromise (BEC) represents a significant cybersecurity threat that exploits linguistic impersonation and social engineering, rather than relying on traditional malware or malicious attachments. These attacks often bypass conventional detection systems by mimicking the language, tone, and identity of trusted individuals within an organization.

This thesis investigates content-based approaches to BEC detection using a suite of natural language processing (NLP) models. It first introduces a transformer-based classifier to identify semantic indicators of deception within email body text. It then presents a Siamese authorship verification (AV) model designed to detect stylistic inconsistencies, even under adversarial mimicry. These components are integrated into a unified multi-task learning (MTL) framework that jointly optimizes for BEC detection and AV, leveraging shared representations while preserving task-specific objectives.

To support empirical evaluation, the thesis proposes a structured taxonomy of BEC fraud and constructs a synthetic dataset through prompt-engineered language model fine-tuning and human validation. Experiments conducted on a combination of real and synthetic emails demonstrate that the MTL framework achieves up to 97% F1-score for BEC detection and 93% for AV, outperforming transfer learning baselines while reducing false positives and computational cost.

This work contributes a principled, modular, and extensible framework for enhancing email security through joint semantic and stylistic analysis, addressing critical gaps in current defenses against sophisticated impersonation-based attacks.

University of Southampton

Almutairi, Amirah

93ab82cb-5649-45b5-b6a7-a1ce15446354

2025

Almutairi, Amirah

93ab82cb-5649-45b5-b6a7-a1ce15446354

Al Hashimy, Nawfal

e73b96f2-bf15-40cb-9af5-23c10ea8e319

Kang, Boojoong

cfccdccd-f57f-448e-9f3c-1c51134c48dd

Almutairi, Amirah (2025) An NLP-driven framework for Business Email Compromise detection and authorship verification. University of Southampton, Doctoral Thesis, 127pp.

Record type: Thesis (Doctoral)

Abstract

Text

Almutairi_PhD_Thesis_2025_PDF-A3b

Download (3MB)

Text

Final-thesis-submission-Examination-Ms-Amirah-Almutairi

Restricted to Repository staff only

More information

Published date: 2025

Related URLs:

Learn more about the School of Electronics and Computer Science

Identifiers

Local EPrints ID: 505289

URI: http://eprints.soton.ac.uk/id/eprint/505289

PURE UUID: 0db58b7a-3f88-4760-92f8-575286f61c1f

ORCID for Amirah Almutairi:

orcid.org/0000-0002-2194-7936

ORCID for Nawfal Al Hashimy:

orcid.org/0000-0002-1129-5217

ORCID for Boojoong Kang:

orcid.org/0000-0001-5984-9867

Catalogue record

Date deposited: 06 Oct 2025 16:43

Last modified: 08 Jan 2026 03:08

Export record

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Amirah Almutairi

Thesis advisor: Nawfal Al Hashimy

Thesis advisor: Boojoong Kang

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information