BiBERT-AV: enhancing authorship verification through Siamese networks with pre-trained BERT and Bi-LSTM
BiBERT-AV: enhancing authorship verification through Siamese networks with pre-trained BERT and Bi-LSTM
Authorship verification is a challenging problem in natural language processing. It is crucial in security and forensics, helping identify authors and combat fake news. Recent advancements in neural network models have shown promising results in improving the accuracy of authorship verification. This paper presents a novel model for authorship verification using Siamese networks and evaluates the advantages of transformer-based models over existing methods that rely on domain knowledge and feature engineering. This paper’s objective is to address the authorship verification problem in NLP which entails determining whether two texts were written by the same author by introducing a novel approach that employs Siamese networks with pre-trained BERT and Bi-LSTM layers. The proposed model BiBERT-AV aims to compare the performance of this Siamese network using pre-trained BERT and Bi-LSTM layers against existing methods for authorship verification. The results of this study demonstrate that the proposed Siamese network model BiBERT-AV offers an effective solution for authorship verification that is based solely on the writing style of the author, which outperformed the baselines and state-of-the-art methods. Additionally, our model offers a viable alternative to existing methods that heavily rely on domain knowledge and laborious feature engineering, which often demand significant time and expertise. Notably, the BiBERT-AV model consistently achieves a notable level of accuracy, even when the number of authors is expanded to a larger group. This achievement underscores a notable contrast to the limitations exhibited by the baseline model used in exacting research studies. Overall, this study provides valuable insights into the application of Siamese networks with pre-trained BERT and Bi-LSTM layers for authorship verification and establishes the superiority of the proposed models over existing methods in this domain. The study contributes to the advancement of NLP research and has implications for several real-world applications.
Authorship verification, BERT, Forensics, Security, Siamese networks, Transformer, bi-LSTM
17-30
Almutairi, Amirah
93ab82cb-5649-45b5-b6a7-a1ce15446354
Kang, BooJoong
cfccdccd-f57f-448e-9f3c-1c51134c48dd
Hashimy, Nawfal Al
e73b96f2-bf15-40cb-9af5-23c10ea8e319
13 March 2024
Almutairi, Amirah
93ab82cb-5649-45b5-b6a7-a1ce15446354
Kang, BooJoong
cfccdccd-f57f-448e-9f3c-1c51134c48dd
Hashimy, Nawfal Al
e73b96f2-bf15-40cb-9af5-23c10ea8e319
Almutairi, Amirah, Kang, BooJoong and Hashimy, Nawfal Al
(2024)
BiBERT-AV: enhancing authorship verification through Siamese networks with pre-trained BERT and Bi-LSTM.
Wang, Guojun, Wang, Haozhe, Min, Geyong, Georgalas, Nektarios and Meng, Weizhi
(eds.)
In Ubiquitous Security: Third International Conference, UbiSec 2023, Exeter, UK, November 1–3, 2023, Revised Selected Papers.
vol. 2034,
Springer Singapore.
.
(doi:10.1007/978-981-97-1274-8_2).
Record type:
Conference or Workshop Item
(Paper)
Abstract
Authorship verification is a challenging problem in natural language processing. It is crucial in security and forensics, helping identify authors and combat fake news. Recent advancements in neural network models have shown promising results in improving the accuracy of authorship verification. This paper presents a novel model for authorship verification using Siamese networks and evaluates the advantages of transformer-based models over existing methods that rely on domain knowledge and feature engineering. This paper’s objective is to address the authorship verification problem in NLP which entails determining whether two texts were written by the same author by introducing a novel approach that employs Siamese networks with pre-trained BERT and Bi-LSTM layers. The proposed model BiBERT-AV aims to compare the performance of this Siamese network using pre-trained BERT and Bi-LSTM layers against existing methods for authorship verification. The results of this study demonstrate that the proposed Siamese network model BiBERT-AV offers an effective solution for authorship verification that is based solely on the writing style of the author, which outperformed the baselines and state-of-the-art methods. Additionally, our model offers a viable alternative to existing methods that heavily rely on domain knowledge and laborious feature engineering, which often demand significant time and expertise. Notably, the BiBERT-AV model consistently achieves a notable level of accuracy, even when the number of authors is expanded to a larger group. This achievement underscores a notable contrast to the limitations exhibited by the baseline model used in exacting research studies. Overall, this study provides valuable insights into the application of Siamese networks with pre-trained BERT and Bi-LSTM layers for authorship verification and establishes the superiority of the proposed models over existing methods in this domain. The study contributes to the advancement of NLP research and has implications for several real-world applications.
Text
BiBERT_AV__Enhancing_Authorship_Verification_through_Siamese_Networks_with_Pre_trained_BERT_and_Bi_LSTM
- Version of Record
Restricted to Repository staff only
Request a copy
More information
e-pub ahead of print date: 13 March 2024
Published date: 13 March 2024
Keywords:
Authorship verification, BERT, Forensics, Security, Siamese networks, Transformer, bi-LSTM
Identifiers
Local EPrints ID: 490641
URI: http://eprints.soton.ac.uk/id/eprint/490641
ISSN: 1865-0929
PURE UUID: d3543cfb-4aa2-4ac6-b633-714137fe1031
Catalogue record
Date deposited: 31 May 2024 16:51
Last modified: 06 Jun 2024 02:10
Export record
Altmetrics
Contributors
Author:
Amirah Almutairi
Author:
BooJoong Kang
Author:
Nawfal Al Hashimy
Editor:
Guojun Wang
Editor:
Haozhe Wang
Editor:
Geyong Min
Editor:
Nektarios Georgalas
Editor:
Weizhi Meng
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics