BiBERT-AV: enhancing authorship verification through Siamese networks with pre-trained BERT and Bi-LSTM

Authorship verification is a challenging problem in natural language processing. It is crucial in security and forensics, helping identify authors and combat fake news. Recent advancements in neural network models have shown promising results in improving the accuracy of authorship verification. This paper presents a novel model for authorship verification using Siamese networks and evaluates the advantages of transformer-based models over existing methods that rely on domain knowledge and feature engineering. This paper’s objective is to address the authorship verification problem in NLP which entails determining whether two texts were written by the same author by introducing a novel approach that employs Siamese networks with pre-trained BERT and Bi-LSTM layers. The proposed model BiBERT-AV aims to compare the performance of this Siamese network using pre-trained BERT and Bi-LSTM layers against existing methods for authorship verification. The results of this study demonstrate that the proposed Siamese network model BiBERT-AV offers an effective solution for authorship verification that is based solely on the writing style of the author, which outperformed the baselines and state-of-the-art methods. Additionally, our model offers a viable alternative to existing methods that heavily rely on domain knowledge and laborious feature engineering, which often demand significant time and expertise. Notably, the BiBERT-AV model consistently achieves a notable level of accuracy, even when the number of authors is expanded to a larger group. This achievement underscores a notable contrast to the limitations exhibited by the baseline model used in exacting research studies. Overall, this study provides valuable insights into the application of Siamese networks with pre-trained BERT and Bi-LSTM layers for authorship verification and establishes the superiority of the proposed models over existing methods in this domain. The study contributes to the advancement of NLP research and has implications for several real-world applications.

Authorship verification, BERT, Forensics, Security, Siamese networks, Transformer, bi-LSTM

10.1007/978-981-97-1274-8_2

1865-0929

17-30

Springer Singapore

Almutairi, Amirah

93ab82cb-5649-45b5-b6a7-a1ce15446354

Kang, BooJoong

cfccdccd-f57f-448e-9f3c-1c51134c48dd

Hashimy, Nawfal Al

e73b96f2-bf15-40cb-9af5-23c10ea8e319

Wang, Guojun

Wang, Haozhe

Min, Geyong

Georgalas, Nektarios

Meng, Weizhi

13 March 2024