Text Simplification with Deep Neural Network Using Knowledge Transfer
Text Simplification with Deep Neural Network Using Knowledge Transfer
Text simplification aims to rephrase complex text into simpler text, where the text we are mainly considering is the English text sentences. Transfer learning from pre-trained text embeddings and models has recently shown great success on a range of natural language processing tasks and is, therefore, a focus method for our work. This thesis’s first focus is to avoid using parallel corpus with sentence pairs. We propose an unsupervised method to overcome the need for parallel data and similarity constraint loss for preserving the original meaning. Moreover, an asymmetric denoising technique is adopted better to learn various features from sentences with different complexity. The results demonstrate that the denoising method can improve the performance, and the content similarity constraint can help preserve the content in our unsupervised method. The second focus of this thesis is to define a novel approach to refining the existing noisy parallel datasets available for text simplification. After refining the dataset, our approach involves fine-tuning a pre-trained language model with a new proposed tuning strategy and decoding with a task-specific strategy. Our data refining method can generate a better dataset for the text simplification task, and the proposed fine-tuning strategy will accelerate model convergence. Moreover, the decoding strategy can greatly improve the model’s performance. The third focus of this thesis is to propose a prompting-based method without model fine-tuning. The proposed method transfers the text simplification task to the text denoising task with adaptive prompts. Our decoding vocabulary constraint technology also makes the output sentence simplicity controllable. The extensive experiments show that our proposed methodology can achieve state-of-the-art results considering many of the automatic evaluation metrics.
University of Southampton
He, Wei
8ca42b4c-b746-42ff-b9ec-43c92d89581a
2023
He, Wei
8ca42b4c-b746-42ff-b9ec-43c92d89581a
Farrahi, Kate
bc848b9c-fc32-475c-b241-f6ade8babacb
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e
He, Wei
(2023)
Text Simplification with Deep Neural Network Using Knowledge Transfer.
University of Southampton, Doctoral Thesis, 143pp.
Record type:
Thesis
(Doctoral)
Abstract
Text simplification aims to rephrase complex text into simpler text, where the text we are mainly considering is the English text sentences. Transfer learning from pre-trained text embeddings and models has recently shown great success on a range of natural language processing tasks and is, therefore, a focus method for our work. This thesis’s first focus is to avoid using parallel corpus with sentence pairs. We propose an unsupervised method to overcome the need for parallel data and similarity constraint loss for preserving the original meaning. Moreover, an asymmetric denoising technique is adopted better to learn various features from sentences with different complexity. The results demonstrate that the denoising method can improve the performance, and the content similarity constraint can help preserve the content in our unsupervised method. The second focus of this thesis is to define a novel approach to refining the existing noisy parallel datasets available for text simplification. After refining the dataset, our approach involves fine-tuning a pre-trained language model with a new proposed tuning strategy and decoding with a task-specific strategy. Our data refining method can generate a better dataset for the text simplification task, and the proposed fine-tuning strategy will accelerate model convergence. Moreover, the decoding strategy can greatly improve the model’s performance. The third focus of this thesis is to propose a prompting-based method without model fine-tuning. The proposed method transfers the text simplification task to the text denoising task with adaptive prompts. Our decoding vocabulary constraint technology also makes the output sentence simplicity controllable. The extensive experiments show that our proposed methodology can achieve state-of-the-art results considering many of the automatic evaluation metrics.
Text
Wei_he_phd_a-3b
- Version of Record
Text
Final-thesis-submission-Examination-Mr-Wei-He
Restricted to Repository staff only
More information
Published date: 2023
Identifiers
Local EPrints ID: 482361
URI: http://eprints.soton.ac.uk/id/eprint/482361
PURE UUID: d275a661-e46a-4290-8f21-0878d9a09d63
Catalogue record
Date deposited: 27 Sep 2023 17:05
Last modified: 17 Mar 2024 03:47
Export record
Contributors
Author:
Wei He
Thesis advisor:
Kate Farrahi
Thesis advisor:
Adam Prugel-Bennett
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics