The University of Southampton
University of Southampton Institutional Repository

Text Simplification with Deep Neural Network Using Knowledge Transfer

Text Simplification with Deep Neural Network Using Knowledge Transfer
Text Simplification with Deep Neural Network Using Knowledge Transfer
Text simplification aims to rephrase complex text into simpler text, where the text we are mainly considering is the English text sentences. Transfer learning from pre-trained text embeddings and models has recently shown great success on a range of natural language processing tasks and is, therefore, a focus method for our work. This thesis’s first focus is to avoid using parallel corpus with sentence pairs. We propose an unsupervised method to overcome the need for parallel data and similarity constraint loss for preserving the original meaning. Moreover, an asymmetric denoising technique is adopted better to learn various features from sentences with different complexity. The results demonstrate that the denoising method can improve the performance, and the content similarity constraint can help preserve the content in our unsupervised method. The second focus of this thesis is to define a novel approach to refining the existing noisy parallel datasets available for text simplification. After refining the dataset, our approach involves fine-tuning a pre-trained language model with a new proposed tuning strategy and decoding with a task-specific strategy. Our data refining method can generate a better dataset for the text simplification task, and the proposed fine-tuning strategy will accelerate model convergence. Moreover, the decoding strategy can greatly improve the model’s performance. The third focus of this thesis is to propose a prompting-based method without model fine-tuning. The proposed method transfers the text simplification task to the text denoising task with adaptive prompts. Our decoding vocabulary constraint technology also makes the output sentence simplicity controllable. The extensive experiments show that our proposed methodology can achieve state-of-the-art results considering many of the automatic evaluation metrics.
University of Southampton
He, Wei
8ca42b4c-b746-42ff-b9ec-43c92d89581a
He, Wei
8ca42b4c-b746-42ff-b9ec-43c92d89581a
Farrahi, Kate
bc848b9c-fc32-475c-b241-f6ade8babacb
Prugel-Bennett, Adam
b107a151-1751-4d8b-b8db-2c395ac4e14e

He, Wei (2023) Text Simplification with Deep Neural Network Using Knowledge Transfer. University of Southampton, Doctoral Thesis, 143pp.

Record type: Thesis (Doctoral)

Abstract

Text simplification aims to rephrase complex text into simpler text, where the text we are mainly considering is the English text sentences. Transfer learning from pre-trained text embeddings and models has recently shown great success on a range of natural language processing tasks and is, therefore, a focus method for our work. This thesis’s first focus is to avoid using parallel corpus with sentence pairs. We propose an unsupervised method to overcome the need for parallel data and similarity constraint loss for preserving the original meaning. Moreover, an asymmetric denoising technique is adopted better to learn various features from sentences with different complexity. The results demonstrate that the denoising method can improve the performance, and the content similarity constraint can help preserve the content in our unsupervised method. The second focus of this thesis is to define a novel approach to refining the existing noisy parallel datasets available for text simplification. After refining the dataset, our approach involves fine-tuning a pre-trained language model with a new proposed tuning strategy and decoding with a task-specific strategy. Our data refining method can generate a better dataset for the text simplification task, and the proposed fine-tuning strategy will accelerate model convergence. Moreover, the decoding strategy can greatly improve the model’s performance. The third focus of this thesis is to propose a prompting-based method without model fine-tuning. The proposed method transfers the text simplification task to the text denoising task with adaptive prompts. Our decoding vocabulary constraint technology also makes the output sentence simplicity controllable. The extensive experiments show that our proposed methodology can achieve state-of-the-art results considering many of the automatic evaluation metrics.

Text
Wei_he_phd_a-3b - Version of Record
Available under License University of Southampton Thesis Licence.
Download (1MB)
Text
results
Available under License University of Southampton Thesis Licence.
Download (42kB)
Text
Final-thesis-submission-Examination-Mr-Wei-He
Restricted to Repository staff only

More information

Published date: 2023

Identifiers

Local EPrints ID: 482361
URI: http://eprints.soton.ac.uk/id/eprint/482361
PURE UUID: d275a661-e46a-4290-8f21-0878d9a09d63
ORCID for Kate Farrahi: ORCID iD orcid.org/0000-0001-6775-127X

Catalogue record

Date deposited: 27 Sep 2023 17:05
Last modified: 17 Mar 2024 03:47

Export record

Contributors

Author: Wei He
Thesis advisor: Kate Farrahi ORCID iD
Thesis advisor: Adam Prugel-Bennett

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×