READ ME File For "Dataset in support of the thesis 'Speech enhancement by using deep learning algorithms"

Dataset DOI:/SOTON/D3161

ReadMe Author: Jianqiao Chi, University of Southampton ORCID ID:0000-0002-6016-5574

This dataset supports the thesis entitled 
AWARDED BY: Univeristy of Southampton
DATE OF AWARD: 2024

DESCRIPTION OF THE DATA
1. https://www.openslr.org/12
LibriSpeech is a corpus of approximately 1000 hours of 16kHz read English speech, prepared by Vassil Panayotov with the assistance of Daniel Povey. The data is derived from read audiobooks from the LibriVox project, and has been carefully segmented and aligned.
Acoustic models, trained on this data set, are available at kaldi-asr.org and language models, suitable for evaluation can be found at http://www.openslr.org/11/.

For more information, see the paper "LibriSpeech: an ASR corpus based on public domain audio books", Vassil Panayotov, Guoguo Chen, Daniel Povey and Sanjeev Khudanpur, ICASSP 2015

2.https://www.openslr.org/17
MUSAN is a corpus of music, speech, and noise recordings.
This work was supported by the National Science Foundation Graduate Research Fellowship under Grant No. 1232825 and by Spoken Communications.

You can cite the data using the following BibTeX entry:

@misc{musan2015,
  author = {David Snyder and Guoguo Chen and Daniel Povey},
  title = {{MUSAN}: {A} {M}usic, {S}peech, and {N}oise {C}orpus},
  year = {2015},
  eprint = {1510.08484},
  note = {arXiv:1510.08484v1}
}

LibriSpeech and Musan dataset Licence MIT Licence


3. source_code.zip
The program from parts of my PhD project.

4.SJ_EXP.zip
The program of the subjective experiment corresponding to the last chapter.
Copyright (c) [2024] [Jianqiao Cui] 
  Source_code.zip and SJ_EXP.zip Licence All rights reserved


This dataset contains:
1.Audio files: (1) https://www.openslr.org/12
	       (2) https://www.openslr.org/17
2.Source code files: (1) source_code.zip
		     (2) SJ_EXP.zip
LibriSpeech and Musan dataset Licence MIT Licence


Date of data collection: 2024 July

Information about geographic location of data collection: University of Southampton 


Related publication:
Jianqiao Chi, Stefan Bleeck. (2024 August). From simulation to reality: tackling data mismatches in speech enhancement with unsupervised pre-training. 53rd International Congress & Exposition on Noise Control Engineering.

Cui, J., & Bleeck, S. (2023, July). Improved Speech Enhancement by Using Both Clean Speech and ‘Clean’Noise. In 2023 IEEE 6th International Conference on Big Data and Artificial Intelligence (BDAI) (pp. 192-196). IEEE.

Cui, J., & Bleeck, S. (2023, July). Parallel Gated Neural Network with Attention Mechanism for Speech Enhancement. In 2023 IEEE 6th International Conference on Big Data and Artificial Intelligence (BDAI) (pp. 197-201). IEEE.

Date that the file was created: July, 2024