READ ME File For Dataset supporting 'Evolutionary algorithms multi-objective benchmarking'

Dataset DOI: 10.5258/SOTON/D2855

ReadMe Author: Sizhe Yuen, University of Southampton orcid.org/0000-0001-5552-2074

This dataset supports the thesis entitled Evolutionary algorithms multi-objective benchmarking
AWARDED BY: University of Southampton
DATE OF AWARD: 2025

Licence:
This dataset is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
You are free to share and adapt the material for any purpose, provided that appropriate credit is given.
For details, see https://creativecommons.org/licenses/by/4.0/


--------------------
DATA & FILE OVERVIEW
--------------------

This dataset contains:

- data: top level directory of the raw function values

  - normal: data for genetic algorithms and particle swarm optimisation algorithms
    - static: for static benchmark problems
    - dynamic: for the dynamic benchmark problems

  - epigenetic: data for the epigenetic algorithm
    - static: the epigenetic algorithm data for static benchmark problems
    - dynamic: the epigenetic algorithm data for dynamic benchmark problems

- summaries: top level directory of the summary files

- tvos: data for the last chapter on voyage optimisation, at the high level are the averaged csvs for the entire dataset 
  - [Route]: data for that specific route, contains the averaged csv data for that route
    - Data: the Pareto fronts and best solution for each run, files are labelled "Epi" or "No_Epi"


Raw data directories in "data" are organised by [Algorithm]/[Problem]

Within each [Problem] directory are FUN tsv files, which are the function outputs of that algorithm on that problem in the format:

FUN.[N].tsv.[FE]

Where N is the run number and FE is the number of function evaluations, outputs are saved every 1000 function evaluations.

Summary files are single csv files which collate all the raw data and with the calculated performance metrics (IGD and HV).




Relationship between files, if important for context:  

Additional related data collected that was not included in the current data package:

If data was derived from another source, list source:

If there are there multiple versions of the dataset, list the file updated, when and why update was made:



--------------------------
METHODOLOGICAL INFORMATION
--------------------------

Description of methods used for collection/generation of data:

Data is collected by running benchmark problems on a number of algorithms, and variants of the epigenetic algorithm in the thesis. Benchmark problems used are described in the thesis. Data is collected every 1000 function evaluations to show the progression of each algorithm throughout the optimisation. Function evaluations are used as the termination condition, to ensure fairness between algorithms and to be agnostic to the computational hardware. An external archive is used to store the best solutions found by the algorithms.

Benchmarking and external archive reference: https://ryojitanabe.github.io/pdf/to-gecco2017.pdf


Methods for processing the data:

IGD and HV performance metrics are calculated to determine performance of the data. For the HV calculation, the reference point is the point 10% worse than the nadir point of the benchmark problem. The processing in the dataset only collates all the performance metrics into the summary csv files.

Software- or Instrument-specific information needed to interpret the data, including software and hardware version numbers:

Python used to run the benchmarks and calculate the metrics, you can find the code and software requirements at the repository: https://github.com/12yuens2/cmlsga-jmetalpy

Standards and calibration information, if appropriate: N/A

Environmental/experimental conditions: N/A

Describe any quality-assurance procedures performed on the data:

People involved with sample collection, processing, analysis and/or submission: N/A


Date that the file was created: October 2025