READ ME File For 'Dataset for Doctoral Thesis' Dataset DOI: 10.5258/SOTON/D2934 ReadMe Author: Jonathan Anns, University of Southampton 0009-0005-8349-4986 This dataset supports the thesis entitled 'Physiology of rest/activity cycles in Drosophila melanogaster' AWARDED BY: Univeristy of Southampton DATE OF AWARD: 2024 DESCRIPTION OF THE DATA Data was collected in two parts, Chapter 3-4 and Chapter 5. Chapter3-4 was collected from a custom-made video tracking assay and the outputs are binary files (way too large to deposit). These were converted to .csv files and I performed analysis to extract useful .csv files which were analysed in this thesis (these were deposited here). Images deposited here were also extracted from binary files too large to deposit here. Chapter 5 data was collected from TopCount raw data files and uploaded to biodare2.com. The .csv files were extracted for analysis (deposited here). Rhythmic analysis files were also created with biodare2.com and the .csv files were extracted for analysis (deposited here). The code for all parts of the thesis was generated in Jupyter Notebooks using VSCode software. The code requires JuypterLab, Jupyter Notebook, or many other python coding interfaces like VSCode to read. This dataset contains: Zipped Folder 'CaLexA_LUC_Experiments' - Each file represents a neuronal/cell population specific expression of CaLexA-LUC. Each row is a timepoint (Hours) and each numbered column is an individual fly's bioluminesence. Some files also have the mean and CI for each timepoint included.The 'elav_cales_DAM_sleep' includes averaged DAM data for elav>CaLexA_LUC female flies. Relevant to Chapter 5. Zipped Folder 'cre_or_tim_LUC_Experiments' - Each file represents a neuronal/cell population specific expression of cre-LUC or tim-LUC. Each row is a timepoint (Hours) and each numbered column is an individual fly's bioluminesence. Some files also have the mean and CI for each timepoint included. Relevant to Chapter 5. Zipped Folder 'TRIC_LUC_Experiments' - Each file represents a neuronal/cell population specific expression of TRIC-LUC. Each row is a timepoint (Hours) and each numbered column is an individual fly's bioluminesence. Some files also have the mean and CI for each timepoint included. Relevant to Chapter 5. Zipped Folder 'NanoLuc_Constructs_and_Tests' - Each snapgene file contains the raw DNA code for each NanoLuc construct created. The data files (.csv) represent experiments of panneuronal expression of specific NanoLuc constructs or of CaLexA-LUC. Each row is a timepoint (Hours) and each numbered column is an individual fly's bioluminesence. Some files also have the mean and CI for each timepoint included. Relevant to Chapter 5. Zipped Folder 'Images_for_Posture_and_Tracking' - Contains subfolders of start and end posture images of three wild-type berlin-k male flies. It also contains tracking photos and photos of the Trumelan experimental design. It also contains images of fly posture used to validate the body angle postural metric. An additional .csv file 'BodyAngle_Validation' within the 'Validation_BA' subfolder contains the hand drawn versus tracking comparison of body angle posture for validation. Relevant to Chapter 2 and 4. Zipped Folder 'Rhythmic_Analysis' - Each file represents a genotype (e.g. bkm = Berlin-K males) and a behavioural state (e.g. SSL) and whether the rhythmic analysis is being performed on behaviour or on ceiling occupancy. Each file consists of the output from a rhythmic analysis experiment from biodare2, whereby the first few rows represent the metadata, and then each row consists of individual flies and each column represents the various metrics recorded by the rhythmic analysis. Relevant to Chapter 4.5. Zipped Folder 'Posture' - Each file contains a specific genotype and their averaged behavioural changes during a rest bout. The first row is the start of the averaged rest bout and the subsequent rows represent how various metrics change as the rest bout progresses. Relevant to Chapter 4.1. Zipped Folder 'XYH'- Contains all the important raw data for most of the experiments in Chapter 3 and 4. Each file is an individual genotype, whereby inside each file contains rows of behavioural bouts of each fly within that genotype tested. Each row is an individual bout of a behaviour, each column is a metric (such as the mean speed of the fly) associated with that bout. Zipped Folder 'Timeseries'- Contains files of individual genotypes. Files of genotype_timeseries (e.g., 'per_timeseries') contain rows of timeseries data for each fly and columns contain various behavioural states and their durations at that timepoint. Files of genotype_DAM_comparison_timeseries contain rows of DAM classified timepoints and columns of the behavioural states and their durations at that timepoint. Relevant to Chapter 3.4. Zipped Folder 'behavioural_tracker_validation' - Contains the machine learning models and training/test data (behavioural tracking models), alongside the raw annotation file (e.g. for 45fps it is '80flies_addedmetrics' and '10fps_6flies_addedmetrics' for 10fps) which contains each frame annotated with a given behaviour and the associated metrics. Zipped Folder 'Raw_Fly_Data_for_Chapter3.1' - Contains two files, each represents an individual fly where the rows are timepoints and the columns are the associated behavioural metrics. These files are used in Chapter 3.1 to illustrate the assay. Zipped Folder 'Code_for_Analysis' - This folder contains a Jupyter notebook (code) file for each chapter/subchapter, where the data analysis was performed. These can be read by JuypterLab, Jupyter Notebook, or many other python coding interfaces like VSCode. The file 'TML' has source code which I use in the notebooks to analyse data. The subfolder 'DABEST-python-Horizontal' contains source code to analyse the data also. Date of data collection: 2019 - 2023 Information about geographic location of data collection: Singapore and Southampton, UK. Licence: CC-BY Date that the file was created: 01, 2024