README: Chapter 4 – Seasonal Variation in the Vitamin C Content of UK-Marketed Strawberries
============================================================================================

Thesis Title:
An Investigation of Overlooked Complexities Affecting UK Vitamin C Security and the Potential for Local Crops to Address Insecurities: A Case of UK Strawberries

Author: David Fisher  
Supervisors: Eleftheria Stavridou | Jenny Baverstock | Guy Poppy  
Date: 25/06/2025  
Contact:  
- D.Fisher@soton.ac.uk  
- eleftheria.stavridou@niab.com  
- J.Baverstock@soton.ac.uk  

--------------------------------------------------------------------------------------------

Description
-----------
Chapter 4 investigates seasonal variation in the vitamin C content of UK-marketed fresh strawberries. The study addresses the absence of seasonal data in CoFID (discussed in Chapter 2), despite seasonal patterns in consumer purchasing. The analysis focuses on identifying patterns in origin, cultivar types, and nutritional composition across seasons.

--------------------------------------------------------------------------------------------

Date Ranges
-----------
- **Survey Period**: 23 March 2023 – 01 March 2024  
- See thesis Section 4.2 for full methodological details.

Geographic Information
----------------------
- All samples were purchased within **Maidstone and Medway**, Kent, United Kingdom.

--------------------------------------------------------------------------------------------

File Structure
--------------
chapter-04/
│
├── code/
│   ├── acids_processing_func_v0.0.8.R
│   ├── chap-04_analysis-figures.R
│   ├── datacompilation_SS_v0.0.4.R
│   └── StrawbRetailers_v1.0.2.R
│
└── data/
    ├── primary/
    │   ├── hplc/
    │   │   ├── extraction_weights/
    │   │   │   └── 20240730_SS_extraction_weights.csv
    │   │   └── results/
    │   │       ├── 20240702_SS_batch1_recal_results.csv
    │   │       ├── 20240702_SS_batch2_recal_results.csv
    │   │       ├── 20240702_SS_batch3_recal_results.csv
    │   │       ├── 20240702_SS_batch4_recal_results.csv
    │   │       ├── 20240702_SS_batch5_recal_results.csv
    │   │       └── 20240702_SS_batch6_recal_results.csv
    │   └── 20240513_SS_SampleInformation.csv
    │
    ├── processed/
    │   ├── 20241002_SS_compiled_dataset.csv
    │   ├── 20250625_SS_kantar_retailers.RData
    │   └── 20250625_SS_kantar_retailers_strawberries.csv
    │
    └── dictionaries/
        ├── SS_compiled_dataset_dictionary.txt
        └── SS_kantar_retailer_strawberries_dictionary.txt

--------------------------------------------------------------------------------------------

Code Overview: ./code/
----------------------
- **acids_processing_func_v0.0.8.R**  
  Custom function for automating HPLC data processing (see thesis Section 4.2.2.1).

- **chap-04_analysis-figures.R**  
  Annotated R script to analyse processed data and generate publication-ready figures.

- **datacompilation_SS_v0.0.4.R**  
  Integrates primary HPLC and sample metadata into a single dataset. Uses the above function.

- **StrawbRetailers_v1.0.2.R**  
  Extracts UK grocery retailer data from the Kantar dataset (referenced in Chapter 2).

--------------------------------------------------------------------------------------------

Data Overview

./data/primary/
---------------
| Filename                                          | Description                                                                        |
|--------------------------------------------------|------------------------------------------------------------------------------------|
| extraction_weights/20240730_SS_extraction_weights.csv | Mass of dried strawberry material for organic acid extraction              |
| results/20240702_SS_batch*_recal_results.csv     | Organic acid HPLC data (batches 1–6); processed via manual calibration            |
| 20240513_SS_SampleInformation.csv                | Metadata collected from packaging and sample processing                           |

./data/processed/
------------------
| Filename                                        | Description                                                                                     | Dictionary File                                                  |
|------------------------------------------------|--------------------------------------------------------------------------------------------------|------------------------------------------------------------------|
| 20241002_SS_compiled_dataset.csv               | Integrated dataset from `data/primary/` (see datacompilation script)                            | SS_compiled_dataset_dictionary.txt                              |
| 20250625_SS_kantar_retailers.RData             | RData version of the compiled dataset                                                           | SS_compiled_dataset_dictionary.txt                              |
| 20250625_SS_kantar_retailers_strawberries.csv  | Fresh strawberry purchase data from major UK retailers                                           | SS_kantar_retailer_strawberries_dictionary.txt                  |

Identifier Abbreviations
-------------------------
| Identifier | Description                                                                 |
|------------|-----------------------------------------------------------------------------|
| SS         | Supermarket Survey (refers to the full 2023–2024 seasonal analytical study) |
--------------------------------------------------------------------------------------------

Secondary Data Sources
----------------------
- Roberts, J.S. (2023). *Kantar World Panel: Homescan Panel Dataset – United Kingdom fruits and vegetables.*  
  See usage notes for access information.

- Public Health England (2015). *McCance and Widdowson’s The Composition of Foods Integrated Dataset (CoFID).*  
  https://www.gov.uk/government/publications/composition-of-foods-integrated-dataset-cofid  
  Accessed 24/06/2025

- HMRC (2022). *UK Trade Info – HMRC Trade Statistics Database.*  
  https://www.uktradeinfo.com/  
  Accessed 27/09/2024

- Food Standards Agency (2024). *UK Trade Data Visualisation.*  
  https://foodstandards.shinyapps.io/TradeDataVis/  
  Accessed 27/09/2024

--------------------------------------------------------------------------------------------

Usage Notes
-----------
- Only one secondary dataset was directly analysed in Chapter 4 (Kantar). See thesis Section 4.2.3 for discussion of HMRC usage.
- The FSA’s TradeDataVis app is no longer supported. For HMRC data access:
  - Use the original HMRC site (https://www.uktradeinfo.com/)
  - Or recreate the app from: https://github.com/LouisTsiattalou/TradeDataVis
- File naming follows the format: `YYYYMMDD_StudyID_Descriptor.Extension`
  - Example: `20241002_SS_compiled_dataset.csv`
- Each HPLC batch (~58 samples) was processed and saved separately.
  - Files with `_recal_` indicate post-run manual calibration was applied.
- The `.RData` file is equivalent to the main compiled CSV and can be loaded directly in R.
- For details of variables in each dataset, refer to the corresponding dictionary in `/data/dictionaries/`.
	- data dictionaries only included for the processed datafiles that were used for analysis
	- please contact the author if you need more information on the primary datafiles
- All figures used in Chapter 4 are included in the main thesis. Cross-reference `chap-04_analysis-figures.R` for reproduction or revision.
- Recommended tools: R ≥ 4.0, see 'chap-04_analysis-figures.R ' for required packages

