The University of Southampton
University of Southampton Institutional Repository
Warning ePrints Soton is experiencing an issue with some file downloads not being available. We are working hard to fix this. Please bear with us.

Data from: The architecture of an empirical genotype-phenotype map

Data from: The architecture of an empirical genotype-phenotype map
Data from: The architecture of an empirical genotype-phenotype map
Recent advances in high-throughput technologies are bringing the study of empirical genotype-phenotype (GP) maps to the fore. Here, we use data from protein binding microarrays to study an empirical GP map of transcription factor (TF) binding preferences. In this map, each genotype is a DNA sequence. The phenotype of this DNA sequence is its ability to bind one or more TFs. We study this GP map using genotype networks, in which nodes represent genotypes with the same phenotype, and edges connect nodes if their genotypes differ by a single small mutation. We describe the structure and arrangement of genotype networks within the space of all possible binding sites for 525 TFs from three eukaryotic species encompassing three kingdoms of life (animal, plant, and fungi). We thus provide a high-resolution depiction of the architecture of an empirical GP map. Among a number of findings, we show that these genotype networks are “small-world” and assortative, and that they ubiquitously overlap and interface with one another. We also use polymorphism data from Arabidopsis thaliana to show how genotype network structure influences the evolution of TF binding sites in vivo. We discuss our findings in the context of regulatory evolution.,The architecture of an empirical genotype-phenotype mapThis DRYAD package contains files from: Aguilar-Rodríguez, J., Peel, L., Stella, M., Wagner, A., and Payne, J. L. The architecture of an empirical genotype-phenotype map. This package contains the network files in GML format for the genotype space of transcription factor (TF) binding sites ('genotype_space.gml'), 525 genotype networks of TF binding sites, and 66 genotype networks of DNA binding domains. The genotype networks of TF binding sites are classified in three directories according to their species provenance ('Arabidopsis_thaliana', 'Mus_musculus,' and 'Neurospora_crassa'). Each network file is named with the TF name. More information about these networks can be found in Table S1. The genotype networks of DNA binding domains are within a 'domains' sub-folder that can be found inside each of the three species folders. Each file is named with the DNA binding domain class. Each network file has the following vertex attributes: - id: vertex identification number. - sequence: the nucleotide sequence of the binding site. - reversecomplement: the reverse complement of 'sequence.' Genotype network of TF binding sites have the following additional vertex attributes: - Escore: the enrichment score in protein binding microarrays of the sequence. - PartitionSBM: Information about the stochastic block model partition group where the vertex is found: '0', '1', or 'None'. 'None' is for vertices not found in the dominant genotype network. - PartitionBA: Information about the binding affinity partition group where the vertex is found: '0', '1', or 'None'. 'None' is for vertices not found in the dominant genotype network. For questions regarding these data, contact Joshua Payne at joshua.payne@env.ethz.ch or Andreas Wagner at andreas.wagner@ieu.uzh.ch.dryad.zip,
DRYAD
Aguilar-Rodriguez, Jose
120e28d4-bb3a-4a22-9e74-a13750b802e9
Peel, Leto
502a7ee9-369e-4b4e-8a75-d1e8d97896e1
Stella, Massimo
37822c93-2522-4bc0-b840-ca32c75efbd7
Wagner, Andreas
768ea5d7-8131-4ef6-9da7-a6d8dedd2857
Payne, Joshua L.
4d990a3c-504b-4a15-936a-9fddcd105467
Aguilar-Rodriguez, Jose
120e28d4-bb3a-4a22-9e74-a13750b802e9
Peel, Leto
502a7ee9-369e-4b4e-8a75-d1e8d97896e1
Stella, Massimo
37822c93-2522-4bc0-b840-ca32c75efbd7
Wagner, Andreas
768ea5d7-8131-4ef6-9da7-a6d8dedd2857
Payne, Joshua L.
4d990a3c-504b-4a15-936a-9fddcd105467

Peel, Leto, Wagner, Andreas and Payne, Joshua L. (2018) Data from: The architecture of an empirical genotype-phenotype map. DRYAD doi:10.5061/dryad.5fb633t [Dataset]

Record type: Dataset

Abstract

Recent advances in high-throughput technologies are bringing the study of empirical genotype-phenotype (GP) maps to the fore. Here, we use data from protein binding microarrays to study an empirical GP map of transcription factor (TF) binding preferences. In this map, each genotype is a DNA sequence. The phenotype of this DNA sequence is its ability to bind one or more TFs. We study this GP map using genotype networks, in which nodes represent genotypes with the same phenotype, and edges connect nodes if their genotypes differ by a single small mutation. We describe the structure and arrangement of genotype networks within the space of all possible binding sites for 525 TFs from three eukaryotic species encompassing three kingdoms of life (animal, plant, and fungi). We thus provide a high-resolution depiction of the architecture of an empirical GP map. Among a number of findings, we show that these genotype networks are “small-world” and assortative, and that they ubiquitously overlap and interface with one another. We also use polymorphism data from Arabidopsis thaliana to show how genotype network structure influences the evolution of TF binding sites in vivo. We discuss our findings in the context of regulatory evolution.,The architecture of an empirical genotype-phenotype mapThis DRYAD package contains files from: Aguilar-Rodríguez, J., Peel, L., Stella, M., Wagner, A., and Payne, J. L. The architecture of an empirical genotype-phenotype map. This package contains the network files in GML format for the genotype space of transcription factor (TF) binding sites ('genotype_space.gml'), 525 genotype networks of TF binding sites, and 66 genotype networks of DNA binding domains. The genotype networks of TF binding sites are classified in three directories according to their species provenance ('Arabidopsis_thaliana', 'Mus_musculus,' and 'Neurospora_crassa'). Each network file is named with the TF name. More information about these networks can be found in Table S1. The genotype networks of DNA binding domains are within a 'domains' sub-folder that can be found inside each of the three species folders. Each file is named with the DNA binding domain class. Each network file has the following vertex attributes: - id: vertex identification number. - sequence: the nucleotide sequence of the binding site. - reversecomplement: the reverse complement of 'sequence.' Genotype network of TF binding sites have the following additional vertex attributes: - Escore: the enrichment score in protein binding microarrays of the sequence. - PartitionSBM: Information about the stochastic block model partition group where the vertex is found: '0', '1', or 'None'. 'None' is for vertices not found in the dominant genotype network. - PartitionBA: Information about the binding affinity partition group where the vertex is found: '0', '1', or 'None'. 'None' is for vertices not found in the dominant genotype network. For questions regarding these data, contact Joshua Payne at joshua.payne@env.ethz.ch or Andreas Wagner at andreas.wagner@ieu.uzh.ch.dryad.zip,

This record has no associated files available for download.

More information

Published date: 1 January 2018

Identifiers

Local EPrints ID: 448824
URI: http://eprints.soton.ac.uk/id/eprint/448824
PURE UUID: f892022f-8e4f-4487-98f1-ee86fd01fb2f

Catalogue record

Date deposited: 06 May 2021 16:31
Last modified: 06 May 2021 16:31

Export record

Altmetrics

Contributors

Contributor: Jose Aguilar-Rodriguez
Creator: Leto Peel
Contributor: Massimo Stella
Creator: Andreas Wagner
Creator: Joshua L. Payne

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×