The University of Southampton
University of Southampton Institutional Repository

AI3SD Video: Machine Learning for biological sequence design

AI3SD Video: Machine Learning for biological sequence design
AI3SD Video: Machine Learning for biological sequence design
Prediction of protein functional properties from sequence is a central challenge that would allow us to discover new proteins with specific functionality. Experimental breakthroughs allow data on the relationship between sequence and function to be rapidly acquired that can be used to train and validate machine learning models that predict protein function directly from sequence. However, the cost and latency of wet-lab experiments require methods that find good sequences in few experimental rounds, where each round contains large batches of sequence designs. In this setting, I will discuss model-based optimization approaches that allow us to take advantage of sample inefficient methods and find diverse optimal sequence candidates for experimental evaluation. The potential of this approach is illustrated through the design and experimental validation of viable AAV capsid protein variants for gene therapy applications.
AI, AI3SD Event, Artificial Intelligence, Machine Intelligence, Machine Learning, ML, Proteins
Colwell, Lucy
d4e85504-2967-48bd-ba8d-90872b93c741
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Colwell, Lucy
d4e85504-2967-48bd-ba8d-90872b93c741
Frey, Jeremy G.
ba60c559-c4af-44f1-87e6-ce69819bf23f
Kanza, Samantha
b73bcf34-3ff8-4691-bd09-aa657dcff420
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Colwell, Lucy (2021) AI3SD Video: Machine Learning for biological sequence design. Frey, Jeremy G., Kanza, Samantha and Niranjan, Mahesan (eds.) AI 4 Proteins Seminar Series 2021. 14 Apr - 17 Jun 2021. (doi:10.5258/SOTON/P0090).

Record type: Conference or Workshop Item (Other)

Abstract

Prediction of protein functional properties from sequence is a central challenge that would allow us to discover new proteins with specific functionality. Experimental breakthroughs allow data on the relationship between sequence and function to be rapidly acquired that can be used to train and validate machine learning models that predict protein function directly from sequence. However, the cost and latency of wet-lab experiments require methods that find good sequences in few experimental rounds, where each round contains large batches of sequence designs. In this setting, I will discuss model-based optimization approaches that allow us to take advantage of sample inefficient methods and find diverse optimal sequence candidates for experimental evaluation. The potential of this approach is illustrated through the design and experimental validation of viable AAV capsid protein variants for gene therapy applications.

Video
AI4Proteins-Seminar-Series-LucyColwell-140421 (1) - Version of Record
Available under License Creative Commons Attribution.
Download (578MB)

More information

Published date: 14 April 2021
Additional Information: Lucy Colwell is a faculty member in chemistry at the University of Cambridge. Her primary interests are in the application of machine learning approaches to better understand the relationship between the sequence and function of biological macromolecules. With collaborators Lucy showed that graphical models built from aligned protein sequences can be used to predict protein tertiary structure and functional attributes. Before moving to Cambridge Lucy received her PhD from Harvard University and was a member at the Institute for Advanced Study in Princeton, NJ. In 2018 Lucy was appointed a Simons Investigator in Mathematical Modeling of Living Systems.
Venue - Dates: AI 4 Proteins Seminar Series 2021, 2021-04-14 - 2021-06-17
Keywords: AI, AI3SD Event, Artificial Intelligence, Machine Intelligence, Machine Learning, ML, Proteins

Identifiers

Local EPrints ID: 450084
URI: http://eprints.soton.ac.uk/id/eprint/450084
PURE UUID: bab24737-8bef-46dc-8b22-a18310b65bd1
ORCID for Jeremy G. Frey: ORCID iD orcid.org/0000-0003-0842-4302
ORCID for Samantha Kanza: ORCID iD orcid.org/0000-0002-4831-9489
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 09 Jul 2021 16:30
Last modified: 17 Mar 2024 03:51

Export record

Altmetrics

Contributors

Author: Lucy Colwell
Editor: Jeremy G. Frey ORCID iD
Editor: Samantha Kanza ORCID iD
Editor: Mahesan Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×