The University of Southampton
University of Southampton Institutional Repository

Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation

Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation
Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation

Recent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for the HIV virus. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram based gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (google colab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/code.

CCR5, Drug discovery, HIV, Machine learning, Molecular docking, Molecular dynamics
0169-7439
Cobre, Alexandre de Fátima
52e7a572-9a8d-47ae-a8ae-04f942177e5d
Ara, Anderson
6fc4c0e9-09a0-40a0-a7ed-599ba0ced321
Alves, Alexessander Couto
87b9179e-abde-4ca5-abfc-4b7c5ac8b03b
Neto, Moisés Maia
41a5f42f-5fe7-4923-b57c-d810effc1f3c
Fachi, Mariana Millan
e9597981-a868-4aad-83b9-11bff1c268ca
Beca, Laize Sílvia dos Anjos Botas
b08184de-7082-4481-b03d-1140db346887
Tonin, Fernanda Stumpf
2563480c-6cf6-47c6-b02e-0f0ac422a7f9
Pontarolo, Roberto
f1338bb3-d7f8-44aa-97f1-c3f3b7bc3ce4
Cobre, Alexandre de Fátima
52e7a572-9a8d-47ae-a8ae-04f942177e5d
Ara, Anderson
6fc4c0e9-09a0-40a0-a7ed-599ba0ced321
Alves, Alexessander Couto
87b9179e-abde-4ca5-abfc-4b7c5ac8b03b
Neto, Moisés Maia
41a5f42f-5fe7-4923-b57c-d810effc1f3c
Fachi, Mariana Millan
e9597981-a868-4aad-83b9-11bff1c268ca
Beca, Laize Sílvia dos Anjos Botas
b08184de-7082-4481-b03d-1140db346887
Tonin, Fernanda Stumpf
2563480c-6cf6-47c6-b02e-0f0ac422a7f9
Pontarolo, Roberto
f1338bb3-d7f8-44aa-97f1-c3f3b7bc3ce4

Cobre, Alexandre de Fátima, Ara, Anderson, Alves, Alexessander Couto, Neto, Moisés Maia, Fachi, Mariana Millan, Beca, Laize Sílvia dos Anjos Botas, Tonin, Fernanda Stumpf and Pontarolo, Roberto (2024) Identifying 124 new anti-HIV drug candidates in a 37 billion-compound database: an integrated approach of machine learning (QSAR), molecular docking, and molecular dynamics simulation. Chemometrics and Intelligent Laboratory Systems, 250, [105145]. (doi:10.1016/j.chemolab.2024.105145).

Record type: Article

Abstract

Recent data from the World Health Organization reveals that in 2023, 38.8 million people were living with HIV. Within this population, there were 1.5 million new cases and 650 thousand deaths attributed to the disease. This study employs an integrated approach involving QSAR-based machine learning models, molecular docking, and molecular dynamics simulations to identify potential compounds for inhibiting the bioactivity of the CC chemokine receptor type 5 (CCR5) protein, a key entry point for the HIV virus. Using non-redundant experimental data from the CHEMBL database, 40 different machine learning algorithms were trained and the top four models (XGBoost, Histogram based gradient Boosting, Light Gradient Boosted Machine, and Extra Trees Regression) were utilized to predict anti-HIV bioactivity for 37 billion compounds in the ZINC-22 database. The screening resulted in the identification of 124 new anti-HIV drug candidates, confirmed through molecular docking and dynamics simulations. The study underscores the therapeutic potential of these compounds, paving the way for further in vitro and in vivo investigations. The convergence of machine learning and experimental findings presents a promising avenue for significant advancements in pharmaceutical research, particularly in the treatment of viral diseases such as HIV. To guarantee the reproducibility of our study, we have made the Python code (google colab) and the associated database available on GitHub. You can access them through the following link: GitHub Link: https://github.com/AlexandreCOBRE/code.

This record has no associated files available for download.

More information

Accepted/In Press date: 12 May 2024
e-pub ahead of print date: 15 July 2024
Keywords: CCR5, Drug discovery, HIV, Machine learning, Molecular docking, Molecular dynamics

Identifiers

Local EPrints ID: 509032
URI: http://eprints.soton.ac.uk/id/eprint/509032
ISSN: 0169-7439
PURE UUID: 1fc8e9c6-0689-48e1-aa30-0ac1b9ed323c
ORCID for Alexessander Couto Alves: ORCID iD orcid.org/0000-0001-8519-7356

Catalogue record

Date deposited: 10 Feb 2026 17:43
Last modified: 07 Mar 2026 04:23

Export record

Altmetrics

Contributors

Author: Alexandre de Fátima Cobre
Author: Anderson Ara
Author: Alexessander Couto Alves ORCID iD
Author: Moisés Maia Neto
Author: Mariana Millan Fachi
Author: Laize Sílvia dos Anjos Botas Beca
Author: Fernanda Stumpf Tonin
Author: Roberto Pontarolo

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×