Poster Presentation The 48th Lorne Conference on Protein Structure and Function 2023

CSM-peptides: a computational approach to rapid identification of therapeutic peptides (#148)

Xiaotong Gu 1 , Carlos Rodrigues 2 , Douglas Pires 3 , David Ascher 2
  1. University of Queensland, St Lucia, Queensland, Australia
  2. University of Queensland and Baker Institute, Melbourne, VIC, Australia
  3. School of Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia

Peptides are versatile molecules that play essential roles in signalling processes, such as growth factors, neurotransmitters and anti-infectives. Given their lower complexity of synthesis and production costs compared to traditional protein-based drugs, peptides are attractive candidates for developing new therapeutics and diagnostics. Increasing interest in these molecules has led to the creation of large collections of experimentally characterised therapeutic peptides, which greatly contributes to the development of data-driven computational approaches. An increasing number of peptides have been identified with a wide variety of therapeutic applications, including treatments for cancer, inflammatory diseases and drug delivery mechanisms. Despite these efforts, experimental screening of novel peptides remains a time-consuming and expensive endeavour.

 

Several computational methods have been proposed to help identify and characterise the functional mechanisms of peptides more efficiently, however, despite these relevant efforts, available approaches present variable performance and lack of easy-to-use interfaces, limiting their use to those with specialist knowledge in addition to not providing mechanisms to facilitate integration within bioinformatics pipelines.

 

Here we propose CSM-peptides, a novel machine learning method for rapid identification of eight different types of therapeutic peptides: Anti-Angiogenic, Anti-Bacterial, Anti-Cancer, Anti-Inflammatory, Anti-Viral, Cell-Penetrating, Quorum Sensing and Surface Binding. Our approach integrates a diverse range of physicochemical properties and sequence-based properties tailored in individual predictive models for each peptide class via supervised learning. CSM-peptides outperform existing approaches, achieving an AUC of up to 0.92 on independent blind tests, and consistent performance on cross-validation. We anticipate CSM-peptides to be of great value in helping to screen large libraries to identify novel peptides with therapeutic potential and have made it freely available as a user-friendly web server and Application Programming Interface at https://biosig.uq.lab.edu.au/csm_peptides.