LSPpred Suite: Tools for Leaderless Secretory Protein Prediction in Plants

Andrew Lonsdale; Laura Ceballos-Laita; Daisuke Takahashi; Matsuo Uemura; Javier Abadía; Melissa J Davis; Antony Bacic; Monika S Doblin

doi:10.3390/plants12071428

LSPpred Suite: Tools for Leaderless Secretory Protein Prediction in Plants

Plants (Basel). 2023 Mar 23;12(7):1428. doi: 10.3390/plants12071428.

Authors

Andrew Lonsdale¹, Laura Ceballos-Laita², Daisuke Takahashi³, Matsuo Uemura⁴, Javier Abadía², Melissa J Davis⁵, Antony Bacic¹, Monika S Doblin¹

Affiliations

¹ ARC Centre of Excellence in Plant Cell Walls, School of BioSciences, The University of Melbourne, Melbourne, VIC 3010, Australia.
² Plant Stress Physiology Group, Plant Nutrition Department, Aula Dei Experimental Station, CSIC, P.O. Box 13034, 50080 Zaragoza, Spain.
³ United Graduate School of Agricultural Sciences, Iwate University, Morioka 020-8550, Japan.
⁴ Faculty of Agriculture, Iwate University, Morioka 020-8550, Japan.
⁵ Bioinformatics, Walter and Eliza Hall Institute for Medical Research, Melbourne, VIC 3052, Australia.

Abstract

Plant proteins that are secreted without a classical signal peptide leader sequence are termed leaderless secretory proteins (LSPs) and are implicated in both plant development and (a)biotic stress responses. In plant proteomics experimental workflows, identification of LSPs is hindered by the possibility of contamination from other subcellar compartments upon purification of the secretome. Applying machine learning algorithms to predict LSPs in plants is also challenging due to the rarity of experimentally validated examples for training purposes. This work attempts to address this issue by establishing criteria for identifying potential plant LSPs based on experimental observations and training random forest classifiers on the putative datasets. The resultant plant protein database LSPDB and bioinformatic prediction tools LSPpred and SPLpred are available at lsppred.lspdb.org. The LSPpred and SPLpred modules are internally validated on the training dataset, with false positives controlled at 5%, and are also able to classify the limited number of established plant LSPs (SPLpred (3/4, LSPpred 4/4). Until such time as a larger set of bona fide (independently experimentally validated) LSPs is established using imaging technologies (light/fluorescence/electron microscopy) to confirm sub-cellular location, these tools represent a bridging method for predicting and identifying plant putative LSPs for subsequent experimental validation.

Keywords: leaderless secretory proteins; subcellular localisation prediction; unconventional protein secretion.

Abstract

Grants and funding