Detecting sequence signals in targeting peptides using deep learning

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Detecting sequence signals in targeting peptides using deep learning. / Almagro Armenteros, Jose Juan; Salvatore, Marco; Emanuelsson, Olof; Winther, Ole; von Heijne, Gunnar; Elofsson, Arne; Nielsen, Henrik.

In: Life Science Alliance, Vol. 2, No. 5, 201900429, 2019.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Almagro Armenteros, JJ, Salvatore, M, Emanuelsson, O, Winther, O, von Heijne, G, Elofsson, A & Nielsen, H 2019, 'Detecting sequence signals in targeting peptides using deep learning', Life Science Alliance, vol. 2, no. 5, 201900429. https://doi.org/10.26508/lsa.201900429

APA

Almagro Armenteros, J. J., Salvatore, M., Emanuelsson, O., Winther, O., von Heijne, G., Elofsson, A., & Nielsen, H. (2019). Detecting sequence signals in targeting peptides using deep learning. Life Science Alliance, 2(5), [201900429]. https://doi.org/10.26508/lsa.201900429

Vancouver

Almagro Armenteros JJ, Salvatore M, Emanuelsson O, Winther O, von Heijne G, Elofsson A et al. Detecting sequence signals in targeting peptides using deep learning. Life Science Alliance. 2019;2(5). 201900429. https://doi.org/10.26508/lsa.201900429

Author

Almagro Armenteros, Jose Juan ; Salvatore, Marco ; Emanuelsson, Olof ; Winther, Ole ; von Heijne, Gunnar ; Elofsson, Arne ; Nielsen, Henrik. / Detecting sequence signals in targeting peptides using deep learning. In: Life Science Alliance. 2019 ; Vol. 2, No. 5.

Bibtex

@article{6594f1026a5142fa9d2fb92021655dc0,
title = "Detecting sequence signals in targeting peptides using deep learning",
abstract = "In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20{\%} in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30{\%} of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60{\%} for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.",
author = "{Almagro Armenteros}, {Jose Juan} and Marco Salvatore and Olof Emanuelsson and Ole Winther and {von Heijne}, Gunnar and Arne Elofsson and Henrik Nielsen",
year = "2019",
doi = "10.26508/lsa.201900429",
language = "English",
volume = "2",
journal = "Life Science Alliance",
issn = "2575-1077",
publisher = "Life Science Alliance",
number = "5",

}

RIS

TY - JOUR

T1 - Detecting sequence signals in targeting peptides using deep learning

AU - Almagro Armenteros, Jose Juan

AU - Salvatore, Marco

AU - Emanuelsson, Olof

AU - Winther, Ole

AU - von Heijne, Gunnar

AU - Elofsson, Arne

AU - Nielsen, Henrik

PY - 2019

Y1 - 2019

N2 - In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.

AB - In bioinformatics, machine learning methods have been used to predict features embedded in the sequences. In contrast to what is generally assumed, machine learning approaches can also provide new insights into the underlying biology. Here, we demonstrate this by presenting TargetP 2.0, a novel state-of-the-art method to identify N-terminal sorting signals, which direct proteins to the secretory pathway, mitochondria, and chloroplasts or other plastids. By examining the strongest signals from the attention layer in the network, we find that the second residue in the protein, that is, the one following the initial methionine, has a strong influence on the classification. We observe that two-thirds of chloroplast and thylakoid transit peptides have an alanine in position 2, compared with 20% in other plant proteins. We also note that in fungi and single-celled eukaryotes, less than 30% of the targeting peptides have an amino acid that allows the removal of the N-terminal methionine compared with 60% for the proteins without targeting peptide. The importance of this feature for predictions has not been highlighted before.

U2 - 10.26508/lsa.201900429

DO - 10.26508/lsa.201900429

M3 - Journal article

C2 - 31570514

AN - SCOPUS:85072779066

VL - 2

JO - Life Science Alliance

JF - Life Science Alliance

SN - 2575-1077

IS - 5

M1 - 201900429

ER -

ID: 230743395