Convolutional LSTM networks for subcellular localization of proteins
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Convolutional LSTM networks for subcellular localization of proteins. / Sønderby, Søren Kaae; Sønderby, Casper Kaae; Nielsen, Henrik; Winther, Ole.
Algorithms for Computational Biology. ed. / Adrian-Horia Dediu; Francisco Hernández-Quiroz; Carlos Martín-Vide; David A. Rosenblueth. Springer, 2015. p. 68-80 (Lecture notes in computer science, Vol. 9199).Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Convolutional LSTM networks for subcellular localization of proteins
AU - Sønderby, Søren Kaae
AU - Sønderby, Casper Kaae
AU - Nielsen, Henrik
AU - Winther, Ole
PY - 2015
Y1 - 2015
N2 - Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biologically relevant knowledge from the LSTM networks.
AB - Machine learning is widely used to analyze biological sequence data. Non-sequential models such as SVMs or feed-forward neural networks are often used although they have no natural way of handling sequences of varying length. Recurrent neural networks such as the long short term memory (LSTM) model on the other hand are designed to handle sequences. In this study we demonstrate that LSTM networks predict the subcellular location of proteins given only the protein sequence with high accuracy (0.902) outperforming current state of the art algorithms. We further improve the performance by introducing convolutional filters and experiment with an attention mechanism which lets the LSTM focus on specific parts of the protein. Lastly we introduce new visualizations of both the convolutional filters and the attention mechanisms and show how they can be used to extract biologically relevant knowledge from the LSTM networks.
KW - Convolutional networks
KW - Deep learning
KW - LSTM
KW - Machine learning
KW - Neural networks
KW - RNN
KW - Subcellular location
U2 - 10.1007/978-3-319-21233-3_6
DO - 10.1007/978-3-319-21233-3_6
M3 - Article in proceedings
AN - SCOPUS:84951119143
SN - 978-3-319-21232-6
T3 - Lecture notes in computer science
SP - 68
EP - 80
BT - Algorithms for Computational Biology
A2 - Dediu, Adrian-Horia
A2 - Hernández-Quiroz, Francisco
A2 - Martín-Vide, Carlos
A2 - Rosenblueth, David A.
PB - Springer
T2 - 2nd International Conference on Algorithms for Computational Biology, AlCoB 2015
Y2 - 4 August 2015 through 5 August 2015
ER -
ID: 153446256