In Silico screening for functional candidates amongst hypothetical proteins

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

In Silico screening for functional candidates amongst hypothetical proteins. / Desler, Claus; Suravajhala, Prashanth; Sanderhoff, May; Rasmussen, Merete; Rasmussen, Lene Juel.

In: BMC Bioinformatics, Vol. 10, 2009, p. 289.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Desler, C, Suravajhala, P, Sanderhoff, M, Rasmussen, M & Rasmussen, LJ 2009, 'In Silico screening for functional candidates amongst hypothetical proteins', BMC Bioinformatics, vol. 10, pp. 289. https://doi.org/10.1186/1471-2105-10-289

APA

Desler, C., Suravajhala, P., Sanderhoff, M., Rasmussen, M., & Rasmussen, L. J. (2009). In Silico screening for functional candidates amongst hypothetical proteins. BMC Bioinformatics, 10, 289. https://doi.org/10.1186/1471-2105-10-289

Vancouver

Desler C, Suravajhala P, Sanderhoff M, Rasmussen M, Rasmussen LJ. In Silico screening for functional candidates amongst hypothetical proteins. BMC Bioinformatics. 2009;10:289. https://doi.org/10.1186/1471-2105-10-289

Author

Desler, Claus ; Suravajhala, Prashanth ; Sanderhoff, May ; Rasmussen, Merete ; Rasmussen, Lene Juel. / In Silico screening for functional candidates amongst hypothetical proteins. In: BMC Bioinformatics. 2009 ; Vol. 10. pp. 289.

Bibtex

@article{ce10a9a0a05f11df928f000ea68e967b,
title = "In Silico screening for functional candidates amongst hypothetical proteins",
abstract = "BACKGROUND: The definition of a hypothetical protein is a protein that is predicted to be expressed from an open reading frame, but for which there is no experimental evidence of translation. Hypothetical proteins constitute a substantial fraction of proteomes of human as well as of other eukaryotes. With the general belief that the majority of hypothetical proteins are the product of pseudogenes, it is essential to have a tool with the ability of pinpointing the minority of hypothetical proteins with a high probability of being expressed. RESULTS: Here, we present an in silico selection strategy where eukaryotic hypothetical proteins are sorted according to two criteria that can be reliably identified in silico: the presence of subcellular targeting signals and presence of characterized protein domains. To validate the selection strategy we applied it on a database of human hypothetical proteins dating to 2006 and compared the proteins predicted to be expressed by our selecting strategy, with their status in 2008. For the comparison we focused on mitochondrial proteins, since considerable amounts of research have focused on this field in between 2006 and 2008. Therefore, many proteins, defined as hypothetical in 2006, have later been characterized as mitochondrial. CONCLUSION: Among the total amount of human proteins hypothetical in 2006, 21% have later been experimentally characterized and 6% of those have been shown to have a role in a mitochondrial context. In contrast, among the selected hypothetical proteins from the 2006 dataset, predicted by our strategy to have a mitochondrial role, 53-62% have later been experimentally characterized, and 85% of these have actually been assigned a role in mitochondria by 2008.Therefore our in silico selection strategy can be used to select the most promising candidates for subsequent in vitro and in vivo analyses.",
author = "Claus Desler and Prashanth Suravajhala and May Sanderhoff and Merete Rasmussen and Rasmussen, {Lene Juel}",
note = "Keywords: Computational Biology; Databases, Protein; Humans; Open Reading Frames; Proteins; Proteome; Proteomics",
year = "2009",
doi = "10.1186/1471-2105-10-289",
language = "English",
volume = "10",
pages = "289",
journal = "B M C Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central Ltd.",

}

RIS

TY - JOUR

T1 - In Silico screening for functional candidates amongst hypothetical proteins

AU - Desler, Claus

AU - Suravajhala, Prashanth

AU - Sanderhoff, May

AU - Rasmussen, Merete

AU - Rasmussen, Lene Juel

N1 - Keywords: Computational Biology; Databases, Protein; Humans; Open Reading Frames; Proteins; Proteome; Proteomics

PY - 2009

Y1 - 2009

N2 - BACKGROUND: The definition of a hypothetical protein is a protein that is predicted to be expressed from an open reading frame, but for which there is no experimental evidence of translation. Hypothetical proteins constitute a substantial fraction of proteomes of human as well as of other eukaryotes. With the general belief that the majority of hypothetical proteins are the product of pseudogenes, it is essential to have a tool with the ability of pinpointing the minority of hypothetical proteins with a high probability of being expressed. RESULTS: Here, we present an in silico selection strategy where eukaryotic hypothetical proteins are sorted according to two criteria that can be reliably identified in silico: the presence of subcellular targeting signals and presence of characterized protein domains. To validate the selection strategy we applied it on a database of human hypothetical proteins dating to 2006 and compared the proteins predicted to be expressed by our selecting strategy, with their status in 2008. For the comparison we focused on mitochondrial proteins, since considerable amounts of research have focused on this field in between 2006 and 2008. Therefore, many proteins, defined as hypothetical in 2006, have later been characterized as mitochondrial. CONCLUSION: Among the total amount of human proteins hypothetical in 2006, 21% have later been experimentally characterized and 6% of those have been shown to have a role in a mitochondrial context. In contrast, among the selected hypothetical proteins from the 2006 dataset, predicted by our strategy to have a mitochondrial role, 53-62% have later been experimentally characterized, and 85% of these have actually been assigned a role in mitochondria by 2008.Therefore our in silico selection strategy can be used to select the most promising candidates for subsequent in vitro and in vivo analyses.

AB - BACKGROUND: The definition of a hypothetical protein is a protein that is predicted to be expressed from an open reading frame, but for which there is no experimental evidence of translation. Hypothetical proteins constitute a substantial fraction of proteomes of human as well as of other eukaryotes. With the general belief that the majority of hypothetical proteins are the product of pseudogenes, it is essential to have a tool with the ability of pinpointing the minority of hypothetical proteins with a high probability of being expressed. RESULTS: Here, we present an in silico selection strategy where eukaryotic hypothetical proteins are sorted according to two criteria that can be reliably identified in silico: the presence of subcellular targeting signals and presence of characterized protein domains. To validate the selection strategy we applied it on a database of human hypothetical proteins dating to 2006 and compared the proteins predicted to be expressed by our selecting strategy, with their status in 2008. For the comparison we focused on mitochondrial proteins, since considerable amounts of research have focused on this field in between 2006 and 2008. Therefore, many proteins, defined as hypothetical in 2006, have later been characterized as mitochondrial. CONCLUSION: Among the total amount of human proteins hypothetical in 2006, 21% have later been experimentally characterized and 6% of those have been shown to have a role in a mitochondrial context. In contrast, among the selected hypothetical proteins from the 2006 dataset, predicted by our strategy to have a mitochondrial role, 53-62% have later been experimentally characterized, and 85% of these have actually been assigned a role in mitochondria by 2008.Therefore our in silico selection strategy can be used to select the most promising candidates for subsequent in vitro and in vivo analyses.

U2 - 10.1186/1471-2105-10-289

DO - 10.1186/1471-2105-10-289

M3 - Journal article

C2 - 19754976

VL - 10

SP - 289

JO - B M C Bioinformatics

JF - B M C Bioinformatics

SN - 1471-2105

ER -

ID: 21205377