Extraction of transcript diversity from scientific literature

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Extraction of transcript diversity from scientific literature. / Shah, Parantu K; Jensen, Lars J; Boué, Stéphanie; Bork, Peer.

In: P L o S Computational Biology, Vol. 1, No. 1, 2005, p. e10.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Shah, PK, Jensen, LJ, Boué, S & Bork, P 2005, 'Extraction of transcript diversity from scientific literature', P L o S Computational Biology, vol. 1, no. 1, pp. e10. https://doi.org/10.1371/journal.pcbi.0010010

APA

Shah, P. K., Jensen, L. J., Boué, S., & Bork, P. (2005). Extraction of transcript diversity from scientific literature. P L o S Computational Biology, 1(1), e10. https://doi.org/10.1371/journal.pcbi.0010010

Vancouver

Shah PK, Jensen LJ, Boué S, Bork P. Extraction of transcript diversity from scientific literature. P L o S Computational Biology. 2005;1(1):e10. https://doi.org/10.1371/journal.pcbi.0010010

Author

Shah, Parantu K ; Jensen, Lars J ; Boué, Stéphanie ; Bork, Peer. / Extraction of transcript diversity from scientific literature. In: P L o S Computational Biology. 2005 ; Vol. 1, No. 1. pp. e10.

Bibtex

@article{ae1d06c87174498a9b9405e38eb7c5e9,
title = "Extraction of transcript diversity from scientific literature",
abstract = "Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term {"}alternative splicing{"} to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/.",
author = "Shah, {Parantu K} and Jensen, {Lars J} and St{\'e}phanie Bou{\'e} and Peer Bork",
year = "2005",
doi = "10.1371/journal.pcbi.0010010",
language = "English",
volume = "1",
pages = "e10",
journal = "P L o S Computational Biology (Online)",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "1",

}

RIS

TY - JOUR

T1 - Extraction of transcript diversity from scientific literature

AU - Shah, Parantu K

AU - Jensen, Lars J

AU - Boué, Stéphanie

AU - Bork, Peer

PY - 2005

Y1 - 2005

N2 - Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term "alternative splicing" to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/.

AB - Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term "alternative splicing" to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/.

U2 - 10.1371/journal.pcbi.0010010

DO - 10.1371/journal.pcbi.0010010

M3 - Journal article

C2 - 16103899

VL - 1

SP - e10

JO - P L o S Computational Biology (Online)

JF - P L o S Computational Biology (Online)

SN - 1553-734X

IS - 1

ER -

ID: 40749400