Enhanced function annotations for Drosophila serine proteases: a case study for systematic annotation of multi-member gene families

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Enhanced function annotations for Drosophila serine proteases : a case study for systematic annotation of multi-member gene families. / Shah, Parantu K; Tripathi, Lokesh P; Jensen, Lars Juhl; Gahnim, Murad; Mason, Christopher; Furlong, Eileen E; Rodrigues, Veronica; White, Kevin P; Bork, Peer; Sowdhamini, R.

In: Gene, Vol. 407, No. 1-2, 2008, p. 199-215.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Shah, PK, Tripathi, LP, Jensen, LJ, Gahnim, M, Mason, C, Furlong, EE, Rodrigues, V, White, KP, Bork, P & Sowdhamini, R 2008, 'Enhanced function annotations for Drosophila serine proteases: a case study for systematic annotation of multi-member gene families', Gene, vol. 407, no. 1-2, pp. 199-215. https://doi.org/10.1016/j.gene.2007.10.012

APA

Shah, P. K., Tripathi, L. P., Jensen, L. J., Gahnim, M., Mason, C., Furlong, E. E., Rodrigues, V., White, K. P., Bork, P., & Sowdhamini, R. (2008). Enhanced function annotations for Drosophila serine proteases: a case study for systematic annotation of multi-member gene families. Gene, 407(1-2), 199-215. https://doi.org/10.1016/j.gene.2007.10.012

Vancouver

Shah PK, Tripathi LP, Jensen LJ, Gahnim M, Mason C, Furlong EE et al. Enhanced function annotations for Drosophila serine proteases: a case study for systematic annotation of multi-member gene families. Gene. 2008;407(1-2):199-215. https://doi.org/10.1016/j.gene.2007.10.012

Author

Shah, Parantu K ; Tripathi, Lokesh P ; Jensen, Lars Juhl ; Gahnim, Murad ; Mason, Christopher ; Furlong, Eileen E ; Rodrigues, Veronica ; White, Kevin P ; Bork, Peer ; Sowdhamini, R. / Enhanced function annotations for Drosophila serine proteases : a case study for systematic annotation of multi-member gene families. In: Gene. 2008 ; Vol. 407, No. 1-2. pp. 199-215.

Bibtex

@article{083cff9cc48c4b10b7dfe4f02c577db7,
title = "Enhanced function annotations for Drosophila serine proteases: a case study for systematic annotation of multi-member gene families",
abstract = "Systematically annotating function of enzymes that belong to large protein families encoded in a single eukaryotic genome is a very challenging task. We carried out such an exercise to annotate function for serine-protease family of the trypsin fold in Drosophila melanogaster, with an emphasis on annotating serine-protease homologues (SPHs) that may have lost their catalytic function. Our approach involves data mining and data integration to provide function annotations for 190 Drosophila gene products containing serine-protease-like domains, of which 35 are SPHs. This was accomplished by analysis of structure-function relationships, gene-expression profiles, large-scale protein-protein interaction data, literature mining and bioinformatic tools. We introduce functional residue clustering (FRC), a method that performs hierarchical clustering of sequences using properties of functionally important residues and utilizes correlation co-efficient as a quantitative similarity measure to transfer in vivo substrate specificities to proteases. We show that the efficiency of transfer of substrate-specificity information using this method is generally high. FRC was also applied on Drosophila proteases to assign putative competitive inhibitor relationships (CIRs). Microarray gene-expression data were utilized to uncover a large-scale and dual involvement of proteases in development and in immune response. We found specific recruitment of SPHs and proteases with CLIP domains in immune response, suggesting evolution of a new function for SPHs. We also suggest existence of separate downstream protease cascades for immune response against bacterial/fungal infections and parasite/parasitoid infections. We verify quality of our annotations using information from RNAi screens and other evidence types. Utilization of such multi-fold approaches results in 10-fold increase of function annotation for Drosophila serine proteases and demonstrates value in increasing annotations in multiple genomes.",
author = "Shah, {Parantu K} and Tripathi, {Lokesh P} and Jensen, {Lars Juhl} and Murad Gahnim and Christopher Mason and Furlong, {Eileen E} and Veronica Rodrigues and White, {Kevin P} and Peer Bork and R Sowdhamini",
year = "2008",
doi = "10.1016/j.gene.2007.10.012",
language = "English",
volume = "407",
pages = "199--215",
journal = "Gene",
issn = "0378-1119",
publisher = "Elsevier",
number = "1-2",

}

RIS

TY - JOUR

T1 - Enhanced function annotations for Drosophila serine proteases

T2 - a case study for systematic annotation of multi-member gene families

AU - Shah, Parantu K

AU - Tripathi, Lokesh P

AU - Jensen, Lars Juhl

AU - Gahnim, Murad

AU - Mason, Christopher

AU - Furlong, Eileen E

AU - Rodrigues, Veronica

AU - White, Kevin P

AU - Bork, Peer

AU - Sowdhamini, R

PY - 2008

Y1 - 2008

N2 - Systematically annotating function of enzymes that belong to large protein families encoded in a single eukaryotic genome is a very challenging task. We carried out such an exercise to annotate function for serine-protease family of the trypsin fold in Drosophila melanogaster, with an emphasis on annotating serine-protease homologues (SPHs) that may have lost their catalytic function. Our approach involves data mining and data integration to provide function annotations for 190 Drosophila gene products containing serine-protease-like domains, of which 35 are SPHs. This was accomplished by analysis of structure-function relationships, gene-expression profiles, large-scale protein-protein interaction data, literature mining and bioinformatic tools. We introduce functional residue clustering (FRC), a method that performs hierarchical clustering of sequences using properties of functionally important residues and utilizes correlation co-efficient as a quantitative similarity measure to transfer in vivo substrate specificities to proteases. We show that the efficiency of transfer of substrate-specificity information using this method is generally high. FRC was also applied on Drosophila proteases to assign putative competitive inhibitor relationships (CIRs). Microarray gene-expression data were utilized to uncover a large-scale and dual involvement of proteases in development and in immune response. We found specific recruitment of SPHs and proteases with CLIP domains in immune response, suggesting evolution of a new function for SPHs. We also suggest existence of separate downstream protease cascades for immune response against bacterial/fungal infections and parasite/parasitoid infections. We verify quality of our annotations using information from RNAi screens and other evidence types. Utilization of such multi-fold approaches results in 10-fold increase of function annotation for Drosophila serine proteases and demonstrates value in increasing annotations in multiple genomes.

AB - Systematically annotating function of enzymes that belong to large protein families encoded in a single eukaryotic genome is a very challenging task. We carried out such an exercise to annotate function for serine-protease family of the trypsin fold in Drosophila melanogaster, with an emphasis on annotating serine-protease homologues (SPHs) that may have lost their catalytic function. Our approach involves data mining and data integration to provide function annotations for 190 Drosophila gene products containing serine-protease-like domains, of which 35 are SPHs. This was accomplished by analysis of structure-function relationships, gene-expression profiles, large-scale protein-protein interaction data, literature mining and bioinformatic tools. We introduce functional residue clustering (FRC), a method that performs hierarchical clustering of sequences using properties of functionally important residues and utilizes correlation co-efficient as a quantitative similarity measure to transfer in vivo substrate specificities to proteases. We show that the efficiency of transfer of substrate-specificity information using this method is generally high. FRC was also applied on Drosophila proteases to assign putative competitive inhibitor relationships (CIRs). Microarray gene-expression data were utilized to uncover a large-scale and dual involvement of proteases in development and in immune response. We found specific recruitment of SPHs and proteases with CLIP domains in immune response, suggesting evolution of a new function for SPHs. We also suggest existence of separate downstream protease cascades for immune response against bacterial/fungal infections and parasite/parasitoid infections. We verify quality of our annotations using information from RNAi screens and other evidence types. Utilization of such multi-fold approaches results in 10-fold increase of function annotation for Drosophila serine proteases and demonstrates value in increasing annotations in multiple genomes.

U2 - 10.1016/j.gene.2007.10.012

DO - 10.1016/j.gene.2007.10.012

M3 - Journal article

C2 - 17996400

VL - 407

SP - 199

EP - 215

JO - Gene

JF - Gene

SN - 0378-1119

IS - 1-2

ER -

ID: 40740165