TY - GEN
T1 - Automatic annotation of GPI sstructures using grid computing
AU - Aguilar-Bonavides, Clemente
AU - Cardenas, Gerardo A.
AU - Nakayasu, Ernesto S.
AU - Gazos-Lopes, Felipe
AU - Almeida, Igor C.
AU - Leung, Ming Ying
N1 - Copyright:
Copyright 2013 Elsevier B.V., All rights reserved.
PY - 2013
Y1 - 2013
N2 - Glycosylphosphatidylinositol (GPI)-anchored proteins are involved in many biological processes and are of medical importance. The identification and analysis of the entire collection of free and protein-linked GPIs within an organism (i.e., GPIomics) requires highly sensitive instruments. At present, liquid chromatography-tandem mass spectrometry (LC-MS/MS or -MSn) is the most efficient laboratory technique for these tasks. As a typical MS n experiment produces hundreds of thousands of spectra, the data analysis creates a major bottleneck in high-throughput GPIomic projects. Yet, no computational tool for characterizing the chemical structures of GPI is available to date. We propose a library-search algorithm to identify GPIs by matching fragment peaks in the spectra with molecular masses derived from a collection of theoretical GPI structures constructed based on properties of all currently known GPIs. A theoretically possible GPI structure is assessed by a scoring scheme that incorporates its fitness values for individual observed spectra as well as its frequency of being considered as a good fit. The algorithm has been tested on a set of experimentally confirmed GPIs for the protozoan parasite Trypanosoma cruzi, the causative agent of Chagas disease. The final list of 4686 predicted GPI candidates contains 76 out of the 78 known structures in the test set. Over 70% of the known structures have fitness values among the top 647. The first version of this tool runs on a single computer, the results are obtained within 10 days. A second version has been developed using the Condor high throughput computing distributed environment; for the same amount of data results are obtained within 3 days, 19 hours, 38 minutes. This computational tool is expected to quicken the discovery and characterization of GPI molecules, increasing the number of experimentally confirmed GPI structures.
AB - Glycosylphosphatidylinositol (GPI)-anchored proteins are involved in many biological processes and are of medical importance. The identification and analysis of the entire collection of free and protein-linked GPIs within an organism (i.e., GPIomics) requires highly sensitive instruments. At present, liquid chromatography-tandem mass spectrometry (LC-MS/MS or -MSn) is the most efficient laboratory technique for these tasks. As a typical MS n experiment produces hundreds of thousands of spectra, the data analysis creates a major bottleneck in high-throughput GPIomic projects. Yet, no computational tool for characterizing the chemical structures of GPI is available to date. We propose a library-search algorithm to identify GPIs by matching fragment peaks in the spectra with molecular masses derived from a collection of theoretical GPI structures constructed based on properties of all currently known GPIs. A theoretically possible GPI structure is assessed by a scoring scheme that incorporates its fitness values for individual observed spectra as well as its frequency of being considered as a good fit. The algorithm has been tested on a set of experimentally confirmed GPIs for the protozoan parasite Trypanosoma cruzi, the causative agent of Chagas disease. The final list of 4686 predicted GPI candidates contains 76 out of the 78 known structures in the test set. Over 70% of the known structures have fitness values among the top 647. The first version of this tool runs on a single computer, the results are obtained within 10 days. A second version has been developed using the Condor high throughput computing distributed environment; for the same amount of data results are obtained within 3 days, 19 hours, 38 minutes. This computational tool is expected to quicken the discovery and characterization of GPI molecules, increasing the number of experimentally confirmed GPI structures.
UR - http://www.scopus.com/inward/record.url?scp=84883619555&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883619555&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84883619555
SN - 9781622769711
T3 - 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013
SP - 219
EP - 224
BT - 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013
T2 - 5th International Conference on Bioinformatics and Computational Biology 2013, BICoB 2013
Y2 - 4 March 2013 through 6 March 2013
ER -