Predictability of gene ontology slim-terms from primary structure information in Embryophyta plant proteins

Jorge Alberto Jaramillo-Garzón, Joan Josep Gallardo-Chacón, César Germán Castellanos-Domínguez, Alexandre Perera-Lluna

Producció científica: Contribució a revistaArticleRecercaAvaluat per experts

10 Cites (Scopus)

Resum

Background: Proteins are the key elements on the path from genetic information to the development of life. The roles played by the different proteins are difficult to uncover experimentally as this process involves complex procedures such as genetic modifications, injection of fluorescent proteins, gene knock-out methods and others. The knowledge learned from each protein is usually annotated in databases through different methods such as the proposed by The Gene Ontology (GO) consortium. Different methods have been proposed in order to predict GO terms from primary structure information, but very few are available for large-scale functional annotation of plants, and reported success rates are much less than the reported by other non-plant predictors. This paper explores the predictability of GO annotations on proteins belonging to the Embryophyta group from a set of features extracted solely from their primary amino acid sequence.Results: High predictability of several GO terms was found for Molecular Function and Cellular Component. As expected, a lower degree of predictability was found on Biological Process ontology annotations, although a few biological processes were easily predicted. Proteins related to transport and transcription were particularly well predicted from primary structure information. The most discriminant features for prediction were those related to electric charges of the amino-acid sequence and hydropathicity derived features.Conclusions: An analysis of GO-slim terms predictability in plants was carried out, in order to determine single categories or groups of functions that are most related with primary structure information. For each highly predictable GO term, the responsible features of such successfulness were identified and discussed. In addition to most published studies, focused on few categories or single ontologies, results in this paper comprise a complete landscape of GO predictability from primary structure encompassing 75 GO terms at molecular, cellular and phenotypical level. Thus, it provides a valuable guide for researchers interested on further advances in protein function prediction on Embryophyta plants. © 2013 Jaramillo-Garzón et al.; licensee BioMed Central Ltd.
Idioma originalAnglès
Número d’article68
RevistaBMC Bioinformatics
Volum14
DOIs
Estat de la publicacióPublicada - 26 de febr. 2013

Fingerprint

Navegar pels temes de recerca de 'Predictability of gene ontology slim-terms from primary structure information in Embryophyta plant proteins'. Junts formen un fingerprint únic.

Com citar-ho