The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: Semi-supervised classification of class C GPCRs

Raúl Cruz-Barbosa, Alfredo Vellido, Jesús Giraldo

Research output: Contribution to journalArticleResearchpeer-review

9 Citations (Scopus)

Abstract

© 2014, International Federation for Medical and Biological Engineering. G protein-coupled receptors (GPCRs) are integral cell membrane proteins of relevance for pharmacology. The tertiary structure of the transmembrane domain, a gate to the study of protein functionality, is unknown for almost all members of class C GPCRs, which are the target of the current study. As a result, their investigation must often rely on alignments of their amino acid sequences. Sequence alignment entails the risk of missing relevant information. Various approaches have attempted to circumvent this risk through alignment-free transformations of the sequences on the basis of different amino acid physicochemical properties. In this paper, we use several of these alignment-free methods, as well as a basic amino acid composition representation, to transform the available sequences. Novel semi-supervised statistical machine learning methods are then used to discriminate the different class C GPCRs types from the transformed data. This approach is relevant due to the existence of orphan proteins to which type labels should be assigned in a process of deorphanization or reverse pharmacology. The reported experiments show that the proposed techniques provide accurate classification even in settings of extreme class-label scarcity and that fair accuracy can be achieved even with very simple transformation strategies that ignore the sequence ordering.
Original languageEnglish
Pages (from-to)137-149
JournalMedical and Biological Engineering and Computing
Volume53
Issue number2
DOIs
Publication statusPublished - 1 Jan 2015

Keywords

  • Alignment-free sequence representations
  • Class C G protein-coupled receptors
  • Semi-supervised learning

Fingerprint

Dive into the research topics of 'The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors: Semi-supervised classification of class C GPCRs'. Together they form a unique fingerprint.

Cite this