TY - GEN
T1 - Discovering subtype specific n-gram motifs in class C GPCR N-termini
AU - König, Caroline
AU - Alquézar, René
AU - Vellido, Alfredo
AU - Giraldo, Jesús
N1 - Publisher Copyright:
© 2017 The authors and IOS Press. All rights reserved.
PY - 2017
Y1 - 2017
N2 - G Protein-Coupled Receptors are a family of cell membrane proteins, whose class C is a current main target for drug development. Their primary sequences are studied as a source of information for the characterization of their behaviour. In previous research, different alignment-free sequence transformations were explored as the basis for the supervised discrimination of the various class C subtypes using Support Vector Machines. We also investigated an alignment-free sequence transformation based on n-gram protein motifs, under the hypothesis that the sequences' extra-cellular N-terminus domain could suffice to retain most of the subtype-discrimination capabilities of the complete receptor, and that a parsimonious selection of n-grams would be responsible for such classification success. In the current study, these previous results are extended by investigating a different classification procedure that now employs a subtype-vs-all the rest of subtypes approach, shifting towards the selection of those sequence motifs that distinguish each class C subtype from the rest. The reported results indicate the adequacy of this new approach, both in terms of discrimination ability and motif selection parsimony.
AB - G Protein-Coupled Receptors are a family of cell membrane proteins, whose class C is a current main target for drug development. Their primary sequences are studied as a source of information for the characterization of their behaviour. In previous research, different alignment-free sequence transformations were explored as the basis for the supervised discrimination of the various class C subtypes using Support Vector Machines. We also investigated an alignment-free sequence transformation based on n-gram protein motifs, under the hypothesis that the sequences' extra-cellular N-terminus domain could suffice to retain most of the subtype-discrimination capabilities of the complete receptor, and that a parsimonious selection of n-grams would be responsible for such classification success. In the current study, these previous results are extended by investigating a different classification procedure that now employs a subtype-vs-all the rest of subtypes approach, shifting towards the selection of those sequence motifs that distinguish each class C subtype from the rest. The reported results indicate the adequacy of this new approach, both in terms of discrimination ability and motif selection parsimony.
KW - Feature selection
KW - G Protein coupled receptors
KW - Pharmaco-proteomics
KW - Support vector machines
UR - http://www.scopus.com/inward/record.url?scp=85034267695&partnerID=8YFLogxK
U2 - 10.3233/978-1-61499-806-8-116
DO - 10.3233/978-1-61499-806-8-116
M3 - Other contribution
AN - SCOPUS:85034267695
T3 - Frontiers in Artificial Intelligence and Applications
ER -