TY - JOUR
T1 - The extracellular N-terminal domain suffices to discriminate class C G Protein-Coupled Receptor subtypes from n-grams of their sequences
AU - König, Caroline
AU - Alquézar, René
AU - Vellido, Alfredo
AU - Giraldo, Jesús
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/9/28
Y1 - 2015/9/28
N2 - The investigation of protein functionality often relies on the knowledge of crystal 3-D structure. This structure is not always known or easily unravelled, which is the case of eukaryotic cell membrane proteins such as G Protein-Coupled Receptors (GPCRs) and specially of those of class C, which are the target of the current study. In the absence of information about tertiary or quaternary structures, functionality can be investigated from the primary structure, that is, from the amino acid sequence. In previous research, we found that the different subtypes of class C GPCRs could be discriminated with a high level of accuracy from the n-gram transformation of their complete primary sequences, using a method that combined two-stage feature selection with kernel classifiers. This study aims at discovering whether subunits of the complete sequence retain such discrimination capabilities. We report experiments that show that the extracellular N-terminal domain of the receptor suffices to retain the classification accuracy of the complete sequence and that it does so using a reduced selection of n-grams whose length of up to five amino acids opens up an avenue for class C GPCR signature motif discovery.
AB - The investigation of protein functionality often relies on the knowledge of crystal 3-D structure. This structure is not always known or easily unravelled, which is the case of eukaryotic cell membrane proteins such as G Protein-Coupled Receptors (GPCRs) and specially of those of class C, which are the target of the current study. In the absence of information about tertiary or quaternary structures, functionality can be investigated from the primary structure, that is, from the amino acid sequence. In previous research, we found that the different subtypes of class C GPCRs could be discriminated with a high level of accuracy from the n-gram transformation of their complete primary sequences, using a method that combined two-stage feature selection with kernel classifiers. This study aims at discovering whether subunits of the complete sequence retain such discrimination capabilities. We report experiments that show that the extracellular N-terminal domain of the receptor suffices to retain the classification accuracy of the complete sequence and that it does so using a reduced selection of n-grams whose length of up to five amino acids opens up an avenue for class C GPCR signature motif discovery.
KW - Accuracy
KW - Proteins
UR - http://www.scopus.com/inward/record.url?scp=84951014098&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2015.7280613
DO - 10.1109/IJCNN.2015.7280613
M3 - Article
AN - SCOPUS:84951014098
JO - Proceedings of the International Joint Conference on Neural Networks
JF - Proceedings of the International Joint Conference on Neural Networks
ER -