TY - JOUR
T1 - Detecting remotely related proteins by their interactions and sequence similarity
AU - Espadaler, Jordi
AU - Eswar, Narayanan
AU - Avilés, Francesc X.
AU - Sali, Andrej
AU - Aragüés, Ramón
AU - Oliva, Baldomero
AU - Marti-Renom, Marc A.
AU - Querol, Enrique
PY - 2005/5/17
Y1 - 2005/5/17
N2 - The function of an uncharacterized protein is usually inferred either from its homology to, or its interactions with, characterized proteins. Here, we use both sequence similarity and protein interactions to identify relationships between remotely related protein sequences. We rely on the fact that homologous sequences share similar interactions, and, therefore, the set of interacting partners of the partners of a given protein is enriched by its homologs. The approach was benchmarked by assigning the fold and functional family to test sequences of known structure. Specifically, we relied on 1,434 proteins with known folds, as defined in the Structural Classification of Proteins (SCOP) database, and with known interacting partners, as defined in the Database of Interacting Proteins (DIP). For this subset, the specificity of fold assignment was increased from 54% for position-specific iterative BLAST to 75% for our approach, with a concomitant increase in sensitivity for a few percentage points. Similarly, the specificity of family assignment at the e-value threshold of 10-8 was increased from 70% to 87%. The proposed method would be a useful tool for large-scale automated discovery of remote relationships between protein sequences, given its unique reliance on sequence similarity and protein-protein interactions. © 2005 by The National Academy of Sciences of the USA.
AB - The function of an uncharacterized protein is usually inferred either from its homology to, or its interactions with, characterized proteins. Here, we use both sequence similarity and protein interactions to identify relationships between remotely related protein sequences. We rely on the fact that homologous sequences share similar interactions, and, therefore, the set of interacting partners of the partners of a given protein is enriched by its homologs. The approach was benchmarked by assigning the fold and functional family to test sequences of known structure. Specifically, we relied on 1,434 proteins with known folds, as defined in the Structural Classification of Proteins (SCOP) database, and with known interacting partners, as defined in the Database of Interacting Proteins (DIP). For this subset, the specificity of fold assignment was increased from 54% for position-specific iterative BLAST to 75% for our approach, with a concomitant increase in sensitivity for a few percentage points. Similarly, the specificity of family assignment at the e-value threshold of 10-8 was increased from 70% to 87%. The proposed method would be a useful tool for large-scale automated discovery of remote relationships between protein sequences, given its unique reliance on sequence similarity and protein-protein interactions. © 2005 by The National Academy of Sciences of the USA.
KW - Family assignment
KW - Remote homology
KW - Fold assignment
KW - Protein function annotation
KW - Protein-protein interactions
UR - https://dialnet.unirioja.es/servlet/articulo?codigo=1179438
UR - https://www.scopus.com/pages/publications/18844431588
U2 - 10.1073/pnas.0500831102
DO - 10.1073/pnas.0500831102
M3 - Article
SN - 0027-8424
VL - 102
SP - 7151
EP - 7156
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
ER -