TY - JOUR
T1 - Critical assessment of protein intrinsic disorder prediction
AU - Necci, Marco
AU - Piovesan, Damiano
AU - Hoque, Md Tamjidul
AU - Walsh, Ian
AU - Iqbal, Sumaiya
AU - Vendruscolo, Michele
AU - Sormanni, Pietro
AU - Wang, Chen
AU - Raimondi, Daniele
AU - Sharma, Ronesh
AU - Zhou, Yaoqi
AU - Litfin, Thomas
AU - Galzitskaya, Oxana Valerianovna
AU - Lobanov, Michail Yu
AU - Vranken, Wim
AU - Wallner, Björn
AU - Mirabello, Claudio
AU - Malhis, Nawar
AU - Dosztányi, Zsuzsanna
AU - Erdős, Gábor
AU - Mészáros, Bálint
AU - Gao, Jianzhao
AU - Wang, Kui
AU - Hu, Gang
AU - Wu, Zhonghua
AU - Sharma, Alok
AU - Hanson, Jack
AU - Paliwal, Kuldip
AU - Callebaut, Isabelle
AU - Bitard-Feildel, Tristan
AU - Orlando, Gabriele
AU - Peng, Zhenling
AU - Xu, Jinbo
AU - Wang, Sheng
AU - Jones, David T.
AU - Cozzetto, Domenico
AU - Meng, Fanchi
AU - Yan, Jing
AU - Gsponer, Jörg
AU - Cheng, Jianlin
AU - Wu, Tianqi
AU - Kurgan, Lukasz
AU - Promponas, Vasilis J.
AU - Tamana, Stella
AU - Marino-Buslje, Cristina
AU - Martínez-Pérez, Elizabeth
AU - Chasapi, Anastasia
AU - Ouzounis, Christos
AU - Dunker, A. Keith
AU - Ventura, Salvador
N1 - Funding Information:
Development of the predictors was supported in part by the National Science Foundation (grant no. 1617369), Natural Sciences and Engineering Research Council of Canada (grant no. 298328), Tianjin Municipal Science and Technology Commission (grant no. 13ZCZDGX01099), National Natural Science Foundation of China (grant nos. 31970649 and 11701296), Natural Science Foundation of Tianjin (grant no. 18JCYBJC24900), Japan Agency for Medical Research and Development (grant no. 16cm0106320h0001), Australian Research Council (no. DP180102060), Research Foundation Flanders (project no. G.0328.16N) and Agence Nationale de la Recherche (nos. ANR-14-CE10–0021 and ANR-17-CE12–0016). O.V.T. and M.L. carried out this work as part of the state task “Bioinformatics and proteomics studies of proteins and their complexes” (no. 0115-2019-004). Z.D. acknowledges funding from the ELTE Thematic Excellence Programme (no. ED-18-1-2019-0030), supported by the Hungarian Ministry for Innovation and Technology, and the ‘Lendület’ grant from the Hungarian Academy of Sciences (no. LP2014‐18). P.T. acknowledges the Hungarian Scientific Research Fund (grant nos. K124670 and K131702). This project received funding from the European Union’s Horizon 2020 research and innovation program under Marie Skłodowska-Curie grant agreement no. 778247, the Italian Ministry of University and Research (PRIN 2017, grant no. 2017483NH8) and ELIXIR, the European infrastructure for biological data.
Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Nature America, Inc.
PY - 2021/5
Y1 - 2021/5
N2 - Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.
AB - Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude.
UR - http://www.scopus.com/inward/record.url?scp=85104988931&partnerID=8YFLogxK
U2 - https://doi.org/10.1038/s41592-021-01117-3
DO - https://doi.org/10.1038/s41592-021-01117-3
M3 - Article
C2 - 33875885
AN - SCOPUS:85104988931
SN - 1548-7091
VL - 18
SP - 472
EP - 481
JO - Nature Methods
JF - Nature Methods
IS - 5
ER -