TY - JOUR
T1 - Explaining the genetic basis of complex quantitative traits through prediction models
AU - Luaces, Oscar
AU - Quevedo, José R.
AU - Pérez-Enciso, Miguel
AU - Díez, Jorge
AU - Del Coz, Juan José
AU - Bahamonde, Antonio
PY - 2010/12/1
Y1 - 2010/12/1
N2 - The functional characterization of genes involved in many complex traits (phenotypes) of plants, animals, or humans can be studied from a computational point of view using different tools. We propose prediction-from the machine learning point of view-to search for the genetic basis of these traits. However, trying to predict an exact value of a phenotype can be too difficult to obtain a confident model, but predicting an approximation, in the form of an interval of values, can be easier. We shall see that trustable and useful models can be obtained from this relaxed formulation. These predictors may be built as extensions of conventional classifiers or regressors. Although the prediction performance in both cases are similar, we show that, from the classification field, it is straightforward to obtain a principled and scalable method to select a reduced set of features in these genetic learning tasks. We conclude by comparing the results so achieved in a real-world data set of barley plants with those obtained with state-of-the-art methods used in the biological literature. © Mary Ann Liebert, Inc.
AB - The functional characterization of genes involved in many complex traits (phenotypes) of plants, animals, or humans can be studied from a computational point of view using different tools. We propose prediction-from the machine learning point of view-to search for the genetic basis of these traits. However, trying to predict an exact value of a phenotype can be too difficult to obtain a confident model, but predicting an approximation, in the form of an interval of values, can be easier. We shall see that trustable and useful models can be obtained from this relaxed formulation. These predictors may be built as extensions of conventional classifiers or regressors. Although the prediction performance in both cases are similar, we show that, from the classification field, it is straightforward to obtain a principled and scalable method to select a reduced set of features in these genetic learning tasks. We conclude by comparing the results so achieved in a real-world data set of barley plants with those obtained with state-of-the-art methods used in the biological literature. © Mary Ann Liebert, Inc.
KW - algorithms
KW - machine learning
U2 - 10.1089/cmb.2009.0161
DO - 10.1089/cmb.2009.0161
M3 - Article
SN - 1066-5277
VL - 17
SP - 1711
EP - 1723
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 12
ER -