Discrimination of fish populations using parasites: Random Forests on a predictable host-parasite system

A. Pérez-Del-Olmo, F. E. Montero, M. Fernndez, J. Barrett, J. A. Raga, A. Kostadinova

Research output: Contribution to journalArticleResearchpeer-review

10 Citations (Scopus)


We address the effect of spatial scale and temporal variation on model generality when forming predictive models for fish assignment using a new data mining approach, Random Forests (RF), to variable biological markers (parasite community data). Models were implemented for a fish host-parasite system sampled along the Mediterranean and Atlantic coasts of Spain and were validated using independent datasets. We considered 2 basic classification problems in evaluating the importance of variations in parasite infracommunities for assignment of individual fish to their populations of origin: multiclass (2-5 population models, using 2 seasonal replicates from each of the populations) and 2-class task (using 4 seasonal replicates from 1 Atlantic and 1 Mediterranean population each). The main results are that (i) RF are well suited for multiclass population assignment using parasite communities in non-migratory fish; (ii) RF provide an efficient means for model cross-validation on the baseline data and this allows sample size limitations in parasite tag studies to be tackled effectively; (iii) the performance of RF is dependent on the complexity and spatial extent/configuration of the problem; and (iv) the development of predictive models is strongly influenced by seasonal change and this stresses the importance of both temporal replication and model validation in parasite tagging studies. © Cambridge University Press 2010.
Original languageEnglish
Pages (from-to)1833-1847
Issue number12
Publication statusPublished - 1 Oct 2010


  • Boops boops
  • fish population discrimination
  • Mediterranean
  • North-East Atlantic
  • parasites as tags
  • predictive models
  • Random Forests


Dive into the research topics of 'Discrimination of fish populations using parasites: Random Forests on a predictable host-parasite system'. Together they form a unique fingerprint.

Cite this