Motivation: Polymorphism studies are one of the main research areas of this genomic era. To date, however, no comprehensive secondary databases have been designed to provide searchable collections of polymorphic sequences with their associated diversity measures. Results: We define a data model for the storage, representation and analysis of genotypic and haplotypic data. Under this model we have created DPDB, 'Drosophila' Polymorphism Database', a web site that provides a daily updated repository of all well-annotated polymorphic sequences in the Drosophila genus. It allows the search for any polymorphic set according to different parameter values of nucleotide diversity, linkage disequilibrium and codon bias. For data collection, analysis and updating we use PDA, a pipeline that automates the process of sequence retrieval, grouping, alignment and estimation of nucleotide diversity from Genbank sequences in different functional regions. The web site also includes analysis tools for sequence comparison and the estimation of genetic diversity, a page with real-time statistics of the database contents, a help section and a collection of selected links. © The Author 2005. Published by Oxford University Press. All rights reserved.
|Publication status||Published - 1 Sep 2005|