TY - JOUR
T1 - GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing
AU - Puig Font, Marta
AU - Cáceres Aguilar, Mario
AU - Valls-Margarit, Jordi
AU - Galván-Femenía, Iván
AU - Matías-Sánchez, Daniel
AU - Blay, Natalia
AU - Puiggròs, Montserrat
AU - Carreras, Anna
AU - Salvoro, Cecilia
AU - Cortés, Beatriz
AU - Amela, Ramon
AU - Farre, Xavier
AU - Sánchez-Herrero, Jose Francisco
AU - Moreno Aguado, Víctor
AU - Perucho, Manuel
AU - Sumoy, Lauro
AU - Armengol, Lluís
AU - Delaneau, Olivier
AU - de Cid, Rafael
AU - Torrents, David
PY - 2022
Y1 - 2022
N2 - The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.
AB - The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.
U2 - 10.1093/nar/gkac076
DO - 10.1093/nar/gkac076
M3 - Article
C2 - 35176773
SN - 1362-4962
VL - 50
SP - 2464
EP - 2479
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 5
ER -