The Qatar Genome: A Population-Specific Tool for Precision Medicine in the Middle East
Authors:
Khalid A. Fakhro,
Michelle R. Staudt,
Monica Denise Ramstetter,
Amal Robay,
Joel A. Malek,
Ramin Badii,
Ajayeb Al-Nabet Al-Marri,
Charbel Abi Khalil,
Alya Al-Shakaki,
Omar Chidiac,
Dora Stadler,
Mahmoud Zirie,
Amin Jayyousi,
Jacqueline Salit,
Jason G. Mezey,
Ronald G. Crystal,
Juan L. Rodriguez-Flores
Abstract:
Reaching the full potential of precision medicine depends on the quality of personalized genome interpretation. In order to facilitate precision medicine in regions of the Middle East and North Africa (MENA), a population-specific reference genome for the indigenous Arab popula-tion of Qatar (QTRG) was constructed by incorporating allele frequency data from sequencing of 1,161 Qataris, representin…
▽ More
Reaching the full potential of precision medicine depends on the quality of personalized genome interpretation. In order to facilitate precision medicine in regions of the Middle East and North Africa (MENA), a population-specific reference genome for the indigenous Arab popula-tion of Qatar (QTRG) was constructed by incorporating allele frequency data from sequencing of 1,161 Qataris, representing 0.4% of the population. A total of 20.9 million SNP and 3.1 million indels were observed in Qatar, including an average of 1.79% novel variants per individual ge-nome. Replacement of the GRCh37 standard reference with QTRG in a best practices genome analysis workflow resulted in an average of 7* deeper coverage depth (an improvement of 23%), and 756,671 fewer variants on average, a reduction of 16% that is attributed to common Qatari alleles being present in the QTRG reference. The benefit for using QTRG varies across ances-tries, a factor that should be taken into consideration when selecting an appropriate reference for analysis.
△ Less
Submitted 13 May, 2018; v1 submitted 8 May, 2018;
originally announced May 2018.
Reconstructing Native American Migrations from Whole-genome and Whole-exome Data
Authors:
Simon Gravel,
Fouad Zakharia,
Andres Moreno-Estrada,
Jake K Byrnes,
Marina Muzzio,
Juan L. Rodriguez-Flores,
Eimear E. Kenny,
Christopher R. Gignoux,
Brian K. Maples,
Wilfried Guiblet,
Julie Dutil,
Marc Via,
Karla Sandoval,
Gabriel Bedoya,
Taras K Oleksyk,
Andres Ruiz-Linares,
Esteban G Burchard,
Juan Carlos Martinez-Cruzado,
Carlos D. Bustamante,
The 1000 Genomes Project
Abstract:
There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genoty** arrays and uniparenta…
▽ More
There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genoty** arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and Native American ancestry to these populations. Estimated Native American ancestry is 48% in MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern America ancestry of the TaĆno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas 16 thousand years ago (kya), supports that the MXL Ancestors split 12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The model also features effective populations of 62,000 in Mexico, 8,700 in Colombia, and 1,900 in Puerto Rico. Modeling Identity-by-descent and ancestry tract length, we show that post-contact populations differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe.
△ Less
Submitted 15 November, 2013; v1 submitted 17 June, 2013;
originally announced June 2013.