-
The Genomic Landscape of Oceania
Authors:
Consuelo D. Quinto-Cortés,
Carmina Barberena Jonas,
Sofía Vieyra-Sánchez,
Stephen Oppenheimer,
Ram González-Buenfil,
Kathryn Auckland,
Kathryn Robson,
Tom Parks,
J. Víctor Moreno-Mayar,
Javier Blanco-Portillo,
Julian R. Homburger,
Genevieve L. Wojcik,
Alissa L. Severson,
Jonathan S. Friedlaender,
Francoise Friedlaender,
Angela Allen,
Stephen Allen,
Mark Stoneking,
Adrian V. S. Hill,
George Aho,
George Koki,
William Pomat,
Carlos D. Bustamante,
Maude Phipps,
Alexander J. Mentzer
, et al. (2 additional authors not shown)
Abstract:
Encompassing regions that were amongst the first inhabited by humans following the out-of-Africa expansion, hosting populations with the highest levels of archaic hominid introgression, and including Pacific islands that are the most isolated inhabited locations on the planet, Oceania has a rich, but understudied, human genomic landscape. Here we describe the first region-wide analysis of genome-w…
▽ More
Encompassing regions that were amongst the first inhabited by humans following the out-of-Africa expansion, hosting populations with the highest levels of archaic hominid introgression, and including Pacific islands that are the most isolated inhabited locations on the planet, Oceania has a rich, but understudied, human genomic landscape. Here we describe the first region-wide analysis of genome-wide data from population groups spanning Oceania and its surroundings, from island and peninsular southeast Asia to Papua New Guinea, east across the Pacific through Melanesia, Micronesia, and Polynesia, and west across the Indian Ocean to related island populations in the Andamans and Madagascar. In total we generate and analyze genome-wide data from 981 individuals from 92 different populations, 58 separate islands, and 30 countries, representing the most expansive study of Pacific genetics to date. In each sample we disentangle the Papuan and more recent Austronesian ancestries, which have admixed in various proportions across this region, using ancestry-specific analyses, and characterize the distinct patterns of settlement, migration, and archaic introgression separately in these two ancestries. We also focus on the patterns of clinically relevant genetic variation across Oceania--a landscape rippled with strong founder effects and island-specific genetic drift in allele frequencies--providing an atlas for the development of precision genetic health strategies in this understudied region of the world.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
HyperFast: Instant Classification for Tabular Data
Authors:
David Bonet,
Daniel Mas Montserrat,
Xavier Giró-i-Nieto,
Alexander G. Ioannidis
Abstract:
Training deep learning models and performing hyperparameter tuning can be computationally demanding and time-consuming. Meanwhile, traditional machine learning methods like gradient-boosting algorithms remain the preferred choice for most tabular data applications, while neural network alternatives require extensive hyperparameter tuning or work only in toy datasets under limited settings. In this…
▽ More
Training deep learning models and performing hyperparameter tuning can be computationally demanding and time-consuming. Meanwhile, traditional machine learning methods like gradient-boosting algorithms remain the preferred choice for most tabular data applications, while neural network alternatives require extensive hyperparameter tuning or work only in toy datasets under limited settings. In this paper, we introduce HyperFast, a meta-trained hypernetwork designed for instant classification of tabular data in a single forward pass. HyperFast generates a task-specific neural network tailored to an unseen dataset that can be directly used for classification inference, removing the need for training a model. We report extensive experiments with OpenML and genomic data, comparing HyperFast to competing tabular data neural networks, traditional ML methods, AutoML systems, and boosting machines. HyperFast shows highly competitive results, while being significantly faster. Additionally, our approach demonstrates robust adaptability across a variety of classification tasks with little to no fine-tuning, positioning HyperFast as a strong solution for numerous applications and rapid model deployment. HyperFast introduces a promising paradigm for fast classification, with the potential to substantially decrease the computational burden of deep learning. Our code, which offers a scikit-learn-like interface, along with the trained HyperFast model, can be found at https://github.com/AI-sandbox/HyperFast.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Adversarial Learning for Feature Shift Detection and Correction
Authors:
Miriam Barrabes,
Daniel Mas Montserrat,
Margarita Geleta,
Xavier Giro-i-Nieto,
Alexander G. Ioannidis
Abstract:
Data shift is a phenomenon present in many real-world applications, and while there are multiple methods attempting to detect shifts, the task of localizing and correcting the features originating such shifts has not been studied in depth. Feature shifts can occur in many datasets, including in multi-sensor data, where some sensors are malfunctioning, or in tabular and structured data, including b…
▽ More
Data shift is a phenomenon present in many real-world applications, and while there are multiple methods attempting to detect shifts, the task of localizing and correcting the features originating such shifts has not been studied in depth. Feature shifts can occur in many datasets, including in multi-sensor data, where some sensors are malfunctioning, or in tabular and structured data, including biomedical, financial, and survey data, where faulty standardization and data processing pipelines can lead to erroneous features. In this work, we explore using the principles of adversarial learning, where the information from several discriminators trained to distinguish between two distributions is used to both detect the corrupted features and fix them in order to remove the distribution shift between datasets. We show that mainstream supervised classifiers, such as random forest or gradient boosting trees, combined with simple iterative heuristics, can localize and correct feature shifts, outperforming current statistical and neural network-based techniques. The code is available at https://github.com/AI-sandbox/DataFix.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Ancestry-specific analyses of genome-wide data confirm the settlement sequence of Polynesia
Authors:
Alexander G. Ioannidis,
Javier Blanco-Portillo,
Erika Hagelberg,
Juan Esteban Rodríguez-Rodríguez,
Keolu Fox,
Adrian V. S. Hill,
Carlos D. Bustamante,
Marcus W. Feldman,
Alexander J. Mentzer,
Andrés Moreno-Estrada
Abstract:
By demonstrating the role that historical population replacements and waves of admixture have played around the world, the genetics work of Reich and colleagues has provided a paradigm for understanding human history [Reich et al. 2009; Reich et al. 2012; Patterson et al. 2012]. Although we show in Ioannidis et al. [2021] that the peopling of Polynesia was a range expansion, and not, as suggested…
▽ More
By demonstrating the role that historical population replacements and waves of admixture have played around the world, the genetics work of Reich and colleagues has provided a paradigm for understanding human history [Reich et al. 2009; Reich et al. 2012; Patterson et al. 2012]. Although we show in Ioannidis et al. [2021] that the peopling of Polynesia was a range expansion, and not, as suggested by Huang et al. [2022], yet another example of waves of admixture and large-scale gene flow between populations, we believe that our result in this recently settled oceanic expanse is the exception that proves the rule.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Ultra-Low-Power Superconductor Logic
Authors:
Quentin P. Herr,
Anna Y. Herr,
Oliver T. Oberg,
Alexander G. Ioannidis
Abstract:
We have developed a new superconducting digital technology, Reciprocal Quantum Logic, that uses AC power carried on a transmission line, which also serves as a clock. Using simple experiments we have demonstrated zero static power dissipation, thermally limited dynamic power dissipation, high clock stability, high operating margins and low BER. These features indicate that the technology is scalab…
▽ More
We have developed a new superconducting digital technology, Reciprocal Quantum Logic, that uses AC power carried on a transmission line, which also serves as a clock. Using simple experiments we have demonstrated zero static power dissipation, thermally limited dynamic power dissipation, high clock stability, high operating margins and low BER. These features indicate that the technology is scalable to far more complex circuits at a significant level of integration. On the system level, Reciprocal Quantum Logic combines the high speed and low-power signal levels of Single-Flux- Quantum signals with the design methodology of CMOS, including low static power dissipation, low latency combinational logic, and efficient device count.
△ Less
Submitted 22 March, 2011;
originally announced March 2011.