-
Butson Hadamard matrices, bent sequences, and spherical codes
Authors:
Minjia Shi,
Danni Lu,
Andrés Armario,
Ronan Egan,
Ferruh Ozbudak,
Patrick Solé
Abstract:
We explore a notion of bent sequence attached to the data consisting of an Hadamard matrix of order $n$ defined over the complex $q^{th}$ roots of unity, an eigenvalue of that matrix, and a Galois automorphism from the cyclotomic field of order $q.$ In particular we construct self-dual bent sequences for various $q\le 60$ and lengths $n\le 21.$ Computational construction methods comprise the resol…
▽ More
We explore a notion of bent sequence attached to the data consisting of an Hadamard matrix of order $n$ defined over the complex $q^{th}$ roots of unity, an eigenvalue of that matrix, and a Galois automorphism from the cyclotomic field of order $q.$ In particular we construct self-dual bent sequences for various $q\le 60$ and lengths $n\le 21.$ Computational construction methods comprise the resolution of polynomial systems by Groebner bases and eigenspace computations. Infinite families can be constructed from regular Hadamard matrices, Bush-type Hadamard matrices, and generalized Boolean bent functions.As an application, we estimate the covering radius of the code attached to that matrix over $\Z_q.$ We derive a lower bound on that quantity for the Chinese Euclidean metric when bent sequences exist. We give the Euclidean distance spectrum, and bound above the covering radius of an attached spherical code, depending on its strength as a spherical design.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Butson full propelinear codes
Authors:
José Andrés Armario,
Ivan Bailera,
Ronan Egan
Abstract:
In this paper we study Butson Hadamard matrices, and codes over finite rings coming from these matrices in logarithmic form, called BH-codes. We introduce a new morphism of Butson Hadamard matrices through a generalized Gray map on the matrices in logarithmic form, which is comparable to the morphism given in a recent note of Ó Catháin and Swartz. That is, we show how, if given a Butson Hadamard m…
▽ More
In this paper we study Butson Hadamard matrices, and codes over finite rings coming from these matrices in logarithmic form, called BH-codes. We introduce a new morphism of Butson Hadamard matrices through a generalized Gray map on the matrices in logarithmic form, which is comparable to the morphism given in a recent note of Ó Catháin and Swartz. That is, we show how, if given a Butson Hadamard matrix over the $k^{\rm th}$ roots of unity, we can construct a larger Butson matrix over the $\ell^{\rm th}$ roots of unity for any $\ell$ dividing $k$, provided that any prime $p$ dividing $k$ also divides $\ell$.
We prove that a $\mathbb{Z}_{p^s}$-additive code with $p$ a prime number is isomorphic as a group to a BH-code over $\mathbb{Z}_{p^s}$ and the image of this BH-code under the Gray map is a BH-code over $\mathbb{Z}_p$ (binary Hadamard code for $p=2$). Further, we investigate the inherent propelinear structure of these codes (and their images) when the Butson matrix is cocyclic. Some structural properties of these codes are studied and examples are provided.
△ Less
Submitted 27 November, 2020; v1 submitted 13 October, 2020;
originally announced October 2020.
-
The Parallelism Motifs of Genomic Data Analysis
Authors:
Katherine Yelick,
Aydin Buluc,
Muaaz Awan,
Ariful Azad,
Benjamin Brock,
Rob Egan,
Saliya Ekanayake,
Marquita Ellis,
Evangelos Georganas,
Giulia Guidi,
Steven Hofmeyr,
Oguz Selvitopi,
Cristina Teodoropol,
Leonid Oliker
Abstract:
Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from…
▽ More
Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high performance genomics analysis, including alignment, profiling, clustering, and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or motifs that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing.
△ Less
Submitted 20 January, 2020;
originally announced January 2020.
-
Extreme Scale De Novo Metagenome Assembly
Authors:
Evangelos Georganas,
Rob Egan,
Steven Hofmeyr,
Eugene Goltsman,
Bill Arndt,
Andrew Tritt,
Aydin Buluc,
Leonid Oliker,
Katherine Yelick
Abstract:
Metagenome assembly is the process of transforming a set of short, overlap**, and potentially erroneous DNA segments from environmental samples into the accurate representation of the underlying microbiomes's genomes. State-of-the-art tools require big shared memory machines and cannot handle contemporary metagenome datasets that exceed Terabytes in size. In this paper, we introduce the MetaHipM…
▽ More
Metagenome assembly is the process of transforming a set of short, overlap**, and potentially erroneous DNA segments from environmental samples into the accurate representation of the underlying microbiomes's genomes. State-of-the-art tools require big shared memory machines and cannot handle contemporary metagenome datasets that exceed Terabytes in size. In this paper, we introduce the MetaHipMer pipeline, a high-quality and high-performance metagenome assembler that employs an iterative de Bruijn graph approach. MetaHipMer leverages a specialized scaffolding algorithm that produces long scaffolds and accommodates the idiosyncrasies of metagenomes. MetaHipMer is end-to-end parallelized using the Unified Parallel C language and therefore can run seamlessly on shared and distributed-memory systems. Experimental results show that MetaHipMer matches or outperforms the state-of-the-art tools in terms of accuracy. Moreover, MetaHipMer scales efficiently to large concurrencies and is able to assemble previously intractable grand challenge metagenomes. We demonstrate the unprecedented capability of MetaHipMer by computing the first full assembly of the Twitchell Wetlands dataset, consisting of 7.5 billion reads - size 2.6 TBytes.
△ Less
Submitted 19 September, 2018;
originally announced September 2018.
-
Extreme-Scale De Novo Genome Assembly
Authors:
Evangelos Georganas,
Steven Hofmeyr,
Rob Egan,
Aydin Buluc,
Leonid Oliker,
Daniel Rokhsar,
Katherine Yelick
Abstract:
De novo whole genome assembly reconstructs genomic sequence from short, overlap**, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, ea…
▽ More
De novo whole genome assembly reconstructs genomic sequence from short, overlap**, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.
△ Less
Submitted 31 May, 2017;
originally announced May 2017.