-
Median of heaps: linear-time selection by recursively constructing binary heaps
Authors:
Oliver Serang
Abstract:
The first worst-case linear-time algorithm for selection was discovered in 1973; however, linear-time binary heap construction was first published in 1964. Here we describe another worst-case linear selection algorithm,which is simply implemented and uses binary heap construction as its principal engine. The algorithm is implemented in place, and shown to perform similarly to in-place median of me…
▽ More
The first worst-case linear-time algorithm for selection was discovered in 1973; however, linear-time binary heap construction was first published in 1964. Here we describe another worst-case linear selection algorithm,which is simply implemented and uses binary heap construction as its principal engine. The algorithm is implemented in place, and shown to perform similarly to in-place median of medians.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Adversarial network training using higher-order moments in a modified Wasserstein distance
Authors:
Oliver Serang
Abstract:
Generative-adversarial networks (GANs) have been used to produce data closely resembling example data in a compressed, latent space that is close to sufficient for reconstruction in the original vector space. The Wasserstein metric has been used as an alternative to binary cross-entropy, producing more numerically stable GANs with greater mode covering behavior. Here, a generalization of the Wasse…
▽ More
Generative-adversarial networks (GANs) have been used to produce data closely resembling example data in a compressed, latent space that is close to sufficient for reconstruction in the original vector space. The Wasserstein metric has been used as an alternative to binary cross-entropy, producing more numerically stable GANs with greater mode covering behavior. Here, a generalization of the Wasserstein distance, using higher-order moments than the mean, is derived. Training a GAN with this higher-order Wasserstein metric is demonstrated to exhibit superior performance, even when adjusted for slightly higher computational cost. This is illustrated generating synthetic antibody sequences.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Selection on $X_1 + X_1 + \cdots X_m$ via Cartesian product tree
Authors:
Patrick Kreitzberg,
Kyle Lucke,
Jake Pennington,
Oliver Serang
Abstract:
Selection on the Cartesian product is a classic problem in computer science. Recently, an optimal algorithm for selection on $X+Y$, based on soft heaps, was introduced. By combining this approach with layer-ordered heaps (LOHs), an algorithm using a balanced binary tree of $X+Y$ selections was proposed to perform $k$-selection on $X_1+X_2+\cdots+X_m$ in $o(n\cdot m + k\cdot m)$, where $X_i$ have l…
▽ More
Selection on the Cartesian product is a classic problem in computer science. Recently, an optimal algorithm for selection on $X+Y$, based on soft heaps, was introduced. By combining this approach with layer-ordered heaps (LOHs), an algorithm using a balanced binary tree of $X+Y$ selections was proposed to perform $k$-selection on $X_1+X_2+\cdots+X_m$ in $o(n\cdot m + k\cdot m)$, where $X_i$ have length $n$. Here, that $o(n\cdot m + k\cdot m)$ algorithm is combined with a novel, optimal LOH-based algorithm for selection on $X+Y$ (without a soft heap). Performance of algorithms for selection on $X_1+X_2+\cdots+X_m$ are compared empirically, demonstrating the benefit of the algorithm proposed here.
△ Less
Submitted 16 August, 2020;
originally announced August 2020.
-
Optimal construction of a layer-ordered heap
Authors:
Jake Pennington,
Patrick Kreitzberg,
Kyle Lucke,
Oliver Serang
Abstract:
The layer-ordered heap (LOH) is a simple, recently proposed data structure used in optimal selection on $X+Y$, thealgorithm with the best known runtime for selection on $X_1+X_2+\cdots+X_m$, and the fastest method in practice for computing the most abundant isotope peaks in a chemical compound. Here, we introduce a few algorithms for constructing LOHs, analyze their complexity, and demonstrate tha…
▽ More
The layer-ordered heap (LOH) is a simple, recently proposed data structure used in optimal selection on $X+Y$, thealgorithm with the best known runtime for selection on $X_1+X_2+\cdots+X_m$, and the fastest method in practice for computing the most abundant isotope peaks in a chemical compound. Here, we introduce a few algorithms for constructing LOHs, analyze their complexity, and demonstrate that one algorithm is optimal for building a LOH of any rank $α$. These results are shown to correspond with empirical experiments of runtimes when applying the LOH construction algorithms to a common task in machine learning.
△ Less
Submitted 15 August, 2020; v1 submitted 27 July, 2020;
originally announced July 2020.
-
Fast exact computation of the $k$ most abundant isotope peaks with layer-ordered heaps
Authors:
Patrick Kreitzberg,
Jake Pennington,
Kyle Lucke,
Oliver Serang
Abstract:
The theoretical computation of isotopic distribution of compounds is crucial in many important applications of mass spectrometry, especially as machine precision grows. A considerable amount of good tools have been created in the last decade for doing so. In this paper we present a novel algorithm for calculating the top $k$ peaks of a given compound. The algorithm takes advantage of layer-ordered…
▽ More
The theoretical computation of isotopic distribution of compounds is crucial in many important applications of mass spectrometry, especially as machine precision grows. A considerable amount of good tools have been created in the last decade for doing so. In this paper we present a novel algorithm for calculating the top $k$ peaks of a given compound. The algorithm takes advantage of layer-ordered heaps used in an optimal method of selection on $X+Y$ and is able to efficiently calculate the top $k$ peaks on very large molecules. Among its peers, this algorithm shows a significant speedup on molecules whose elements have many isotopes. The algorithm obtains a speedup of more than 31x when compared to $\textsc{IsoSpec}$ on \ch{Au2Ca10Ga10Pd76} when computing 47409787 peaks, which covers 0.999 of the total abundance.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Optimally selecting the top $k$ values from $X+Y$ with layer-ordered heaps
Authors:
Oliver Serang
Abstract:
Selection and sorting the Cartesian sum, $X+Y$, are classic and important problems. Here, a new algorithm is presented, which generates the top $k$ values of the form $X_i+Y_j$. The algorithm relies only on median-of-medians and is simple to implement. Furthermore, it uses data structures contiguous in memory, and is fast in practice. The presented algorithm is demonstrated to be theoretically opt…
▽ More
Selection and sorting the Cartesian sum, $X+Y$, are classic and important problems. Here, a new algorithm is presented, which generates the top $k$ values of the form $X_i+Y_j$. The algorithm relies only on median-of-medians and is simple to implement. Furthermore, it uses data structures contiguous in memory, and is fast in practice. The presented algorithm is demonstrated to be theoretically optimal.
△ Less
Submitted 5 October, 2020; v1 submitted 30 January, 2020;
originally announced January 2020.
-
Selection on $X_1+X_2+\cdots + X_m$ with layer-ordered heaps
Authors:
Patrick Kreitzberg,
Kyle Lucke,
Oliver Serang
Abstract:
Selection on $X_1+X_2+\cdots + X_m$ is an important problem with many applications in areas such as max-convolution, max-product Bayesian inference, calculating most probable isotopes, and computing non-parametric test statistics, among others. Faster-than-naïve approaches exist for $m=2$: Frederickson (1993) published the optimal algorithm with runtime $O(k)$ and Kaplan \emph{et al.} (2018) has s…
▽ More
Selection on $X_1+X_2+\cdots + X_m$ is an important problem with many applications in areas such as max-convolution, max-product Bayesian inference, calculating most probable isotopes, and computing non-parametric test statistics, among others. Faster-than-naïve approaches exist for $m=2$: Frederickson (1993) published the optimal algorithm with runtime $O(k)$ and Kaplan \emph{et al.} (2018) has since published a much simpler algorithm which makes use of Chazelle's soft heaps (2003). No fast methods exist for $m>2$. Johnson \& Mizoguchi (1978) introduced a method to compute the single $k^{th}$ value when $m>2$, but that method runs in $O(m\cdot n^{\lceil\frac{m}{2}\rceil} \log(n))$ time and is inefficient when $m \gg 1$ and $k \ll n^{\lceil\frac{m}{2}\rceil}$.
In this paper, we introduce the first efficient methods, both in theory and practice, for problems with $m>2$. We introduce the ``layer-ordered heap,'' a simple special class of heap with which we produce a new, fast selection algorithm on the Cartesian product. Using this new algorithm to perform $k$-selection on the Cartesian product of $m$ arrays of length $n$ has runtime $\in o(k\cdot m)$. We also provide implementations of the algorithms proposed and evaluate their performance in practice.
△ Less
Submitted 15 August, 2020; v1 submitted 26 October, 2019;
originally announced October 2019.
-
Most abundant isotope peaks and efficient selection on $Y=X_1+X_2+\cdots + X_m$
Authors:
Patrick Kreitzberg,
Kyle Lucke,
Oliver Serang
Abstract:
The isotope masses and relative abundances for each element are fundamental chemical knowledge. Computing the isotope masses of a compound and their relative abundances is an important and difficult analytical chemistry problem. We demonstrate that this problem is equivalent to sorting $Y=X_1+X_2+\cdots+X_m$. We introduce a novel, practically efficient method for computing the top values in $Y$. t…
▽ More
The isotope masses and relative abundances for each element are fundamental chemical knowledge. Computing the isotope masses of a compound and their relative abundances is an important and difficult analytical chemistry problem. We demonstrate that this problem is equivalent to sorting $Y=X_1+X_2+\cdots+X_m$. We introduce a novel, practically efficient method for computing the top values in $Y$. then demonstrate the applicability of this method by computing the most abundant isotope masses (and their abundances) from compounds of nontrivial size.
△ Less
Submitted 29 June, 2019;
originally announced July 2019.
-
Practically efficient methods for performing bit-reversed permutation in C++11 on the x86-64 architecture
Authors:
Christian Knauth,
Boran Adas,
Daniel Whitfield,
Xuesong Wang,
Lydia Ickler,
Tim Conrad,
Oliver Serang
Abstract:
The bit-reversed permutation is a famous task in signal processing and is key to efficient implementation of the fast Fourier transform. This paper presents optimized C++11 implementations of five extant methods for computing the bit-reversed permutation: Stockham auto-sort, naive bitwise swap**, swap** via a table of reversed bytes, local pairwise swap** of bits, and swap** via a cache-lo…
▽ More
The bit-reversed permutation is a famous task in signal processing and is key to efficient implementation of the fast Fourier transform. This paper presents optimized C++11 implementations of five extant methods for computing the bit-reversed permutation: Stockham auto-sort, naive bitwise swap**, swap** via a table of reversed bytes, local pairwise swap** of bits, and swap** via a cache-localized matrix buffer. Three new strategies for performing the bit-reversed permutation in C++11 are proposed: an inductive method using the bitwise XOR operation, a template-recursive closed form, and a cache-oblivious template-recursive approach, which reduces the bit-reversed permutation to smaller bit-reversed permutations and a square matrix transposition. These new methods are compared to the extant approaches in terms of theoretical runtime, empirical compile time, and empirical runtime. The template-recursive cache-oblivious method is shown to be competitive with the fastest known method; however, we demonstrate that the cache-oblivious method can more readily benefit from parallelization on multiple cores and on the GPU.
△ Less
Submitted 2 August, 2017;
originally announced August 2017.
-
An exact, cache-localized algorithm for the sub-quadratic convolution of hypercubes
Authors:
Oliver Serang
Abstract:
Fast multidimensional convolution can be performed naively in quadratic time and can often be performed more efficiently via the Fourier transform; however, when the dimensionality is large, these algorithms become more challenging. A method is proposed for performing exact hypercube convolution in sub-quadratic time. The method outperforms FFTPACK, called via numpy, and FFTW, called via pyfftw) f…
▽ More
Fast multidimensional convolution can be performed naively in quadratic time and can often be performed more efficiently via the Fourier transform; however, when the dimensionality is large, these algorithms become more challenging. A method is proposed for performing exact hypercube convolution in sub-quadratic time. The method outperforms FFTPACK, called via numpy, and FFTW, called via pyfftw) for hypercube convolution. Embeddings in hypercubes can be paired with sub-quadratic hypercube convolution method to construct sub-quadratic algorithms for variants of vector convolution.
△ Less
Submitted 31 July, 2016;
originally announced August 2016.
-
TRIOT: Faster tensor manipulation in C++11
Authors:
Florian Heyl,
Oliver Serang
Abstract:
[abridged] Context: Multidimensional arrays are used by many different algorithms. As such, indexing and broadcasting complex operations over multidimensional arrays are ubiquitous tasks and can be performance limiting. Inquiry: Simultaneously indexing two or more multidimensional arrays with different shapes (e.g., copying data from one tensor to another larger, zero padded tensor in anticipation…
▽ More
[abridged] Context: Multidimensional arrays are used by many different algorithms. As such, indexing and broadcasting complex operations over multidimensional arrays are ubiquitous tasks and can be performance limiting. Inquiry: Simultaneously indexing two or more multidimensional arrays with different shapes (e.g., copying data from one tensor to another larger, zero padded tensor in anticipation of a convolution) is difficult to do efficiently: Hard-coded nested for loops in C, Fortran, and Go cannot be applied when the dimension of a tensor is unknown at compile time. Likewise, boost::multi_array cannot be used unless the dimensions of the array are known at compile time, and the style of implementation restricts the user from using the index tuple inside a vectorized operation (as would be required to compute an expected value of a multidimensional distribution). On the other hand, iteration methods that do not require the dimensionality or shape to be known at compile time (e.g., incrementing and applying carry operations to index tuples or remap** integer indices in the flat array), can be substantially slower than hard-coded nested for loops. ... Importance: Manipulation of multidimensional arrays is a common task in software, especially in high performance numerical methods. This paper proposes a novel way to leverage template recursion to iterate over and apply operations to multidimensional arrays, and then demonstrates the superior performance and flexibility of operations that can be achieved using this new approach.
△ Less
Submitted 31 March, 2017; v1 submitted 30 July, 2016;
originally announced August 2016.
-
Fast Computation on Semirings Isomorphic to $(\times, \max)$ on $\mathbb{R}_+$
Authors:
Oliver Serang
Abstract:
Important problems across multiple disciplines involve computations on the semiring $(\times, \max)$ (or its equivalents, the negated version $(\times, \min)$), the log-transformed version $(+, \max)$, or the negated log-transformed version $(+, \min)$): max-convolution, all-pairs shortest paths in a weighted graph, and finding the largest $k$ values in $x_i+y_j$ for two lists $x$ and $y$. However…
▽ More
Important problems across multiple disciplines involve computations on the semiring $(\times, \max)$ (or its equivalents, the negated version $(\times, \min)$), the log-transformed version $(+, \max)$, or the negated log-transformed version $(+, \min)$): max-convolution, all-pairs shortest paths in a weighted graph, and finding the largest $k$ values in $x_i+y_j$ for two lists $x$ and $y$. However, fast algorithms such as those enabling FFT convolution, sub-cubic matrix multiplication, \emph{etc.}, require inverse operations, and thus cannot be computed on semirings. This manuscript generalizes recent advances on max-convolution: in this approach a small family of $p$-norm rings are used to efficiently approximate results on a nonnegative semiring. The general approach can be used to easily compute sub-cubic estimates of the all-pairs shortest paths in a graph with nonnegative edge weights and sub-quadratic estimates of the top $k$ values in $x_i+y_j$ when $x$ and $y$ are nonnegative. These methods are fast in practice and can benefit from coarse-grained parallelization.
△ Less
Submitted 17 June, 2016; v1 submitted 18 November, 2015;
originally announced November 2015.