-
Edge-Disjoint Spanning Trees on Star-Product Networks
Authors:
Aleyah Dawkins,
Kelly Isham,
Ales Kubicek,
Kartik Lakhotia,
Laura Monroe
Abstract:
Star-product graphs are a natural extension of the Cartesian product, but have not been well-studied. We show that many important established and emerging network topologies, including HyperX, SlimFly, BundleFly, PolarStar, mesh, and torus, are in fact star-product graphs. While this connection was known for BundleFly and PolarStar, it was not for the others listed.
We extend a method of constru…
▽ More
Star-product graphs are a natural extension of the Cartesian product, but have not been well-studied. We show that many important established and emerging network topologies, including HyperX, SlimFly, BundleFly, PolarStar, mesh, and torus, are in fact star-product graphs. While this connection was known for BundleFly and PolarStar, it was not for the others listed.
We extend a method of constructing maximal and near-maximal sets of edge-disjoint spanning trees on Cartesian products to the star product, thus obtain maximal or near-maximal sets of edge-disjoint spanning trees on new networks of importance, where such sets can improve bandwidth of collective operations and therefore accelerate many important workloads in high-performance computing.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
PolarStar: Expanding the Scalability Horizon of Diameter-3 Networks
Authors:
Kartik Lakhotia,
Laura Monroe,
Kelly Isham,
Maciej Besta,
Nils Blach,
Torsten Hoefler,
Fabrizio Petrini
Abstract:
In this paper, we present PolarStar, a novel family of diameter-3 network topologies derived from the star product of two low-diameter factor graphs. The proposed PolarStar construction gives the largest known diameter-3 network topologies for almost all radixes. When compared to state-of-the-art diameter-3 networks, PolarStar achieves 31% geometric mean increase in scale over Bundlefly, 91% over…
▽ More
In this paper, we present PolarStar, a novel family of diameter-3 network topologies derived from the star product of two low-diameter factor graphs. The proposed PolarStar construction gives the largest known diameter-3 network topologies for almost all radixes. When compared to state-of-the-art diameter-3 networks, PolarStar achieves 31% geometric mean increase in scale over Bundlefly, 91% over Dragonfly, and 690% over 3-D HyperX.
PolarStar has many other desirable properties including a modular layout, large bisection, high resilience to link failures and a large number of feasible sizes for every radix. Our evaluation shows that it exhibits comparable or better performance than other diameter-3 networks under various traffic patterns.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
PolarFly: A Cost-Effective and Flexible Low-Diameter Topology
Authors:
Kartik Lakhotia,
Maciej Besta,
Laura Monroe,
Kelly Isham,
Patrick Iff,
Torsten Hoefler,
Fabrizio Petrini
Abstract:
In this paper we present PolarFly, a diameter-2 network topology based on the Erdos-Renyi family of polarity graphs from finite geometry. This is a highly scalable low-diameter topology that asymptotically reaches the Moore bound on the number of nodes for a given network degree and diameter
PolarFly achieves high Moore bound efficiency even for the moderate radixes commonly seen in current and…
▽ More
In this paper we present PolarFly, a diameter-2 network topology based on the Erdos-Renyi family of polarity graphs from finite geometry. This is a highly scalable low-diameter topology that asymptotically reaches the Moore bound on the number of nodes for a given network degree and diameter
PolarFly achieves high Moore bound efficiency even for the moderate radixes commonly seen in current and near-future routers, reaching more than 96% of the theoretical peak. It also offers more feasible router degrees than the state-of-the-art solutions, greatly adding to the selection of scalable diameter-2 networks. PolarFly enjoys many other topological properties highly relevant in practice, such as a modular design and expandability that allow incremental growth in network size without rewiring the whole network. Our evaluation shows that PolarFly outperforms competitive networks in terms of scalability, cost and performance for various traffic patterns.
△ Less
Submitted 2 May, 2023; v1 submitted 2 August, 2022;
originally announced August 2022.
-
SIMD-Optimized Search Over Sorted Data
Authors:
Benjamin Mastripolito,
Nicholas Koskelo,
Dylan Weatherred,
David A. Pimentel,
Daniel Sheppard,
Anna Pietarila Graham,
Laura Monroe,
Robert Robey
Abstract:
Applications often require a fast, single-threaded search algorithm over sorted data, typical in table-lookup operations. We explore various search algorithms for a large number of search candidates over a relatively small array of logarithmically-distributed sorted data. These include an innovative hash-based search that takes advantage of floating point representation to bin data by the exponent…
▽ More
Applications often require a fast, single-threaded search algorithm over sorted data, typical in table-lookup operations. We explore various search algorithms for a large number of search candidates over a relatively small array of logarithmically-distributed sorted data. These include an innovative hash-based search that takes advantage of floating point representation to bin data by the exponent. Algorithms that can be optimized to take advantage of SIMD vector instructions are of particular interest. We then conduct a case study applying our results and analyzing algorithmic performance with the EOSPAC package. EOSPAC is a table look-up library for manipulation and interpolation of SESAME equation-of-state data. Our investigation results in a couple of algorithms with better performance with a best case 8x speedup over the original EOSPAC Hunt-and-Locate implementation. Our techniques are generalizable to other instances of search algorithms seeking to get a performance boost from vectorization.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
A Few Identities of the Takagi Function on Dyadic Rationals
Authors:
Laura Monroe
Abstract:
The number of unbalanced interior nodes of divide-and-conquer trees on $n$ leaves is known to form a sequence of dilations of the Takagi function on dyadic rationals. We use this fact to derive identities on the Takagi function, and on the Hamming weight of an integer in terms of the Takagi function.
The number of unbalanced interior nodes of divide-and-conquer trees on $n$ leaves is known to form a sequence of dilations of the Takagi function on dyadic rationals. We use this fact to derive identities on the Takagi function, and on the Hamming weight of an integer in terms of the Takagi function.
△ Less
Submitted 15 November, 2021; v1 submitted 10 November, 2021;
originally announced November 2021.
-
A Class of Trees Having Near-Best Balance
Authors:
Laura Monroe
Abstract:
Full binary trees naturally represent commutative non-associative products. There are many important examples of these products: finite-precision floating-point addition and NAND gates, among others. Balance in such a tree is highly desirable for efficiency in calculation. The best balance is attained with a divide-and-conquer approach. However, this may not be the optimal solution, since the succ…
▽ More
Full binary trees naturally represent commutative non-associative products. There are many important examples of these products: finite-precision floating-point addition and NAND gates, among others. Balance in such a tree is highly desirable for efficiency in calculation. The best balance is attained with a divide-and-conquer approach. However, this may not be the optimal solution, since the success of many calculations is dependent on the grou** and ordering of the calculation, for reasons ranging from the avoidance of rounding error, to calculating with varying precision, to the placement of calculation within a heterogeneous system.
We introduce a new class of computational trees having near-best balance in terms of the Colless index from mathematical phylogenetics. These trees are easily constructed from the binary decomposition of the number of terms in the problem. They also permit much more flexibility than the optimally balanced divide-and-conquer trees. This gives needed freedom in the grou** and ordering of calculation, and allows intelligent efficiency trade-offs.
△ Less
Submitted 25 August, 2021;
originally announced August 2021.
-
Binary Signed-Digit Integers, the Stern Diatomic Sequence and Stern Polynomials
Authors:
Laura Monroe
Abstract:
Stern's diatomic sequence is a well-studied and simply defined sequence with many fascinating characteristics. The binary signed-digit (BSD) representation of integers is used widely in efficient computation, coding theory and other applications. We link these two objects, showing that the number of $i$-bit binary signed-digit representations of an integer $n<2^i$ is the $(2^i-n)^\text{th}$ elemen…
▽ More
Stern's diatomic sequence is a well-studied and simply defined sequence with many fascinating characteristics. The binary signed-digit (BSD) representation of integers is used widely in efficient computation, coding theory and other applications. We link these two objects, showing that the number of $i$-bit binary signed-digit representations of an integer $n<2^i$ is the $(2^i-n)^\text{th}$ element in Stern's diatomic sequence.
This correspondence makes the vast range of results known about the Stern diatomic sequence available for consideration in the study of binary signed-digit integers, and vice versa. Applications of this relationship discussed in this paper include a weight-distribution theorem for BSD representations, linking these representations to Stern polynomials, a recursion for the number of optimal BSD representations of an integer along with their Hamming weight, stemming from an easy recursion for the leading coefficients and degrees of Stern polynomials, and the identification of all integers having a maximal number of such representations.
△ Less
Submitted 30 August, 2021; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Computationally Inequivalent Summations and Their Parenthetic Forms
Authors:
Laura Monroe,
Vanessa Job
Abstract:
Floating-point addition on a finite-precision machine is not associative, so not all mathematically equivalent summations are computationally equivalent. Making this assumption can lead to numerical error in computations. Proper ordering and parenthesizing is a low-overhead way of mitigating such error in a floating point summation.
Ordered and parenthesized summations fall into equivalence clas…
▽ More
Floating-point addition on a finite-precision machine is not associative, so not all mathematically equivalent summations are computationally equivalent. Making this assumption can lead to numerical error in computations. Proper ordering and parenthesizing is a low-overhead way of mitigating such error in a floating point summation.
Ordered and parenthesized summations fall into equivalence classes. We describe these classes, and the parenthetic forms summations in these classes take. We provide summation-related interpretations for sequences known in other contexts, and give new recursive and closed formulas for sequences not previously related to summation.
We also introduce a data structure that facilitates understanding of these objects, and use it to consider certain forms of summation used by default in widely used computer languages. Finally, we relate this data structure to other mathematical constructs from the fields of mathematical analysis and algorithmic analysis.
△ Less
Submitted 11 May, 2020;
originally announced May 2020.