-
A Sierpinski Triangle Data Structure for Efficient Array Value Update and Prefix Sum Calculation
Authors:
Brent Harrison,
Jason Necaise,
Andrew Projansky,
James D. Whitfield
Abstract:
The binary indexed tree, or Fenwick tree, is a data structure that can efficiently update values and calculate prefix sums in an array. It allows both of these operations to be performed in $O(\log_2 N)$ time. Here we present a novel data structure resembling the Sierpinski triangle, which accomplishes these operations with the same memory usage in $O(\log_3 N)$ time instead. We show this order to…
▽ More
The binary indexed tree, or Fenwick tree, is a data structure that can efficiently update values and calculate prefix sums in an array. It allows both of these operations to be performed in $O(\log_2 N)$ time. Here we present a novel data structure resembling the Sierpinski triangle, which accomplishes these operations with the same memory usage in $O(\log_3 N)$ time instead. We show this order to be optimal by making use of a connection to quantum computing.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Using GPT-2 to Create Synthetic Data to Improve the Prediction Performance of NLP Machine Learning Classification Models
Authors:
Dewayne Whitfield
Abstract:
Classification Models use input data to predict the likelihood that the subsequent input data will fall into predetermined categories. To perform effective classifications, these models require large datasets for training. It is becoming common practice to utilize synthetic data to boost the performance of Machine Learning Models. It is reported that Shell is using synthetic data to build models t…
▽ More
Classification Models use input data to predict the likelihood that the subsequent input data will fall into predetermined categories. To perform effective classifications, these models require large datasets for training. It is becoming common practice to utilize synthetic data to boost the performance of Machine Learning Models. It is reported that Shell is using synthetic data to build models to detect problems that rarely occur; for example Shell created synthetic data to help models to identify deteriorating oil lines. It is common practice for Machine Learning Practitioners to generate synthetic data by rotating, flip**, and crop** images to increase the volume of image data to train Convolutional Neural Networks. The purpose of this paper is to explore creating and utilizing synthetic NLP data to improve the performance of Natural Language Processing Machine Learning Classification Models. In this paper I used a Yelp pizza restaurant reviews dataset and transfer learning to fine-tune a pre-trained GPT-2 Transformer Model to generate synthetic pizza reviews data. I then combined this synthetic data with the original genuine data to create a new joint dataset. The new combined model significantly outperformed the original model in accuracy and precision.
△ Less
Submitted 2 April, 2021;
originally announced April 2021.
-
Achieving a quantum smart workforce
Authors:
Clarice D. Aiello,
D. D. Awschalom,
Hannes Bernien,
Tina Brower-Thomas,
Kenneth R. Brown,
Todd A. Brun,
Justin R. Caram,
Eric Chitambar,
Rosa Di Felice,
Michael F. J. Fox,
Stephan Haas,
Alexander W. Holleitner,
Eric R. Hudson,
Jeffrey H. Hunt,
Robert Joynt,
Scott Koziol,
H. J. Lewandowski,
Douglas T. McClure,
Jens Palsberg,
Gina Passante,
Kristen L. Pudenz,
Christopher J. K. Richardson,
Jessica L. Rosenberg,
R. S. Ross,
Mark Saffman
, et al. (7 additional authors not shown)
Abstract:
Interest in building dedicated Quantum Information Science and Engineering (QISE) education programs has greatly expanded in recent years. These programs are inherently convergent, complex, often resource intensive and likely require collaboration with a broad variety of stakeholders. In order to address this combination of challenges, we have captured ideas from many members in the community. Thi…
▽ More
Interest in building dedicated Quantum Information Science and Engineering (QISE) education programs has greatly expanded in recent years. These programs are inherently convergent, complex, often resource intensive and likely require collaboration with a broad variety of stakeholders. In order to address this combination of challenges, we have captured ideas from many members in the community. This manuscript not only addresses policy makers and funding agencies (both public and private and from the regional to the international level) but also contains needs identified by industry leaders and discusses the difficulties inherent in creating an inclusive QISE curriculum. We report on the status of eighteen post-secondary education programs in QISE and provide guidance for building new programs. Lastly, we encourage the development of a comprehensive strategic plan for quantum education and workforce development as a means to make the most of the ongoing substantial investments being made in QISE.
△ Less
Submitted 23 October, 2020;
originally announced October 2020.
-
Practically efficient methods for performing bit-reversed permutation in C++11 on the x86-64 architecture
Authors:
Christian Knauth,
Boran Adas,
Daniel Whitfield,
Xuesong Wang,
Lydia Ickler,
Tim Conrad,
Oliver Serang
Abstract:
The bit-reversed permutation is a famous task in signal processing and is key to efficient implementation of the fast Fourier transform. This paper presents optimized C++11 implementations of five extant methods for computing the bit-reversed permutation: Stockham auto-sort, naive bitwise swap**, swap** via a table of reversed bytes, local pairwise swap** of bits, and swap** via a cache-lo…
▽ More
The bit-reversed permutation is a famous task in signal processing and is key to efficient implementation of the fast Fourier transform. This paper presents optimized C++11 implementations of five extant methods for computing the bit-reversed permutation: Stockham auto-sort, naive bitwise swap**, swap** via a table of reversed bytes, local pairwise swap** of bits, and swap** via a cache-localized matrix buffer. Three new strategies for performing the bit-reversed permutation in C++11 are proposed: an inductive method using the bitwise XOR operation, a template-recursive closed form, and a cache-oblivious template-recursive approach, which reduces the bit-reversed permutation to smaller bit-reversed permutations and a square matrix transposition. These new methods are compared to the extant approaches in terms of theoretical runtime, empirical compile time, and empirical runtime. The template-recursive cache-oblivious method is shown to be competitive with the fastest known method; however, we demonstrate that the cache-oblivious method can more readily benefit from parallelization on multiple cores and on the GPU.
△ Less
Submitted 2 August, 2017;
originally announced August 2017.
-
Computational complexity of time-dependent density functional theory
Authors:
J. D. Whitfield,
M. -H. Yung,
D. G. Tempel,
S. Boixo,
A. Aspuru-Guzik
Abstract:
Time-dependent density functional theory (TDDFT) is rapidly emerging as a premier method for solving dynamical many-body problems in physics and chemistry. The mathematical foundations of TDDFT are established through the formal existence of a fictitious non-interacting system (known as the Kohn-Sham system), which can reproduce the one-electron reduced probability density of the actual system. We…
▽ More
Time-dependent density functional theory (TDDFT) is rapidly emerging as a premier method for solving dynamical many-body problems in physics and chemistry. The mathematical foundations of TDDFT are established through the formal existence of a fictitious non-interacting system (known as the Kohn-Sham system), which can reproduce the one-electron reduced probability density of the actual system. We build upon these works and show that on the interior of the domain of existence, the Kohn-Sham system can be efficiently obtained given the time-dependent density. Since a quantum computer can efficiently produce such time-dependent densities, we present a polynomial time quantum algorithm to generate the time-dependent Kohn-Sham potential with controllable error bounds. As a consequence, in contrast to the known intractability result for ground state density functional theory (DFT), the computation of the necessary time-dependent potentials given the initial state is in the complexity class described by bounded error quantum computation in polynomial time (BQP).
△ Less
Submitted 21 August, 2014; v1 submitted 4 October, 2013;
originally announced October 2013.
-
Computational Complexity in Electronic Structure
Authors:
James D. Whitfield,
Peter J. Love,
Alan Aspuru-Guzik
Abstract:
In quantum chemistry, the price paid by all known efficient model chemistries is either the truncation of the Hilbert space or uncontrolled approximations. Theoretical computer science suggests that these restrictions are not mere shortcomings of the algorithm designers and programmers but could stem from the inherent difficulty of simulating quantum systems. Extensions of computer science and inf…
▽ More
In quantum chemistry, the price paid by all known efficient model chemistries is either the truncation of the Hilbert space or uncontrolled approximations. Theoretical computer science suggests that these restrictions are not mere shortcomings of the algorithm designers and programmers but could stem from the inherent difficulty of simulating quantum systems. Extensions of computer science and information processing exploiting quantum mechanics has led to new ways of understanding the ultimate limitations of computational power. Interestingly, this perspective helps us understand widely used model chemistries in a new light. In this article, the fundamentals of computational complexity will be reviewed and motivated from the vantage point of chemistry. Then recent results from the computational complexity literature regarding common model chemistries including Hartree-Fock and density functional theory are discussed.
△ Less
Submitted 16 August, 2012;
originally announced August 2012.