-
Graph Neural Networks for Parameterized Quantum Circuits Expressibility Estimation
Authors:
Shamminuj Aktar,
Andreas Bärtschi,
Diane Oyen,
Stephan Eidenbenz,
Abdel-Hameed A. Badawy
Abstract:
Parameterized quantum circuits (PQCs) are fundamental to quantum machine learning (QML), quantum optimization, and variational quantum algorithms (VQAs). The expressibility of PQCs is a measure that determines their capability to harness the full potential of the quantum state space. It is thus a crucial guidepost to know when selecting a particular PQC ansatz. However, the existing technique for…
▽ More
Parameterized quantum circuits (PQCs) are fundamental to quantum machine learning (QML), quantum optimization, and variational quantum algorithms (VQAs). The expressibility of PQCs is a measure that determines their capability to harness the full potential of the quantum state space. It is thus a crucial guidepost to know when selecting a particular PQC ansatz. However, the existing technique for expressibility computation through statistical estimation requires a large number of samples, which poses significant challenges due to time and computational resource constraints. This paper introduces a novel approach for expressibility estimation of PQCs using Graph Neural Networks (GNNs). We demonstrate the predictive power of our GNN model with a dataset consisting of 25,000 samples from the noiseless IBM QASM Simulator and 12,000 samples from three distinct noisy quantum backends. The model accurately estimates expressibility, with root mean square errors (RMSE) of 0.05 and 0.06 for the noiseless and noisy backends, respectively. We compare our model's predictions with reference circuits [Sim and others, QuTe'2019] and IBM Qiskit's hardware-efficient ansatz sets to further evaluate our model's performance. Our experimental evaluation in noiseless and noisy scenarios reveals a close alignment with ground truth expressibility values, highlighting the model's efficacy. Moreover, our model exhibits promising extrapolation capabilities, predicting expressibility values with low RMSE for out-of-range qubit circuits trained solely on only up to 5-qubit circuit sets. This work thus provides a reliable means of efficiently evaluating the expressibility of diverse PQCs on noiseless simulators and hardware.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding
Authors:
Kehinde Ajayi,
Xin Wei,
Martin Gryder,
Winston Shields,
Jian Wu,
Shawn M. Jones,
Michal Kucer,
Diane Oyen
Abstract:
Recent advances in computer vision (CV) and natural language processing have been driven by exploiting big data on practical applications. However, these research fields are still limited by the sheer volume, versatility, and diversity of the available datasets. CV tasks, such as image captioning, which has primarily been carried out on natural images, still struggle to produce accurate and meanin…
▽ More
Recent advances in computer vision (CV) and natural language processing have been driven by exploiting big data on practical applications. However, these research fields are still limited by the sheer volume, versatility, and diversity of the available datasets. CV tasks, such as image captioning, which has primarily been carried out on natural images, still struggle to produce accurate and meaningful captions on sketched images often included in scientific and technical documents. The advancement of other tasks such as 3D reconstruction from 2D images requires larger datasets with multiple viewpoints. We introduce DeepPatent2, a large-scale dataset, providing more than 2.7 million technical drawings with 132,890 object names and 22,394 viewpoints extracted from 14 years of US design patent documents. We demonstrate the usefulness of DeepPatent2 with conceptual captioning. We further provide the potential usefulness of our dataset to facilitate other research areas such as 3D image reconstruction and image retrieval.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Discovering Image Usage Online: A Case Study With "Flatten the Curve''
Authors:
Shawn M. Jones,
Diane Oyen
Abstract:
Understanding the spread of images across the web helps us understand the reuse of scientific visualizations and their relationship with the public. The "Flatten the Curve" graphic was heavily used during the COVID-19 pandemic to convey a complex concept in a simple form. It displays two curves comparing the impact on case loads for medical facilities if the populace either adopts or fails to adop…
▽ More
Understanding the spread of images across the web helps us understand the reuse of scientific visualizations and their relationship with the public. The "Flatten the Curve" graphic was heavily used during the COVID-19 pandemic to convey a complex concept in a simple form. It displays two curves comparing the impact on case loads for medical facilities if the populace either adopts or fails to adopt protective measures during a pandemic. We use five variants of the "Flatten the Curve" image as a case study for viewing the spread of an image online. To evaluate its spread, we leverage three information channels: reverse image search engines, social media, and web archives. Reverse image searches give us a current view into image reuse. Social media helps us understand a variant's popularity over time. Web archives help us see when it was preserved, highlighting a view of popularity for future researchers. Our case study leverages document URLs can be used as a proxy for images when studying the spread of images online.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
Semi-supervised Learning of Pushforwards For Domain Translation & Adaptation
Authors:
Nishant Panda,
Natalie Klein,
Dominic Yang,
Patrick Gasda,
Diane Oyen
Abstract:
Given two probability densities on related data spaces, we seek a map pushing one density to the other while satisfying application-dependent constraints. For maps to have utility in a broad application space (including domain translation, domain adaptation, and generative modeling), the map must be available to apply on out-of-sample data points and should correspond to a probabilistic model over…
▽ More
Given two probability densities on related data spaces, we seek a map pushing one density to the other while satisfying application-dependent constraints. For maps to have utility in a broad application space (including domain translation, domain adaptation, and generative modeling), the map must be available to apply on out-of-sample data points and should correspond to a probabilistic model over the two spaces. Unfortunately, existing approaches, which are primarily based on optimal transport, do not address these needs. In this paper, we introduce a novel pushforward map learning algorithm that utilizes normalizing flows to parameterize the map. We first re-formulate the classical optimal transport problem to be map-focused and propose a learning algorithm to select from all possible maps under the constraint that the map minimizes a probability distance and application-specific regularizers; thus, our method can be seen as solving a modified optimal transport problem. Once the map is learned, it can be used to map samples from a source domain to a target domain. In addition, because the map is parameterized as a composition of normalizing flows, it models the empirical distributions over the two data spaces and allows both sampling and likelihood evaluation for both data sets. We compare our method (parOT) to related optimal transport approaches in the context of domain adaptation and domain translation on benchmark data sets. Finally, to illustrate the impact of our work on applied problems, we apply parOT to a real scientific application: spectral calibration for high-dimensional measurements from two vastly different environments
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Generative structured normalizing flow Gaussian processes applied to spectroscopic data
Authors:
Natalie Klein,
Nishant Panda,
Patrick Gasda,
Diane Oyen
Abstract:
In this work, we propose a novel generative model for map** inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequate…
▽ More
In this work, we propose a novel generative model for map** inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequately characterize future observed data; it is critical that models adequately indicate uncertainty, particularly when they may be asked to extrapolate. In our proposed model, structured conditional normalizing flows provide parsimonious latent representations that relate to the inputs through a Gaussian process, providing exact likelihood calculations and uncertainty that naturally increases away from the training data inputs. We demonstrate the methodology on laser-induced breakdown spectroscopy data from the ChemCam instrument onboard the Mars rover Curiosity. ChemCam was designed to recover the chemical composition of rock and soil samples by measuring the spectral properties of plasma atomic emissions induced by a laser pulse. We show that our model can generate realistic spectra conditional on a given chemical composition and that we can use the model to perform uncertainty quantification of chemical compositions for new observed spectra. Based on our results, we anticipate that our proposed modeling approach may be useful in other scientific domains with high-dimensional, complex structure where it is important to quantify predictive uncertainty.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine
Authors:
Shawn M. Jones,
Diane Oyen
Abstract:
Much computer vision research has focused on natural images, but technical documents typically consist of abstract images, such as charts, drawings, diagrams, and schematics. How well do general web search engines discover abstract images? Recent advancements in computer vision and machine learning have led to the rise of reverse image search engines. Where conventional search engines accept a tex…
▽ More
Much computer vision research has focused on natural images, but technical documents typically consist of abstract images, such as charts, drawings, diagrams, and schematics. How well do general web search engines discover abstract images? Recent advancements in computer vision and machine learning have led to the rise of reverse image search engines. Where conventional search engines accept a text query and return a set of document results, including images, a reverse image search accepts an image as a query and returns a set of images as results. This paper evaluates how well common reverse image search engines discover abstract images. We conducted an experiment leveraging images from Wikimedia Commons, a website known to be well indexed by Baidu, Bing, Google, and Yandex. We measure how difficult an image is to find again (retrievability), what percentage of images returned are relevant (precision), and the average number of results a visitor must review before finding the submitted image (mean reciprocal rank). When trying to discover the same image again among similar images, Yandex performs best. When searching for pages containing a specific image, Google and Yandex outperform the others when discovering photographs with precision scores ranging from 0.8191 to 0.8297, respectively. In both of these cases, Google and Yandex perform better with natural images than with abstract ones achieving a difference in retrievability as high as 54\% between images in these categories. These results affect anyone applying common web search engines to search for technical documents that use abstract images.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Robustness to Label Noise Depends on the Shape of the Noise Distribution in Feature Space
Authors:
Diane Oyen,
Michal Kucer,
Nick Hengartner,
Har Simrat Singh
Abstract:
Machine learning classifiers have been demonstrated, both empirically and theoretically, to be robust to label noise under certain conditions -- notably the typical assumption is that label noise is independent of the features given the class label. We provide a theoretical framework that generalizes beyond this typical assumption by modeling label noise as a distribution over feature space. We sh…
▽ More
Machine learning classifiers have been demonstrated, both empirically and theoretically, to be robust to label noise under certain conditions -- notably the typical assumption is that label noise is independent of the features given the class label. We provide a theoretical framework that generalizes beyond this typical assumption by modeling label noise as a distribution over feature space. We show that both the scale and the shape of the noise distribution influence the posterior likelihood; and the shape of the noise distribution has a stronger impact on classification performance if the noise is concentrated in feature space where the decision boundary can be moved. For the special case of uniform label noise (independent of features and the class label), we show that the Bayes optimal classifier for $c$ classes is robust to label noise until the ratio of noisy samples goes above $\frac{c-1}{c}$ (e.g. 90% for 10 classes), which we call the tip** point. However, for the special case of class-dependent label noise (independent of features given the class label), the tip** point can be as low as 50%. Most importantly, we show that when the noise distribution targets decision boundaries (label noise is directly dependent on feature space), classification robustness can drop off even at a small scale of noise. Even when evaluating recent label-noise mitigation methods we see reduced accuracy when label noise is dependent on features. These findings explain why machine learning often handles label noise well if the noise distribution is uniform in feature-space; yet it also points to the difficulty of overcoming label noise when it is concentrated in a region of feature space where a decision boundary can move.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
On visual self-supervision and its effect on model robustness
Authors:
Michal Kucer,
Diane Oyen,
Garrett Kenyon
Abstract:
Recent self-supervision methods have found success in learning feature representations that could rival ones from full supervision, and have been shown to be beneficial to the model in several ways: for example improving models robustness and out-of-distribution detection. In our paper, we conduct an empirical study to understand more precisely in what way can self-supervised learning - as a pre-t…
▽ More
Recent self-supervision methods have found success in learning feature representations that could rival ones from full supervision, and have been shown to be beneficial to the model in several ways: for example improving models robustness and out-of-distribution detection. In our paper, we conduct an empirical study to understand more precisely in what way can self-supervised learning - as a pre-training technique or part of adversarial training - affects model robustness to $l_2$ and $l_{\infty}$ adversarial perturbations and natural image corruptions. Self-supervision can indeed improve model robustness, however it turns out the devil is in the details. If one simply adds self-supervision loss in tandem with adversarial training, then one sees improvement in accuracy of the model when evaluated with adversarial perturbations smaller or comparable to the value of $ε_{train}$ that the robust model is trained with. However, if one observes the accuracy for $ε_{test} \ge ε_{train}$, the model accuracy drops. In fact, the larger the weight of the supervision loss, the larger the drop in performance, i.e. harming the robustness of the model. We identify primary ways in which self-supervision can be added to adversarial training, and observe that using a self-supervised loss to optimize both network parameters and find adversarial examples leads to the strongest improvement in model robustness, as this can be viewed as a form of ensemble adversarial training. Although self-supervised pre-training yields benefits in improving adversarial training as compared to random weight initialization, we observe no benefit in model robustness or accuracy if self-supervision is incorporated into adversarial training.
△ Less
Submitted 8 December, 2021;
originally announced December 2021.
-
SpectroscopyNet: Learning to pre-process Spectroscopy Signals without clean data
Authors:
Juan Castorena,
Diane Oyen
Abstract:
In this work we propose a deep learning approach to clean spectroscopy signals using only uncleaned data. Cleaning signals from spectroscopy instrument noise is challenging as noise exhibits an unknown, non-zero mean, multivariate distributions. Our framework is a siamese neural net that learns identifiable disentanglement of the signal and noise components under a stationarity assumption. The dis…
▽ More
In this work we propose a deep learning approach to clean spectroscopy signals using only uncleaned data. Cleaning signals from spectroscopy instrument noise is challenging as noise exhibits an unknown, non-zero mean, multivariate distributions. Our framework is a siamese neural net that learns identifiable disentanglement of the signal and noise components under a stationarity assumption. The disentangled representations satisfy reconstruction fidelity, reduce consistencies with measurements of unrelated targets and imposes relaxed-orthogonality constraints between the signal and noise representations. Evaluations on a laser induced breakdown spectroscopy (LIBS) dataset from the ChemCam instrument onboard the Martian Curiosity rover show a superior performance in cleaning LIBS measurements compared to the standard feature engineered approaches being used by the ChemCam team.
△ Less
Submitted 3 January, 2023; v1 submitted 26 October, 2021;
originally announced October 2021.
-
Neural density estimation and uncertainty quantification for laser induced breakdown spectroscopy spectra
Authors:
Katiana Kontolati,
Natalie Klein,
Nishant Panda,
Diane Oyen
Abstract:
Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate th…
▽ More
Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate the capability of this approach on laser-induced breakdown spectroscopy data collected by the ChemCam instrument on the Mars rover Curiosity. Using our approach, we are able to generate realistic spectral samples and to accurately predict state vectors with associated well-calibrated uncertainties. We anticipate that this methodology will enable efficient probabilistic modeling of spectral data, leading to potential advances in several areas, including out-of-distribution detection and sensitivity analysis.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Diff2Dist: Learning Spectrally Distinct Edge Functions, with Applications to Cell Morphology Analysis
Authors:
Cory Braker Scott,
Eric Mjolsness,
Diane Oyen,
Chie Kodera,
David Bouchez,
Magalie Uyttewaal
Abstract:
We present a method for learning "spectrally descriptive" edge weights for graphs. We generalize a previously known distance measure on graphs (Graph Diffusion Distance), thereby allowing it to be tuned to minimize an arbitrary loss function. Because all steps involved in calculating this modified GDD are differentiable, we demonstrate that it is possible for a small neural network model to learn…
▽ More
We present a method for learning "spectrally descriptive" edge weights for graphs. We generalize a previously known distance measure on graphs (Graph Diffusion Distance), thereby allowing it to be tuned to minimize an arbitrary loss function. Because all steps involved in calculating this modified GDD are differentiable, we demonstrate that it is possible for a small neural network model to learn edge weights which minimize loss. GDD alone does not effectively discriminate between graphs constructed from shoot apical meristem images of wild-type vs. mutant \emph{Arabidopsis thaliana} specimens. However, training edge weights and kernel parameters with contrastive loss produces a learned distance metric with large margins between these graph categories. We demonstrate this by showing improved performance of a simple k-nearest-neighbors classifier on the learned distance matrix. We also demonstrate a further application of this method to biological image analysis: once trained, we use our model to compute the distance between the biological graphs and a set of graphs output by a cell division simulator. This allows us to identify simulation parameter regimes which are similar to each class of graph in our original dataset.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Deep Spectral CNN for Laser Induced Breakdown Spectroscopy
Authors:
Juan Castorena,
Diane Oyen,
Ann Ollila,
Carey Legget,
Nina Lanza
Abstract:
This work proposes a spectral convolutional neural network (CNN) operating on laser induced breakdown spectroscopy (LIBS) signals to learn to (1) disentangle spectral signals from the sources of sensor uncertainty (i.e., pre-process) and (2) get qualitative and quantitative measures of chemical content of a sample given a spectral signal (i.e., calibrate). Once the spectral CNN is trained, it can…
▽ More
This work proposes a spectral convolutional neural network (CNN) operating on laser induced breakdown spectroscopy (LIBS) signals to learn to (1) disentangle spectral signals from the sources of sensor uncertainty (i.e., pre-process) and (2) get qualitative and quantitative measures of chemical content of a sample given a spectral signal (i.e., calibrate). Once the spectral CNN is trained, it can accomplish either task through a single feed-forward pass, with real-time benefits and without any additional side information requirements including dark current, system response, temperature and detector-to-target range. Our experiments demonstrate that the proposed method outperforms the existing approaches used by the Mars Science Lab for pre-processing and calibration for remote sensing observations from the Mars rover, 'Curiosity'.
△ Less
Submitted 2 December, 2020;
originally announced December 2020.
-
StressNet: Deep Learning to Predict Stress With Fracture Propagation in Brittle Materials
Authors:
Yinan Wang,
Diane Oyen,
Weihong,
Guo,
Anishi Mehta,
Cory Braker Scott,
Nishant Panda,
M. Giselle Fernández-Godino,
Gowri Srinivasan,
Xiaowei Yue
Abstract:
Catastrophic failure in brittle materials is often due to the rapid growth and coalescence of cracks aided by high internal stresses. Hence, accurate prediction of maximum internal stress is critical to predicting time to failure and improving the fracture resistance and reliability of materials. Existing high-fidelity methods, such as the Finite-Discrete Element Model (FDEM), are limited by their…
▽ More
Catastrophic failure in brittle materials is often due to the rapid growth and coalescence of cracks aided by high internal stresses. Hence, accurate prediction of maximum internal stress is critical to predicting time to failure and improving the fracture resistance and reliability of materials. Existing high-fidelity methods, such as the Finite-Discrete Element Model (FDEM), are limited by their high computational cost. Therefore, to reduce computational cost while preserving accuracy, a novel deep learning model, "StressNet," is proposed to predict the entire sequence of maximum internal stress based on fracture propagation and the initial stress data. More specifically, the Temporal Independent Convolutional Neural Network (TI-CNN) is designed to capture the spatial features of fractures like fracture path and spall regions, and the Bidirectional Long Short-term Memory (Bi-LSTM) Network is adapted to capture the temporal features. By fusing these features, the evolution in time of the maximum internal stress can be accurately predicted. Moreover, an adaptive loss function is designed by dynamically integrating the Mean Squared Error (MSE) and the Mean Absolute Percentage Error (MAPE), to reflect the fluctuations in maximum internal stress. After training, the proposed model is able to compute accurate multi-step predictions of maximum internal stress in approximately 20 seconds, as compared to the FDEM run time of 4 hours, with an average MAPE of 2% relative to test data.
△ Less
Submitted 20 November, 2020;
originally announced November 2020.
-
Diagram Image Retrieval using Sketch-Based Deep Learning and Transfer Learning
Authors:
Manish Bhattarai,
Diane Oyen,
Juan Castorena,
Li** Yang,
Brendt Wohlberg
Abstract:
Resolution of the complex problem of image retrieval for diagram images has yet to be reached. Deep learning methods continue to excel in the fields of object detection and image classification applied to natural imagery. However, the application of such methodologies applied to binary imagery remains limited due to lack of crucial features such as textures,color and intensity information. This pa…
▽ More
Resolution of the complex problem of image retrieval for diagram images has yet to be reached. Deep learning methods continue to excel in the fields of object detection and image classification applied to natural imagery. However, the application of such methodologies applied to binary imagery remains limited due to lack of crucial features such as textures,color and intensity information. This paper presents a deep learning based method for image-based search for binary patent images by taking advantage of existing large natural image repositories for image search and sketch-based methods (Sketches are not identical to diagrams, but they do share some characteristics; for example, both imagery types are gray scale (binary), composed of contours, and are lacking in texture).
We begin by using deep learning to generate sketches from natural images for image retrieval and then train a second deep learning model on the sketches. We then use our small set of manually labeled patent diagram images via transfer learning to adapt the image search from sketches of natural images to diagrams. Our experiment results show the effectiveness of deep learning with transfer learning for detecting near-identical copies in patent images and querying similar images based on content.
△ Less
Submitted 22 April, 2020;
originally announced April 2020.
-
Learning Spatial Relationships between Samples of Patent Image Shapes
Authors:
Juan Castorena,
Manish Bhattarai,
Diane Oyen
Abstract:
Binary image based classification and retrieval of documents of an intellectual nature is a very challenging problem. Variations in the binary image generation mechanisms which are subject to the document artisan designer including drawing style, view-point, inclusion of multiple image components are plausible causes for increasing the complexity of the problem. In this work, we propose a method s…
▽ More
Binary image based classification and retrieval of documents of an intellectual nature is a very challenging problem. Variations in the binary image generation mechanisms which are subject to the document artisan designer including drawing style, view-point, inclusion of multiple image components are plausible causes for increasing the complexity of the problem. In this work, we propose a method suitable to binary images which bridges some of the successes of deep learning (DL) to alleviate the problems introduced by the aforementioned variations. The method consists on extracting the shape of interest from the binary image and applying a non-Euclidean geometric neural-net architecture to learn the local and global spatial relationships of the shape. Empirical results show that our method is in some sense invariant to the image generation mechanism variations and achieves results outperforming existing methods in a patent image dataset benchmark.
△ Less
Submitted 27 April, 2020; v1 submitted 12 April, 2020;
originally announced April 2020.
-
TGGLines: A Robust Topological Graph Guided Line Segment Detector for Low Quality Binary Images
Authors:
Ming Gong,
Li** Yang,
Catherine Potts,
Vijayan K. Asari,
Diane Oyen,
Brendt Wohlberg
Abstract:
Line segment detection is an essential task in computer vision and image analysis, as it is the critical foundation for advanced tasks such as shape modeling and road lane line detection for autonomous driving. We present a robust topological graph guided approach for line segment detection in low quality binary images (hence, we call it TGGLines). Due to the graph-guided approach, TGGLines not on…
▽ More
Line segment detection is an essential task in computer vision and image analysis, as it is the critical foundation for advanced tasks such as shape modeling and road lane line detection for autonomous driving. We present a robust topological graph guided approach for line segment detection in low quality binary images (hence, we call it TGGLines). Due to the graph-guided approach, TGGLines not only detects line segments, but also organizes the segments with a line segment connectivity graph, which means the topological relationships (e.g., intersection, an isolated line segment) of the detected line segments are captured and stored; whereas other line detectors only retain a collection of loose line segments. Our empirical results show that the TGGLines detector visually and quantitatively outperforms state-of-the-art line segment detection methods. In addition, our TGGLines approach has the following two competitive advantages: (1) our method only requires one parameter and it is adaptive, whereas almost all other line segment detection methods require multiple (non-adaptive) parameters, and (2) the line segments detected by TGGLines are organized by a line segment connectivity graph.
△ Less
Submitted 27 February, 2020;
originally announced February 2020.
-
The ISTI Rapid Response on Exploring Cloud Computing 2018
Authors:
Carleton Coffrin,
James Arnold,
Stephan Eidenbenz,
Derek Aberle,
John Ambrosiano,
Zachary Baker,
Sara Brambilla,
Michael Brown,
K. Nolan Carter,
**han Chu,
Patrick Conry,
Keeley Costigan,
Ariane Eberhardt,
David M. Fobes,
Adam Gausmann,
Sean Harris,
Donovan Heimer,
Marlin Holmes,
Bill Junor,
Csaba Kiss,
Steve Linger,
Rodman Linn,
Li-Ta Lo,
Jonathan MacCarthy,
Omar Marcillo
, et al. (23 additional authors not shown)
Abstract:
This report describes eighteen projects that explored how commercial cloud computing services can be utilized for scientific computation at national laboratories. These demonstrations ranged from deploying proprietary software in a cloud environment to leveraging established cloud-based analytics workflows for processing scientific datasets. By and large, the projects were successful and collectiv…
▽ More
This report describes eighteen projects that explored how commercial cloud computing services can be utilized for scientific computation at national laboratories. These demonstrations ranged from deploying proprietary software in a cloud environment to leveraging established cloud-based analytics workflows for processing scientific datasets. By and large, the projects were successful and collectively they suggest that cloud computing can be a valuable computational resource for scientific computation at national laboratories.
△ Less
Submitted 4 January, 2019;
originally announced January 2019.
-
Quantum Algorithm Implementations for Beginners
Authors:
Abhijith J.,
Adetokunbo Adedoyin,
John Ambrosiano,
Petr Anisimov,
William Casper,
Gopinath Chennupati,
Carleton Coffrin,
Hristo Djidjev,
David Gunter,
Satish Karra,
Nathan Lemons,
Shizeng Lin,
Alexander Malyzhenkov,
David Mascarenas,
Susan Mniszewski,
Balu Nadiga,
Daniel O'Malley,
Diane Oyen,
Scott Pakin,
Lakshman Prasad,
Randy Roberts,
Phillip Romero,
Nandakishore Santhi,
Nikolai Sinitsyn,
Pieter J. Swart
, et al. (9 additional authors not shown)
Abstract:
As quantum computers become available to the general public, the need has arisen to train a cohort of quantum programmers, many of whom have been develo** classical computer programs for most of their careers. While currently available quantum computers have less than 100 qubits, quantum computing hardware is widely expected to grow in terms of qubit count, quality, and connectivity. This review…
▽ More
As quantum computers become available to the general public, the need has arisen to train a cohort of quantum programmers, many of whom have been develo** classical computer programs for most of their careers. While currently available quantum computers have less than 100 qubits, quantum computing hardware is widely expected to grow in terms of qubit count, quality, and connectivity. This review aims to explain the principles of quantum programming, which are quite different from classical programming, with straightforward algebra that makes understanding of the underlying fascinating quantum mechanical principles optional. We give an introduction to quantum computing algorithms and their implementation on real quantum hardware. We survey 20 different quantum algorithms, attempting to describe each in a succinct and self-contained fashion. We show how these algorithms can be implemented on IBM's quantum computer, and in each case, we discuss the results of the implementation with respect to differences between the simulator and the actual hardware runs. This article introduces computer scientists, physicists, and engineers to quantum algorithms and provides a blueprint for their implementations.
△ Less
Submitted 26 June, 2022; v1 submitted 10 April, 2018;
originally announced April 2018.
-
Controlling the Precision-Recall Tradeoff in Differential Dependency Network Analysis
Authors:
Diane Oyen,
Alexandru Niculescu-Mizil,
Rachel Ostroff,
Alex Stewart,
Vincent P. Clark
Abstract:
Graphical models have gained a lot of attention recently as a tool for learning and representing dependencies among variables in multivariate data. Often, domain scientists are looking specifically for differences among the dependency networks of different conditions or populations (e.g. differences between regulatory networks of different species, or differences between dependency networks of dis…
▽ More
Graphical models have gained a lot of attention recently as a tool for learning and representing dependencies among variables in multivariate data. Often, domain scientists are looking specifically for differences among the dependency networks of different conditions or populations (e.g. differences between regulatory networks of different species, or differences between dependency networks of diseased versus healthy populations). The standard method for finding these differences is to learn the dependency networks for each condition independently and compare them. We show that this approach is prone to high false discovery rates (low precision) that can render the analysis useless. We then show that by imposing a bias towards learning similar dependency networks for each condition the false discovery rates can be reduced to acceptable levels, at the cost of finding a reduced number of differences. Algorithms developed in the transfer learning literature can be used to vary the strength of the imposed similarity bias and provide a natural mechanism to smoothly adjust this differential precision-recall tradeoff to cater to the requirements of the analysis conducted. We present real case studies (oncological and neurological) where domain experts use the proposed technique to extract useful differential networks that shed light on the biological processes involved in cancer and brain function.
△ Less
Submitted 9 July, 2013;
originally announced July 2013.
-
Bayesian Discovery of Multiple Bayesian Networks via Transfer Learning
Authors:
Diane Oyen,
Terran Lane
Abstract:
Bayesian network structure learning algorithms with limited data are being used in domains such as systems biology and neuroscience to gain insight into the underlying processes that produce observed data. Learning reliable networks from limited data is difficult, therefore transfer learning can improve the robustness of learned networks by leveraging data from related tasks. Existing transfer lea…
▽ More
Bayesian network structure learning algorithms with limited data are being used in domains such as systems biology and neuroscience to gain insight into the underlying processes that produce observed data. Learning reliable networks from limited data is difficult, therefore transfer learning can improve the robustness of learned networks by leveraging data from related tasks. Existing transfer learning algorithms for Bayesian network structure learning give a single maximum a posteriori estimate of network models. Yet, many other models may be equally likely, and so a more informative result is provided by Bayesian structure discovery. Bayesian structure discovery algorithms estimate posterior probabilities of structural features, such as edges. We present transfer learning for Bayesian structure discovery which allows us to explore the shared and unique structural features among related tasks. Efficient computation requires that our transfer learning objective factors into local calculations, which we prove is given by a broad class of transfer biases. Theoretically, we show the efficiency of our approach. Empirically, we show that compared to single task learning, transfer learning is better able to positively identify true edges. We apply the method to whole-brain neuroimaging data.
△ Less
Submitted 8 July, 2013;
originally announced July 2013.