-
Large Language Models Assume People are More Rational than We Really are
Authors:
Ryan Liu,
Jiayi Geng,
Joshua C. Peterson,
Ilia Sucholutsky,
Thomas L. Griffiths
Abstract:
In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human…
▽ More
In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human behavior, acting how we expect humans would in everyday interactions. However, by comparing LLM behavior and predictions to a large dataset of human decisions, we find that this is actually not the case: when both simulating and predicting people's choices, a suite of cutting-edge LLMs (GPT-4o & 4-Turbo, Llama-3-8B & 70B, Claude 3 Opus) assume that people are more rational than we really are. Specifically, these models deviate from human behavior and align more closely with a classic model of rational choice -- expected value theory. Interestingly, people also tend to assume that other people are rational when interpreting their behavior. As a consequence, when we compare the inferences that LLMs and people draw from the decisions of others using another psychological dataset, we find that these inferences are highly correlated. Thus, the implicit decision-making models of LLMs appear to be aligned with the human expectation that other people will act rationally, rather than with how people actually act.
△ Less
Submitted 1 July, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Spatially parallel decoding for multi-qubit lattice surgery
Authors:
Sophia Fuhui Lin,
Eric C. Peterson,
Krishanu Sankar,
Prasahnt Sivarajah
Abstract:
Running quantum algorithms protected by quantum error correction requires a real time, classical decoder. To prevent the accumulation of a backlog, this decoder must process syndromes from the quantum device at a faster rate than they are generated. Most prior work on real time decoding has focused on an isolated logical qubit encoded in the surface code. However, for surface code, quantum program…
▽ More
Running quantum algorithms protected by quantum error correction requires a real time, classical decoder. To prevent the accumulation of a backlog, this decoder must process syndromes from the quantum device at a faster rate than they are generated. Most prior work on real time decoding has focused on an isolated logical qubit encoded in the surface code. However, for surface code, quantum programs of utility will require multi-qubit interactions performed via lattice surgery. A large merged patch can arise during lattice surgery -- possibly as large as the entire device. This puts a significant strain on a real time decoder, which must decode errors on this merged patch and maintain the level of fault-tolerance that it achieves on isolated logical qubits.
These requirements are relaxed by using spatially parallel decoding, which can be accomplished by dividing the physical qubits on the device into multiple overlap** groups and assigning a decoder module to each. We refer to this approach as spatially parallel windows. While previous work has explored similar ideas, none have addressed system-specific considerations pertinent to the task or the constraints from using hardware accelerators. In this work, we demonstrate how to configure spatially parallel windows, so that the scheme (1) is compatible with hardware accelerators, (2) supports general lattice surgery operations, (3) maintains the fidelity of the logical qubits, and (4) meets the throughput requirement for real time decoding. Furthermore, our results reveal the importance of optimally choosing the buffer width to achieve a balance between accuracy and throughput -- a decision that should be influenced by the device's physical noise.
△ Less
Submitted 6 May, 2024; v1 submitted 2 March, 2024;
originally announced March 2024.
-
ReLU Neural Networks, Polyhedral Decompositions, and Persistent Homolog
Authors:
Ya**g Liu,
Christina M Cole,
Chris Peterson,
Michael Kirby
Abstract:
A ReLU neural network leads to a finite polyhedral decomposition of input space and a corresponding finite dual graph. We show that while this dual graph is a coarse quantization of input space, it is sufficiently robust that it can be combined with persistent homology to detect homological signals of manifolds in the input space from samples. This property holds for a variety of networks trained…
▽ More
A ReLU neural network leads to a finite polyhedral decomposition of input space and a corresponding finite dual graph. We show that while this dual graph is a coarse quantization of input space, it is sufficiently robust that it can be combined with persistent homology to detect homological signals of manifolds in the input space from samples. This property holds for a variety of networks trained for a wide range of purposes that have nothing to do with this topological application. We found this feature to be surprising and interesting; we hope it will also be useful.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
The Universal Law of Generalization Holds for Naturalistic Stimuli
Authors:
Raja Marjieh,
Nori Jacoby,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Shepard's universal law of generalization is a remarkable hypothesis about how intelligent organisms should perceive similarity. In its broadest form, the universal law states that the level of perceived similarity between a pair of stimuli should decay as a concave function of their distance when embedded in an appropriate psychological space. While extensively studied, evidence in support of the…
▽ More
Shepard's universal law of generalization is a remarkable hypothesis about how intelligent organisms should perceive similarity. In its broadest form, the universal law states that the level of perceived similarity between a pair of stimuli should decay as a concave function of their distance when embedded in an appropriate psychological space. While extensively studied, evidence in support of the universal law has relied on low-dimensional stimuli and small stimulus sets that are very different from their real-world counterparts. This is largely because pairwise comparisons -- as required for similarity judgments -- scale quadratically in the number of stimuli. We provide direct evidence for the universal law in a naturalistic high-dimensional regime by analyzing an existing dataset of 214,200 human similarity judgments and a newly collected dataset of 390,819 human generalization judgments (N=2406 US participants) across three sets of natural images.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Hamming Similarity and Graph Laplacians for Class Partitioning and Adversarial Image Detection
Authors:
Huma Jamil,
Ya**g Liu,
Turgay Caglar,
Christina M. Cole,
Nathaniel Blanchard,
Christopher Peterson,
Michael Kirby
Abstract:
Researchers typically investigate neural network representations by examining activation outputs for one or more layers of a network. Here, we investigate the potential for ReLU activation patterns (encoded as bit vectors) to aid in understanding and interpreting the behavior of neural networks. We utilize Representational Dissimilarity Matrices (RDMs) to investigate the coherence of data within t…
▽ More
Researchers typically investigate neural network representations by examining activation outputs for one or more layers of a network. Here, we investigate the potential for ReLU activation patterns (encoded as bit vectors) to aid in understanding and interpreting the behavior of neural networks. We utilize Representational Dissimilarity Matrices (RDMs) to investigate the coherence of data within the embedding spaces of a deep neural network. From each layer of a network, we extract and utilize bit vectors to construct similarity scores between images. From these similarity scores, we build a similarity matrix for a collection of images drawn from 2 classes. We then apply Fiedler partitioning to the associated Laplacian matrix to separate the classes. Our results indicate, through bit vector representations, that the network continues to refine class detectability with the last ReLU layer achieving better than 95\% separation accuracy. Additionally, we demonstrate that bit vectors aid in adversarial image detection, again achieving over 95\% accuracy in separating adversarial and non-adversarial images using a simple classifier.
△ Less
Submitted 5 May, 2023; v1 submitted 2 May, 2023;
originally announced May 2023.
-
Automated deep learning segmentation of high-resolution 7 T postmortem MRI for quantitative analysis of structure-pathology correlations in neurodegenerative diseases
Authors:
Pulkit Khandelwal,
Michael Tran Duong,
Shokufeh Sadaghiani,
Sydney Lim,
Amanda Denning,
Eunice Chung,
Sadhana Ravikumar,
Sanaz Arezoumandan,
Claire Peterson,
Madigan Bedard,
Noah Capp,
Ranjit Ittyerah,
Elyse Migdal,
Grace Choi,
Emily Kopp,
Bridget Loja,
Eusha Hasan,
Jiacheng Li,
Alejandra Bahena,
Karthik Prabhakaran,
Gabor Mizsei,
Marianna Gabrielyan,
Theresa Schuck,
Winifred Trotman,
John Robinson
, et al. (12 additional authors not shown)
Abstract:
Postmortem MRI allows brain anatomy to be examined at high resolution and to link pathology measures with morphometric measurements. However, automated segmentation methods for brain map** in postmortem MRI are not well developed, primarily due to limited availability of labeled datasets, and heterogeneity in scanner hardware and acquisition protocols. In this work, we present a high resolution…
▽ More
Postmortem MRI allows brain anatomy to be examined at high resolution and to link pathology measures with morphometric measurements. However, automated segmentation methods for brain map** in postmortem MRI are not well developed, primarily due to limited availability of labeled datasets, and heterogeneity in scanner hardware and acquisition protocols. In this work, we present a high resolution of 135 postmortem human brain tissue specimens imaged at 0.3 mm$^{3}$ isotropic using a T2w sequence on a 7T whole-body MRI scanner. We developed a deep learning pipeline to segment the cortical mantle by benchmarking the performance of nine deep neural architectures, followed by post-hoc topological correction. We then segment four subcortical structures (caudate, putamen, globus pallidus, and thalamus), white matter hyperintensities, and the normal appearing white matter. We show generalizing capabilities across whole brain hemispheres in different specimens, and also on unseen images acquired at 0.28 mm^3 and 0.16 mm^3 isotropic T2*w FLASH sequence at 7T. We then compute localized cortical thickness and volumetric measurements across key regions, and link them with semi-quantitative neuropathological ratings. Our code, Jupyter notebooks, and the containerized executables are publicly available at: https://pulkit-khandelwal.github.io/exvivo-brain-upenn
△ Less
Submitted 17 October, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Dual Graphs of Polyhedral Decompositions for the Detection of Adversarial Attacks
Authors:
Huma Jamil,
Ya**g Liu,
Christina M. Cole,
Nathaniel Blanchard,
Emily J. King,
Michael Kirby,
Christopher Peterson
Abstract:
Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize…
▽ More
Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit ($1$ for ReLU activation, $0$ for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.
△ Less
Submitted 2 December, 2022; v1 submitted 23 November, 2022;
originally announced November 2022.
-
On the Informativeness of Supervision Signals
Authors:
Ilia Sucholutsky,
Ruairidh M. Battleday,
Katherine M. Collins,
Raja Marjieh,
Joshua C. Peterson,
Pulkit Singh,
Umang Bhatt,
Nori Jacoby,
Adrian Weller,
Thomas L. Griffiths
Abstract:
Supervised learning typically focuses on learning transferable representations from training examples annotated by humans. While rich annotations (like soft labels) carry more information than sparse annotations (like hard labels), they are also more expensive to collect. For example, while hard labels only provide information about the closest class an object belongs to (e.g., "this is a dog"), s…
▽ More
Supervised learning typically focuses on learning transferable representations from training examples annotated by humans. While rich annotations (like soft labels) carry more information than sparse annotations (like hard labels), they are also more expensive to collect. For example, while hard labels only provide information about the closest class an object belongs to (e.g., "this is a dog"), soft labels provide information about the object's relationship with multiple classes (e.g., "this is most likely a dog, but it could also be a wolf or a coyote"). We use information theory to compare how a number of commonly-used supervision signals contribute to representation-learning performance, as well as how their capacity is affected by factors such as the number of labels, classes, dimensions, and noise. Our framework provides theoretical justification for using hard labels in the big-data regime, but richer supervision signals for few-shot learning and out-of-distribution generalization. We validate these results empirically in a series of experiments with over 1 million crowdsourced image annotations and conduct a cost-benefit analysis to establish a tradeoff curve that enables users to optimize the cost of supervising representation learning on their own datasets.
△ Less
Submitted 4 July, 2023; v1 submitted 2 November, 2022;
originally announced November 2022.
-
A distributed blossom algorithm for minimum-weight perfect matching
Authors:
Eric C. Peterson,
Peter J. Karalekas
Abstract:
We describe a distributed, asynchronous variant of Edmonds's exact algorithm for producing perfect matchings of minimum weight. The development of this algorithm is driven by an application to online error correction in quantum computing, first envisioned by Fowler; we analyze the performance of our algorithm as applied to this domain in a sequel.
We describe a distributed, asynchronous variant of Edmonds's exact algorithm for producing perfect matchings of minimum weight. The development of this algorithm is driven by an application to online error correction in quantum computing, first envisioned by Fowler; we analyze the performance of our algorithm as applied to this domain in a sequel.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
The Flag Median and FlagIRLS
Authors:
Nathan Mankovich,
Emily King,
Chris Peterson,
Michael Kirby
Abstract:
Finding prototypes (e.g., mean and median) for a dataset is central to a number of common machine learning algorithms. Subspaces have been shown to provide useful, robust representations for datasets of images, videos and more. Since subspaces correspond to points on a Grassmann manifold, one is led to consider the idea of a subspace prototype for a Grassmann-valued dataset. While a number of diff…
▽ More
Finding prototypes (e.g., mean and median) for a dataset is central to a number of common machine learning algorithms. Subspaces have been shown to provide useful, robust representations for datasets of images, videos and more. Since subspaces correspond to points on a Grassmann manifold, one is led to consider the idea of a subspace prototype for a Grassmann-valued dataset. While a number of different subspace prototypes have been described, the calculation of some of these prototypes has proven to be computationally expensive while other prototypes are affected by outliers and produce highly imperfect clustering on noisy data. This work proposes a new subspace prototype, the flag median, and introduces the FlagIRLS algorithm for its calculation. We provide evidence that the flag median is robust to outliers and can be used effectively in algorithms like Linde-Buzo-Grey (LBG) to produce improved clusterings on Grassmannians. Numerical experiments include a synthetic dataset, the MNIST handwritten digits dataset, the Mind's Eye video dataset and the UCF YouTube action dataset. The flag median is compared the other leading algorithms for computing prototypes on the Grassmannian, namely, the $\ell_2$-median and to the flag mean. We find that using FlagIRLS to compute the flag median converges in $4$ iterations on a synthetic dataset. We also see that Grassmannian LBG with a codebook size of $20$ and using the flag median produces at least a $10\%$ improvement in cluster purity over Grassmannian LBG using the flag mean or $\ell_2$-median on the Mind's Eye dataset.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
Supporting Massive DLRM Inference Through Software Defined Memory
Authors:
Ehsan K. Ardestani,
Changkyu Kim,
Seung Jae Lee,
Luoshang Pan,
Valmiki Rampersad,
Jens Axboe,
Banit Agrawal,
Fuxun Yu,
Ansha Yu,
Trung Le,
Hector Yuen,
Shishir Juluri,
Akshat Nanda,
Manoj Wodekar,
Dheevatsa Mudigere,
Krishnakumar Nair,
Maxim Naumov,
Chris Peterson,
Mikhail Smelyanskiy,
Vijay Rao
Abstract:
Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage ClassMemory (SCM) for inference enables lower power consumption and cost. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents differen…
▽ More
Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage ClassMemory (SCM) for inference enables lower power consumption and cost. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents different techniques to improve performance through a Software Defined Memory. We show how underlying technologies such as Nand Flash and 3DXP differentiate, and relate to real world scenarios, enabling from 5% to 29% power savings.
△ Less
Submitted 8 November, 2021; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Gray Matter Segmentation in Ultra High Resolution 7 Tesla ex vivo T2w MRI of Human Brain Hemispheres
Authors:
Pulkit Khandelwal,
Shokufeh Sadaghiani,
Michael Tran Duong,
Sadhana Ravikumar,
Sydney Lim,
Sanaz Arezoumandan,
Claire Peterson,
Eunice Chung,
Madigan Bedard,
Noah Capp,
Ranjit Ittyerah,
Elyse Migdal,
Grace Choi,
Emily Kopp,
Bridget Loja,
Eusha Hasan,
Jiacheng Li,
Karthik Prabhakaran,
Gabor Mizsei,
Marianna Gabrielyan,
Theresa Schuck,
John Robinson,
Daniel Ohm,
Edward Lee,
John Q. Trojanowski
, et al. (8 additional authors not shown)
Abstract:
Ex vivo MRI of the brain provides remarkable advantages over in vivo MRI for visualizing and characterizing detailed neuroanatomy. However, automated cortical segmentation methods in ex vivo MRI are not well developed, primarily due to limited availability of labeled datasets, and heterogeneity in scanner hardware and acquisition protocols. In this work, we present a high resolution 7 Tesla datase…
▽ More
Ex vivo MRI of the brain provides remarkable advantages over in vivo MRI for visualizing and characterizing detailed neuroanatomy. However, automated cortical segmentation methods in ex vivo MRI are not well developed, primarily due to limited availability of labeled datasets, and heterogeneity in scanner hardware and acquisition protocols. In this work, we present a high resolution 7 Tesla dataset of 32 ex vivo human brain specimens. We benchmark the cortical mantle segmentation performance of nine neural network architectures, trained and evaluated using manually-segmented 3D patches sampled from specific cortical regions, and show excellent generalizing capabilities across whole brain hemispheres in different specimens, and also on unseen images acquired at different magnetic field strength and imaging sequences. Finally, we provide cortical thickness measurements across key regions in 3D ex vivo human brain images. Our code and processed datasets are publicly available at https://github.com/Pulkit-Khandelwal/picsl-ex-vivo-segmentation.
△ Less
Submitted 3 March, 2022; v1 submitted 14 October, 2021;
originally announced October 2021.
-
gazel: Supporting Source Code Edits in Eye-Tracking Studies
Authors:
Sarah Fakhoury,
Devjeet Roy,
Harry Pines,
Tyler Cleveland,
Cole Peterson,
Venera Arnaoudova,
Bonita Sharif,
Jonathan Maletic
Abstract:
Eye tracking tools are used in software engineering research to study various software development activities. However, a major limitation of these tools is their inability to track gaze data for activities that involve source code editing. We present a novel solution to support eye tracking experiments for tasks involving source code edits as an extension of the iTrace community infrastructure. W…
▽ More
Eye tracking tools are used in software engineering research to study various software development activities. However, a major limitation of these tools is their inability to track gaze data for activities that involve source code editing. We present a novel solution to support eye tracking experiments for tasks involving source code edits as an extension of the iTrace community infrastructure. We introduce the iTrace-Atom plugin and gazel -- a Python data processing pipeline that maps gaze information to changing source code elements and provides researchers with a way to query this dynamic data. iTrace-Atom is evaluated via a series of simulations and is over 99% accurate at high eye-tracking speeds of over 1,000Hz. iTrace and gazel completely revolutionize the way eye tracking studies are conducted in realistic settings with the presence of scrolling, context switching, and now editing. This opens the doors to support many day-to-day software engineering tasks such as bug fixing, adding new features, and refactoring.
△ Less
Submitted 19 June, 2021;
originally announced June 2021.
-
Locally Linear Attributes of ReLU Neural Networks
Authors:
Ben Sattelberg,
Renzo Cavalieri,
Michael Kirby,
Chris Peterson,
Ross Beveridge
Abstract:
A ReLU neural network determines/is a continuous piecewise linear map from an input space to an output space. The weights in the neural network determine a decomposition of the input space into convex polytopes and on each of these polytopes the network can be described by a single affine map**. The structure of the decomposition, together with the affine map attached to each polytope, can be an…
▽ More
A ReLU neural network determines/is a continuous piecewise linear map from an input space to an output space. The weights in the neural network determine a decomposition of the input space into convex polytopes and on each of these polytopes the network can be described by a single affine map**. The structure of the decomposition, together with the affine map attached to each polytope, can be analyzed to investigate the behavior of the associated neural network.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
aether: Distributed system emulation in Common Lisp
Authors:
Eric C. Peterson,
Peter J. Karalekas
Abstract:
We describe a Common Lisp package suitable for the high-level design, specification, simulation, and instrumentation of real-time distributed algorithms and hardware on which to run them. We discuss various design decisions around the package structure, and we explore their consequences with small examples.
We describe a Common Lisp package suitable for the high-level design, specification, simulation, and instrumentation of real-time distributed algorithms and hardware on which to run them. We discuss various design decisions around the package structure, and we explore their consequences with small examples.
△ Less
Submitted 23 April, 2021; v1 submitted 11 November, 2020;
originally announced November 2020.
-
End-to-end Deep Prototype and Exemplar Models for Predicting Human Behavior
Authors:
Pulkit Singh,
Joshua C. Peterson,
Ruairidh M. Battleday,
Thomas L. Griffiths
Abstract:
Traditional models of category learning in psychology focus on representation at the category level as opposed to the stimulus level, even though the two are likely to interact. The stimulus representations employed in such models are either hand-designed by the experimenter, inferred circuitously from human judgments, or borrowed from pretrained deep neural networks that are themselves competing…
▽ More
Traditional models of category learning in psychology focus on representation at the category level as opposed to the stimulus level, even though the two are likely to interact. The stimulus representations employed in such models are either hand-designed by the experimenter, inferred circuitously from human judgments, or borrowed from pretrained deep neural networks that are themselves competing models of category learning. In this work, we extend classic prototype and exemplar models to learn both stimulus and category representations jointly from raw input. This new class of models can be parameterized by deep neural networks (DNN) and trained end-to-end. Following their namesakes, we refer to them as Deep Prototype Models, Deep Exemplar Models, and Deep Gaussian Mixture Models. Compared to typical DNNs, we find that their cognitively inspired counterparts both provide better intrinsic fit to human behavior and improve ground-truth classification.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
The flag manifold as a tool for analyzing and comparing data sets
Authors:
Xiaofeng Ma,
Michael Kirby,
Chris Peterson
Abstract:
The shape and orientation of data clouds reflect variability in observations that can confound pattern recognition systems. Subspace methods, utilizing Grassmann manifolds, have been a great aid in dealing with such variability. However, this usefulness begins to falter when the data cloud contains sufficiently many outliers corresponding to stray elements from another class or when the number of…
▽ More
The shape and orientation of data clouds reflect variability in observations that can confound pattern recognition systems. Subspace methods, utilizing Grassmann manifolds, have been a great aid in dealing with such variability. However, this usefulness begins to falter when the data cloud contains sufficiently many outliers corresponding to stray elements from another class or when the number of data points is larger than the number of features. We illustrate how nested subspace methods, utilizing flag manifolds, can help to deal with such additional confounding factors. Flag manifolds, which are parameter spaces for nested subspaces, are a natural geometric generalization of Grassmann manifolds. To make practical comparisons on a flag manifold, algorithms are proposed for determining the distances between points $[A], [B]$ on a flag manifold, where $A$ and $B$ are arbitrary orthogonal matrix representatives for $[A]$ and $[B]$, and for determining the initial direction of these minimal length geodesics. The approach is illustrated in the context of (hyper) spectral imagery showing the impact of ambient dimension, sample dimension, and flag structure.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
An Open-Source, Industrial-Strength Optimizing Compiler for Quantum Programs
Authors:
Robert S. Smith,
Eric C. Peterson,
Mark G. Skilbeck,
Erik J. Davis
Abstract:
Quilc is an open-source, optimizing compiler for gate-based quantum programs written in Quil or QASM, two popular quantum programming languages. The compiler was designed with attention toward NISQ-era quantum computers, specifically recognizing that each quantum gate has a non-negligible and often irrecoverable cost toward a program's successful execution. Quilc's primary goal is to make authorin…
▽ More
Quilc is an open-source, optimizing compiler for gate-based quantum programs written in Quil or QASM, two popular quantum programming languages. The compiler was designed with attention toward NISQ-era quantum computers, specifically recognizing that each quantum gate has a non-negligible and often irrecoverable cost toward a program's successful execution. Quilc's primary goal is to make authoring quantum software a simpler exercise by making architectural details less burdensome to the author. Using Quilc allows one to write programs faster while usually not compromising---and indeed sometimes improving---their execution fidelity on a given hardware architecture. In this paper, we describe many of the principles behind Quilc's design, and demonstrate the compiler with various examples.
△ Less
Submitted 31 March, 2020;
originally announced March 2020.
-
A quantum-classical cloud platform optimized for variational hybrid algorithms
Authors:
Peter J. Karalekas,
Nikolas A. Tezak,
Eric C. Peterson,
Colm A. Ryan,
Marcus P. da Silva,
Robert S. Smith
Abstract:
In order to support near-term applications of quantum computing, a new compute paradigm has emerged--the quantum-classical cloud--in which quantum computers (QPUs) work in tandem with classical computers (CPUs) via a shared cloud infrastructure. In this work, we enumerate the architectural requirements of a quantum-classical cloud platform, and present a framework for benchmarking its runtime perf…
▽ More
In order to support near-term applications of quantum computing, a new compute paradigm has emerged--the quantum-classical cloud--in which quantum computers (QPUs) work in tandem with classical computers (CPUs) via a shared cloud infrastructure. In this work, we enumerate the architectural requirements of a quantum-classical cloud platform, and present a framework for benchmarking its runtime performance. In addition, we walk through two platform-level enhancements, parametric compilation and active qubit reset, that specifically optimize a quantum-classical architecture to support variational hybrid algorithms (VHAs), the most promising applications of near-term quantum hardware. Finally, we show that integrating these two features into the Rigetti Quantum Cloud Services (QCS) platform results in considerable improvements to the latencies that govern algorithm runtime.
△ Less
Submitted 30 May, 2020; v1 submitted 13 January, 2020;
originally announced January 2020.
-
Scaling up Psychology via Scientific Regret Minimization: A Case Study in Moral Decisions
Authors:
Mayank Agrawal,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Do large datasets provide value to psychologists? Without a systematic methodology for working with such datasets, there is a valid concern that analyses will produce noise artifacts rather than true effects. In this paper, we offer a way to enable researchers to systematically build models and identify novel phenomena in large datasets. One traditional approach is to analyze the residuals of mode…
▽ More
Do large datasets provide value to psychologists? Without a systematic methodology for working with such datasets, there is a valid concern that analyses will produce noise artifacts rather than true effects. In this paper, we offer a way to enable researchers to systematically build models and identify novel phenomena in large datasets. One traditional approach is to analyze the residuals of models---the biggest errors they make in predicting the data---to discover what might be missing from those models. However, once a dataset is sufficiently large, machine learning algorithms approximate the true underlying function better than the data, suggesting instead that the predictions of these data-driven models should be used to guide model-building. We call this approach "Scientific Regret Minimization" (SRM) as it focuses on minimizing errors for cases that we know should have been predictable. We demonstrate this methodology on a subset of the Moral Machine dataset, a public collection of roughly forty million moral decisions. Using SRM, we found that incorporating a set of deontological principles that capture dimensions along which groups of agents can vary (e.g. sex and age) improves a computational model of human moral judgment. Furthermore, we were able to identify and independently validate three interesting moral phenomena: criminal dehumanization, age of responsibility, and asymmetric notions of responsibility.
△ Less
Submitted 8 January, 2020; v1 submitted 16 October, 2019;
originally announced October 2019.
-
Human uncertainty makes classification more robust
Authors:
Joshua C. Peterson,
Ruairidh M. Battleday,
Thomas L. Griffiths,
Olga Russakovsky
Abstract:
The classification performance of deep neural networks has begun to asymptote at near-perfect levels. However, their ability to generalize outside the training set and their robustness to adversarial attacks have not. In this paper, we make progress on this problem by training with full label distributions that reflect human perceptual uncertainty. We first present a new benchmark dataset which we…
▽ More
The classification performance of deep neural networks has begun to asymptote at near-perfect levels. However, their ability to generalize outside the training set and their robustness to adversarial attacks have not. In this paper, we make progress on this problem by training with full label distributions that reflect human perceptual uncertainty. We first present a new benchmark dataset which we call CIFAR10H, containing a full distribution of human labels for each image of the CIFAR10 test set. We then show that, while contemporary classifiers fail to exhibit human-like uncertainty on their own, explicit training on our dataset closes this gap, supports improved generalization to increasingly out-of-training-distribution test datasets, and confers robustness to adversarial attacks.
△ Less
Submitted 19 August, 2019;
originally announced August 2019.
-
More chemical detection through less sampling: amplifying chemical signals in hyperspectral data cubes through compressive sensing
Authors:
Henry Kvinge,
Elin Farnell,
Julia R. Dupuis,
Michael Kirby,
Chris Peterson,
Elizabeth C. Schundler
Abstract:
Compressive sensing (CS) is a method of sampling which permits some classes of signals to be reconstructed with high accuracy even when they were under-sampled. In this paper we explore a phenomenon in which bandwise CS sampling of a hyperspectral data cube followed by reconstruction can actually result in amplification of chemical signals contained in the cube. Perhaps most surprisingly, chemical…
▽ More
Compressive sensing (CS) is a method of sampling which permits some classes of signals to be reconstructed with high accuracy even when they were under-sampled. In this paper we explore a phenomenon in which bandwise CS sampling of a hyperspectral data cube followed by reconstruction can actually result in amplification of chemical signals contained in the cube. Perhaps most surprisingly, chemical signal amplification generally seems to increase as the level of sampling decreases. In some examples, the chemical signal is significantly stronger in a data cube reconstructed from 10% CS sampling than it is in the raw, 100% sampled data cube. We explore this phenomenon in two real-world datasets including the Physical Sciences Inc. Fabry-Pérot interferometer sensor multispectral dataset and the Johns Hopkins Applied Physics Lab FTIR-based longwave infrared sensor hyperspectral dataset. Each of these datasets contains the release of a chemical simulant, such as glacial acetic acid, triethyl phospate, and sulfur hexafluoride, and in all cases we use the adaptive coherence estimator (ACE) to detect a target signal in the hyperspectral data cube. We end the paper by suggesting some theoretical justifications for why chemical signals would be amplified in CS sampled and reconstructed hyperspectral data cubes and discuss some practical implications.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
A data-driven approach to sampling matrix selection for compressive sensing
Authors:
Elin Farnell,
Henry Kvinge,
John P. Dixon,
Julia R. Dupuis,
Michael Kirby,
Chris Peterson,
Elizabeth C. Schundler,
Christian W. Smith
Abstract:
Sampling is a fundamental aspect of any implementation of compressive sensing. Typically, the choice of sampling method is guided by the reconstruction basis. However, this approach can be problematic with respect to certain hardware constraints and is not responsive to domain-specific context. We propose a method for defining an order for a sampling basis that is optimal with respect to capturing…
▽ More
Sampling is a fundamental aspect of any implementation of compressive sensing. Typically, the choice of sampling method is guided by the reconstruction basis. However, this approach can be problematic with respect to certain hardware constraints and is not responsive to domain-specific context. We propose a method for defining an order for a sampling basis that is optimal with respect to capturing variance in data, thus allowing for meaningful sensing at any desired level of compression. We focus on the Walsh-Hadamard sampling basis for its relevance to hardware constraints, but our approach applies to any sampling basis of interest. We illustrate the effectiveness of our method on the Physical Sciences Inc. Fabry-Pérot interferometer sensor multispectral dataset, the Johns Hopkins Applied Physics Lab FTIR-based longwave infrared sensor hyperspectral dataset, and a Colorado State University Swiss Ranger depth image dataset. The spectral datasets consist of simulant experiments, including releases of chemicals such as GAA and SF6. We combine our sampling and reconstruction with the adaptive coherence estimator (ACE) and bulk coherence for chemical detection and we incorporate an algorithmic threshold for ACE values to determine the presence or absence of a chemical. We compare results across sampling methods in this context. We have successful chemical detection at a compression rate of 90%. For all three datasets, we compare our sampling approach to standard orderings of sampling basis such as random, sequency, and an analog of sequency that we term `frequency.' In one instance, the peak signal to noise ratio was improved by over 30% across a test set of depth images.
△ Less
Submitted 20 June, 2019;
originally announced June 2019.
-
Read-Uncommitted Transactions for Smart Contract Performance
Authors:
Victor Cook,
Zachary Painter,
Christina Peterson,
Damian Dechev
Abstract:
Smart contract transactions demonstrate issues of performance and correctness that application programmers must work around. Although the blockchain consensus mechanism approaches ACID compliance, use cases that rely on frequent state changes are impractical due to the block publishing interval of $O(10^1)$ seconds. The effective isolation level is Read-Committed, only revealing state transitions…
▽ More
Smart contract transactions demonstrate issues of performance and correctness that application programmers must work around. Although the blockchain consensus mechanism approaches ACID compliance, use cases that rely on frequent state changes are impractical due to the block publishing interval of $O(10^1)$ seconds. The effective isolation level is Read-Committed, only revealing state transitions at the end of the block interval. Values read may be stale and not match program order, causing many transactions to fail when a block is committed. This paper perceives the blockchain as a transactional data structure, using this analogy in the development of a new algorithm, Hash-Mark-Set (HMS), that improves transaction throughput by providing a Read-Uncommitted view of state variables. HMS creates a directed acyclic graph (DAG) from the pending transaction pool. The transaction order derived from the DAG is used to provide a Read-Uncommitted view of the data for new transactions, which enter the DAG as they are received. An implementation of HMS is provided, interoperable with Ethereum and ready for use in smart contracts. Over a wide range of transaction mixes, HMS is demonstrated to improve throughput. A side product of the implementation is a new technique, Runtime Argument Augmentation (RAA), that allows smart contracts to communicate with external data services before submitting a transaction. RAA has use cases beyond HMS and can serve as a lightweight replacement for blockchain oracles.
△ Less
Submitted 29 May, 2019;
originally announced May 2019.
-
Cognitive Model Priors for Predicting Human Decisions
Authors:
David D. Bourgin,
Joshua C. Peterson,
Daniel Reichman,
Thomas L. Griffiths,
Stuart J. Russell
Abstract:
Human decision-making underlies all economic behavior. For the past four decades, human decision-making under uncertainty has continued to be explained by theoretical models based on prospect theory, a framework that was awarded the Nobel Prize in Economic Sciences. However, theoretical models of this kind have developed slowly, and robust, high-precision predictive models of human decisions remai…
▽ More
Human decision-making underlies all economic behavior. For the past four decades, human decision-making under uncertainty has continued to be explained by theoretical models based on prospect theory, a framework that was awarded the Nobel Prize in Economic Sciences. However, theoretical models of this kind have developed slowly, and robust, high-precision predictive models of human decisions remain a challenge. While machine learning is a natural candidate for solving these problems, it is currently unclear to what extent it can improve predictions obtained by current theories. We argue that this is mainly due to data scarcity, since noisy human behavior requires massive sample sizes to be accurately captured by off-the-shelf machine learning methods. To solve this problem, what is needed are machine learning models with appropriate inductive biases for capturing human behavior, and larger datasets. We offer two contributions towards this end: first, we construct "cognitive model priors" by pretraining neural networks with synthetic data generated by cognitive models (i.e., theoretical models developed by cognitive psychologists). We find that fine-tuning these networks on small datasets of real human decisions results in unprecedented state-of-the-art improvements on two benchmark datasets. Second, we present the first large-scale dataset for human decision-making, containing over 240,000 human judgments across over 13,000 decision problems. This dataset reveals the circumstances where cognitive model priors are useful, and provides a new standard for benchmarking prediction of human decisions under uncertainty.
△ Less
Submitted 22 May, 2019;
originally announced May 2019.
-
Quantifiability: Concurrent Correctness from First Principles
Authors:
Victor Cook,
Christina Peterson,
Zachary Painter,
Damian Dechev
Abstract:
Architectural imperatives due to the slowing of Moore's Law, the broad acceptance of relaxed semantics and the O(n!) worst case verification complexity of generating sequential histories motivate a new approach to concurrent correctness. Desiderata for a new correctness condition are that it be independent of sequential histories, compositional, flexible as to timing, modular as to semantics and f…
▽ More
Architectural imperatives due to the slowing of Moore's Law, the broad acceptance of relaxed semantics and the O(n!) worst case verification complexity of generating sequential histories motivate a new approach to concurrent correctness. Desiderata for a new correctness condition are that it be independent of sequential histories, compositional, flexible as to timing, modular as to semantics and free of inherent locking or waiting. We propose Quantifiability, a novel correctness condition based on intuitive first principles. Quantifiability models a system in vector space to launch a new mathematical analysis of concurrency. The vector space model is suitable for a wide range of concurrent systems and their associated data structures. This paper formally defines quantifiability and demonstrates useful properties such as compositionality. Analysis is facilitated with linear algebra, better supported and of much more efficient time complexity than traditional combinatorial methods. We present results showing that quantifiable data structures are highly scalable due to the usage of relaxed semantics and propose entropy to evaluate the implementation trade-offs permitted by quantifiability.
△ Less
Submitted 16 July, 2019; v1 submitted 15 May, 2019;
originally announced May 2019.
-
Capturing human categorization of natural images at scale by combining deep networks and cognitive models
Authors:
Ruairidh M. Battleday,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Human categorization is one of the most important and successful targets of cognitive modeling in psychology, yet decades of development and assessment of competing models have been contingent on small sets of simple, artificial experimental stimuli. Here we extend this modeling paradigm to the domain of natural images, revealing the crucial role that stimulus representation plays in categorizatio…
▽ More
Human categorization is one of the most important and successful targets of cognitive modeling in psychology, yet decades of development and assessment of competing models have been contingent on small sets of simple, artificial experimental stimuli. Here we extend this modeling paradigm to the domain of natural images, revealing the crucial role that stimulus representation plays in categorization and its implications for conclusions about how people form categories. Applying psychological models of categorization to natural images required two significant advances. First, we conducted the first large-scale experimental study of human categorization, involving over 500,000 human categorization judgments of 10,000 natural images from ten non-overlap** object categories. Second, we addressed the traditional bottleneck of representing high-dimensional images in cognitive models by exploring the best of current supervised and unsupervised deep and shallow machine learning methods. We find that selecting sufficiently expressive, data-driven representations is crucial to capturing human categorization, and using these representations allows simple models that represent categories with abstract prototypes to outperform the more complex memory-based exemplar accounts of categorization that have dominated in studies using less naturalistic stimuli.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.
-
Predicting human decisions with behavioral theories and machine learning
Authors:
Ori Plonsky,
Reut Apel,
Eyal Ert,
Moshe Tennenholtz,
David Bourgin,
Joshua C. Peterson,
Daniel Reichman,
Thomas L. Griffiths,
Stuart J. Russell,
Evan C. Carter,
James F. Cavanagh,
Ido Erev
Abstract:
Predicting human decision-making under risk and uncertainty represents a quintessential challenge that spans economics, psychology, and related disciplines. Despite decades of research effort, no model can be said to accurately describe and predict human choice even for the most stylized tasks like choice between lotteries. Here, we introduce BEAST Gradient Boosting (BEAST-GB), a novel hybrid mode…
▽ More
Predicting human decision-making under risk and uncertainty represents a quintessential challenge that spans economics, psychology, and related disciplines. Despite decades of research effort, no model can be said to accurately describe and predict human choice even for the most stylized tasks like choice between lotteries. Here, we introduce BEAST Gradient Boosting (BEAST-GB), a novel hybrid model that synergizes behavioral theories, specifically the model BEAST, with machine learning techniques. First, we show the effectiveness of BEAST-GB by describing CPC18, an open competition for prediction of human decision making under risk and uncertainty, in which BEAST-GB won. Second, we show that it achieves state-of-the-art performance on the largest publicly available dataset of human risky choice, outperforming purely data-driven neural networks, indicating the continued relevance of BEAST theoretical insights in the presence of large data. Third, we demonstrate BEAST-GB's superior predictive power in an ensemble of choice experiments in which the BEAST model alone falters, underscoring the indispensable role of machine learning in interpreting complex idiosyncratic behavioral data. Finally, we show BEAST-GB also displays robust domain generalization capabilities as it effectively predicts choice behavior in new experimental contexts that it was not trained on. These results confirm the potency of combining domain-specific theoretical frameworks with machine learning, underscoring a methodological advance with broad implications for modeling decisions in diverse environments.
△ Less
Submitted 18 April, 2024; v1 submitted 15 April, 2019;
originally announced April 2019.
-
Analysis of Commutativity with State-Chart Graph Representation of Concurrent Programs
Authors:
Kishore Debnath,
Christina Peterson,
Damian Dechev
Abstract:
We present a new approach to check for commutativity in concurrent programs from their state-chart graphs. A set of operations are commutative if changing the order of their execution on an object does not affect the abstract state of the object and returns the same response. Concurrent operations that commute at object-level can be executed concurrently at transaction-level, which boosts performa…
▽ More
We present a new approach to check for commutativity in concurrent programs from their state-chart graphs. A set of operations are commutative if changing the order of their execution on an object does not affect the abstract state of the object and returns the same response. Concurrent operations that commute at object-level can be executed concurrently at transaction-level, which boosts performance while preserving the appearance of atomicity and isolation. Utilizing object-level commutativity in transactional execution enables the reuse of existing non-blocking programming techniques for thread-level synchronization. In our approach, we generate state-chart graphs by tracking data on the atomic instructions invoked on the concurrent object during model checking and represent the atomic instructions as states in a state-transition representation. Considering the non-deterministic nature of concurrent programs, we determine commutativity by exhaustively searching for identical object states captured at a thread-level granularity across all thread interleavings. With this methodology, a user can not only verify commutativity among operations, but also can visually check ways in which methods commute at object-level, which is an edge over current state-of-the-art tools. The object-level commutative information helps in identifying faulty implementations and performance improvement considerations. We use the graph database, Neo4j, to represent object states as nodes that further assists the user to check for concurrency properties using Cypher queries.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.
-
Lock-Free Transactional Adjacency List
Authors:
Zachary Painter,
Christina Peterson,
Damian Dechev
Abstract:
Adjacency lists are frequently used in graphing or map based applications. Although efficient concurrent linked-list algorithms are well known, it can be difficult to adapt these approaches to build a high-performance adjacency list. Furthermore, it can often be desirable to execute operations in these data structures transactionally, or perform a sequence of operations in one atomic step. In this…
▽ More
Adjacency lists are frequently used in graphing or map based applications. Although efficient concurrent linked-list algorithms are well known, it can be difficult to adapt these approaches to build a high-performance adjacency list. Furthermore, it can often be desirable to execute operations in these data structures transactionally, or perform a sequence of operations in one atomic step. In this paper, we present a lock-free transactional adjacency list based on a multi-dimensional list (MDList). We are able to combine known linked list strategies with the capability of the MDList in order to efficiently organize graph vertexes and their edges. We design our underlying data structure to be node-based and linearizable, then use the Lock-Free Transactional Transformation (LFTT) methodology to efficiently enable transactional execution. In our performance evaluation, our lock-free transactional adjacency list achieves an average of 50% speedup over a transactional boosting implementation.
△ Less
Submitted 24 March, 2019;
originally announced March 2019.
-
Using Machine Learning to Guide Cognitive Modeling: A Case Study in Moral Reasoning
Authors:
Mayank Agrawal,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Large-scale behavioral datasets enable researchers to use complex machine learning algorithms to better predict human behavior, yet this increased predictive power does not always lead to a better understanding of the behavior in question. In this paper, we outline a data-driven, iterative procedure that allows cognitive scientists to use machine learning to generate models that are both interpret…
▽ More
Large-scale behavioral datasets enable researchers to use complex machine learning algorithms to better predict human behavior, yet this increased predictive power does not always lead to a better understanding of the behavior in question. In this paper, we outline a data-driven, iterative procedure that allows cognitive scientists to use machine learning to generate models that are both interpretable and accurate. We demonstrate this method in the domain of moral decision-making, where standard experimental approaches often identify relevant principles that influence human judgments, but fail to generalize these findings to "real world" situations that place these principles in conflict. The recently released Moral Machine dataset allows us to build a powerful model that can predict the outcomes of these conflicts while remaining simple enough to explain the basis behind human decisions.
△ Less
Submitted 10 May, 2019; v1 submitted 18 February, 2019;
originally announced February 2019.
-
Monitoring the shape of weather, soundscapes, and dynamical systems: a new statistic for dimension-driven data analysis on large data sets
Authors:
Henry Kvinge,
Elin Farnell,
Michael Kirby,
Chris Peterson
Abstract:
Dimensionality-reduction methods are a fundamental tool in the analysis of large data sets. These algorithms work on the assumption that the "intrinsic dimension" of the data is generally much smaller than the ambient dimension in which it is collected. Alongside their usual purpose of map** data into a smaller dimension with minimal information loss, dimensionality-reduction techniques implicit…
▽ More
Dimensionality-reduction methods are a fundamental tool in the analysis of large data sets. These algorithms work on the assumption that the "intrinsic dimension" of the data is generally much smaller than the ambient dimension in which it is collected. Alongside their usual purpose of map** data into a smaller dimension with minimal information loss, dimensionality-reduction techniques implicitly or explicitly provide information about the dimension of the data set.
In this paper, we propose a new statistic that we call the $κ$-profile for analysis of large data sets. The $κ$-profile arises from a dimensionality-reduction optimization problem: namely that of finding a projection into $k$-dimensions that optimally preserves the secants between points in the data set. From this optimal projection we extract $κ,$ the norm of the shortest projected secant from among the set of all normalized secants. This $κ$ can be computed for any $k$; thus the tuple of $κ$ values (indexed by dimension) becomes a $κ$-profile. Algorithms such as the Secant-Avoidance Projection algorithm and the Hierarchical Secant-Avoidance Projection algorithm, provide a computationally feasible means of estimating the $κ$-profile for large data sets, and thus a method of understanding and monitoring their behavior. As we demonstrate in this paper, the $κ$-profile serves as a useful statistic in several representative settings: weather data, soundscape data, and dynamical systems data.
△ Less
Submitted 26 October, 2018;
originally announced October 2018.
-
Too many secants: a hierarchical approach to secant-based dimensionality reduction on large data sets
Authors:
Henry Kvinge,
Elin Farnell,
Michael Kirby,
Chris Peterson
Abstract:
A fundamental question in many data analysis settings is the problem of discerning the "natural" dimension of a data set. That is, when a data set is drawn from a manifold (possibly with noise), a meaningful aspect of the data is the dimension of that manifold. Various approaches exist for estimating this dimension, such as the method of Secant-Avoidance Projection (SAP). Intuitively, the SAP algo…
▽ More
A fundamental question in many data analysis settings is the problem of discerning the "natural" dimension of a data set. That is, when a data set is drawn from a manifold (possibly with noise), a meaningful aspect of the data is the dimension of that manifold. Various approaches exist for estimating this dimension, such as the method of Secant-Avoidance Projection (SAP). Intuitively, the SAP algorithm seeks to determine a projection which best preserves the lengths of all secants between points in a data set; by applying the algorithm to find the best projections to vector spaces of various dimensions, one may infer the dimension of the manifold of origination. That is, one may learn the dimension at which it is possible to construct a diffeomorphic copy of the data in a lower-dimensional Euclidean space. Using Whitney's embedding theorem, we can relate this information to the natural dimension of the data. A drawback of the SAP algorithm is that a data set with $T$ points has $O(T^2)$ secants, making the computation and storage of all secants infeasible for very large data sets. In this paper, we propose a novel algorithm that generalizes the SAP algorithm with an emphasis on addressing this issue. That is, we propose a hierarchical secant-based dimensionality-reduction method, which can be employed for data sets where explicitly calculating all secants is not feasible.
△ Less
Submitted 5 August, 2018;
originally announced August 2018.
-
A GPU-Oriented Algorithm Design for Secant-Based Dimensionality Reduction
Authors:
Henry Kvinge,
Elin Farnell,
Michael Kirby,
Chris Peterson
Abstract:
Dimensionality-reduction techniques are a fundamental tool for extracting useful information from high-dimensional data sets. Because secant sets encode manifold geometry, they are a useful tool for designing meaningful data-reduction algorithms. In one such approach, the goal is to construct a projection that maximally avoids secant directions and hence ensures that distinct data points are not m…
▽ More
Dimensionality-reduction techniques are a fundamental tool for extracting useful information from high-dimensional data sets. Because secant sets encode manifold geometry, they are a useful tool for designing meaningful data-reduction algorithms. In one such approach, the goal is to construct a projection that maximally avoids secant directions and hence ensures that distinct data points are not mapped too close together in the reduced space. This type of algorithm is based on a mathematical framework inspired by the constructive proof of Whitney's embedding theorem from differential topology. Computing all (unit) secants for a set of points is by nature computationally expensive, thus opening the door for exploitation of GPU architecture for achieving fast versions of these algorithms. We present a polynomial-time data-reduction algorithm that produces a meaningful low-dimensional representation of a data set by iteratively constructing improved projections within the framework described above. Key to our algorithm design and implementation is the use of GPUs which, among other things, minimizes the computational time required for the calculation of all secant lines. One goal of this report is to share ideas with GPU experts and to discuss a class of mathematical algorithms that may be of interest to the broader GPU community.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.
-
Endmember Extraction on the Grassmannian
Authors:
Elin Farnell,
Henry Kvinge,
Michael Kirby,
Chris Peterson
Abstract:
Endmember extraction plays a prominent role in a variety of data analysis problems as endmembers often correspond to data representing the purest or best representative of some feature. Identifying endmembers then can be useful for further identification and classification tasks. In settings with high-dimensional data, such as hyperspectral imagery, it can be useful to consider endmembers that are…
▽ More
Endmember extraction plays a prominent role in a variety of data analysis problems as endmembers often correspond to data representing the purest or best representative of some feature. Identifying endmembers then can be useful for further identification and classification tasks. In settings with high-dimensional data, such as hyperspectral imagery, it can be useful to consider endmembers that are subspaces as they are capable of capturing a wider range of variations of a signature. The endmember extraction problem in this setting thus translates to finding the vertices of the convex hull of a set of points on a Grassmannian. In the presence of noise, it can be less clear whether a point should be considered a vertex. In this paper, we propose an algorithm to extract endmembers on a Grassmannian, identify subspaces of interest that lie near the boundary of a convex hull, and demonstrate the use of the algorithm on a synthetic example and on the 220 spectral band AVIRIS Indian Pines hyperspectral image.
△ Less
Submitted 3 July, 2018;
originally announced July 2018.
-
Learning a face space for experiments on human identity
Authors:
Jordan W. Suchow,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Generative models of human identity and appearance have broad applicability to behavioral science and technology, but the exquisite sensitivity of human face perception means that their utility hinges on the alignment of the model's representation to human psychological representations and the photorealism of the generated images. Meeting these requirements is an exacting task, and existing models…
▽ More
Generative models of human identity and appearance have broad applicability to behavioral science and technology, but the exquisite sensitivity of human face perception means that their utility hinges on the alignment of the model's representation to human psychological representations and the photorealism of the generated images. Meeting these requirements is an exacting task, and existing models of human identity and appearance are often unworkably abstract, artificial, uncanny, or biased. Here, we use a variational autoencoder with an autoregressive decoder to learn a face space from a uniquely diverse dataset of portraits that control much of the variation irrelevant to human identity and appearance. Our method generates photorealistic portraits of fictive identities with a smooth, navigable latent space. We validate our model's alignment with human sensitivities by introducing a psychophysical Turing test for images, which humans mostly fail. Lastly, we demonstrate an initial application of our model to the problem of fast search in mental space to obtain detailed "police sketches" in a small number of trials.
△ Less
Submitted 19 May, 2018;
originally announced May 2018.
-
Learning Hierarchical Visual Representations in Deep Neural Networks Using Hierarchical Linguistic Labels
Authors:
Joshua C. Peterson,
Paul Soulos,
Aida Nematzadeh,
Thomas L. Griffiths
Abstract:
Modern convolutional neural networks (CNNs) are able to achieve human-level object classification accuracy on specific tasks, and currently outperform competing models in explaining complex human visual representations. However, the categorization problem is posed differently for these networks than for humans: the accuracy of these networks is evaluated by their ability to identify single labels…
▽ More
Modern convolutional neural networks (CNNs) are able to achieve human-level object classification accuracy on specific tasks, and currently outperform competing models in explaining complex human visual representations. However, the categorization problem is posed differently for these networks than for humans: the accuracy of these networks is evaluated by their ability to identify single labels assigned to each image. These labels often cut arbitrarily across natural psychological taxonomies (e.g., dogs are separated into breeds, but never jointly categorized as "dogs"), and bias the resulting representations. By contrast, it is common for children to hear both "dog" and "Dalmatian" to describe the same stimulus, hel** to group perceptually disparate objects (e.g., breeds) into a common mental class. In this work, we train CNN classifiers with multiple labels for each image that correspond to different levels of abstraction, and use this framework to reproduce classic patterns that appear in human generalization behavior.
△ Less
Submitted 19 May, 2018;
originally announced May 2018.
-
Capturing human category representations by sampling in deep feature spaces
Authors:
Joshua C. Peterson,
Jordan W. Suchow,
Krisha Aghi,
Alexander Y. Ku,
Thomas L. Griffiths
Abstract:
Understanding how people represent categories is a core problem in cognitive science. Decades of research have yielded a variety of formal theories of categories, but validating them with naturalistic stimuli is difficult. The challenge is that human category representations cannot be directly observed and running informative experiments with naturalistic stimuli such as images requires a workable…
▽ More
Understanding how people represent categories is a core problem in cognitive science. Decades of research have yielded a variety of formal theories of categories, but validating them with naturalistic stimuli is difficult. The challenge is that human category representations cannot be directly observed and running informative experiments with naturalistic stimuli such as images requires a workable representation of these stimuli. Deep neural networks have recently been successful in solving a range of computer vision tasks and provide a way to compactly represent image features. Here, we introduce a method to estimate the structure of human categories that combines ideas from cognitive science and machine learning, blending human-based algorithms with state-of-the-art deep image generators. We provide qualitative and quantitative results as a proof-of-concept for the method's feasibility. Samples drawn from human distributions rival those from state-of-the-art generative models in quality and outperform alternative methods for estimating the structure of human categories.
△ Less
Submitted 19 May, 2018;
originally announced May 2018.
-
Modeling Human Categorization of Natural Images Using Deep Feature Representations
Authors:
Ruairidh M. Battleday,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Over the last few decades, psychologists have developed sophisticated formal models of human categorization using simple artificial stimuli. In this paper, we use modern machine learning methods to extend this work into the realm of naturalistic stimuli, enabling human categorization to be studied over the complex visual domain in which it evolved and developed. We show that representations derive…
▽ More
Over the last few decades, psychologists have developed sophisticated formal models of human categorization using simple artificial stimuli. In this paper, we use modern machine learning methods to extend this work into the realm of naturalistic stimuli, enabling human categorization to be studied over the complex visual domain in which it evolved and developed. We show that representations derived from a convolutional neural network can be used to model behavior over a database of >300,000 human natural image classifications, and find that a group of models based on these representations perform well, near the reliability of human judgments. Interestingly, this group includes both exemplar and prototype models, contrasting with the dominance of exemplar models in previous work. We are able to improve the performance of the remaining models by preprocessing neural network representations to more closely capture human similarity judgments.
△ Less
Submitted 13 November, 2017;
originally announced November 2017.
-
Evaluating (and improving) the correspondence between deep neural networks and human representations
Authors:
Joshua C. Peterson,
Joshua T. Abbott,
Thomas L. Griffiths
Abstract:
Decades of psychological research have been aimed at modeling how people learn features and categories. The empirical validation of these theories is often based on artificial stimuli with simple representations. Recently, deep neural networks have reached or surpassed human accuracy on tasks such as identifying objects in natural images. These networks learn representations of real-world stimuli…
▽ More
Decades of psychological research have been aimed at modeling how people learn features and categories. The empirical validation of these theories is often based on artificial stimuli with simple representations. Recently, deep neural networks have reached or surpassed human accuracy on tasks such as identifying objects in natural images. These networks learn representations of real-world stimuli that can potentially be leveraged to capture psychological representations. We find that state-of-the-art object classification networks provide surprisingly accurate predictions of human similarity judgments for natural images, but fail to capture some of the structure represented by people. We show that a simple transformation that corrects these discrepancies can be obtained through convex optimization. We use the resulting representations to predict the difficulty of learning novel categories of natural images. Our results extend the scope of psychological experiments and computational modeling by enabling tractable use of large natural stimulus sets.
△ Less
Submitted 23 July, 2018; v1 submitted 7 June, 2017;
originally announced June 2017.
-
Evaluating vector-space models of analogy
Authors:
Dawn Chen,
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Vector-space representations provide geometric tools for reasoning about the similarity of a set of objects and their relationships. Recent machine learning methods for deriving vector-space embeddings of words (e.g., word2vec) have achieved considerable success in natural language processing. These vector spaces have also been shown to exhibit a surprising capacity to capture verbal analogies, wi…
▽ More
Vector-space representations provide geometric tools for reasoning about the similarity of a set of objects and their relationships. Recent machine learning methods for deriving vector-space embeddings of words (e.g., word2vec) have achieved considerable success in natural language processing. These vector spaces have also been shown to exhibit a surprising capacity to capture verbal analogies, with similar results for natural images, giving new life to a classic model of analogies as parallelograms that was first proposed by cognitive scientists. We evaluate the parallelogram model of analogy as applied to modern word embeddings, providing a detailed analysis of the extent to which this approach captures human relational similarity judgments in a large benchmark dataset. We find that that some semantic relationships are better captured than others. We then provide evidence for deeper limitations of the parallelogram model based on the intrinsic geometric constraints of vector spaces, paralleling classic results for first-order similarity.
△ Less
Submitted 8 June, 2017; v1 submitted 11 May, 2017;
originally announced May 2017.
-
Evidence for the size principle in semantic and perceptual domains
Authors:
Joshua C. Peterson,
Thomas L. Griffiths
Abstract:
Shepard's Universal Law of Generalization offered a compelling case for the first physics-like law in cognitive science that should hold for all intelligent agents in the universe. Shepard's account is based on a rational Bayesian model of generalization, providing an answer to the question of why such a law should emerge. Extending this account to explain how humans use multiple examples to make…
▽ More
Shepard's Universal Law of Generalization offered a compelling case for the first physics-like law in cognitive science that should hold for all intelligent agents in the universe. Shepard's account is based on a rational Bayesian model of generalization, providing an answer to the question of why such a law should emerge. Extending this account to explain how humans use multiple examples to make better generalizations requires an additional assumption, called the size principle: hypotheses that pick out fewer objects should make a larger contribution to generalization. The degree to which this principle warrants similarly law-like status is far from conclusive. Typically, evaluating this principle has not been straightforward, requiring additional assumptions. We present a new method for evaluating the size principle that is more direct, and apply this method to a diverse array of datasets. Our results provide support for the broad applicability of the size principle.
△ Less
Submitted 9 May, 2017;
originally announced May 2017.
-
Stratifying High Dimensional Data Based on Proximity to the Convex Hull Boundary
Authors:
Lori Ziegelmeier,
Michael Kirby,
Chris Peterson
Abstract:
The convex hull of a set of points, $C$, serves to expose extremal properties of $C$ and can help identify elements in $C$ of high interest. For many problems, particularly in the presence of noise, the true vertex set (and facets) may be difficult to determine. One solution is to expand the list of high interest candidates to points lying near the boundary of the convex hull. We propose a quadrat…
▽ More
The convex hull of a set of points, $C$, serves to expose extremal properties of $C$ and can help identify elements in $C$ of high interest. For many problems, particularly in the presence of noise, the true vertex set (and facets) may be difficult to determine. One solution is to expand the list of high interest candidates to points lying near the boundary of the convex hull. We propose a quadratic program for the purpose of stratifying points in a data cloud based on proximity to the boundary of the convex hull. For each data point, a quadratic program is solved to determine an associated weight vector. We show that the weight vector encodes geometric information concerning the point's relationship to the boundary of the convex hull. The computation of the weight vectors can be carried out in parallel, and for a fixed number of points and fixed neighborhood size, the overall computational complexity of the algorithm grows linearly with dimension. As a consequence, meaningful computations can be completed on reasonably large, high dimensional data sets.
△ Less
Submitted 4 November, 2016;
originally announced November 2016.
-
Modelling Student Behavior using Granular Large Scale Action Data from a MOOC
Authors:
Steven Tang,
Joshua C. Peterson,
Zachary A. Pardos
Abstract:
Digital learning environments generate a precise record of the actions learners take as they interact with learning materials and complete exercises towards comprehension. With this high quantity of sequential data comes the potential to apply time series models to learn about underlying behavioral patterns and trends that characterize successful learning based on the granular record of student ac…
▽ More
Digital learning environments generate a precise record of the actions learners take as they interact with learning materials and complete exercises towards comprehension. With this high quantity of sequential data comes the potential to apply time series models to learn about underlying behavioral patterns and trends that characterize successful learning based on the granular record of student actions. There exist several methods for looking at longitudinal, sequential data like those recorded from learning environments. In the field of language modelling, traditional n-gram techniques and modern recurrent neural network (RNN) approaches have been applied to algorithmically find structure in language and predict the next word given the previous words in the sentence or paragraph as input. In this paper, we draw an analogy to this work by treating student sequences of resource views and interactions in a MOOC as the inputs and predicting students' next interaction as outputs. In this study, we train only on students who received a certificate of completion. In doing so, the model could potentially be used for recommendation of sequences eventually leading to success, as opposed to perpetuating unproductive behavior. Given that the MOOC used in our study had over 3,500 unique resources, predicting the exact resource that a student will interact with next might appear to be a difficult classification problem. We find that simply following the syllabus (built-in structure of the course) gives on average 23% accuracy in making this prediction, followed by the n-gram method with 70.4%, and RNN based methods with 72.2%. This research lays the ground work for recommendation in a MOOC and other digital learning environments where high volumes of sequential data exist.
△ Less
Submitted 16 August, 2016;
originally announced August 2016.
-
Adapting Deep Network Features to Capture Psychological Representations
Authors:
Joshua C. Peterson,
Joshua T. Abbott,
Thomas L. Griffiths
Abstract:
Deep neural networks have become increasingly successful at solving classic perception problems such as object recognition, semantic segmentation, and scene understanding, often reaching or surpassing human-level accuracy. This success is due in part to the ability of DNNs to learn useful representations of high-dimensional inputs, a problem that humans must also solve. We examine the relationship…
▽ More
Deep neural networks have become increasingly successful at solving classic perception problems such as object recognition, semantic segmentation, and scene understanding, often reaching or surpassing human-level accuracy. This success is due in part to the ability of DNNs to learn useful representations of high-dimensional inputs, a problem that humans must also solve. We examine the relationship between the representations learned by these networks and human psychological representations recovered from similarity judgments. We find that deep features learned in service of object classification account for a significant amount of the variance in human similarity judgments for a set of animal images. However, these features do not capture some qualitative distinctions that are a key part of human representations. To remedy this, we develop a method for adapting deep features to align with human similarity judgments, resulting in image representations that can potentially be used to extend the scope of psychological experiments.
△ Less
Submitted 6 August, 2016;
originally announced August 2016.
-
Persistent Homology on Grassmann Manifolds for Analysis of Hyperspectral Movies
Authors:
Sofya Chepushtanova,
Michael Kirby,
Chris Peterson,
Lori Ziegelmeier
Abstract:
The existence of characteristic structure, or shape, in complex data sets has been recognized as increasingly important for mathematical data analysis. This realization has motivated the development of new tools such as persistent homology for exploring topological invariants, or features, in large data sets. In this paper we apply persistent homology to the characterization of gas plumes in time…
▽ More
The existence of characteristic structure, or shape, in complex data sets has been recognized as increasingly important for mathematical data analysis. This realization has motivated the development of new tools such as persistent homology for exploring topological invariants, or features, in large data sets. In this paper we apply persistent homology to the characterization of gas plumes in time dependent sequences of hyperspectral cubes, i.e. the analysis of 4-way arrays. We investigate hyperspectral movies of Long-Wavelength Infrared data monitoring an experimental release of chemical simulant into the air. Our approach models regions of interest within the hyperspectral data cubes as points on the real Grassmann manifold $G(k, n)$ (whose points parameterize the $k$-dimensional subspaces of $\mathbb{R}^n$), contrasting our approach with the more standard framework in Euclidean space. An advantage of this approach is that it allows a sequence of time slices in a hyperspectral movie to be collapsed to a sequence of points in such a way that some of the key structure within and between the slices is encoded by the points on the Grassmann manifold. This motivates the search for topological features, associated with the evolution of the frames of a hyperspectral movie, within the corresponding points on the Grassmann manifold. The proposed mathematical model affords the processing of large data sets while retaining valuable discriminatory information. In this paper, we discuss how embedding our data in the Grassmann manifold, together with topological data analysis, captures dynamical events that occur as the chemical plume is released and evolves.
△ Less
Submitted 11 July, 2016; v1 submitted 7 July, 2016;
originally announced July 2016.
-
Persistence Images: A Stable Vector Representation of Persistent Homology
Authors:
Henry Adams,
Sofya Chepushtanova,
Tegan Emerson,
Eric Hanson,
Michael Kirby,
Francis Motta,
Rachel Neville,
Chris Peterson,
Patrick Shipman,
Lori Ziegelmeier
Abstract:
Many datasets can be viewed as a noisy sampling of an underlying space, and tools from topological data analysis can characterize this structure for the purpose of knowledge discovery. One such tool is persistent homology, which provides a multiscale description of the homological features within a dataset. A useful representation of this homological information is a persistence diagram (PD). Effo…
▽ More
Many datasets can be viewed as a noisy sampling of an underlying space, and tools from topological data analysis can characterize this structure for the purpose of knowledge discovery. One such tool is persistent homology, which provides a multiscale description of the homological features within a dataset. A useful representation of this homological information is a persistence diagram (PD). Efforts have been made to map PDs into spaces with additional structure valuable to machine learning tasks. We convert a PD to a finite-dimensional vector representation which we call a persistence image (PI), and prove the stability of this transformation with respect to small perturbations in the inputs. The discriminatory power of PIs is compared against existing methods, showing significant performance gains. We explore the use of PIs with vector-based machine learning tools, such as linear sparse support vector machines, which identify features containing discriminating topological information. Finally, high accuracy inference of parameter values from the dynamic output of a discrete dynamical system (the linked twist map) and a partial differential equation (the anisotropic Kuramoto-Sivashinsky equation) provide a novel application of the discriminatory power of PIs.
△ Less
Submitted 11 July, 2016; v1 submitted 22 July, 2015;
originally announced July 2015.
-
Network-Centric Quantum Communications with Application to Critical Infrastructure Protection
Authors:
Richard J. Hughes,
Jane E. Nordholt,
Kevin P. McCabe,
Raymond T. Newell,
Charles G. Peterson,
Rolando D. Somma
Abstract:
Network-centric quantum communications (NQC) - a new, scalable instantiation of quantum cryptography providing key management with forward security for lightweight encryption, authentication and digital signatures in optical networks - is briefly described. Results from a multi-node experimental test-bed utilizing integrated photonics quantum communications components, known as QKarDs, include: qu…
▽ More
Network-centric quantum communications (NQC) - a new, scalable instantiation of quantum cryptography providing key management with forward security for lightweight encryption, authentication and digital signatures in optical networks - is briefly described. Results from a multi-node experimental test-bed utilizing integrated photonics quantum communications components, known as QKarDs, include: quantum identification; verifiable quantum secret sharing; multi-party authenticated key establishment, including group keying; and single-fiber quantum-secured communications that can be applied as a security retrofit/upgrade to existing optical fiber installations. A demonstration that NQC meets the challenging simultaneous latency and security requirements of electric grid control communications, which cannot be met without compromises using conventional cryptography, is described.
△ Less
Submitted 1 May, 2013;
originally announced May 2013.
-
Locally Linear Embedding Clustering Algorithm for Natural Imagery
Authors:
Lori Ziegelmeier,
Michael Kirby,
Chris Peterson
Abstract:
The ability to characterize the color content of natural imagery is an important application of image processing. The pixel by pixel coloring of images may be viewed naturally as points in color space, and the inherent structure and distribution of these points affords a quantization, through clustering, of the color information in the image. In this paper, we present a novel topologically driven…
▽ More
The ability to characterize the color content of natural imagery is an important application of image processing. The pixel by pixel coloring of images may be viewed naturally as points in color space, and the inherent structure and distribution of these points affords a quantization, through clustering, of the color information in the image. In this paper, we present a novel topologically driven clustering algorithm that permits segmentation of the color features in a digital image. The algorithm blends Locally Linear Embedding (LLE) and vector quantization by map** color information to a lower dimensional space, identifying distinct color regions, and classifying pixels together based on both a proximity measure and color content. It is observed that these techniques permit a significant reduction in color resolution while maintaining the visually important features of images.
△ Less
Submitted 20 February, 2012;
originally announced February 2012.
-
An Efficient Mean Field Approach to the Set Covering Problem
Authors:
Mattias Ohlsson,
Carsten Peterson,
Bo Söderberg
Abstract:
A mean field feedback artificial neural network algorithm is developed and explored for the set covering problem. A convenient encoding of the inequality constraints is achieved by means of a multilinear penalty function. An approximate energy minimum is obtained by iterating a set of mean field equations, in combination with annealing. The approach is numerically tested against a set of publicl…
▽ More
A mean field feedback artificial neural network algorithm is developed and explored for the set covering problem. A convenient encoding of the inequality constraints is achieved by means of a multilinear penalty function. An approximate energy minimum is obtained by iterating a set of mean field equations, in combination with annealing. The approach is numerically tested against a set of publicly available test problems with sizes ranging up to 5x10^3 rows and 10^6 columns. When comparing the performance with exact results for sizes where these are available, the approach yields results within a few percent from the optimal solutions. Comparisons with other approximate methods also come out well, in particular given the very low CPU consumption required -- typically a few seconds. Arbitrary problems can be processed using the algorithm via a public domain server.
△ Less
Submitted 12 February, 1999;
originally announced February 1999.