-
Quantitative determination of twist angle and strain in Van der Waals moiré superlattices
Authors:
Steven J. Tran,
Jan-Lucas Uslu,
Mihir Pendharkar,
Joe Finney,
Aaron L. Sharpe,
Marisa Hocking,
Nathan J. Bittner,
Kenji Watanabe,
Takashi Taniguchi,
Marc A. Kastner,
Andrew J. Mannix,
David Goldhaber-Gordon
Abstract:
Scanning probe techniques are popular, non-destructive ways to visualize the real space structure of Van der Waals moirés. The high lateral spatial resolution provided by these techniques enables extracting the moiré lattice vectors from a scanning probe image. We have found that the extracted values, while precise, are not necessarily accurate. Scan-to-scan variations in the behavior of the piezo…
▽ More
Scanning probe techniques are popular, non-destructive ways to visualize the real space structure of Van der Waals moirés. The high lateral spatial resolution provided by these techniques enables extracting the moiré lattice vectors from a scanning probe image. We have found that the extracted values, while precise, are not necessarily accurate. Scan-to-scan variations in the behavior of the piezos which drive the scanning probe, and thermally-driven slow relative drift between probe and sample, produce systematic errors in the extraction of lattice vectors. In this Letter, we identify the errors and provide a protocol to correct for them. Applying this protocol to an ensemble of ten successive scans of near-magic-angle twisted bilayer graphene, we are able to reduce our errors in extracting lattice vectors to less than 1%. This translates to extracting twist angles with a statistical uncertainty less than 0.001° and uniaxial heterostrain with uncertainty on the order of 0.002%.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Deterministic fabrication of graphene hexagonal boron nitride moiré superlattices
Authors:
Rupini V. Kamat,
Aaron L. Sharpe,
Mihir Pendharkar,
Jenny Hu,
Steven J. Tran,
Gregory Zaborski Jr.,
Marisa Hocking,
Joe Finney,
Kenji Watanabe,
Takashi Taniguchi,
Marc A. Kastner,
Andrew J. Mannix,
Tony Heinz,
David Goldhaber-Gordon
Abstract:
The electronic properties of moiré heterostructures depend sensitively on the relative orientation between layers of the stack. For example, near-magic-angle twisted bilayer graphene (TBG) commonly shows superconductivity, yet a TBG sample with one of the graphene layers rotationally aligned to a hexagonal Boron Nitride (hBN) cladding layer provided the first experimental observation of orbital fe…
▽ More
The electronic properties of moiré heterostructures depend sensitively on the relative orientation between layers of the stack. For example, near-magic-angle twisted bilayer graphene (TBG) commonly shows superconductivity, yet a TBG sample with one of the graphene layers rotationally aligned to a hexagonal Boron Nitride (hBN) cladding layer provided the first experimental observation of orbital ferromagnetism. To create samples with aligned graphene/hBN, researchers often align edges of exfoliated flakes that appear straight in optical micrographs. However, graphene or hBN can cleave along either zig-zag or armchair lattice directions, introducing a 30 degree ambiguity in the relative orientation of two flakes. By characterizing the crystal lattice orientation of exfoliated flakes prior to stacking using Raman and second-harmonic generation for graphene and hBN, respectively, we unambiguously align monolayer graphene to hBN at a near-0 degree, not 30 degree, relative twist angle. We confirm this alignment by torsional force microscopy (TFM) of the graphene/hBN moiré on an open-face stack, and then by cryogenic transport measurements, after full encapsulation with a second, non-aligned hBN layer. This work demonstrates a key step toward systematically exploring the effects of the relative twist angle between dissimilar materials within moiré heterostructures.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
The Life and Legacy of Bui Tuong Phong
Authors:
Yoehan Oh,
Jacinda Tran,
Theodore Kim
Abstract:
We examine the life and legacy of pioneering Vietnamese American computer scientist Bùi Tuong Phong, whose shading and lighting models turned 50 last year. We trace the trajectory of his life through Vietnam, France, and the United States, and its intersections with global conflicts. Crucially, we present evidence that his name has been cited incorrectly over the last five decades. His family name…
▽ More
We examine the life and legacy of pioneering Vietnamese American computer scientist Bùi Tuong Phong, whose shading and lighting models turned 50 last year. We trace the trajectory of his life through Vietnam, France, and the United States, and its intersections with global conflicts. Crucially, we present evidence that his name has been cited incorrectly over the last five decades. His family name appears to be Bùi, not Phong. By presenting these facts at SIGGRAPH, we hope to collect more information about his life, and ensure that his name is remembered correctly in the future.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Tuning diagonal scale matrices for HMC
Authors:
Jimmy Huy Tran,
Tore Selland Kleppe
Abstract:
Three approaches for adaptively tuning diagonal scale matrices for HMC are discussed and compared. The common practice of scaling according to estimated marginal standard deviations is taken as a benchmark. Scaling according to the mean log-target gradient (ISG), and a scaling method targeting that the frequency of when the underlying Hamiltonian dynamics crosses the respective medians should be u…
▽ More
Three approaches for adaptively tuning diagonal scale matrices for HMC are discussed and compared. The common practice of scaling according to estimated marginal standard deviations is taken as a benchmark. Scaling according to the mean log-target gradient (ISG), and a scaling method targeting that the frequency of when the underlying Hamiltonian dynamics crosses the respective medians should be uniform across dimensions, are taken as alternatives. Numerical studies suggest that the ISG method leads in many cases to more efficient sampling than the benchmark, in particular in cases with strong correlations or non-linear dependencies. The ISG method is also easy to implement, computationally cheap and would be relatively simple to include in automatically tuned codes as an alternative to the benchmark practice.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
A note on the equality $π^2/6= \sum_{n\geq 1} 1/n^2
Authors:
Alain Lasjaunias,
Jean-Paul Tran
Abstract:
This short note is a comment on a historical aspect of a famous formula dating from the 18th century.
This short note is a comment on a historical aspect of a famous formula dating from the 18th century.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency
Authors:
Eric Zelikman,
Wan**g Anya Ma,
Jasmine E. Tran,
Diyi Yang,
Jason D. Yeatman,
Nick Haber
Abstract:
Develo** an educational test can be expensive and time-consuming, as each item must be written by experts and then evaluated by collecting hundreds of student responses. Moreover, many tests require multiple distinct sets of questions administered throughout the school year to closely monitor students' progress, known as parallel tests. In this study, we focus on tests of silent sentence reading…
▽ More
Develo** an educational test can be expensive and time-consuming, as each item must be written by experts and then evaluated by collecting hundreds of student responses. Moreover, many tests require multiple distinct sets of questions administered throughout the school year to closely monitor students' progress, known as parallel tests. In this study, we focus on tests of silent sentence reading efficiency, used to assess students' reading ability over time. To generate high-quality parallel tests, we propose to fine-tune large language models (LLMs) to simulate how previous students would have responded to unseen items. With these simulated responses, we can estimate each item's difficulty and ambiguity. We first use GPT-4 to generate new test items following a list of expert-developed rules and then apply a fine-tuned LLM to filter the items based on criteria from psychological measurements. We also propose an optimal-transport-inspired technique for generating parallel tests and show the generated tests closely correspond to the original test's difficulty and reliability based on crowdworker responses. Our evaluation of a generated test with 234 students from grades 2 to 8 produces test scores highly correlated (r=0.93) to those of a standard test form written by human experts and evaluated across thousands of K-12 students.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Deep Learning based Fast and Accurate Beamforming for Millimeter-Wave Systems
Authors:
Tarun S Cousik,
Vijay K Shah,
Jeffrey H. Reed Harry X Tran,
Rittwik Jana
Abstract:
The widespread proliferation of mmW devices has led to a surge of interest in antenna arrays. This interest in arrays is due to their ability to steer beams in desired directions, for the purpose of increasing signal-power and/or decreasing interference levels. To enable beamforming, array coefficients are typically stored in look-up tables (LUTs) for subsequent referencing. While LUTs enable fast…
▽ More
The widespread proliferation of mmW devices has led to a surge of interest in antenna arrays. This interest in arrays is due to their ability to steer beams in desired directions, for the purpose of increasing signal-power and/or decreasing interference levels. To enable beamforming, array coefficients are typically stored in look-up tables (LUTs) for subsequent referencing. While LUTs enable fast sweep times, their limited memory size restricts the number of beams the array can produce. Consequently, a receiver is likely to be offset from the main beam, thus decreasing received power, and resulting in sub-optimal performance. In this letter, we present BeamShaper, a deep neural network (DNN) framework, which enables fast and accurate beamsteering in any desirable 3-D direction. Unlike traditional finite-memory LUTs which support a fixed set of beams, BeamShaper utilizes a trained NN model to generate the array coefficients for arbitrary directions in \textit{real-time}. Our simulations show that BeamShaper outperforms contemporary LUT based solutions in terms of cosine-similarity and central angle in time scales that are slightly higher than LUT based solutions. Additionally, we show that our DNN based approach has the added advantage of being more resilient to the effects of quantization noise generated while using digital phase-shifters.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Torsional Force Microscopy of Van der Waals Moirés and Atomic Lattices
Authors:
Mihir Pendharkar,
Steven J. Tran,
Gregory Zaborski Jr.,
Joe Finney,
Aaron L. Sharpe,
Rupini V. Kamat,
Sandesh S. Kalantre,
Marisa Hocking,
Nathan J. Bittner,
Kenji Watanabe,
Takashi Taniguchi,
Bede Pittenger,
Christina J. Newcomb,
Marc A. Kastner,
Andrew J. Mannix,
David Goldhaber-Gordon
Abstract:
In a stack of atomically-thin Van der Waals layers, introducing interlayer twist creates a moiré superlattice whose period is a function of twist angle. Changes in that twist angle of even hundredths of a degree can dramatically transform the system's electronic properties. Setting a precise and uniform twist angle for a stack remains difficult, hence determining that twist angle and map** its s…
▽ More
In a stack of atomically-thin Van der Waals layers, introducing interlayer twist creates a moiré superlattice whose period is a function of twist angle. Changes in that twist angle of even hundredths of a degree can dramatically transform the system's electronic properties. Setting a precise and uniform twist angle for a stack remains difficult, hence determining that twist angle and map** its spatial variation is very important. Techniques have emerged to do this by imaging the moiré, but most of these require sophisticated infrastructure, time-consuming sample preparation beyond stack synthesis, or both. In this work, we show that Torsional Force Microscopy (TFM), a scanning probe technique sensitive to dynamic friction, can reveal surface and shallow subsurface structure of Van der Waals stacks on multiple length scales: the moirés formed between bi-layers of graphene and between graphene and hexagonal boron nitride (hBN), and also the atomic crystal lattices of graphene and hBN. In TFM, torsional motion of an AFM cantilever is monitored as it is actively driven at a torsional resonance while a feedback loop maintains contact at a set force with the sample surface. TFM works at room temperature in air, with no need for an electrical bias between the tip and the sample, making it applicable to a wide array of samples. It should enable determination of precise structural information including twist angles and strain in moiré superlattices and crystallographic orientation of VdW flakes to support predictable moiré heterostructure fabrication.
△ Less
Submitted 20 December, 2023; v1 submitted 17 August, 2023;
originally announced August 2023.
-
NoFADE: Analyzing Diminishing Returns on CO2 Investment
Authors:
Andre Fu,
Justin Tran,
Andy Xie,
Jonathan Spraggett,
Elisa Ding,
Chang-Won Lee,
Kanav Singla,
Mahdi S. Hosseini,
Konstantinos N. Plataniotis
Abstract:
Climate change continues to be a pressing issue that currently affects society at-large. It is important that we as a society, including the Computer Vision (CV) community take steps to limit our impact on the environment. In this paper, we (a) analyze the effect of diminishing returns on CV methods, and (b) propose a \textit{``NoFADE''}: a novel entropy-based metric to quantify model--dataset--co…
▽ More
Climate change continues to be a pressing issue that currently affects society at-large. It is important that we as a society, including the Computer Vision (CV) community take steps to limit our impact on the environment. In this paper, we (a) analyze the effect of diminishing returns on CV methods, and (b) propose a \textit{``NoFADE''}: a novel entropy-based metric to quantify model--dataset--complexity relationships. We show that some CV tasks are reaching saturation, while others are almost fully saturated. In this light, NoFADE allows the CV community to compare models and datasets on a similar basis, establishing an agnostic platform.
△ Less
Submitted 28 November, 2021;
originally announced November 2021.
-
A Novel Epidemiological Approach to Geographically Map** Population Dry Eye Disease in the United States through Google Trends
Authors:
Daniel B. Azzam,
Nitish Nag,
Julia Tran,
Lauren Chen,
Kaajal Visnagra,
Kailey Marshall,
Matthew Wade
Abstract:
Dry eye disease (DED) affects approximately half of the United States population. DED is characterized by dryness on the corena surface due to a variety of causes. This study fills the spatiotemporal gaps in DED epidemiology by using Google Trends as a novel epidemiological tool for geographically map** DED in relation to environmental risk factors. We utilized Google Trends to extract DED-relat…
▽ More
Dry eye disease (DED) affects approximately half of the United States population. DED is characterized by dryness on the corena surface due to a variety of causes. This study fills the spatiotemporal gaps in DED epidemiology by using Google Trends as a novel epidemiological tool for geographically map** DED in relation to environmental risk factors. We utilized Google Trends to extract DED-related queries estimating user intent from 2004-2019 in the United States. We incorporated national climate data to generate heat maps comparing geographic, temporal, and environmental relationships of DED. Multi-variable regression models were constructed to generate quadratic forecasts predicting DED and control searches. Our results illustrated the upward trend, seasonal pattern, environmental influence, and spatial relationship of DED search volume across US geography. Localized patches of DED interest were visualized along the coastline. There was no significant difference in DED queries across US census regions. Regression model 1 predicted DED searches over time (R^2=0.97) with significant predictors being control queries (p=0.0024), time (p=0.001), and seasonality (Winter p=0.0028; Spring p<0.001; Summer p=0.018). Regression model 2 predicted DED queries per state (R^2=0.49) with significant predictors being temperature (p=0.0003) and coastal zone (p=0.025). Importantly, temperature, coastal status, and seasonality were stronger risk factors of DED searches than humidity, sunshine, pollution, or region as clinical literature may suggest. Our work paves the way for future exploration of geographic information systems for locating DED and other diseases via online search query metrics.
△ Less
Submitted 21 June, 2020;
originally announced June 2020.
-
Jupiter: A Networked Computing Architecture
Authors:
Pradipta Ghosh,
Quynh Nguyen,
Pranav K Sakulkar,
Aleksandra Knezevic,
Jason A. Tran,
Jiatong Wang,
Zhifeng Lin,
Bhaskar Krishnamachari,
Murali Annavaram,
Salman Avestimehr
Abstract:
In the era of Internet of Things, there is an increasing demand for networked computing to support the requirements of the time-constrained, compute-intensive distributed applications such as multi-camera video processing and data fusion for security. We present Jupiter, an open source networked computing system that inputs a Directed Acyclic Graph (DAG)-based computational task graph to efficient…
▽ More
In the era of Internet of Things, there is an increasing demand for networked computing to support the requirements of the time-constrained, compute-intensive distributed applications such as multi-camera video processing and data fusion for security. We present Jupiter, an open source networked computing system that inputs a Directed Acyclic Graph (DAG)-based computational task graph to efficiently distribute the tasks among a set of networked compute nodes regardless of their geographical separations and orchestrates the execution of the DAG thereafter. This Kubernetes container-orchestration-based system supports both centralized and decentralized scheduling algorithms for optimally map** the tasks based on information from a range of profilers: network profilers, resource profilers, and execution time profilers. While centralized scheduling algorithms with global knowledge have been popular among the grid/cloud computing community, we argue that a distributed scheduling approach is better suited for networked computing due to lower communication and computation overhead in the face of network dynamics. To this end, we propose and implement a new class of distributed scheduling algorithms called WAVE on the Jupiter system. We present a set of real world experiments on two separate testbeds - one a world-wide network of 90 cloud computers across 8 cities and the other a cluster of 30 Raspberry pi nodes, over a simple networked computing application called Distributed Network Anomaly Detector (DNAD). We show that despite using more localized knowledge, a distributed WAVE greedy algorithm can achieve similar performance as a classical centralized scheduling algorithm called Heterogeneous Earliest Finish Time (HEFT), suitably enhanced for the Jupiter system.
△ Less
Submitted 23 December, 2019;
originally announced December 2019.
-
Implementing Homomorphic Encryption Based Secure Feedback Control for Physical Systems
Authors:
Julian Tran,
Farhad Farokhi,
Michael Cantoni,
Iman Shames
Abstract:
This paper is about an encryption based approach to the secure implementation of feedback controllers for physical systems. Specifically, Paillier's homomorphic encryption is used to digitally implement a class of linear dynamic controllers, which includes the commonplace static gain and PID type feedback control laws as special cases. The developed implementation is amenable to Field Programmable…
▽ More
This paper is about an encryption based approach to the secure implementation of feedback controllers for physical systems. Specifically, Paillier's homomorphic encryption is used to digitally implement a class of linear dynamic controllers, which includes the commonplace static gain and PID type feedback control laws as special cases. The developed implementation is amenable to Field Programmable Gate Array (FPGA) realization. Experimental results, including timing analysis and resource usage characteristics for different encryption key lengths, are presented for the realization of an inverted pendulum controller; as this is an unstable plant, the control is necessarily fast.
△ Less
Submitted 27 March, 2019; v1 submitted 19 February, 2019;
originally announced February 2019.
-
Fast Reflective Optic-Based Rotational Anisotropy Nonlinear Harmonic Generation Spectrometer
Authors:
Baozhu Lu,
Jason D. Tran,
Darius H. Torchinsky
Abstract:
We present a novel Rotational Anisotropy Nonlinear Harmonic Generation (RA-NHG) apparatus based primarily upon reflective optics. The data acquisition scheme used here allows for fast accumulation of RA-NHG traces, mitigating low frequency noise from laser drift, while permitting real-time adjustment of acquired signals with significantly more data points per unit angle rotation of the optics than…
▽ More
We present a novel Rotational Anisotropy Nonlinear Harmonic Generation (RA-NHG) apparatus based primarily upon reflective optics. The data acquisition scheme used here allows for fast accumulation of RA-NHG traces, mitigating low frequency noise from laser drift, while permitting real-time adjustment of acquired signals with significantly more data points per unit angle rotation of the optics than other RA-NHG setups. We discuss the design and construction of the optical and electronic components of the device and present example data taken on a GaAs test sample at a variety of wavelengths. The RA-second harmonic generation data for this sample show the expected four-fold rotational symmetry across a broad range of wavelengths, while those for RA-third harmonic generation exhibit evidence of cascaded nonlinear processes possible in acentric crystal structures.
△ Less
Submitted 31 October, 2018;
originally announced November 2018.
-
Resonance-enhanced optical nonlinearity in the Weyl semimetal TaAs
Authors:
Shreyas Patankar,
Liang Wu,
Baozhu Lu,
Manita Rai,
Jason D. Tran,
T. Morimoto,
D. Parker,
Adolfo Grushin,
N. L. Nair,
J. G. Analytis,
J. E. Moore,
J. Orenstein,
Darius H. Torchinsky
Abstract:
While all media can exhibit first-order conductivity describing current linearly proportional to electric field, $E$, the second-order conductivity, $σ^{(2)}$ , relating current to $E^2$, is nonzero only when inversion symmetry is broken. Second order nonlinear optical responses are powerful tools in basic research, as probes of symmetry breaking, and in optical technology as the basis for generat…
▽ More
While all media can exhibit first-order conductivity describing current linearly proportional to electric field, $E$, the second-order conductivity, $σ^{(2)}$ , relating current to $E^2$, is nonzero only when inversion symmetry is broken. Second order nonlinear optical responses are powerful tools in basic research, as probes of symmetry breaking, and in optical technology as the basis for generating currents from far-infrared to X-ray wavelengths. The recent surge of interest in Weyl semimetals with acentric crystal structures has led to the discovery of a host of $σ^{(2)}$ -related phenomena in this class of materials, such as polarization-selective conversion of light to dc current (photogalvanic effects) and the observation of giant second-harmonic generation (SHG) efficiency in TaAs at photon energy 1.5 eV. Here, we present measurements of the SHG spectrum of TaAs revealing that the response at 1.5 eV corresponds to the high-energy tail of a resonance at 0.7 eV, at which point the second harmonic conductivity is approximately 200 times larger than seen in the standard candle nonlinear crystal, GaAs. This remarkably large SHG response provokes the question of ultimate limits on $σ^{(2)}$ , which we address by a new theorem relating frequency-integrated nonlinear response functions to the third cumulant (or "skewness") of the polarization distribution function in the ground state. This theorem provides considerable insight into the factors that lead to the largest possible second-order nonlinear response, specifically showing that the spectral weight is unbounded and potentially divergent when the possibility of next-neighbor hop** is included.
△ Less
Submitted 20 April, 2018; v1 submitted 18 April, 2018;
originally announced April 2018.
-
ROMANO: A Novel Overlay Lightweight Communication Protocol for Unified Control and Sensing of a Network of Robots
Authors:
Pradipta Ghosh,
Jason A. Tran,
Daniel Dsouza,
Nora Ayanian,
Bhaskar Krishnamachari
Abstract:
We present the Robotic Overlay coMmunicAtioN prOtocol (ROMANO), a lightweight, application layer overlay communication protocol for a unified sensing and control abstraction of a network of heterogeneous robots mainly consisting of low power, low-compute-capable robots. ROMANO is built to work in conjunction with the well-known MQ Telemetry Transport for Sensor Nodes (MQTT-SN) protocol, a lightwei…
▽ More
We present the Robotic Overlay coMmunicAtioN prOtocol (ROMANO), a lightweight, application layer overlay communication protocol for a unified sensing and control abstraction of a network of heterogeneous robots mainly consisting of low power, low-compute-capable robots. ROMANO is built to work in conjunction with the well-known MQ Telemetry Transport for Sensor Nodes (MQTT-SN) protocol, a lightweight publish-subscribe communication protocol for the Internet of Things and makes use its concept of "topics" to designate the addition and deletion of communication endpoints by changing the subscriptions of topics at each device. We also develop a portable implementation of ROMANO for low power IEEE 802.15.4 (Zigbee) radios and deployed it on a small testbed of commercially available, low-power, and low-compute-capable robots called Pololu 3pi robots. Based on a thorough analysis of the protocol on the real testbed, as a measure of throughput, we demonstrate that ROMANO can guarantee more than a $99.5\%$ message delivery ratio for a message generation rate up to 200 messages per second. The single hop delays in ROMANO are as low as 20ms with linear dependency on the number of robots connected. These delay numbers concur with typical delays in 802.15.4 networks and suggest that ROMANO does not introduce additional delays. Lastly, we implement four different multi-robot applications to demonstrate the scalability, adaptability, ease of integration, and reliability of ROMANO.
△ Less
Submitted 21 September, 2017;
originally announced September 2017.
-
ARREST: A RSSI Based Approach for Mobile Sensing and Tracking of a Moving Object
Authors:
Pradipta Ghosh,
Jason A. Tran,
Bhaskar Krishnamachari
Abstract:
We present Autonomous Rssi based RElative poSitioning and Tracking (ARREST), a new robotic sensing system for tracking and following a moving, RF-emitting object, which we refer to as the Leader, solely based on signal strength information. This kind of system can expand the horizon of autonomous mobile tracking and distributed robotics into many scenarios with limited visibility such as nighttime…
▽ More
We present Autonomous Rssi based RElative poSitioning and Tracking (ARREST), a new robotic sensing system for tracking and following a moving, RF-emitting object, which we refer to as the Leader, solely based on signal strength information. This kind of system can expand the horizon of autonomous mobile tracking and distributed robotics into many scenarios with limited visibility such as nighttime, dense forests, and cluttered environments. Our proposed tracking agent, which we refer to as the TrackBot, uses a single rotating, off-the-shelf, directional antenna, novel angle and relative speed estimation algorithms, and Kalman filtering to continually estimate the relative position of the Leader with decimeter level accuracy (which is comparable to a state-of-the-art multiple access point based RF-localization system) and the relative speed of the Leader with accuracy on the order of 1 m/s. The TrackBot feeds the relative position and speed estimates into a Linear Quadratic Gaussian (LQG) controller to generate a set of control outputs to control the orientation and the movement of the TrackBot. We perform an extensive set of real world experiments with a full-fledged prototype to demonstrate that the TrackBot is able to stay within 5m of the Leader with: (1) more than $99\%$ probability in line of sight scenarios, and (2) more than $70\%$ probability in no line of sight scenarios, when it moves 1.8X faster than the Leader. For ground truth estimation in real world experiments, we also developed an integrated TDoA based distance and angle estimation system with centimeter level localization accuracy in line of sight scenarios. While providing a first proof of concept, our work opens the door to future research aimed at further improvements of autonomous RF-based tracking.
△ Less
Submitted 24 October, 2017; v1 submitted 18 July, 2017;
originally announced July 2017.
-
DSD: Dense-Sparse-Dense Training for Deep Neural Networks
Authors:
Song Han,
Jeff Pool,
Sharan Narang,
Huizi Mao,
Enhao Gong,
Shijian Tang,
Erich Elsen,
Peter Vajda,
Manohar Paluri,
John Tran,
Bryan Catanzaro,
William J. Dally
Abstract:
Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimp…
▽ More
Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimportant connections with small weights and retraining the network given the sparsity constraint. In the final D (re-Dense) step, we increase the model capacity by removing the sparsity constraint, re-initialize the pruned parameters from zero and retrain the whole dense network. Experiments show that DSD training can improve the performance for a wide range of CNNs, RNNs and LSTMs on the tasks of image classification, caption generation and speech recognition. On ImageNet, DSD improved the Top1 accuracy of GoogLeNet by 1.1%, VGG-16 by 4.3%, ResNet-18 by 1.2% and ResNet-50 by 1.1%, respectively. On the WSJ'93 dataset, DSD improved DeepSpeech and DeepSpeech2 WER by 2.0% and 1.1%. On the Flickr-8K dataset, DSD improved the NeuralTalk BLEU score by over 1.7. DSD is easy to use in practice: at training time, DSD incurs only one extra hyper-parameter: the sparsity ratio in the S step. At testing time, DSD doesn't change the network architecture or incur any inference overhead. The consistent and significant performance gain of DSD experiments shows the inadequacy of the current training methods for finding the best local optimum, while DSD effectively achieves superior optimization performance for finding a better solution. DSD models are available to download at https://songhan.github.io/DSD.
△ Less
Submitted 21 February, 2017; v1 submitted 15 July, 2016;
originally announced July 2016.
-
A virtual instrument to standardise the calibration of atomic force microscope cantilevers
Authors:
John E. Sader,
Riccardo Borgani,
Christopher T. Gibson,
David B. Haviland,
Michael J. Higgins,
Jason I. Kilpatrick,
Jianing Lu,
Paul Mulvaney,
Cameron J. Shearer,
Ashley D. Slattery,
Per-Anders Thorén,
Jim Tran,
Heyou Zhang,
Hongrui Zhang,
Tian Zheng
Abstract:
Atomic force microscope (AFM) users often calibrate the spring constants of cantilevers using functionality built into individual instruments. This is performed without reference to a global standard, which hinders robust comparison of force measurements reported by different laboratories. In this article, we describe a virtual instrument (an internet-based initiative) whereby users from all labor…
▽ More
Atomic force microscope (AFM) users often calibrate the spring constants of cantilevers using functionality built into individual instruments. This is performed without reference to a global standard, which hinders robust comparison of force measurements reported by different laboratories. In this article, we describe a virtual instrument (an internet-based initiative) whereby users from all laboratories can instantly and quantitatively compare their calibration measurements to those of others - standardising AFM force measurements - and simultaneously enabling non-invasive calibration of AFM cantilevers of any geometry. This global calibration initiative requires no additional instrumentation or data processing on the part of the user. It utilises a single website where users upload currently available data. A proof-of-principle demonstration of this initiative is presented using measured data from five independent laboratories across three countries, which also allows for an assessment of current calibration.
△ Less
Submitted 25 May, 2016;
originally announced May 2016.
-
Core Course Analysis for Undergraduate Students in Mathematics
Authors:
Ritvik Kharkar,
Jessica Tran,
Charles Z. Marshak
Abstract:
In this work, we develop statistical tools to understand core courses at the university level. Traditionally, professors and administrators label courses as "core" when the courses contain foundational material. Such courses are often required to complete a major, and, in some cases, allocated additional educational resources. We identify two key attributes which we expect core courses to have. Na…
▽ More
In this work, we develop statistical tools to understand core courses at the university level. Traditionally, professors and administrators label courses as "core" when the courses contain foundational material. Such courses are often required to complete a major, and, in some cases, allocated additional educational resources. We identify two key attributes which we expect core courses to have. Namely, we expect core courses to be highly correlated with and highly impactful on a student's overall mathematics GPA. We use two statistical procedures to measure the strength of these attributes across courses. The first of these procedures fashions a metric out of standard correlation measures. The second utilizes sparse regression. We apply these methods on student data coming from the University of California, Los Angeles (UCLA) department of mathematics to compare core and non-core coursework.
△ Less
Submitted 1 May, 2016;
originally announced May 2016.
-
Learning both Weights and Connections for Efficient Neural Networks
Authors:
Song Han,
Jeff Pool,
John Tran,
William J. Dally
Abstract:
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude with…
▽ More
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. Our method prunes redundant connections using a three-step method. First, we train the network to learn which connections are important. Next, we prune the unimportant connections. Finally, we retrain the network to fine tune the weights of the remaining connections. On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9x, from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the number of parameters can be reduced by 13x, from 138 million to 10.3 million, again with no loss of accuracy.
△ Less
Submitted 30 October, 2015; v1 submitted 8 June, 2015;
originally announced June 2015.
-
cuDNN: Efficient Primitives for Deep Learning
Authors:
Sharan Chetlur,
Cliff Woolley,
Philippe Vandermersch,
Jonathan Cohen,
John Tran,
Bryan Catanzaro,
Evan Shelhamer
Abstract:
We present a library of efficient implementations of deep learning primitives. Deep learning workloads are computationally intensive, and optimizing their kernels is difficult and time-consuming. As parallel architectures evolve, kernels must be reoptimized, which makes maintaining codebases difficult over time. Similar issues have long been addressed in the HPC community by libraries such as the…
▽ More
We present a library of efficient implementations of deep learning primitives. Deep learning workloads are computationally intensive, and optimizing their kernels is difficult and time-consuming. As parallel architectures evolve, kernels must be reoptimized, which makes maintaining codebases difficult over time. Similar issues have long been addressed in the HPC community by libraries such as the Basic Linear Algebra Subroutines (BLAS). However, there is no analogous library for deep learning. Without such a library, researchers implementing deep learning workloads on parallel processors must create and optimize their own implementations of the main computational kernels, and this work must be repeated as new parallel processors emerge. To address this problem, we have created a library similar in intent to BLAS, with optimized routines for deep learning workloads. Our implementation contains routines for GPUs, although similarly to the BLAS library, these routines could be implemented for other platforms. The library is easy to integrate into existing frameworks, and provides optimized performance and memory usage. For example, integrating cuDNN into Caffe, a popular framework for convolutional networks, improves performance by 36% on a standard model while also reducing memory consumption.
△ Less
Submitted 17 December, 2014; v1 submitted 3 October, 2014;
originally announced October 2014.
-
Parallel Support Vector Machines in Practice
Authors:
Stephen Tyree,
Jacob R. Gardner,
Kilian Q. Weinberger,
Kunal Agrawal,
John Tran
Abstract:
In this paper, we evaluate the performance of various parallel optimization methods for Kernel Support Vector Machines on multicore CPUs and GPUs. In particular, we provide the first comparison of algorithms with explicit and implicit parallelization. Most existing parallel implementations for multi-core or GPU architectures are based on explicit parallelization of Sequential Minimal Optimization…
▽ More
In this paper, we evaluate the performance of various parallel optimization methods for Kernel Support Vector Machines on multicore CPUs and GPUs. In particular, we provide the first comparison of algorithms with explicit and implicit parallelization. Most existing parallel implementations for multi-core or GPU architectures are based on explicit parallelization of Sequential Minimal Optimization (SMO)---the programmers identified parallelizable components and hand-parallelized them, specifically tuned for a particular architecture. We compare these approaches with each other and with implicitly parallelized algorithms---where the algorithm is expressed such that most of the work is done within few iterations with large dense linear algebra operations. These can be computed with highly-optimized libraries, that are carefully parallelized for a large variety of parallel platforms. We highlight the advantages and disadvantages of both approaches and compare them on various benchmark data sets. We find an approximate implicitly parallel algorithm which is surprisingly efficient, permits a much simpler implementation, and leads to unprecedented speedups in SVM training.
△ Less
Submitted 3 April, 2014;
originally announced April 2014.
-
Site percolation on lattices with low average coordination numbers
Authors:
Ted Y. Yoo,
Jonathan Tran,
Shane P. Stahlheber,
Carina E. Kaainoa,
Kevin Djepang,
Alexander R. Small
Abstract:
We present a study of site and bond percolation on periodic lattices with (on average) fewer than three nearest neighbors per site. We have studied this issue in two contexts: By simulating oxides with a mixture of 2-coordinated and higher-coordinated sites, and by map** site-bond percolation results onto a site model with mixed coordination number. Our results show that a conjectured power-law…
▽ More
We present a study of site and bond percolation on periodic lattices with (on average) fewer than three nearest neighbors per site. We have studied this issue in two contexts: By simulating oxides with a mixture of 2-coordinated and higher-coordinated sites, and by map** site-bond percolation results onto a site model with mixed coordination number. Our results show that a conjectured power-law relationship between coordination number and site percolation threshold holds approximately if the coordination number is defined as the average number of connections available between high-coordinated sites, and suggest that the conjectured power-law relationship reflects a real phenomenon requiring further study. The solution may be to modify the power-law relationship to be an implicit formula for percolation threshold, one that takes into account aspects of the lattice beyond spatial dimension and average coordination number.
△ Less
Submitted 7 March, 2014;
originally announced March 2014.
-
Percolation thresholds on 3-dimensional lattices with 3 nearest neighbors
Authors:
Jonathan Tran,
Ted Yoo,
Shane Stahlheber,
Alex Small
Abstract:
We present a study of site and bond percolation on periodic lattices with 3 nearest neighbors per site. We have considered 3 lattices, with different symmetries, different underlying Bravais lattices, and different degrees of longer-range connections. As expected, we find that the site and bond percolation thresholds in all of the 3-connected lattices studied here are significantly higher than in…
▽ More
We present a study of site and bond percolation on periodic lattices with 3 nearest neighbors per site. We have considered 3 lattices, with different symmetries, different underlying Bravais lattices, and different degrees of longer-range connections. As expected, we find that the site and bond percolation thresholds in all of the 3-connected lattices studied here are significantly higher than in diamond. Interestingly, thresholds for different lattices are similar to within a few percent, despite the differences between the lattices at scales beyond nearest and next-nearest neighbors.
△ Less
Submitted 28 November, 2012;
originally announced November 2012.
-
Wireless Mesh Network Performance for Urban Search and Rescue Missions
Authors:
Cristina Ribeiro,
Alexander Ferworn,
Jimmy Tran
Abstract:
In this paper we demonstrate that the Canine Pose Estimation (CPE) system can provide a reliable estimate for some poses and when coupled with effective wireless transmission over a mesh network. Pose estimates are time sensitive, thus it is important that pose data arrives at its destination quickly. Propagation delay and packet delivery ratio measuring algorithms were developed and used to appra…
▽ More
In this paper we demonstrate that the Canine Pose Estimation (CPE) system can provide a reliable estimate for some poses and when coupled with effective wireless transmission over a mesh network. Pose estimates are time sensitive, thus it is important that pose data arrives at its destination quickly. Propagation delay and packet delivery ratio measuring algorithms were developed and used to appraise Wireless Mesh Network (WMN) performance as a means of carriage for this time-critical data. The experiments were conducted in the rooms of a building where the radio characteristics closely resembled those of a partially collapsed building-a typical US&R environment. This paper presents the results of the experiments, which demonstrate that it is possible to receive the canine pose estimation data in realtime although accuracy of the results depend on the network size and the deployment environment.
△ Less
Submitted 16 March, 2010;
originally announced March 2010.