-
Missingness Bias in Model Debugging
Authors:
Saachi Jain,
Hadi Salman,
Eric Wong,
Pengchuan Zhang,
Vibhav Vineet,
Sai Vemprala,
Aleksander Madry
Abstract:
Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools. However, in computer vision, pixels cannot simply be removed from an image. One thus tends to resort to heuristics such as blacking out pixels, which may in turn introduce bias into the debugging process. We study such biases and, in particular, show how transformer-based architectures ca…
▽ More
Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools. However, in computer vision, pixels cannot simply be removed from an image. One thus tends to resort to heuristics such as blacking out pixels, which may in turn introduce bias into the debugging process. We study such biases and, in particular, show how transformer-based architectures can enable a more natural implementation of missingness, which side-steps these issues and improves the reliability of model debugging in practice. Our code is available at https://github.com/madrylab/missingness
△ Less
Submitted 13 June, 2022; v1 submitted 19 April, 2022;
originally announced April 2022.
-
Recent Formation of a Spiral Disk Hosting Progenitor Globular Clusters at the center of the Perseus Brightest Cluster Galaxy: II. Progenitor Globular Clusters
Authors:
Jeremy Lim,
Emily Wong,
Youichi Ohyama,
Michael C. H. Yeung
Abstract:
We address the nature and origin of Super Star Clusters (SSCs) discovered by Holtzman et al. (1992) within a radius of $\sim$$5\,\rm kpc$ from the center of NGC 1275, the giant elliptical galaxy at the center of the Perseus Cluster. We show that, in contrast with the much more numerous population of SSCs subsequently discovered up to $\sim$$30\,\rm kpc$ from the center of this galaxy, the central…
▽ More
We address the nature and origin of Super Star Clusters (SSCs) discovered by Holtzman et al. (1992) within a radius of $\sim$$5\,\rm kpc$ from the center of NGC 1275, the giant elliptical galaxy at the center of the Perseus Cluster. We show that, in contrast with the much more numerous population of SSCs subsequently discovered up to $\sim$$30\,\rm kpc$ from the center of this galaxy, the central SSC population have maximal masses an order of magnitude higher and a mass function with a shallower power-law slope. Furthermore, whereas the outer SSC population have ages spanning a few $\rm Myr$ to at least $\sim$$1\,\rm Gyr$, the central SSC population have ages strongly concentrated around $\sim$$500 \rm \, Myr$ with a $1\,σ$ dispersion of $\sim$$100\,\rm Myr$. These SSCs share a close spatial and temporal relationship with the "central spiral," which also has a radius $\sim$$5\,\rm kpc$ centered on NGC 1275 and a characteristic stellar age of $\sim$$150\,\rm Myr$ (Paper I). We argue that both the central SSC population and the central spiral formed from gas deposited by a residual cooling flow, with the SSCs forming first followed by the formation of the stellar body of the central spiral $\sim$$300$-$400\,\rm Myr$ later. The ages of the central SSC population imply that they are able to withstand very strong tidal fields near the center of NGC 1275, making them genuine progenitor globular clusters. Evidently, a spiral disk hosting progenitor globular clusters has recently formed at the center of a giant elliptical galaxy.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
DistAD: Software Anomaly Detection Based on Execution Trace Distribution
Authors:
Shiyi Kong,
Jun Ai,
Minyan Lu,
Shuguang Wang,
W. Eric Wong
Abstract:
Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) be…
▽ More
Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) before they ultimately lead to unavoidable failures, thus, supporting the following runtime fault-tolerant techniques. In this work, we propose a novel anomaly detection method named DistAD, which is based on the distribution of software runtime dynamic execution traces. Unlike other existing works using key performance indicators, the execution trace is collected during runtime via intrusive instrumentation. Instrumentation are controlled following a sampling mechanism to avoid excessive overheads. Bi-directional Long Short-Term Memory (Bi-LSTM), an architecture of Recurrent Neural Network (RNN) is used to achieve the anomaly detection. The whole framework is constructed under a One-Class Neural Network (OCNN) learning mode which can help eliminate the limits of lacking for enough labeled samples and the data imbalance issues. A series of controlled experiments are conducted on a widely used database system named Cassandra to prove the validity and feasibility of the proposed method. Overheads brought about by the intrusive probing are also evaluated. The results show that DistAD can achieve more than 70% accuracy and 90% recall (in normal states) with no more than 2 times overheads compared with unmonitored executions.
△ Less
Submitted 26 April, 2022; v1 submitted 28 February, 2022;
originally announced February 2022.
-
CCOMPASSION: A Hybrid Cloudlet Placement Framework over Passive Optical Access Networks
Authors:
Sourav Mondal,
Goutam Das,
Elaine Wong
Abstract:
Cloud-based computing technology is one of the most significant technical advents of the last decade and extension of this facility towards access networks by aggregation of cloudlets is a step further. To fulfill the ravenous demand for computational resources entangled with the stringent latency requirements of computationally-heavy applications related to augmented reality, cognitive assistance…
▽ More
Cloud-based computing technology is one of the most significant technical advents of the last decade and extension of this facility towards access networks by aggregation of cloudlets is a step further. To fulfill the ravenous demand for computational resources entangled with the stringent latency requirements of computationally-heavy applications related to augmented reality, cognitive assistance and context-aware computation, installation of cloudlets near the access segment is a very promising solution because of its support for wide geographical network distribution, low latency, mobility and heterogeneity. In this paper, we propose a novel framework, Cloudlet Cost OptiMization over PASSIve Optical Network (CCOMPASSION), and formulate a nonlinear mixed-integer program to identify optimal cloudlet placement locations such that installation cost is minimized whilst meeting the capacity and latency constraints. Considering urban, suburban and rural scenarios as commonly-used network deployment models, we investigate the feasibility of the proposed model over them and provide guidance on the overall cloudlet facility installation over optical access network. We also study the percentage of incremental energy budget in the presence of cloudlets of the existing network. The final results from our proposed model can be considered as fundamental cornerstones for network planning with hybrid cloudlet network architectures.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Projected-Search Methods for Bound-Constrained Optimization
Authors:
Michael W. Ferry,
Philip E. Gill,
Elizabeth Wong,
Minxin Zhang
Abstract:
Projected-search methods for bound-constrained optimization are based on performing a search along a piecewise-linear continuous path obtained by projecting a search direction onto the feasible region. A benefit of these methods is that many changes to the active set can be made at the cost of computing a single search direction. As the objective function is not differentiable along the search pat…
▽ More
Projected-search methods for bound-constrained optimization are based on performing a search along a piecewise-linear continuous path obtained by projecting a search direction onto the feasible region. A benefit of these methods is that many changes to the active set can be made at the cost of computing a single search direction. As the objective function is not differentiable along the search path, it is not possible to use a projected-search method with a step that satisfies the Wolfe conditions, which require the directional derivative of the objective function at a point on the path. Thus, methods based on a simple backtracking procedure must be used to give a step that satisfies an "Armijo-like" sufficient decrease condition. As a consequence, conventional projected-search methods are unable to exploit sophisticated safeguarded polynomial interpolation techniques that have been shown to be effective for the unconstrained case.
This paper concerns the formulation and analysis of projected-search methods based on a new quasi-Wolfe line search that is appropriate for piecewise differentiable functions. The behavior of the line search is similar to conventional Wolfe line search, except that a step is accepted under a wider range of conditions. These conditions take into consideration steps at which the restriction of the objective function on the search path is not differentiable. Two new classes of method are proposed that may be broadly categorized as active-set and interior-point methods. Computational results are given for two specific methods from these general classes: a projected-search active-set method that uses a limited-memory quasi-Newton approximation of the Hessian; and a projected-search primal-dual interior-point method. The results show that in these contexts, a quasi-Wolfe line search is substantially more efficient and reliable than an Armijo line search.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Certified Patch Robustness via Smoothed Vision Transformers
Authors:
Hadi Salman,
Saachi Jain,
Eric Wong,
Aleksander Mądry
Abstract:
Certified patch defenses can guarantee robustness of an image classifier to arbitrary changes within a bounded contiguous region. But, currently, this robustness comes at a cost of degraded standard accuracies and slower inference times. We demonstrate how using vision transformers enables significantly better certified patch robustness that is also more computationally efficient and does not incu…
▽ More
Certified patch defenses can guarantee robustness of an image classifier to arbitrary changes within a bounded contiguous region. But, currently, this robustness comes at a cost of degraded standard accuracies and slower inference times. We demonstrate how using vision transformers enables significantly better certified patch robustness that is also more computationally efficient and does not incur a substantial drop in standard accuracy. These improvements stem from the inherent ability of the vision transformer to gracefully handle largely masked images. Our code is available at https://github.com/MadryLab/smoothed-vit.
△ Less
Submitted 11 October, 2021;
originally announced October 2021.
-
Autonomous Investigations over WS$_2$ and Au{111} with Scanning Probe Microscopy
Authors:
John C. Thomas,
Antonio Rossi,
Darian Smalley,
Luca Francaviglia,
Zhuohang Yu,
Tianyi Zhang,
Shalini Kumari,
Joshua A. Robinson,
Mauricio Terrones,
Masahiro Ishigami,
Eli Rotenberg,
Edward S. Barnard,
Archana Raja,
Ed Wong,
D. Frank Ogletree,
Marcus M. Noack,
Alexander Weber-Bargioni
Abstract:
Individual atomic defects in 2D materials impact their macroscopic functionality. Correlating the interplay is challenging, however, intelligent hyperspectral scanning tunneling spectroscopy (STS) map** provides a feasible solution to this technically difficult and time consuming problem. Here, dense spectroscopic volume is collected autonomously via Gaussian process regression, where convolutio…
▽ More
Individual atomic defects in 2D materials impact their macroscopic functionality. Correlating the interplay is challenging, however, intelligent hyperspectral scanning tunneling spectroscopy (STS) map** provides a feasible solution to this technically difficult and time consuming problem. Here, dense spectroscopic volume is collected autonomously via Gaussian process regression, where convolutional neural networks are used in tandem for spectral identification. Acquired data enable defect segmentation, and a workflow is provided for machine-driven decision making during experimentation with capability for user customization. We provide a means towards autonomous experimentation for the benefit of both enhanced reproducibility and user-accessibility. Hyperspectral investigations on WS$_2$ sulfur vacancy sites are explored, which is combined with local density of states confirmation on the Au{111} herringbone reconstruction. Chalcogen vacancies, pristine WS$_2$, Au face-centered cubic, and Au hexagonal close packed regions are examined and detected by machine learning methods to demonstrate the potential of artificial intelligence for hyperspectral STS map**.
△ Less
Submitted 2 May, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
A Decomposition Algorithm for Large-Scale Security-Constrained AC Optimal Power Flow
Authors:
Frank E. Curtis,
Daniel K. Molzahn,
Shenyinying Tu,
Andreas Wächter,
Ermin Wei,
Elizabeth Wong
Abstract:
A decomposition algorithm for solving large-scale security-constrained AC optimal power flow problems is presented. The formulation considered is the one used in the ARPA-E Grid Optimization (GO) Competition, Challenge 1, held from November 2018 through October 2019. The techniques found to be most effective in terms of performance in the challenge are presented, including strategies for contingen…
▽ More
A decomposition algorithm for solving large-scale security-constrained AC optimal power flow problems is presented. The formulation considered is the one used in the ARPA-E Grid Optimization (GO) Competition, Challenge 1, held from November 2018 through October 2019. The techniques found to be most effective in terms of performance in the challenge are presented, including strategies for contingency selection, fast contingency evaluation, handling complementarity constraints, avoiding issues related to degeneracy, and exploiting parallelism. The results of numerical experiments are provided to demonstrate the effectiveness of the proposed techniques as compared to alternative strategies.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Inhalation and deposition of spherical and pollen particles after middle turbinate resection in a human nasal cavity
Authors:
Kiao Inthavong,
Yidan Shang,
John M. Del Gaudio,
Sarah K. Wise,
Thomas S. Edwards,
Kimberley Bradshaw,
Eugene Wong,
Murray Smith,
Narinder Singh
Abstract:
Middle turbinate resection significantly alters the anatomy and redistributes the inhaled air. The superior half of the main nasal cavity is opened up, increasing accessibility to the region. This is expected to increase inhalation dosimetry to the region during exposure to airborne particles. This study investigated the influence of middle turbinate resection on the deposition of inhaled pollutan…
▽ More
Middle turbinate resection significantly alters the anatomy and redistributes the inhaled air. The superior half of the main nasal cavity is opened up, increasing accessibility to the region. This is expected to increase inhalation dosimetry to the region during exposure to airborne particles. This study investigated the influence of middle turbinate resection on the deposition of inhaled pollutants that cover spherical and non-spherical particles (e.g. pollen). A computational model of the nasal cavity from CT scans, and its corresponding post-operative model with virtual surgery performed was created. Two constant flow rates of 5_L/min, and 15_L/min were simulated under a laminar flow field. Inhaled particles including pollen (non-spherical), and a spherical particle with reference density of 1000 kg/m3 were introduced in the surrounding atmosphere. The effect of surgery was most prominent in the less patent cavity side, since the change in anatomy was proportionally greater relative to the original airway space. The left cavity produced an increase in particle deposition at a flow rate of 15_L/min. The main particle deposition mechanisms were inertial impaction, and to a lesser degree gravitational sedimentation. The results are expected to provide insight into inhalation efficiency of different aerosol types, and the likelihood of deposition in different nasal cavity surfaces.
△ Less
Submitted 8 August, 2021;
originally announced August 2021.
-
The impact of nasal adhesions on airflow and mucosal cooling -- a computational fluid dynamics analysis
Authors:
Praween Senanayakea,
Hana Salati,
Eugene Wong,
Kimberley Bradshaw,
Yidan Shang,
Narinder Singh,
Kiao Inthavong
Abstract:
Nasal adhesions are a known postoperative complication following surgical procedures for nasal airway obstruction (NAO); and are a common cause of surgical failure, with patients often reporting significant NAO, despite relatively minor adhesion size. Division of such nasal adhesions often provides much greater relief than anticipated, based on the minimal reduction in cross-sectional area associa…
▽ More
Nasal adhesions are a known postoperative complication following surgical procedures for nasal airway obstruction (NAO); and are a common cause of surgical failure, with patients often reporting significant NAO, despite relatively minor adhesion size. Division of such nasal adhesions often provides much greater relief than anticipated, based on the minimal reduction in cross-sectional area associated with the adhesion. The available literature regarding nasal adhesions provides little evidence examining their quantitative and qualitative effects on nasal airflow using objective measures. This study examined the impact of nasal adhesions at various anatomical sites on nasal airflow and mucosal cooling using computational fluid dynamics (CFD). A high-resolution CT scan of the paranasal sinuses of a 25-year-old, healthy female patient was segmented to create a three-dimensional nasal airway model. Virtual nasal adhesions of 2.5~mm diameter were added to various locations within the nasal cavity, representing common sites seen following NAO surgery. A series of models with single adhesions were created. CFD analysis was performed on each model and compared with a baseline no-adhesion model, comparing airflow and heat and mass transfer. The nasal adhesions resulted in no significant change in bulk airflow patterns through the nasal cavity. However, significant changes were observed in local airflow and mucosal cooling around and immediately downstream to the nasal adhesions. These were most evident with anterior nasal adhesions at the internal valve and anterior inferior turbinate.
△ Less
Submitted 8 July, 2021;
originally announced July 2021.
-
Analysis of the Evolution of Parametric Drivers of High-End Sea-Level Hazards
Authors:
Alana Hough,
Tony E. Wong
Abstract:
Climate models are critical tools for develo** strategies to manage the risks posed by sea-level rise to coastal communities. While these models are necessary for understanding climate risks, there is a level of uncertainty inherent in each parameter in the models. This model parametric uncertainty leads to uncertainty in future climate risks. Consequently, there is a need to understand how thos…
▽ More
Climate models are critical tools for develo** strategies to manage the risks posed by sea-level rise to coastal communities. While these models are necessary for understanding climate risks, there is a level of uncertainty inherent in each parameter in the models. This model parametric uncertainty leads to uncertainty in future climate risks. Consequently, there is a need to understand how those parameter uncertainties impact our assessment of future climate risks and the efficacy of strategies to manage them. Here, we use random forests to examine the parametric drivers of future climate risk and how the relative importances of those drivers change over time. We find that the equilibrium climate sensitivity and a factor that scales the effect of aerosols on radiative forcing are consistently the most important climate model parametric uncertainties throughout the 2020 to 2150 interval for both low and high radiative forcing scenarios. The near-term hazards of high-end sea-level rise are driven primarily by thermal expansion, while the longer-term hazards are associated with mass loss from the Antarctic and Greenland ice sheets. Our results highlight the practical importance of considering time-evolving parametric uncertainties when develo** strategies to manage future climate risks.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting
Authors:
Shaoru Chen,
Eric Wong,
J. Zico Kolter,
Mahyar Fazlyab
Abstract:
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In…
▽ More
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In this work, we propose a novel operator splitting method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller sub-problems that often have analytical solutions. The method is modular, scales to very large problem instances, and compromises operations that are amenable to fast parallelization with GPU acceleration. We demonstrate our method in bounding the worst-case performance of large convolutional networks in image classification and reinforcement learning settings, and in reachability analysis of neural network dynamical systems.
△ Less
Submitted 8 July, 2022; v1 submitted 16 June, 2021;
originally announced June 2021.
-
Binomial Determinants for Tiling Problems Yield to the Holonomic Ansatz
Authors:
Hao Du,
Christoph Koutschan,
Thotsaporn Thanatipanonda,
Elaine Wong
Abstract:
We present and prove closed form expressions for some families of binomial determinants with signed Kronecker deltas that are located along an arbitrary diagonal in the corresponding matrix. They count cyclically symmetric rhombus tilings of hexagonal regions with triangular holes. We extend a previous systematic study of these families, where the locations of the Kronecker deltas depended on an a…
▽ More
We present and prove closed form expressions for some families of binomial determinants with signed Kronecker deltas that are located along an arbitrary diagonal in the corresponding matrix. They count cyclically symmetric rhombus tilings of hexagonal regions with triangular holes. We extend a previous systematic study of these families, where the locations of the Kronecker deltas depended on an additional parameter, to families with negative Kronecker deltas. By adapting Zeilberger's holonomic ansatz to make it work for our problems, we can take full advantage of computer algebra tools for symbolic summation. This, together with the combinatorial interpretation, allows us to realize some new determinantal relationships. From there, we are able to resolve all remaining open conjectures related to these determinants, including one from 2005 due to Lascoux and Krattenthaler.
△ Less
Submitted 21 September, 2021; v1 submitted 18 May, 2021;
originally announced May 2021.
-
Leveraging Sparse Linear Layers for Debuggable Deep Networks
Authors:
Eric Wong,
Shibani Santurkar,
Aleksander Mądry
Abstract:
We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantiatively via numerical and human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, expla…
▽ More
We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantiatively via numerical and human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks. The code for our toolkit can be found at https://github.com/madrylab/debuggabledeepnetworks.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Localization and reduction of superconducting quantum coherent circuit losses
Authors:
M. Virginia P. Altoé,
Archan Banerjee,
Cassidy Berk,
Ahmed Hajr,
Adam Schwartzberg,
Chengyu Song,
Mohammed Al Ghadeer,
Shaul Aloni,
Michael J. Elowson,
John Mark Kreikebaum,
Ed K. Wong,
Sinead Griffin,
Saleem Rao,
Alexander Weber-Bargioni,
Andrew M. Minor,
David I. Santiago,
Stefano Cabrini,
Irfan Siddiqi,
D. Frank Ogletree
Abstract:
Quantum sensing and computation can be realized with superconducting microwave circuits. Qubits are engineered quantum systems of capacitors and inductors with non-linear Josephson junctions. They operate in the single-excitation quantum regime, photons of $27 μ$eV at 6.5 GHz. Quantum coherence is fundamentally limited by materials defects, in particular atomic-scale parasitic two-level systems (T…
▽ More
Quantum sensing and computation can be realized with superconducting microwave circuits. Qubits are engineered quantum systems of capacitors and inductors with non-linear Josephson junctions. They operate in the single-excitation quantum regime, photons of $27 μ$eV at 6.5 GHz. Quantum coherence is fundamentally limited by materials defects, in particular atomic-scale parasitic two-level systems (TLS) in amorphous dielectrics at circuit interfaces.[1] The electric fields driving oscillating charges in quantum circuits resonantly couple to TLS, producing phase noise and dissipation. We use coplanar niobium-on-silicon superconducting resonators to probe decoherence in quantum circuits. By selectively modifying interface dielectrics, we show that most TLS losses come from the silicon surface oxide, and most non-TLS losses are distributed throughout the niobium surface oxide. Through post-fabrication interface modification we reduced TLS losses by 85% and non-TLS losses by 72%, obtaining record single-photon resonator quality factors above 5 million and approaching a regime where non-TLS losses are dominant.
[1]Müller, C., Cole, J. H. & Lisenfeld, J. Towards understanding two-level-systems in amorphous solids: insights from quantum circuits. Rep. Prog. Phys. 82, 124501 (2019)
△ Less
Submitted 14 December, 2020;
originally announced December 2020.
-
Experimental measurement of the intrinsic excitonic wavefunction
Authors:
Michael K. L. Man,
Julien Madéo,
Chakradhar Sahoo,
Kaichen Xie,
Marshall Campbell,
Vivek Pareek,
Arka Karmakar,
E Laine Wong,
Abdullah Al-Mahboob,
Nicholas S. Chan,
David R. Bacon,
Xing Zhu,
Mohamed Abdelrasoul,
Xiaoquin Li,
Tony F. Heinz,
Felipe H. da Jornada,
Ting Cao,
Keshav M. Dani
Abstract:
An exciton, a two-body composite quasiparticle formed of an electron and hole, is a fundamental optical excitation in condensed-matter systems. Since its discovery nearly a century ago, a measurement of the excitonic wavefunction has remained beyond experimental reach. Here, we directly image the excitonic wavefunction in reciprocal space by measuring the momentum distribution of electrons photoem…
▽ More
An exciton, a two-body composite quasiparticle formed of an electron and hole, is a fundamental optical excitation in condensed-matter systems. Since its discovery nearly a century ago, a measurement of the excitonic wavefunction has remained beyond experimental reach. Here, we directly image the excitonic wavefunction in reciprocal space by measuring the momentum distribution of electrons photoemitted from excitons in monolayer WSe2. By transforming to real space, we obtain a visual of the distribution of the electron around the hole in an exciton. Further, by also resolving the energy coordinate, we confirm the elusive theoretical prediction that the photoemitted electron exhibits an inverted energy-momentum dispersion relationship reflecting the valence band where the partner hole remains, rather than that of conduction-band states of the electron.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Creative Telesco** on Multiple Sums
Authors:
Christoph Koutschan,
Elaine Wong
Abstract:
We showcase a collection of practical strategies to deal with a problem arising from an analysis of integral estimators derived via quasi-Monte Carlo methods. The problem reduces to a triple binomial sum, thereby enabling us to open up the holonomic toolkit, which contains tools such as creative telesco** that can be used to deduce a recurrence satisfied by the sum. While applying these techniqu…
▽ More
We showcase a collection of practical strategies to deal with a problem arising from an analysis of integral estimators derived via quasi-Monte Carlo methods. The problem reduces to a triple binomial sum, thereby enabling us to open up the holonomic toolkit, which contains tools such as creative telesco** that can be used to deduce a recurrence satisfied by the sum. While applying these techniques, a host of issues arose that partly needed to be resolved by hand. In other words, no creative telesco** implementation currently exists that can resolve all these issues automatically. Thus, we felt the need to compile the different strategies we tried and the difficulties that we encountered along the way. In particular, we highlight the necessity of the certificate in these computations and how its complexity can greatly influence the computation time.
△ Less
Submitted 17 March, 2021; v1 submitted 17 October, 2020;
originally announced October 2020.
-
Sustained formation of progenitor globular clusters in a giant elliptical galaxy
Authors:
Jeremy Lim,
Emily Wong,
Youichi Ohyama,
Tom Broadhurst,
Elinor Medezinski
Abstract:
Globular clusters (GCs) are thought to be ancient relics from the early formative phase of galaxies, although their physical origin remains uncertain. GCs are most numerous around massive elliptical galaxies, where they can exhibit a broad colour dispersion, suggesting a wide metallicity spread. Here, we show that many thousands of compact and massive (~5$\times$10$^{\rm 3}-$3$\times$ 10…
▽ More
Globular clusters (GCs) are thought to be ancient relics from the early formative phase of galaxies, although their physical origin remains uncertain. GCs are most numerous around massive elliptical galaxies, where they can exhibit a broad colour dispersion, suggesting a wide metallicity spread. Here, we show that many thousands of compact and massive (~5$\times$10$^{\rm 3}-$3$\times$ 10$^{\rm 6} M_{\odot}$) star clusters have formed at an approximately steady rate over, at least, the past ~1Gyr around NGC 1275, the central giant elliptical galaxy of the Perseus cluster. Beyond ~1Gyr, these star clusters are indistinguishable in broadband optical colours from the more numerous GCs. Their number distribution exhibits a similar dependence with luminosity and mass as the GCs, whereas their spatial distribution resembles a filamentary network of multiphase gas associated with cooling of the intracluster gas. The sustained formation of these star clusters demonstrates that progenitor GCs can form over cosmic history from cooled intracluster gas, thus contributing to both the large number and broad colour dispersion$-$owing to an age spread, in addition to a spread in metallicity$-$of GCs in massive elliptical galaxies. The progenitor GCs have minimal masses well below the maximal masses of Galactic open star clusters, affirming a common formation mechanism for star clusters over all mass scales irrespective of their formative pathways.
△ Less
Submitted 10 October, 2020; v1 submitted 8 October, 2020;
originally announced October 2020.
-
Learning perturbation sets for robust machine learning
Authors:
Eric Wong,
J. Zico Kolter
Abstract:
Although much progress has been made towards robust deep learning, a significant gap in robustness remains between real-world perturbations and more narrowly defined sets typically studied in adversarial defenses. In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation. Specifically, we use a c…
▽ More
Although much progress has been made towards robust deep learning, a significant gap in robustness remains between real-world perturbations and more narrowly defined sets typically studied in adversarial defenses. In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation. Specifically, we use a conditional generator that defines the perturbation set over a constrained region of the latent space. We formulate desirable properties that measure the quality of a learned perturbation set, and theoretically prove that a conditional variational autoencoder naturally satisfies these criteria. Using this framework, our approach can generate a variety of perturbations at different complexities and scales, ranging from baseline spatial transformations, through common image corruptions, to lighting variations. We measure the quality of our learned perturbation sets both quantitatively and qualitatively, finding that our models are capable of producing a diverse set of meaningful perturbations beyond the limited data seen during training. Finally, we leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations, while improving generalization on non-adversarial data. All code and configuration files for reproducing the experiments as well as pretrained model weights can be found at https://github.com/locuslab/perturbation_learning.
△ Less
Submitted 8 October, 2020; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Neural Network Virtual Sensors for Fuel Injection Quantities with Provable Performance Specifications
Authors:
Eric Wong,
Tim Schneider,
Joerg Schmitt,
Frank R. Schmidt,
J. Zico Kolter
Abstract:
Recent work has shown that it is possible to learn neural networks with provable guarantees on the output of the model when subject to input perturbations, however these works have focused primarily on defending against adversarial examples for image classifiers. In this paper, we study how these provable guarantees can be naturally applied to other real world settings, namely getting performance…
▽ More
Recent work has shown that it is possible to learn neural networks with provable guarantees on the output of the model when subject to input perturbations, however these works have focused primarily on defending against adversarial examples for image classifiers. In this paper, we study how these provable guarantees can be naturally applied to other real world settings, namely getting performance specifications for robust virtual sensors measuring fuel injection quantities within an engine. We first demonstrate that, in this setting, even simple neural network models are highly susceptible to reasonable levels of adversarial sensor noise, which are capable of increasing the mean relative error of a standard neural network from 6.6% to 43.8%. We then leverage methods for learning provably robust networks and verifying robustness properties, resulting in a robust model which we can provably guarantee has at most 16.5% mean relative error under any sensor noise. Additionally, we show how specific intervals of fuel injection quantities can be targeted to maximize robustness for certain ranges, allowing us to train a virtual sensor for fuel injection which is provably guaranteed to have at most 10.69% relative error under noise while maintaining 3% relative error on non-adversarial data within normalized fuel injection ranges of 0.6 to 1.0.
△ Less
Submitted 30 June, 2020;
originally announced July 2020.
-
Evidence for increasing frequency of extreme coastal sea levels
Authors:
Tony E. Wong,
Travis Torline,
Mingxuan Zhang
Abstract:
Projections of extreme sea levels (ESLs) are critical for managing coastal risks, but are made complicated by deep uncertainties. One key uncertainty is the choice of model structure used to estimate coastal hazards. Differences in model structural choices contribute to uncertainty in estimated coastal hazard, so it is important to characterize how model structural choice affects estimates of ESL.…
▽ More
Projections of extreme sea levels (ESLs) are critical for managing coastal risks, but are made complicated by deep uncertainties. One key uncertainty is the choice of model structure used to estimate coastal hazards. Differences in model structural choices contribute to uncertainty in estimated coastal hazard, so it is important to characterize how model structural choice affects estimates of ESL. Here, we present a collection of 36 ESL data sets, from tide gauge stations along the United States East and Gulf Coasts. The data are processed using both annual block maxima and peaks-over-thresholds approaches for modeling distributions of extremes. We use these data sets to fit a suite of potentially nonstationary extreme value models by covarying the ESL statistics with multiple climate variables. We demonstrate how this data set enables inquiry into deep uncertainty surrounding coastal hazards. For all of the models and sites considered here, we find that accounting for changes in the frequency of coastal extreme sea levels provides a better fit than using a stationary extreme value model.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Walsh functions, scrambled $(0,m,s)$-nets, and negative covariance: applying symbolic computation to quasi-Monte Carlo integration
Authors:
Jaspar Wiart,
Elaine Wong
Abstract:
We investigate base $b$ Walsh functions for which the variance of the integral estimator based on a scrambled $(0,m,s)$-net in base $b$ is less than or equal to that of the Monte-Carlo estimator based on the same number of points. First we compute the Walsh decomposition for the joint probability density function of two distinct points randomly chosen from a scrambled $(t,m,s)$-net in base $b$ in…
▽ More
We investigate base $b$ Walsh functions for which the variance of the integral estimator based on a scrambled $(0,m,s)$-net in base $b$ is less than or equal to that of the Monte-Carlo estimator based on the same number of points. First we compute the Walsh decomposition for the joint probability density function of two distinct points randomly chosen from a scrambled $(t,m,s)$-net in base $b$ in terms of certain counting numbers and simplify it in the special case $t$ is zero. Using this, we obtain an expression for the covariance of the integral estimator in terms of the Walsh coefficients of the function. Finally, we prove that the covariance of the integral estimator is negative when the Walsh coefficients of the function satisfy a certain decay condition. To do this, we use creative telesco** and recurrence solving algorithms from symbolic computation to find a sign equivalent closed form expression for the covariance term.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Centralized and Decentralized Non-Cooperative Load-Balancing Games among Federated Cloudlets
Authors:
Sourav Mondal,
Goutam Das,
Elaine Wong
Abstract:
Edge computing servers like cloudlets from different service providers compensate scarce computational, memory, and energy resources of mobile devices, are distributed across access networks. However, depending on the mobility pattern and dynamically varying computational requirements of associated mobile devices, cloudlets at different parts of the network become either overloaded or under-loaded…
▽ More
Edge computing servers like cloudlets from different service providers compensate scarce computational, memory, and energy resources of mobile devices, are distributed across access networks. However, depending on the mobility pattern and dynamically varying computational requirements of associated mobile devices, cloudlets at different parts of the network become either overloaded or under-loaded. Hence, load balancing among neighboring cloudlets appears to be an essential research problem. Nonetheless, the existing load balancing frameworks are unsuitable for low-latency applications. Thus, in this paper, we propose an economic and non-cooperative load balancing game for low-latency applications among federated neighboring cloudlets from the same as well as different service providers and heterogeneous classes of job requests. Firstly, we propose a centralized incentive mechanism to compute the pure strategy Nash equilibrium load balancing strategies of the cloudlets under the supervision of a neutral mediator. With this mechanism, we ensure that the truthful revelation of private information to the mediator is a weakly-dominant strategy for all the federated cloudlets. Secondly, we propose a continuous-action reinforcement learning automata-based algorithm, which allows each cloudlet to independently compute the Nash equilibrium in a completely distributed network setting. We critically study the convergence properties of the designed learning algorithm, scaffolding our understanding of the underlying load balancing game for faster convergence. Furthermore, through extensive simulations, we study the impacts of exploration and exploitation on learning accuracy. This is the first study to show the effectiveness of reinforcement learning algorithms for load balancing games among neighboring cloudlets.
△ Less
Submitted 5 May, 2021; v1 submitted 30 May, 2020;
originally announced June 2020.
-
Strain-Induced Room-Temperature Ferroelectricity in SrTiO$_3$ Membranes
Authors:
Ruijuan Xu,
Jiawei Huang,
Edward S. Barnard,
Seung Sae Hong,
Prastuti Singh,
Ed K. Wong,
Thies Jansen,
Varun Harbola,
Jun Xiao,
Bai Yang Wang,
Sam Crossley,
Di Lu,
Shi Liu,
Harold Y. Hwang
Abstract:
Advances in complex oxide heteroepitaxy have highlighted the enormous potential of utilizing strain engineering via lattice mismatch to control ferroelectricity in thin-film heterostructures. This approach, however, lacks the ability to produce large and continuously variable strain states, thus limiting the potential for designing and tuning the desired properties of ferroelectric films. Here, we…
▽ More
Advances in complex oxide heteroepitaxy have highlighted the enormous potential of utilizing strain engineering via lattice mismatch to control ferroelectricity in thin-film heterostructures. This approach, however, lacks the ability to produce large and continuously variable strain states, thus limiting the potential for designing and tuning the desired properties of ferroelectric films. Here, we observe and explore dynamic strain-induced ferroelectricity in SrTiO$_3$ by laminating freestanding oxide films onto a stretchable polymer substrate. Using a combination of scanning probe microscopy, optical second harmonic generation measurements, and atomistic modeling, we demonstrate robust room-temperature ferroelectricity in SrTiO$_3$ with 2.0% uniaxial tensile strain, corroborated by the notable features of 180° ferroelectric domains and an extrapolated transition temperature of 400 K. Our work reveals the enormous potential of employing oxide membranes to create and enhance ferroelectricity in environmentally benign lead-free oxides, which hold great promise for applications ranging from non-volatile memories and microwave electronics.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
Directly visualizing the momentum forbidden dark excitons and their dynamics in atomically thin semiconductors
Authors:
Julien Madéo,
Michael K. L. Man,
Chakradhar Sahoo,
Marshall Campbell,
Vivek Pareek,
E Laine Wong,
Abdullah Al Mahboob,
Nicholas S. Chan,
Arka Karmakar,
Bala Murali Krishna Mariserla,
Xiaoqin Li,
Tony F. Heinz,
Ting Cao,
Keshav M. Dani
Abstract:
Resolving the momentum degree of freedom of excitons - electron-hole pairs bound by the Coulomb attraction in a photoexcited semiconductor, has remained a largely elusive goal for decades. In atomically thin semiconductors, such a capability could probe the momentum forbidden dark excitons, which critically impact proposed opto-electronic technologies, but are not directly accessible via optical t…
▽ More
Resolving the momentum degree of freedom of excitons - electron-hole pairs bound by the Coulomb attraction in a photoexcited semiconductor, has remained a largely elusive goal for decades. In atomically thin semiconductors, such a capability could probe the momentum forbidden dark excitons, which critically impact proposed opto-electronic technologies, but are not directly accessible via optical techniques. Here, we probe the momentum-state of excitons in a WSe2 monolayer by photoemitting their constituent electrons, and resolving them in time, momentum and energy. We obtain a direct visual of the momentum forbidden dark excitons, and study their properties, including their near-degeneracy with bright excitons and their formation pathways in the energy-momentum landscape. These dark excitons dominate the excited state distribution - a surprising finding that highlights their importance in atomically thin semiconductors.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
Long-Range Exciton Diffusion in Two-Dimensional Assemblies of Cesium Lead Bromide Perovskite Nanocrystals
Authors:
Erika Penzo,
Anna Loiudice,
Edward S. Barnard,
Nicholas J. Borys,
Matthew J. Jurow,
Monica Lorenzon,
Igor Rajzbaum,
Edward K. Wong,
Yi Liu,
Adam M. Schwartzberg,
Stefano Cabrini,
Stephen Whitelam,
Raffaella Buonsanti,
Alexander Weber-Bargioni
Abstract:
Förster Resonant Energy Transfer (FRET)-mediated exciton diffusion through artificial nanoscale building block assemblies could be used as a new optoelectronic design element to transport energy. However, so far nanocrystal (NC) systems supported only diffusion length of 30 nm, which are too small to be useful in devices. Here, we demonstrate a FRET-mediated exciton diffusion length of 200 nm with…
▽ More
Förster Resonant Energy Transfer (FRET)-mediated exciton diffusion through artificial nanoscale building block assemblies could be used as a new optoelectronic design element to transport energy. However, so far nanocrystal (NC) systems supported only diffusion length of 30 nm, which are too small to be useful in devices. Here, we demonstrate a FRET-mediated exciton diffusion length of 200 nm with 0.5 cm2/s diffusivity through an ordered, two-dimensional assembly of cesium lead bromide perovskite nanocrystals (PNC). Exciton diffusion was directly measured via steady-state and time-resolved photoluminescence (PL) microscopy, with physical modeling providing deeper insight into the transport process. This exceptionally efficient exciton transport is facilitated by PNCs high PL quantum yield, large absorption cross-section, and high polarizability, together with minimal energetic and geometric disorder of the assembly. This FRET-mediated exciton diffusion length matches perovskites optical absorption depth, opening the possibility to design new optoelectronic device architectures with improved performances, and providing insight into the high conversion efficiencies of PNC-based optoelectronic devices.
△ Less
Submitted 1 September, 2020; v1 submitted 9 March, 2020;
originally announced March 2020.
-
Overfitting in adversarially robust deep learning
Authors:
Leslie Rice,
Eric Wong,
J. Zico Kolter
Abstract:
It is common practice in deep learning to use overparameterized networks and train for as long as possible; there are numerous studies that show, both theoretically and empirically, that such practices surprisingly do not unduly harm the generalization performance of the classifier. In this paper, we empirically study this phenomenon in the setting of adversarially trained deep networks, which are…
▽ More
It is common practice in deep learning to use overparameterized networks and train for as long as possible; there are numerous studies that show, both theoretically and empirically, that such practices surprisingly do not unduly harm the generalization performance of the classifier. In this paper, we empirically study this phenomenon in the setting of adversarially trained deep networks, which are trained to minimize the loss under worst-case adversarial perturbations. We find that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training across multiple datasets (SVHN, CIFAR-10, CIFAR-100, and ImageNet) and perturbation models ($\ell_\infty$ and $\ell_2$). Based upon this observed effect, we show that the performance gains of virtually all recent algorithmic improvements upon adversarial training can be matched by simply using early stop**. We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting. Finally, we study several classical and modern deep learning remedies for overfitting, including regularization and data augmentation, and find that no approach in isolation improves significantly upon the gains achieved by early stop**. All code for reproducing the experiments as well as pretrained model weights and training logs can be found at https://github.com/locuslab/robust_overfitting.
△ Less
Submitted 4 March, 2020; v1 submitted 26 February, 2020;
originally announced February 2020.
-
An Additive Decomposition in S-Primitive Towers
Authors:
Hao Du,
**g Guo,
Ziming Li,
Elaine Wong
Abstract:
We consider the additive decomposition problem in primitive towers and present an algorithm to decompose a function in an S-primitive tower as a sum of a derivative in the tower and a remainder which is minimal in some sense. Special instances of S-primitive towers include differential fields generated by finitely many logarithmic functions and logarithmic integrals. A function in an S-primitive t…
▽ More
We consider the additive decomposition problem in primitive towers and present an algorithm to decompose a function in an S-primitive tower as a sum of a derivative in the tower and a remainder which is minimal in some sense. Special instances of S-primitive towers include differential fields generated by finitely many logarithmic functions and logarithmic integrals. A function in an S-primitive tower is integrable in the tower if and only if the remainder is equal to zero. The additive decomposition is achieved by viewing our towers not as a traditional chain of extension fields, but rather as a direct sum of certain subrings. Furthermore, we can determine whether or not a function in an S-primitive tower has an elementary integral without solving any differential equations. We also show that a kind of S-primitive towers, known as logarithmic towers, can be embedded into a particular extension where we can obtain a finer remainder.
△ Less
Submitted 6 February, 2020;
originally announced February 2020.
-
Fast is better than free: Revisiting adversarial training
Authors:
Eric Wong,
Leslie Rice,
J. Zico Kolter
Abstract:
Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary,…
▽ More
Adversarial training, a method for learning robust deep networks, is typically assumed to be more expensive than traditional training due to the necessity of constructing adversarial examples via a first-order method like projected gradient decent (PGD). In this paper, we make the surprising discovery that it is possible to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training in practice. Specifically, we show that adversarial training with the fast gradient sign method (FGSM), when combined with random initialization, is as effective as PGD-based training but has significantly lower cost. Furthermore we show that FGSM adversarial training can be further accelerated by using standard techniques for efficient training of deep networks, allowing us to learn a robust CIFAR10 classifier with 45% robust accuracy to PGD attacks with $ε=8/255$ in 6 minutes, and a robust ImageNet classifier with 43% robust accuracy at $ε=2/255$ in 12 hours, in comparison to past work based on "free" adversarial training which took 10 and 50 hours to reach the same respective thresholds. Finally, we identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail. All code for reproducing the experiments in this paper as well as pretrained model weights are at https://github.com/locuslab/fast_adversarial.
△ Less
Submitted 12 January, 2020;
originally announced January 2020.
-
Examining the surface phase diagram of IrTe$_2$ with photoemission
Authors:
M. Rumo,
C. W. Nicholson,
A. Pulkkinen,
B. Hildebrand,
G. Kremer,
B. Salzmann,
M. -L. Mottas,
K. Y. Ma,
E L. Wong,
M. K. L. Man,
K. M. Dani,
B. Barbiellini,
M. Muntwiller,
T. Jaouen,
F. O. von Rohr,
C. Monney
Abstract:
In the transition metal dichalcogenide IrTe$_2$, low-temperature charge-ordered phase transitions involving Ir dimers lead to the occurrence of stripe phases of different periodicities, and nearly degenerate energies. Bulk-sensitive measurements have shown that, upon cooling, IrTe$_2$ undergoes two such first-order transitions to $(5\times1\times5)$ and $(8\times1\times8)$ reconstructed phases at…
▽ More
In the transition metal dichalcogenide IrTe$_2$, low-temperature charge-ordered phase transitions involving Ir dimers lead to the occurrence of stripe phases of different periodicities, and nearly degenerate energies. Bulk-sensitive measurements have shown that, upon cooling, IrTe$_2$ undergoes two such first-order transitions to $(5\times1\times5)$ and $(8\times1\times8)$ reconstructed phases at $T_{c_1}\sim 280$~K and $T_{c_2}\sim 180$~K, respectively. Here, using surface sensitive probes of the electronic structure of IrTe$_2$, we reveal the first-order phase transition at $T_{c_3}=165$~K to the $(6\times1)$ stripes phase, previously proposed to be the surface ground state. This is achieved by combining x-ray photoemission spectroscopy and angle-resolved photoemission spectroscopy, which give access to the evolution of stripe domains and a particular surface state, the energy of which is dependent on the Ir dimer length. By performing measurements over a full thermal cycle, we also report the complete hysteresis of all these phases.
△ Less
Submitted 9 June, 2020; v1 submitted 10 December, 2019;
originally announced December 2019.
-
A tighter constraint on Earth-system sensitivity from long-term temperature and carbon-cycle observations
Authors:
Tony E. Wong,
Ying Cui,
Dana L. Royer,
Klaus Keller
Abstract:
The long-term temperature response to a given change in CO2 forcing, or Earth-system sensitivity (ESS), is a key parameter quantifying our understanding about the relationship between changes in Earth's radiative forcing and the resulting long-term Earth-system response. Current ESS estimates are subject to sizable uncertainties. Long-term carbon cycle models can provide a useful avenue to constra…
▽ More
The long-term temperature response to a given change in CO2 forcing, or Earth-system sensitivity (ESS), is a key parameter quantifying our understanding about the relationship between changes in Earth's radiative forcing and the resulting long-term Earth-system response. Current ESS estimates are subject to sizable uncertainties. Long-term carbon cycle models can provide a useful avenue to constrain ESS, but previous efforts either use rather informal statistical approaches or focus on discrete paleoevents. Here, we improve on previous ESS estimates by using a Bayesian approach to fuse deep-time CO2 and temperature data over the last 420 Myrs with a long-term carbon cycle model. Our median ESS estimate of 3.4 deg C (2.6-4.7 deg C; 5-95% range) shows a narrower range than previous assessments. We show that weaker chemical weathering relative to the a priori model configuration via reduced weatherable land area yields better agreement with temperature records during the Cretaceous. Research into improving the understanding about these weathering mechanisms hence provides potentially powerful avenues to further constrain this fundamental Earth-system property.
△ Less
Submitted 1 March, 2021; v1 submitted 25 October, 2019;
originally announced October 2019.
-
Class Mean Vectors, Self Monitoring and Self Learning for Neural Classifiers
Authors:
Eugene Wong
Abstract:
In this paper we explore the role of sample mean in building a neural network for classification. This role is surprisingly extensive and includes: direct computation of weights without training, performance monitoring for samples without known classification, and self-training for unlabeled data. Experimental computation on a CIFAR-10 data set provides promising empirical evidence on the efficacy…
▽ More
In this paper we explore the role of sample mean in building a neural network for classification. This role is surprisingly extensive and includes: direct computation of weights without training, performance monitoring for samples without known classification, and self-training for unlabeled data. Experimental computation on a CIFAR-10 data set provides promising empirical evidence on the efficacy of a simple and widely applicable approach to some difficult problems.
△ Less
Submitted 22 October, 2019;
originally announced October 2019.
-
Electrically driven photon emission from individual atomic defects in monolayer WS2
Authors:
Bruno Schuler,
Katherine A. Cochrane,
Christoph Kastl,
Ed Barnard,
Ed Wong,
Nicholas Borys,
Adam M. Schwartzberg,
D. Frank Ogletree,
F. Javier García de Abajo,
Alexander Weber-Bargioni
Abstract:
Optical quantum emitters are a key component of quantum devices for metrology and information processing. In particular, atomic defects in 2D materials can operate as optical quantum emitters that overcome current limitations of conventional bulk emitters, such as yielding a high single-photon generation rate and offering surface accessibility for excitation and photon extraction. Here we demonstr…
▽ More
Optical quantum emitters are a key component of quantum devices for metrology and information processing. In particular, atomic defects in 2D materials can operate as optical quantum emitters that overcome current limitations of conventional bulk emitters, such as yielding a high single-photon generation rate and offering surface accessibility for excitation and photon extraction. Here we demonstrate electrically stimulated photon emission from individual point defects in a 2D material. Specifically, by bringing a metallic tip into close proximity to a discrete defect state in the band gap of WS2, we induce inelastic tip-to-defect electron tunneling with an excess of transition energy carried by the emitted photons. We gain atomic spatial control over the emission through the position of the tip, while the spectral characteristics are highly customizable by varying the applied tip-sample voltage. Atomically resolved emission maps of individual sulfur vacancies and chromium substituent defects are in excellent agreement with the electron density of their respective defect orbitals as imaged via conventional elastic scanning tunneling microscopy. Inelastic charge-carrier injection into localized defect states of 2D materials thus provides a powerful platform for electrically driven, broadly tunable, atomic-scale single-photon sources.
△ Less
Submitted 10 October, 2019;
originally announced October 2019.
-
Adversarial Robustness Against the Union of Multiple Perturbation Models
Authors:
Pratyush Maini,
Eric Wong,
J. Zico Kolter
Abstract:
Owing to the susceptibility of deep learning systems to adversarial attacks, there has been a great deal of work in develo** (both empirically and certifiably) robust classifiers. While most work has defended against a single type of attack, recent work has looked at defending against multiple perturbation models using simple aggregations of multiple attacks. However, these methods can be diffic…
▽ More
Owing to the susceptibility of deep learning systems to adversarial attacks, there has been a great deal of work in develo** (both empirically and certifiably) robust classifiers. While most work has defended against a single type of attack, recent work has looked at defending against multiple perturbation models using simple aggregations of multiple attacks. However, these methods can be difficult to tune, and can easily result in imbalanced degrees of robustness to individual perturbation models, resulting in a sub-optimal worst-case loss over the union. In this work, we develop a natural generalization of the standard PGD-based procedure to incorporate multiple perturbation models into a single attack, by taking the worst-case over all steepest descent directions. This approach has the advantage of directly converging upon a trade-off between different perturbation models which minimizes the worst-case performance over the union. With this approach, we are able to train standard architectures which are simultaneously robust against $\ell_\infty$, $\ell_2$, and $\ell_1$ attacks, outperforming past approaches on the MNIST and CIFAR10 datasets and achieving adversarial accuracy of 47.0% against the union of ($\ell_\infty$, $\ell_2$, $\ell_1$) perturbations with radius = (0.03, 0.5, 12) on the latter, improving upon previous approaches which achieve 40.6% accuracy.
△ Less
Submitted 28 July, 2020; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Categorical Co-Frequency Analysis: Clustering Diagnosis Codes to Predict Hospital Readmissions
Authors:
Hallee E. Wong,
Brianna C. Heggeseth,
Steven J. Miller
Abstract:
Accurately predicting patients' risk of 30-day hospital readmission would enable hospitals to efficiently allocate resource-intensive interventions. We develop a new method, Categorical Co-Frequency Analysis (CoFA), for clustering diagnosis codes from the International Classification of Diseases (ICD) according to the similarity in relationships between covariates and readmission risk. CoFA measur…
▽ More
Accurately predicting patients' risk of 30-day hospital readmission would enable hospitals to efficiently allocate resource-intensive interventions. We develop a new method, Categorical Co-Frequency Analysis (CoFA), for clustering diagnosis codes from the International Classification of Diseases (ICD) according to the similarity in relationships between covariates and readmission risk. CoFA measures the similarity between diagnoses by the frequency with which two diagnoses are split in the same direction versus split apart in random forests to predict readmission risk. Applying CoFA to de-identified data from Berkshire Medical Center, we identified three groups of diagnoses that vary in readmission risk. To evaluate CoFA, we compared readmission risk models using ICD majors and CoFA groups to a baseline model without diagnosis variables. We found substituting ICD majors for the CoFA-identified clusters simplified the model without compromising the accuracy of predictions. Fitting separate models for each ICD major and CoFA group did not improve predictions, suggesting that readmission risk may be more homogeneous that heterogeneous across diagnosis groups.
△ Less
Submitted 31 August, 2019;
originally announced September 2019.
-
Markerless Augmented Advertising for Sports Videos
Authors:
Hallee E. Wong,
Osman Akar,
Emmanuel Antonio Cuevas,
Iuliana Tabian,
Divyaa Ravichandran,
Iris Fu,
Cambron Carter
Abstract:
Markerless augmented reality can be a challenging computer vision task, especially in live broadcast settings and in the absence of information related to the video capture such as the intrinsic camera parameters. This typically requires the assistance of a skilled artist, along with the use of advanced video editing tools in a post-production environment. We present an automated video augmentatio…
▽ More
Markerless augmented reality can be a challenging computer vision task, especially in live broadcast settings and in the absence of information related to the video capture such as the intrinsic camera parameters. This typically requires the assistance of a skilled artist, along with the use of advanced video editing tools in a post-production environment. We present an automated video augmentation pipeline that identifies textures of interest and overlays an advertisement onto these regions. We constrain the advertisement to be placed in a way that is aesthetic and natural. The aim is to augment the scene such that there is no longer a need for commercial breaks. In order to achieve seamless integration of the advertisement with the original video we build a 3D representation of the scene, place the advertisement in 3D, and then project it back onto the image plane. After successful placement in a single frame, we use homography-based, shape-preserving tracking such that the advertisement appears perspective correct for the duration of a video clip. The tracker is designed to handle smooth camera motion and shot boundaries.
△ Less
Submitted 22 July, 2019;
originally announced July 2019.
-
Blue-Light-Emitting Color Centers in High-Quality Hexagonal Boron Nitride
Authors:
Brian Shevitski,
S. Matt Gilbert,
Christopher T. Chen,
Christoph Kastl,
Edward S. Barnard,
Ed Wong,
D. Frank Ogletree,
Kenji Watanabe,
Takashi Taniguchi,
Alex Zettl,
Shaul Aloni
Abstract:
Light emitters in wide band gap semiconductors are of great fundamental interest and have potential as optically addressable qubits. Here we describe the discovery of a new color center in high-quality hexagonal boron nitride (h-BN) with a sharp emission line at 435 nm. The emitters are activated and deactivated by electron beam irradiation and have spectral and temporal characteristics consistent…
▽ More
Light emitters in wide band gap semiconductors are of great fundamental interest and have potential as optically addressable qubits. Here we describe the discovery of a new color center in high-quality hexagonal boron nitride (h-BN) with a sharp emission line at 435 nm. The emitters are activated and deactivated by electron beam irradiation and have spectral and temporal characteristics consistent with atomic color centers weakly coupled to lattice vibrations. The emitters are conspicuously absent from commercially available h-BN and are only present in ultra-high-quality h-BN grown using a high-pressure, high-temperature Ba-B-N flux/solvent, suggesting that these emitters originate from impurities or related defects specific to this unique synthetic route. Our results imply that the light emission is activated and deactivated by electron beam manipulation of the charge state of an impurity-defect complex.
△ Less
Submitted 5 September, 2019; v1 submitted 27 April, 2019;
originally announced April 2019.
-
Exact Lower Bounds for Monochromatic Schur Triples and Generalizations
Authors:
Christoph Koutschan,
Elaine Wong
Abstract:
We derive exact and sharp lower bounds for the number of monochromatic generalized Schur triples $(x,y,x+ay)$ whose entries are from the set $\{1,\dots,n\}$, subject to a coloring with two different colors. Previously, only asymptotic formulas for such bounds were known, and only for $a\in\mathbb{N}$. Using symbolic computation techniques, these results are extended here to arbitrary…
▽ More
We derive exact and sharp lower bounds for the number of monochromatic generalized Schur triples $(x,y,x+ay)$ whose entries are from the set $\{1,\dots,n\}$, subject to a coloring with two different colors. Previously, only asymptotic formulas for such bounds were known, and only for $a\in\mathbb{N}$. Using symbolic computation techniques, these results are extended here to arbitrary $a\in\mathbb{R}$. Furthermore, we give exact formulas for the minimum number of monochromatic Schur triples for $a=1,2,3,4$, and briefly discuss the case $0<a<1$.
△ Less
Submitted 12 October, 2020; v1 submitted 3 April, 2019;
originally announced April 2019.
-
Wasserstein Adversarial Examples via Projected Sinkhorn Iterations
Authors:
Eric Wong,
Frank R. Schmidt,
J. Zico Kolter
Abstract:
A rapidly growing area of work has studied the existence of adversarial examples, datapoints which have been perturbed to fool a classifier, but the vast majority of these works have focused primarily on threat models defined by $\ell_p$ norm-bounded perturbations. In this paper, we propose a new threat model for adversarial attacks based on the Wasserstein distance. In the image classification se…
▽ More
A rapidly growing area of work has studied the existence of adversarial examples, datapoints which have been perturbed to fool a classifier, but the vast majority of these works have focused primarily on threat models defined by $\ell_p$ norm-bounded perturbations. In this paper, we propose a new threat model for adversarial attacks based on the Wasserstein distance. In the image classification setting, such distances measure the cost of moving pixel mass, which naturally cover "standard" image manipulations such as scaling, rotation, translation, and distortion (and can potentially be applied to other settings as well). To generate Wasserstein adversarial examples, we develop a procedure for projecting onto the Wasserstein ball, based upon a modified version of the Sinkhorn iteration. The resulting algorithm can successfully attack image classification models, bringing traditional CIFAR10 models down to 3% accuracy within a Wasserstein ball with radius 0.1 (i.e., moving 10% of the image mass 1 pixel), and we demonstrate that PGD-based adversarial training can improve this adversarial accuracy to 76%. In total, this work opens up a new direction of study in adversarial robustness, more formally considering convex metrics that accurately capture the invariances that we typically believe should exist in classifiers. Code for all experiments in the paper is available at https://github.com/locuslab/projected_sinkhorn.
△ Less
Submitted 18 January, 2020; v1 submitted 21 February, 2019;
originally announced February 2019.
-
Confidence Trigger Detection: Accelerating Real-time Tracking-by-detection Systems
Authors:
Zhicheng Ding,
Zhixin Lai,
Siyang Li,
Panfeng Li,
Qikai Yang,
Edward Wong
Abstract:
Real-time object tracking necessitates a delicate balance between speed and accuracy, a challenge exacerbated by the computational demands of deep learning methods. In this paper, we propose Confidence-Triggered Detection (CTD), an innovative approach that strategically bypasses object detection for frames closely resembling intermediate states, leveraging tracker confidence scores. CTD not only e…
▽ More
Real-time object tracking necessitates a delicate balance between speed and accuracy, a challenge exacerbated by the computational demands of deep learning methods. In this paper, we propose Confidence-Triggered Detection (CTD), an innovative approach that strategically bypasses object detection for frames closely resembling intermediate states, leveraging tracker confidence scores. CTD not only enhances tracking speed but also preserves accuracy, surpassing existing tracking algorithms. Through extensive evaluation across various tracker confidence thresholds, we identify an optimal trade-off between tracking speed and accuracy, providing crucial insights for parameter fine-tuning and enhancing CTD's practicality in real-world scenarios. Our experiments across diverse detection models underscore the robustness and versatility of the CTD framework, demonstrating its potential to enable real-time tracking in resource-constrained environments.
△ Less
Submitted 24 May, 2024; v1 submitted 1 February, 2019;
originally announced February 2019.
-
Impact bombardment on the regular satellites of Jupiter and Uranus during an episode of giant planet migration
Authors:
E. W. Wong,
R. Brasser,
S. C. Werner
Abstract:
The intensity and effects of early impact bombardment on the major satellites of the giant planets during an episode of giant planet migration is still poorly known. We use a combination of dynamical N-body and Monte Carlo simulations to determine impact probabilities, impact velocities, and expected masses that collide with these satellites to determine the chronology of impacts during the migrat…
▽ More
The intensity and effects of early impact bombardment on the major satellites of the giant planets during an episode of giant planet migration is still poorly known. We use a combination of dynamical N-body and Monte Carlo simulations to determine impact probabilities, impact velocities, and expected masses that collide with these satellites to determine the chronology of impacts during the migration. Volatile loss through bombardment is typically 20% for Miranda, a few percents for the larger Uranian satellites and negligible for the Galilean satellites. Due to its small size and the high impact velocity there is a >99% chance that Miranda suffered a catastrophic impact that shattered the satellite. Subsequent re-accretion from a circum-Uranian ring could account for its peculiar surface morphology and low density. The probability to destroy Ariel and Umbriel is 15% and 1% for Titania and Oberon. Approximately 90% of the mass in planetesimals that passes through the Jovian and Uranian satellite systems (about $4 {\rm \ M_{\oplus}}$ and $2 {\rm \ M_{\oplus}}$ respectively) does so in about 15 Myr. This extremely rapid and intense bombardment causes repeated local crustal melting on all satellites. The combination of these effects results in an entirely different impact chronology than that of the inner solar system. We conclude that the simple extrapolation of the lunar chronology to the outer solar system satellites is not correct. The tail end (after 25 Myr) of the chronology function has an e-folding time of 100 Myr at Jupiter, but follows a cumulative Weibull distribution at Uranus, making direct comparisons between the gas and ice giant planets difficult. Based on our results the surfaces of the Uranian satellites, Callisto, and possibly Ganymede, are all about the same age, and are roughly 150 Myr younger than the timing of the dynamical instability.
△ Less
Submitted 12 November, 2018;
originally announced November 2018.
-
Cathodoluminescence-based nanoscopic thermometry in a lanthanide-doped phosphor
Authors:
Clarice D. Aiello,
Andrea D. Pickel,
Edward Barnard,
Rebecca B. Wai,
Christian Monachon,
Edward Wong,
Shaul Aloni,
D. Frank Ogletree,
Chris Dames,
Naomi Ginsberg
Abstract:
Crucial to analyze phenomena as varied as plasmonic hot spots and the spread of cancer in living tissue, nanoscale thermometry is challenging: probes are usually larger than the sample under study, and contact techniques may alter the sample temperature itself. Many photostable nanomaterials whose luminescence is temperature-dependent, such as lanthanide-doped phosphors, have been shown to be good…
▽ More
Crucial to analyze phenomena as varied as plasmonic hot spots and the spread of cancer in living tissue, nanoscale thermometry is challenging: probes are usually larger than the sample under study, and contact techniques may alter the sample temperature itself. Many photostable nanomaterials whose luminescence is temperature-dependent, such as lanthanide-doped phosphors, have been shown to be good non-contact thermometric sensors when optically excited. Using such nanomaterials, in this work we accomplished the key milestone of enabling far-field thermometry with a spatial resolution that is not diffraction-limited at readout.
We explore thermal effects on the cathodoluminescence of lanthanide-doped NaYF$_4$ nanoparticles. Whereas cathodoluminescence from such lanthanide-doped nanomaterials has been previously observed, here we use quantitative features of such emission for the first time towards an application beyond localization. We demonstrate a thermometry scheme that is based on cathodoluminescence lifetime changes as a function of temperature that achieves $\sim$ 30 mK sensitivity in sub-$μ$m nanoparticle patches. The scheme is robust against spurious effects related to electron beam radiation damage and optical alignment fluctuations.
We foresee the potential of single nanoparticles, of sheets of nanoparticles, and also of thin films of lanthanide-doped NaYF$_4$ to yield temperature information via cathodoluminescence changes when in the vicinity of a sample of interest; the phosphor may even protect the sample from direct contact to damaging electron beam radiation. Cathodoluminescence-based thermometry is thus a valuable novel tool towards temperature monitoring at the nanoscale, with broad applications including heat dissipation in miniaturized electronics and biological diagnostics.
△ Less
Submitted 11 October, 2018;
originally announced October 2018.
-
Self Configuration in Machine Learning
Authors:
Eugene Wong
Abstract:
In this paper we first present a class of algorithms for training multi-level neural networks with a quadratic cost function one layer at a time starting from the input layer. The algorithm is based on the fact that for any layer to be trained, the effect of a direct connection to an optimized linear output layer can be computed without the connection being made. Thus, starting from the input laye…
▽ More
In this paper we first present a class of algorithms for training multi-level neural networks with a quadratic cost function one layer at a time starting from the input layer. The algorithm is based on the fact that for any layer to be trained, the effect of a direct connection to an optimized linear output layer can be computed without the connection being made. Thus, starting from the input layer, we can train each layer in succession in isolation from the other layers. Once trained, the weights are kept fixed and the outputs of the trained layer then serve as the inputs to the next layer to be trained. The result is a very fast algorithm. The simplicity of this training arrangement allows the activation function and step size in weight adjustment to be adaptive and self-adjusting. Furthermore, the stability of the training process allows relatively large steps to be taken and thereby achieving in even greater speeds. Finally, in our context configuring the network means determining the number of outputs for each layer. By decomposing the overall cost function into separate components related to approximation and estimation, we obtain an optimization formula for determining the number of outputs for each layer. With the ability to self-configure and set parameters, we now have more than a fast training algorithm, but the ability to build automatically a fully trained deep neural network starting with nothing more than data.
△ Less
Submitted 17 September, 2018;
originally announced September 2018.
-
In Vitro Vascularized Tumor Platform for Modeling Tumor-Vasculature Interactions of Inflammatory Breast Cancer
Authors:
Manasa Gadde,
Caleb Phillips,
Neda Ghousifam,
Anna G. Sorace,
Enoch Wong,
Savitri Krishnamurthy,
Anum Syed,
Omar Rahal,
Thomas E. Yankeelov,
Wendy A. Woodward,
Marissa Nichole Rylander
Abstract:
Inflammatory breast cancer (IBC), a rare form of breast cancer associated with increased angiogenesis and metastasis, is largely driven by tumor-stromal interactions with the vasculature and the extracellular matrix (ECM). However, there is currently a lack of understanding of the role these interactions play in initiation and progression of the disease. In this study, we developed the first three…
▽ More
Inflammatory breast cancer (IBC), a rare form of breast cancer associated with increased angiogenesis and metastasis, is largely driven by tumor-stromal interactions with the vasculature and the extracellular matrix (ECM). However, there is currently a lack of understanding of the role these interactions play in initiation and progression of the disease. In this study, we developed the first three-dimensional, in vitro, vascularized, breast tumor platform to quantify the spatial and temporal dynamics of tumor-vasculature and tumor-ECM interactions specific to IBC. Platforms consisting of collagen type 1 ECM with an endothelialized blood vessel were cultured with IBC cells, MDA-IBC3 (HER2+) or SUM149 (triple negative), and for comparison to non-IBC cells, MDA-MB-231 (triple negative). An acellular collagen platform with an endothelial blood vessel served as control. SUM149 and MDA-MB-231 platforms exhibited a significantly (p<0.05) higher vessel permeability and decreased endothelial coverage of the vessel lumen compared to the control. Both IBC platforms, MDA-IBC3 and SUM149, expressed higher levels of VEGF (p<0.05) and increased collagen ECM porosity compared to non-IBC MDA-MB-231 (p<0.05) and control (p<0.01) platforms. Additionally, unique to the MDA-IBC3 platform, we observed progressive sprouting of the endothelium over time resulting in viable vessels with lumen. The newly sprouted vessels encircled clusters of MDA-IBC3 cells replicating a feature of in vivo IBC. The IBC in vitro vascularized platforms introduced in this study model well-described in vivo and clinical IBC phenotypes and provide an adaptable, high throughout tool for systematically and quantitatively investigating tumor-stromal mechanisms and dynamics of tumor progression.
△ Less
Submitted 10 January, 2020; v1 submitted 17 September, 2018;
originally announced September 2018.
-
An Integration and Assessment of Covariates of Nonstationary Storm Surge Statistical Behavior by Bayesian Model Averaging
Authors:
Tony E. Wong
Abstract:
Projections of storm surge return levels are a basic requirement for effective management of coastal risks. A common approach to estimate hazards posed by extreme sea levels is to use a statistical model, which may use a time series of a climate variable as a covariate to modulate the statistical model and account for potentially nonstationary storm surge behavior. Previous work using nonstationar…
▽ More
Projections of storm surge return levels are a basic requirement for effective management of coastal risks. A common approach to estimate hazards posed by extreme sea levels is to use a statistical model, which may use a time series of a climate variable as a covariate to modulate the statistical model and account for potentially nonstationary storm surge behavior. Previous work using nonstationary statistical approaches, however, has demonstrated the importance of accounting for the many inherent modeling uncertainties. Additionally, previous assessments of coastal flood hazard using statistical modeling have typically relied on a single climate covariate, which likely leaves out important processes and leads to potential biases. Here, I employ upon a recently developed approach to integrate stationary and nonstationary statistical models, and examine the effects of choice of covariate time series on projected flood hazard. Furthermore, I expand upon this approach by develo** a nonstationary storm surge statistical model that makes use of multiple covariate time series: global mean temperature, sea level, North Atlantic Oscillation index and time. I show that a storm surge model that accounts for additional processes raises the projected 100-year storm surge return level by up to 23 centimeters relative to a stationary model or one that employs a single covariate time series. I find that the total marginal model likelihood associated with each set of nonstationary models given by the candidate covariates, as well as a stationary model, is about 20%. These results shed light on how best to account for potential nonstationary coastal surge behavior, and incorporate more processes into surge projections. By including a wider range of physical process information and considering nonstationary behavior, these methods will better enable modeling efforts to inform coastal risk management.
△ Less
Submitted 25 August, 2018; v1 submitted 20 August, 2018;
originally announced August 2018.
-
Scaling provable adversarial defenses
Authors:
Eric Wong,
Frank R. Schmidt,
Jan Hendrik Metzen,
J. Zico Kolter
Abstract:
Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks. In this paper, in an effort to scale these approaches to substantially larger models, we extend previous work in three main directions. First, we present a technique f…
▽ More
Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks. In this paper, in an effort to scale these approaches to substantially larger models, we extend previous work in three main directions. First, we present a technique for extending these training procedures to much more general networks, with skip connections (such as ResNets) and general nonlinearities; the approach is fully modular, and can be implemented automatically (analogous to automatic differentiation). Second, in the specific case of $\ell_\infty$ adversarial perturbations and networks with ReLU nonlinearities, we adopt a nonlinear random projection for training, which scales linearly in the number of hidden units (previous approaches scaled quadratically). Third, we show how to further improve robust error through cascade models. On both MNIST and CIFAR data sets, we train classifiers that improve substantially on the state of the art in provable robust adversarial error bounds: from 5.8% to 3.1% on MNIST (with $\ell_\infty$ perturbations of $ε=0.1$), and from 80% to 36.4% on CIFAR (with $\ell_\infty$ perturbations of $ε=2/255$). Code for all experiments in the paper is available at https://github.com/locuslab/convex_adversarial/.
△ Less
Submitted 21 November, 2018; v1 submitted 31 May, 2018;
originally announced May 2018.
-
Spatial Image Steganography Based on Generative Adversarial Network
Authors:
Jianhua Yang,
Kai Liu,
Xiangui Kang,
Edward K. Wong,
Yun-Qing Shi
Abstract:
With the recent development of deep learning on steganalysis, embedding secret information into digital images faces great challenges. In this paper, a secure steganography algorithm by using adversarial training is proposed. The architecture contain three component modules: a generator, an embedding simulator and a discriminator. A generator based on U-NET to translate a cover image into an embed…
▽ More
With the recent development of deep learning on steganalysis, embedding secret information into digital images faces great challenges. In this paper, a secure steganography algorithm by using adversarial training is proposed. The architecture contain three component modules: a generator, an embedding simulator and a discriminator. A generator based on U-NET to translate a cover image into an embedding change probability is proposed. To fit the optimal embedding simulator and propagate the gradient, a function called Tanh-simulator is proposed. As for the discriminator, the selection-channel awareness (SCA) is incorporated to resist the SCA based steganalytic methods. Experimental results have shown that the proposed framework can increase the security performance dramatically over the recently reported method ASDL-GAN, while the training time is only 30% of that used by ASDL-GAN. Furthermore, it also performs better than the hand-crafted steganographic algorithm S-UNIWARD.
△ Less
Submitted 21 April, 2018;
originally announced April 2018.
-
JPEG Steganalysis Based on DenseNet
Authors:
Jianhua Yang,
Yun-Qing Shi,
Edward K. Wong,
Xiangui Kang
Abstract:
Different from the conventional deep learning work based on an images content in computer vision, deep steganalysis is an art to detect the secret information embedded in an image via deep learning, pose challenge of detection weak information invisible hidden in a host image thus learning in a very low signal-to-noise (SNR) case. In this paper, we propose a 32- layer convolutional neural Networks…
▽ More
Different from the conventional deep learning work based on an images content in computer vision, deep steganalysis is an art to detect the secret information embedded in an image via deep learning, pose challenge of detection weak information invisible hidden in a host image thus learning in a very low signal-to-noise (SNR) case. In this paper, we propose a 32- layer convolutional neural Networks (CNNs) in to improve the efficiency of preprocess and reuse the features by concatenating all features from the previous layers with the same feature- map size, thus improve the flow of information and gradient. The shared features and bottleneck layers further improve the feature propagation and reduce the CNN model parameters dramatically. Experimental results on the BOSSbase, BOWS2 and ImageNet datasets have showed that the proposed CNN architecture can improve the performance and enhance the robustness. To further boost the detection accuracy, an ensemble architecture called as CNN-SCA-GFR is proposed, CNN-SCA- GFR is also the first work to combine the CNN architecture and conventional method in the JPEG domain. Experiments show that it can further lower detection errors. Compared with the state-of-the-art method XuNet [1] on BOSSbase, the proposed CNN-SCA-GFR architecture can reduce detection error rate by 5.67% for 0.1 bpnzAC and by 4.41% for 0.4 bpnzAC while the number of training parameters in CNN is only 17% of what used by XuNet. It also decreases the detection errors from the conventional method SCA-GFR by 7.89% for 0.1 bpnzAC and 8.06% for 0.4 bpnzAC, respectively.
△ Less
Submitted 17 April, 2018; v1 submitted 26 November, 2017;
originally announced November 2017.
-
Provable defenses against adversarial examples via the convex outer adversarial polytope
Authors:
Eric Wong,
J. Zico Kolter
Abstract:
We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data. For previously unseen examples, the approach is guaranteed to detect all adversarial examples, though it may flag some non-adversarial examples as well. The basic idea is to consider a convex outer approximation of the set of activations reachable t…
▽ More
We propose a method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations on the training data. For previously unseen examples, the approach is guaranteed to detect all adversarial examples, though it may flag some non-adversarial examples as well. The basic idea is to consider a convex outer approximation of the set of activations reachable through a norm-bounded perturbation, and we develop a robust optimization procedure that minimizes the worst case loss over this outer region (via a linear program). Crucially, we show that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss. The end result is that by executing a few more forward and backward passes through a slightly modified version of the original network (though possibly with much larger batch sizes), we can learn a classifier that is provably robust to any norm-bounded adversarial attack. We illustrate the approach on a number of tasks to train classifiers with robust adversarial guarantees (e.g. for MNIST, we produce a convolutional classifier that provably has less than 5.8% test error for any adversarial attack with bounded $\ell_\infty$ norm less than $ε= 0.1$), and code for all experiments in the paper is available at https://github.com/locuslab/convex_adversarial.
△ Less
Submitted 8 June, 2018; v1 submitted 2 November, 2017;
originally announced November 2017.
-
Neglecting Model Structural Uncertainty Underestimates Upper Tails of Flood Hazard
Authors:
Tony E. Wong,
Alexandra Klufas,
Vivek Srikrishnan,
Klaus Keller
Abstract:
Coastal flooding drives considerable risks to many communities, but projections of future flood risks are deeply uncertain. The paucity of observations of extreme events often motivates the use of statistical approaches to model the distribution of extreme storm surge events. A key deep uncertainty that is often overlooked is model structural uncertainty. There is currently no strong consensus amo…
▽ More
Coastal flooding drives considerable risks to many communities, but projections of future flood risks are deeply uncertain. The paucity of observations of extreme events often motivates the use of statistical approaches to model the distribution of extreme storm surge events. A key deep uncertainty that is often overlooked is model structural uncertainty. There is currently no strong consensus among experts regarding which class of statistical model to use as a best practice. Robust management of coastal flooding risks requires coastal managers to consider the distinct possibility of non-stationarity in storm surges. This increases the complexity of the potential models to use, which tends to increase the data required to constrain the model. Here, we use a Bayesian model averaging approach to analyze the balance between model complexity sufficient to capture decision-relevant risks and data availability to constrain complex model structures. We characterize deep model structural uncertainty through a set of calibration experiments. Specifically, we calibrate a set of models ranging in complexity using long-term tide gauge observations from the Netherlands and the United States. We find that in both cases, roughly half the model weight is associated with non-stationary models. Our approach provides a formal framework to integrate information across model structures, in light of the potentially sizable modeling uncertainties. By combining information from multiple models, our inference sharpens for the projected storm surge 100-year return levels, and estimated return levels increase by several centimeters. We assess the impacts of data availability through a set of experiments with temporal subsets and model comparison metrics. Our analysis suggests about 70 years of data are required to stabilize estimates of the 100-year return level, for the locations and methods considered here.
△ Less
Submitted 3 June, 2018; v1 submitted 25 September, 2017;
originally announced September 2017.