-
Gravitational contributions to the electron $g$-factor
Authors:
Andrew G. Cohen,
David B. Kaplan
Abstract:
In a previous paper, the authors with Ann Nelson proposed that the UV and IR applicability of effective quantum field theories should be constrained by requiring that strong gravitational effects are nowhere encountered in a theory's domain of validity [Phys. Rev. Lett. 82, 4971 (1999)]. The constraint was proposed to delineate the boundary beyond which conventional quantum field theory, viewed as…
▽ More
In a previous paper, the authors with Ann Nelson proposed that the UV and IR applicability of effective quantum field theories should be constrained by requiring that strong gravitational effects are nowhere encountered in a theory's domain of validity [Phys. Rev. Lett. 82, 4971 (1999)]. The constraint was proposed to delineate the boundary beyond which conventional quantum field theory, viewed as an effective theory excluding quantum gravitational effects, might be expected to break down. In this Letter we revisit this idea and show that quantum gravitational effects could lead to a deviation of size $(α/2π)\sqrt{m_e/M_p}$ from the Standard Model calculation for the electron magnetic moment. This is the same size as QED and hadronic uncertainties in the theory of $a_e$, and a little more than one order of magnitude smaller than both the dominant uncertainty in its Standard Model value arising from the accuracy with which $α$ is measured, as well as the experimental uncertainty in measurement of $a_e$.
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Stream Distributed Coded Computing
Authors:
Alejandro Cohen,
Guillaume Thiran,
Homa Esfahanizadeh,
Muriel Médard
Abstract:
The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the stragglers. To address this challenge, introducing efficient amount of redundant computations via distributed coded computation has received significant attentio…
▽ More
The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the stragglers. To address this challenge, introducing efficient amount of redundant computations via distributed coded computation has received significant attention. Recent approaches in this area have mainly focused on introducing minimum computational redundancies to tolerate certain number of stragglers. To the best of our knowledge, the current literature lacks a unified end-to-end design in a heterogeneous setting where the workers can vary in their computation and communication capabilities. The contribution of this paper is to devise a novel framework for joint scheduling-coding, in a setting where the workers and the arrival of stream computational jobs are based on stochastic models. In our initial joint scheme, we propose a systematic framework that illustrates how to select a set of workers and how to split the computational load among the selected workers based on their differences in order to minimize the average in-order job execution delay. Through simulations, we demonstrate that the performance of our framework is dramatically better than the performance of naive method that splits the computational load uniformly among the workers, and it is close to the ideal performance.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
A Scaling Limit for Utility Indifference Prices in the Discretized Bachelier Model
Authors:
Asaf Cohen,
Yan Dolinsky
Abstract:
We consider the discretized Bachelier model where hedging is done on an equidistant set of times. Exponential utility indifference prices are studied for path-dependent European options and we compute their non-trivial scaling limit for a large number of trading times $n$ and when risk aversion is scaled like $n\ell$ for some constant $\ell>0$. Our analysis is purely probabilistic. We first use a…
▽ More
We consider the discretized Bachelier model where hedging is done on an equidistant set of times. Exponential utility indifference prices are studied for path-dependent European options and we compute their non-trivial scaling limit for a large number of trading times $n$ and when risk aversion is scaled like $n\ell$ for some constant $\ell>0$. Our analysis is purely probabilistic. We first use a duality argument to transform the problem into an optimal drift control problem with a penalty term. We further use martingale techniques and strong invariance principles and get that the limiting problem takes the form of a volatility control problem.
△ Less
Submitted 1 March, 2022; v1 submitted 23 February, 2021;
originally announced February 2021.
-
(Almost Full) EFX Exists for Four Agents (and Beyond)
Authors:
Ben Berger,
Avi Cohen,
Michal Feldman,
Amos Fiat
Abstract:
The existence of EFX allocations is a major open problem in fair division, even for additive valuations. The current state of the art is that no setting where EFX allocations are impossible is known, and EFX is known to exist for ($i$) agents with identical valuations, ($ii$) 2 agents, ($iii$) 3 agents with additive valuations, ($iv$) agents with one of two additive valuations and ($v$) agents wit…
▽ More
The existence of EFX allocations is a major open problem in fair division, even for additive valuations. The current state of the art is that no setting where EFX allocations are impossible is known, and EFX is known to exist for ($i$) agents with identical valuations, ($ii$) 2 agents, ($iii$) 3 agents with additive valuations, ($iv$) agents with one of two additive valuations and ($v$) agents with two-valued instances. It is also known that EFX exists if one can leave $n-1$ items unallocated, where $n$ is the number of agents.
We develop new techniques that allow us to push the boundaries of the enigmatic EFX problem beyond these known results, and, arguably, to simplify proofs of earlier results. Our main results are ($i$) every setting with 4 additive agents admits an EFX allocation that leaves at most a single item unallocated, ($ii$) every setting with $n$ additive valuations has an EFX allocation with at most $n-2$ unallocated items.
Moreover, all of our results extend beyond additive valuations to all nice cancelable valuations (a new class, including additive, unit-demand, budget-additive and multiplicative valuations, among others). Furthermore, using our new techniques, we show that previous results for additive valuations extend to nice cancelable valuations.
△ Less
Submitted 28 February, 2021; v1 submitted 21 February, 2021;
originally announced February 2021.
-
Partition and Analytic Rank are Equivalent over Large Fields
Authors:
Alex Cohen,
Guy Moshkovitz
Abstract:
We prove that the partition rank and the analytic rank of tensors are equal up to a constant, over finite fields of any characteristic and any large enough cardinality depending on the analytic rank. Moreover, we show that a plausible improvement of our field cardinality requirement would imply that the ranks are equal up to 1+o(1) in the exponent over every finite field. At the core of the proof…
▽ More
We prove that the partition rank and the analytic rank of tensors are equal up to a constant, over finite fields of any characteristic and any large enough cardinality depending on the analytic rank. Moreover, we show that a plausible improvement of our field cardinality requirement would imply that the ranks are equal up to 1+o(1) in the exponent over every finite field. At the core of the proof is a technique for lifting decompositions of multilinear polynomials in an open subset of an algebraic variety, and a technique for finding a large subvariety that retains all rational points such that at least one of these points satisfies a finite-field analogue of genericity with respect to it. Proving the equivalence between these two ranks, ideally over fixed finite fields, is a central question in additive combinatorics, and was reiterated by multiple authors. As a corollary we prove, allowing the field to depend on the value of the norm, the Polynomial Gowers Inverse Conjecture in the d vs. d-1 case.
△ Less
Submitted 27 November, 2023; v1 submitted 20 February, 2021;
originally announced February 2021.
-
Structure vs. Randomness for Bilinear Maps
Authors:
Alex Cohen,
Guy Moshkovitz
Abstract:
We prove that the slice rank of a 3-tensor (a combinatorial notion introduced by Tao in the context of the cap-set problem), the analytic rank (a Fourier-theoretic notion introduced by Gowers and Wolf), and the geometric rank (an algebro-geometric notion introduced by Kopparty, Moshkovitz, and Zuiddam) are all equal up to an absolute constant. As a corollary, we obtain strong trade-offs on the ari…
▽ More
We prove that the slice rank of a 3-tensor (a combinatorial notion introduced by Tao in the context of the cap-set problem), the analytic rank (a Fourier-theoretic notion introduced by Gowers and Wolf), and the geometric rank (an algebro-geometric notion introduced by Kopparty, Moshkovitz, and Zuiddam) are all equal up to an absolute constant. As a corollary, we obtain strong trade-offs on the arithmetic complexity of a biased bilinear map, and on the separation between computing a bilinear map exactly and on average. Our result settles open questions of Haramaty and Shpilka [STOC 2010], and of Lovett [Discrete Anal. 2019] for 3-tensors.
△ Less
Submitted 3 October, 2022; v1 submitted 9 February, 2021;
originally announced February 2021.
-
Online Markov Decision Processes with Aggregate Bandit Feedback
Authors:
Alon Cohen,
Haim Kaplan,
Tomer Koren,
Yishay Mansour
Abstract:
We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics. In each episode, the learner suffers the loss accumulated along the trajectory realized by the policy chosen for the episode, and observes aggregate bandit feedback: the trajectory is revealed along with the cumulative loss suffered, rather than the…
▽ More
We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics. In each episode, the learner suffers the loss accumulated along the trajectory realized by the policy chosen for the episode, and observes aggregate bandit feedback: the trajectory is revealed along with the cumulative loss suffered, rather than the individual losses encountered along the trajectory. Our main result is a computationally efficient algorithm with $O(\sqrt{K})$ regret for this setting, where $K$ is the number of episodes.
We establish this result via an efficient reduction to a novel bandit learning setting we call Distorted Linear Bandits (DLB), which is a variant of bandit linear optimization where actions chosen by the learner are adversarially distorted before they are committed. We then develop a computationally-efficient online algorithm for DLB for which we prove an $O(\sqrt{T})$ regret bound, where $T$ is the number of time steps. Our algorithm is based on online mirror descent with a self-concordant barrier regularization that employs a novel increasing learning rate schedule.
△ Less
Submitted 31 January, 2021;
originally announced February 2021.
-
Multi-Group Discontinuous Asymptotic $P_1$ Approximation in Radiative Marshak Waves Experiments
Authors:
Avner P. Cohen,
Shay I. Heizler
Abstract:
We study the propagation of radiative heat (Marshak) waves, using modified $P_1$-approximation equations. In relatively optically-thin media the heat propagation is supersonic,~i.e. hydrodynamic motion is negligible, and thus can be described by the radiative transfer Boltzmann equation, coupled with the material energy equation. However, the exact thermal radiative transfer problem is still diffi…
▽ More
We study the propagation of radiative heat (Marshak) waves, using modified $P_1$-approximation equations. In relatively optically-thin media the heat propagation is supersonic,~i.e. hydrodynamic motion is negligible, and thus can be described by the radiative transfer Boltzmann equation, coupled with the material energy equation. However, the exact thermal radiative transfer problem is still difficult to solve and requires massive simulation capabilities. Hence, there still exists a need for adequate approximations that are comparatively easy to carry out. Classic approximations, such as the classic diffusion and classic $P_1$, fail to describe the correct heat wave velocity, when the optical depth is not sufficiently high. Therefore, we use the recently developed discontinuous asymptotic $P_1$ approximation, which is a time-dependent analogy for the adjustment of the discontinuous asymptotic diffusion for two different zones. This approximation was tested via several benchmarks, showing better results than other common approximations, and has also demonstrated a good agreement with a main Marshak wave experiment and its Monte-Carlo gray simulation. Here we derive energy expansion of the discontinuous asymptotic $P_1$ approximation in slab geometry, and test it with numerous experimental results for propagating Marshak waves inside low density foams. The new approximation describes the heat wave propagation with good agreement. Furthermore, a comparison of the simulations to exact implicit Monte-Carlo slab-geometry multi-group simulations, in this wide range of experimental conditions, demonstrates the superiority of this approximation to others.
△ Less
Submitted 10 February, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
Uniqueness of excited states to $-Δu+u-u^3=0$ in three dimensions
Authors:
Alex Cohen,
Zhenhao Li,
Wilhelm Schlag
Abstract:
We prove the uniqueness of several excited states to the ODE $\ddot y(t) + \frac{2}{t} \dot y(t) + f(y(t)) = 0$, $y(0) = b$, and $\dot y(0) = 0$ for the model nonlinearity $f(y) = y^3 - y$. The $n$-th excited state is a solution with exactly $n$ zeros and which tends to $0$ as $t \to \infty$. These represent all smooth radial nonzero solutions to the PDE $Δu + f(u)= 0$ in $H^1$. We interpret the O…
▽ More
We prove the uniqueness of several excited states to the ODE $\ddot y(t) + \frac{2}{t} \dot y(t) + f(y(t)) = 0$, $y(0) = b$, and $\dot y(0) = 0$ for the model nonlinearity $f(y) = y^3 - y$. The $n$-th excited state is a solution with exactly $n$ zeros and which tends to $0$ as $t \to \infty$. These represent all smooth radial nonzero solutions to the PDE $Δu + f(u)= 0$ in $H^1$. We interpret the ODE as a damped oscillator governed by a double-well potential, and the result is proved via rigorous numerical analysis of the energy and variation of the solutions. More specifically, the problem of uniqueness can be formulated entirely in terms of inequalities on the solutions and their variation, and these inequalities can be verified numerically.
△ Less
Submitted 20 January, 2021;
originally announced January 2021.
-
Secure Optimization Through Opaque Observations
Authors:
Son Tuan Vu,
Albert Cohen,
Karine Heydemann,
Arnaud de Grandmaison,
Christophe Guillon
Abstract:
Secure applications implement software protections against side-channel and physical attacks. Such protections are meaningful at machine code or micro-architectural level, but they typically do not carry observable semantics at source level. To prevent optimizing compilers from altering the protection, security engineers embed input/output side-effects into the protection. These side-effects are e…
▽ More
Secure applications implement software protections against side-channel and physical attacks. Such protections are meaningful at machine code or micro-architectural level, but they typically do not carry observable semantics at source level. To prevent optimizing compilers from altering the protection, security engineers embed input/output side-effects into the protection. These side-effects are error-prone and compiler-dependent, and the current practice involves analyzing the generated machine code to make sure security or privacy properties are still enforced. Vu et al. recently demonstrated how to automate the insertion of volatile side-effects in a compiler [52], but these may be too expensive in fined-grained protections such as control-flow integrity. We introduce observations of the program state that are intrinsic to the correct execution of security protections, along with means to specify and preserve observations across the compilation flow. Such observations complement the traditional input/output-preservation contract of compilers. We show how to guarantee their preservation without modifying compilation passes and with as little performance impact as possible. We validate our approach on a range of benchmarks, expressing the secure compilation of these applications in terms of observations to be made at specific program points.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
In Situ Geochronology for the Next Decade: Mission Designs for the Moon, Mars, and Vesta
Authors:
Barbara A. Cohen,
Kelsey E. Young,
Nicolle E. B. Zellner,
Kris Zacny,
R. Aileen Yingst,
Ryan N. Watkins,
Richard Warwick,
Sarah N. Valencia,
Timothy D. Swindle,
Stuart J. Robbins,
Noah E. Petro,
Anthony Nicoletti,
Daniel P. Moriarty, III,
Richard Lynch,
Stephen J. Indyk,
Juliane Gross,
Jennifer A. Grier,
John A. Grant,
Amani Ginyard,
Caleb I. Fassett,
Kenneth A. Farley,
Benjamin J. Farcy,
Bethany L. Ehlmann,
M. Darby Dyar,
Gerard Daelemans
, et al. (4 additional authors not shown)
Abstract:
Geochronology, or determination of absolute ages for geologic events, underpins many inquiries into the formation and evolution of planets and our Solar System. Absolute ages of ancient and recent magmatic products provide strong constraints on the dynamics of magma oceans and crustal formation, as well as the longevity and evolution of interior heat engines and distinct mantle/crustal source regi…
▽ More
Geochronology, or determination of absolute ages for geologic events, underpins many inquiries into the formation and evolution of planets and our Solar System. Absolute ages of ancient and recent magmatic products provide strong constraints on the dynamics of magma oceans and crustal formation, as well as the longevity and evolution of interior heat engines and distinct mantle/crustal source regions. Absolute dating also relates habitability markers to the timescale of evolution of life on Earth. However, the number of geochronologically-significant terrains across the inner Solar System far exceeds our ability to conduct sample return from all of them. In preparation for the upcoming Decadal Survey, our team formulated a set of medium-class (New Frontiers) mission concepts to three different locations (the Moon, Mars, and Vesta) where sites that record Solar System bombardment, magmatism, and/or habitability are uniquely preserved and accessible. We developed a notional payload to directly date planetary surfaces, consisting of two instruments capable of measuring radiometric ages in situ, an imaging spectrometer, optical cameras to provide site geologic context and sample characterization, a trace element analyzer to augment sample contextualization, and a sample acquisition and handling system. Landers carrying this payload to the Moon, Mars, and Vesta would likely fit into the New Frontiers cost cap in our study (~$1B). A mission of this type would provide crucial constraints on planetary history while also enabling a broad suite of investigations such as basic geologic characterization, geomorphologic analysis, ground truth for remote sensing analyses, analyses of major, minor, trace, and volatile elements, atmospheric and other long-lived monitoring, organic molecule analyses, and soil and geotechnical properties.
△ Less
Submitted 4 January, 2021;
originally announced January 2021.
-
Finite state mean field games with Wright Fisher common noise as limits of $N$-player weighted games
Authors:
Erhan Bayraktar,
Alekos Cecchin,
Asaf Cohen,
François Delarue
Abstract:
Forcing finite state mean field games by a relevant form of common noise is a subtle issue, which has been addressed only recently. Among others, one possible way is to subject the simplex valued dynamics of an equilibrium by a so-called Wright-Fisher noise, very much in the spirit of stochastic models in population genetics. A key feature is that such a random forcing preserves the structure of t…
▽ More
Forcing finite state mean field games by a relevant form of common noise is a subtle issue, which has been addressed only recently. Among others, one possible way is to subject the simplex valued dynamics of an equilibrium by a so-called Wright-Fisher noise, very much in the spirit of stochastic models in population genetics. A key feature is that such a random forcing preserves the structure of the simplex, which is nothing but, in this setting, the probability space over the state space of the game. The purpose of this article is hence to elucidate the finite player version and, accordingly, to prove that $N$-player equilibria indeed converge towards the solution of such a kind of Wright-Fisher mean field game. Whilst part of the analysis is made easier by the fact that the corresponding master equation has already been proved to be uniquely solvable under the presence of the common noise, it becomes however more subtle than in the standard setting because the mean field interaction between the players now occurs through a weighted empirical measure. In other words, each player carries its own weight, which hence may differ from $1/N$ and which, most of all, evolves with the common noise.
△ Less
Submitted 1 November, 2021; v1 submitted 8 December, 2020;
originally announced December 2020.
-
Coherent Spin Precession and Lifetime-Limited Spin Dephasing in CsPbBr3 Perovskite Nanocrystals
Authors:
Matthew J. Crane,
Laura M. Jacoby,
Theodore A. Cohen,
Yun** Huang,
Christine K. Luscombe,
Daniel R. Gamelin
Abstract:
Carrier spins in semiconductor nanocrystals are promising candidates for quantum information processing. Using a combination of time-resolved Faraday rotation and photoluminescence spectroscopies, we demonstrate optical spin polarization and coherent spin precession in colloidal CsPbBr3 nanocrystals that persists up to room temperature. By suppressing the influence of inhomogeneous hyperfine field…
▽ More
Carrier spins in semiconductor nanocrystals are promising candidates for quantum information processing. Using a combination of time-resolved Faraday rotation and photoluminescence spectroscopies, we demonstrate optical spin polarization and coherent spin precession in colloidal CsPbBr3 nanocrystals that persists up to room temperature. By suppressing the influence of inhomogeneous hyperfine fields with a small applied magnetic field, we demonstrate inhomogeneous hole transverse spin-dephasing times (T2*) that approach the nanocrystal photoluminescence lifetime, such that nearly all emitted photons derive from coherent hole spins. Thermally activated LO phonons drive additional spin dephasing at elevated temperatures, but coherent spin precession is still observed at room temperature. These data reveal several major distinctions between spins in nanocrystalline and bulk CsPbBr3 and open the door for using metal-halide perovskite nanocrystals in spin-based quantum technologies.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
A Study on MIMO Channel Estimation by 2D and 3D Convolutional Neural Networks
Authors:
Ben Marinberg,
Ariel Cohen,
Eilam Ben-Dror,
Haim Permuter
Abstract:
In this paper, we study the usage of Convolutional Neural Network (CNN) estimators for the task of Multiple-Input-Multiple-Output Orthogonal Frequency Division Multiplexing (MIMO-OFDM) Channel Estimation (CE). Specifically, the CNN estimators interpolate the channel values of reference signals for estimating the channel of the full OFDM resource element (RE) matrix. We have designed a 2D CNN archi…
▽ More
In this paper, we study the usage of Convolutional Neural Network (CNN) estimators for the task of Multiple-Input-Multiple-Output Orthogonal Frequency Division Multiplexing (MIMO-OFDM) Channel Estimation (CE). Specifically, the CNN estimators interpolate the channel values of reference signals for estimating the channel of the full OFDM resource element (RE) matrix. We have designed a 2D CNN architecture based on U-net, and a 3D CNN architecture for handling spatial correlation. We investigate the performance of various CNN architectures fora diverse data set generated according to the 5G NR standard and in particular, we investigate the influence of spatial correlation, Doppler, and reference signal resource allocation. The CE CNN estimators are then integrated with MIMO detection algorithms for testing their influence on the system level Bit Error Rate(BER) performance.
△ Less
Submitted 12 November, 2020;
originally announced November 2020.
-
Deep-LIBRA: Artificial intelligence method for robust quantification of breast density with independent validation in breast cancer risk assessment
Authors:
Omid Haji Maghsoudi,
Aimilia Gastounioti,
Christopher Scott,
Lauren Pantalone,
Fang-Fang Wu,
Eric A. Cohen,
Stacey Winham,
Emily F. Conant,
Celine Vachon,
Despina Kontos
Abstract:
Breast density is an important risk factor for breast cancer that also affects the specificity and sensitivity of screening mammography. Current federal legislation mandates reporting of breast density for all women undergoing breast screening. Clinically, breast density is assessed visually using the American College of Radiology Breast Imaging Reporting And Data System (BI-RADS) scale. Here, we…
▽ More
Breast density is an important risk factor for breast cancer that also affects the specificity and sensitivity of screening mammography. Current federal legislation mandates reporting of breast density for all women undergoing breast screening. Clinically, breast density is assessed visually using the American College of Radiology Breast Imaging Reporting And Data System (BI-RADS) scale. Here, we introduce an artificial intelligence (AI) method to estimate breast percentage density (PD) from digital mammograms. Our method leverages deep learning (DL) using two convolutional neural network architectures to accurately segment the breast area. A machine-learning algorithm combining superpixel generation, texture feature analysis, and support vector machine is then applied to differentiate dense from non-dense tissue regions, from which PD is estimated. Our method has been trained and validated on a multi-ethnic, multi-institutional dataset of 15,661 images (4,437 women), and then tested on an independent dataset of 6,368 digital mammograms (1,702 women; cases=414) for both PD estimation and discrimination of breast cancer. On the independent dataset, PD estimates from Deep-LIBRA and an expert reader were strongly correlated (Spearman correlation coefficient = 0.90). Moreover, Deep-LIBRA yielded a higher breast cancer discrimination performance (area under the ROC curve, AUC = 0.611 [95% confidence interval (CI): 0.583, 0.639]) compared to four other widely-used research and commercial PD assessment methods (AUCs = 0.528 to 0.588). Our results suggest a strong agreement of PD estimates between Deep-LIBRA and gold-standard assessment by an expert reader, as well as improved performance in breast cancer risk assessment over state-of-the-art open-source and commercial methods.
△ Less
Submitted 18 October, 2021; v1 submitted 13 November, 2020;
originally announced November 2020.
-
The State of AI Ethics Report (October 2020)
Authors:
Abhishek Gupta,
Alexandrine Royer,
Victoria Heath,
Connor Wright,
Camylle Lanteigne,
Allison Cohen,
Marianna Bergamaschi Ganapini,
Muriam Fancy,
Erick Galinkin,
Ryan Khurana,
Mo Akif,
Renjie Butalid,
Falaah Arif Khan,
Masa Sweidan,
Audrey Balogh
Abstract:
The 2nd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in the field of AI Ethics since July 2020. This report aims to help anyone, from machine learning experts to human rights activists and policymakers, quickly digest and understand the ever-changing developments in the field. Through research and article summaries, as well as expert…
▽ More
The 2nd edition of the Montreal AI Ethics Institute's The State of AI Ethics captures the most relevant developments in the field of AI Ethics since July 2020. This report aims to help anyone, from machine learning experts to human rights activists and policymakers, quickly digest and understand the ever-changing developments in the field. Through research and article summaries, as well as expert commentary, this report distills the research and reporting surrounding various domains related to the ethics of AI, including: AI and society, bias and algorithmic justice, disinformation, humans and AI, labor impacts, privacy, risk, and future of AI ethics.
In addition, The State of AI Ethics includes exclusive content written by world-class AI Ethics experts from universities, research institutes, consulting firms, and governments. These experts include: Danit Gal (Tech Advisor, United Nations), Amba Kak (Director of Global Policy and Programs, NYU's AI Now Institute), Rumman Chowdhury (Global Lead for Responsible AI, Accenture), Brent Barron (Director of Strategic Projects and Knowledge Management, CIFAR), Adam Murray (U.S. Diplomat working on tech policy, Chair of the OECD Network on AI), Thomas Kochan (Professor, MIT Sloan School of Management), and Katya Klinova (AI and Economy Program Lead, Partnership on AI).
This report should be used not only as a point of reference and insight on the latest thinking in the field of AI Ethics, but should also be used as a tool for introspection as we aim to foster a more nuanced conversation regarding the impacts of AI on the world.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
Inner ideals in Lie algebras and spherical buildings
Authors:
Arjeh M. Cohen
Abstract:
The correspondence found by Faulkner between inner ideals of the Lie algebra of a simple algebraic group and shadows on long root groups of the building associated with the algebraic group is shown to hold in greater generality (in particular, over perfect fields of characteristic distinct from two).
The correspondence found by Faulkner between inner ideals of the Lie algebra of a simple algebraic group and shadows on long root groups of the building associated with the algebraic group is shown to hold in greater generality (in particular, over perfect fields of characteristic distinct from two).
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Optimal sampling and Christoffel functions on general domains
Authors:
Albert Cohen,
Matthieu Dolbeault
Abstract:
We consider the problem of reconstructing an unknown function $u\in L^2(D,μ)$ from its evaluations at given sampling points $x^1,\dots,x^m\in D$, where $D\subset \mathbb R^d$ is a general domain and $μ$ a probability measure. The approximation is picked from a linear space $V_n$ of interest where $n=\dim(V_n)$. Recent results have revealed that certain weighted least-squares methods achieve near b…
▽ More
We consider the problem of reconstructing an unknown function $u\in L^2(D,μ)$ from its evaluations at given sampling points $x^1,\dots,x^m\in D$, where $D\subset \mathbb R^d$ is a general domain and $μ$ a probability measure. The approximation is picked from a linear space $V_n$ of interest where $n=\dim(V_n)$. Recent results have revealed that certain weighted least-squares methods achieve near best approximation with a sampling budget $m$ that is proportional to $n$, up to a logarithmic factor $\ln(2n/\varepsilon)$, where $\varepsilon>0$ is a probability of failure. The sampling points should be picked at random according to a well-chosen probability measure $σ$ whose density is given by the inverse Christoffel function that depends both on $V_n$ and $μ$. While this approach is greatly facilitated when $D$ and $μ$ have tensor product structure, it becomes problematic for domains $D$ with arbitrary geometry since the optimal measure depends on an orthonormal basis of $V_n$ in $L^2(D,μ)$ which is not explicitly given, even for simple polynomial spaces. Therefore sampling according to this measure is not practically feasible. In this paper, we discuss practical sampling strategies, which amount to using a perturbed measure $\widetilde σ$ that can be computed in an offline stage, not involving the measurement of $u$. We show that near best approximation is attained by the resulting weighted least-squares method at near-optimal sampling budget and we discuss multilevel approaches that preserve optimality of the cumulated sampling budget when the spaces $V_n$ are iteratively enriched. These strategies rely on the knowledge of a-priori upper bounds on the inverse Christoffel function. We establish such bounds for spaces $V_n$ of multivariate algebraic polynomials, and for general domains $D$.
△ Less
Submitted 27 October, 2020; v1 submitted 21 October, 2020;
originally announced October 2020.
-
Noise Recycling
Authors:
Alejandro Cohen,
Amit Solomon,
Ken R. Duffy,
Muriel Médard
Abstract:
We introduce Noise Recycling, a method that enhances decoding performance of channels subject to correlated noise without joint decoding. The method can be used with any combination of codes, code-rates and decoding techniques. In the approach, a continuous realization of noise is estimated from a lead channel by subtracting its decoded output from its received signal. This estimate is then used t…
▽ More
We introduce Noise Recycling, a method that enhances decoding performance of channels subject to correlated noise without joint decoding. The method can be used with any combination of codes, code-rates and decoding techniques. In the approach, a continuous realization of noise is estimated from a lead channel by subtracting its decoded output from its received signal. This estimate is then used to improve the accuracy of decoding of an orthogonal channel that is experiencing correlated noise. In this design, channels aid each other only through the provision of noise estimates post-decoding. In a Gauss-Markov model of correlated noise, we constructive establish that noise recycling employing a simple successive order enables higher rates than not recycling noise. Simulations illustrate noise recycling can be employed with any code and decoder, and that noise recycling shows Block Error Rate (BLER) benefits when applying the same predetermined order as used to enhance the rate region. Finally, for short codes we establish that an additional BLER improvement is possible through noise recycling with racing, where the lead channel is not pre-determined, but is chosen on the fly based on which decoder completes first.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Multi-Level Group Testing with Application to One-Shot Pooled COVID-19 Tests
Authors:
Amit Solomon,
Alejandro Cohen,
Nir Shlezinger,
Yonina C. Eldar,
Muriel Médard
Abstract:
A key requirement in containing contagious diseases, such as the Coronavirus disease 2019 (COVID-19) pandemic, is the ability to efficiently carry out mass diagnosis over large populations. Some of the leading testing procedures, such as those utilizing qualitative polymerase chain reaction, involve using dedicated machinery which can simultaneously process a limited amount of samples. A candidate…
▽ More
A key requirement in containing contagious diseases, such as the Coronavirus disease 2019 (COVID-19) pandemic, is the ability to efficiently carry out mass diagnosis over large populations. Some of the leading testing procedures, such as those utilizing qualitative polymerase chain reaction, involve using dedicated machinery which can simultaneously process a limited amount of samples. A candidate method to increase the test throughput is to examine pooled samples comprised of a mixture of samples from different patients. In this work we study pooling based tests which operate in a one-shot fashion, while providing an indication not solely on the presence of infection, but also on its level, without additional pool tests, as often required in COVID-19 testing. As these requirements limit the application of traditional group-testing (GT) methods, we propose a multi-level GT scheme, which builds upon GT principles to enable accurate recovery using much fewer tests than patients, while operating in a one-shot manner and providing multi-level indications. We provide a theoretical analysis of the proposed scheme and characterize conditions under which the algorithm operates reliably and at affordable computational complexity. Our numerical results demonstrate that multi level GT accurately and efficiently detects infection levels, while achieving improved performance over previously proposed one-shot COVID-19 pooled-testing methods.
△ Less
Submitted 30 August, 2022; v1 submitted 12 October, 2020;
originally announced October 2020.
-
Relation Classification as Two-way Span-Prediction
Authors:
Amir DN Cohen,
Shachar Rosenman,
Yoav Goldberg
Abstract:
The current supervised relation classification (RC) task uses a single embedding to represent the relation between a pair of entities. We argue that a better approach is to treat the RC task as span-prediction (SP) problem, similar to Question answering (QA). We present a span-prediction based system for RC and evaluate its performance compared to the embedding based system. We demonstrate that th…
▽ More
The current supervised relation classification (RC) task uses a single embedding to represent the relation between a pair of entities. We argue that a better approach is to treat the RC task as span-prediction (SP) problem, similar to Question answering (QA). We present a span-prediction based system for RC and evaluate its performance compared to the embedding based system. We demonstrate that the supervised SP objective works significantly better then the standard classification based objective. We achieve state-of-the-art results on the TACRED and SemEval task 8 datasets.
△ Less
Submitted 17 April, 2021; v1 submitted 9 October, 2020;
originally announced October 2020.
-
Place Recognition in Forests with Urquhart Tessellations
Authors:
Guilherme V. Nardari,
Avraham Cohen,
Steven W. Chen,
Xu Liu,
Vaibhav Arcot,
Roseli A. F. Romero,
Vijay Kumar
Abstract:
In this letter, we present a novel descriptor based on Urquhart tessellations derived from the position of trees in a forest. We propose a framework that uses these descriptors to detect previously seen observations and landmark correspondences, even with partial overlap and noise. We run loop closure detection experiments in simulation and real-world data map-merging from different flights of an…
▽ More
In this letter, we present a novel descriptor based on Urquhart tessellations derived from the position of trees in a forest. We propose a framework that uses these descriptors to detect previously seen observations and landmark correspondences, even with partial overlap and noise. We run loop closure detection experiments in simulation and real-world data map-merging from different flights of an Unmanned Aerial Vehicle (UAV) in a pine tree forest and show that our method outperforms state-of-the-art approaches in accuracy and robustness.
△ Less
Submitted 16 November, 2020; v1 submitted 23 September, 2020;
originally announced October 2020.
-
A Sylvester-Gallai theorem for cubic curves
Authors:
Alex Cohen,
Frank de Zeeuw
Abstract:
We prove a variant of the Sylvester-Gallai theorem for cubics (algebraic curves of degree three): If a finite set of sufficiently many points in $\mathbb{R}^2$ is not contained in a cubic, then there is a cubic that contains exactly nine of the points. This resolves the first unknown case of a conjecture of Wiseman and Wilson from 1988, who proved a variant of Sylvester-Gallai for conics and conje…
▽ More
We prove a variant of the Sylvester-Gallai theorem for cubics (algebraic curves of degree three): If a finite set of sufficiently many points in $\mathbb{R}^2$ is not contained in a cubic, then there is a cubic that contains exactly nine of the points. This resolves the first unknown case of a conjecture of Wiseman and Wilson from 1988, who proved a variant of Sylvester-Gallai for conics and conjectured that similar statements hold for curves of any degree.
△ Less
Submitted 2 January, 2022; v1 submitted 4 October, 2020;
originally announced October 2020.
-
Bringing Network Coding into SDN: A Case-study for Highly Meshed Heterogeneous Communications
Authors:
Alejandro Cohen,
Homa Esfahanizadeh,
Bruno Sousa,
João P. Vilela,
Miguel Luís,
Duarte Raposo,
Francois Michel,
Susana Sargento,
Muriel Médard
Abstract:
Modern communications have moved away from point-to-point models to increasingly heterogeneous network models. In this article, we propose a novel controller-based protocol to deploy adaptive causal network coding in heterogeneous and highly-meshed communication networks. Specifically, we consider using Software-Defined-Network (SDN) as the main controller. We first present an architecture for the…
▽ More
Modern communications have moved away from point-to-point models to increasingly heterogeneous network models. In this article, we propose a novel controller-based protocol to deploy adaptive causal network coding in heterogeneous and highly-meshed communication networks. Specifically, we consider using Software-Defined-Network (SDN) as the main controller. We first present an architecture for the highly-meshed heterogeneous multi-source multi-destination networks that represents the practical communication networks encountered in the fifth generation of wireless networks (5G) and beyond. Next, we present a promising solution to deploy network coding over the new architecture. In fact, we investigate how to generalize adaptive and causal random linear network coding (AC-RLNC), proposed for multipath multi-hop (MP-MH) communication channels, to a protocol for the new multi-source multi-destination network architecture using controller. To this end, we present a modularized implementation of AC-RLNC solution where the modules work together in a distributed fashion and perform the AC-RLNC technology. We also present a new controller-based setting through which the network coding modules can communicate and can attain their required information. Finally, we briefly discuss how the proposed architecture and network coding solution provide a good opportunity for future technologies, e.g., distributed coded computation and storage, mmWave communication environments, and innovative and efficient security features.
△ Less
Submitted 1 October, 2020;
originally announced October 2020.
-
The pursuit of stability in halide perovskites: the monovalent cation and the key for surface and bulk self-repair
Authors:
D R Ceratti,
A V Cohen,
R Tenne,
Y Rakita,
L Snarski,
L Cremonesi,
I Goldian,
I Kaplan-Ashiri,
T Bendikov,
V Kalchenko,
M Elbaum,
M A C Potenza,
L Kronik,
G Hodes,
D Cahen
Abstract:
We find significant differences between degradation and healing at the surface or in the bulk for each of the different APbBr3 single crystals (A=CH3NH3+, methylammonium (MA); HC(NH2)2+, formamidinium (FA); and cesium, Cs+). Using 1- and 2-photon microscopy and photobleaching we conclude that kinetics dominate the surface, and thermodynamics the bulk stability. Fluorescence-lifetime imaging micros…
▽ More
We find significant differences between degradation and healing at the surface or in the bulk for each of the different APbBr3 single crystals (A=CH3NH3+, methylammonium (MA); HC(NH2)2+, formamidinium (FA); and cesium, Cs+). Using 1- and 2-photon microscopy and photobleaching we conclude that kinetics dominate the surface, and thermodynamics the bulk stability. Fluorescence-lifetime imaging microscopy, as well as results from several other methods, relate the (damaged) state of the halide perovskite (HaP) after photobleaching to its modified optical and electronic properties. The A cation type strongly influences both the kinetics and the thermodynamics of recovery and degradation: FA heals best the bulk material with faster self-healing; Cs+ protects the surface best, being the least volatile of the A cations and possibly through O-passivation; MA passivates defects via methylamine from photo-dissociation, which binds to Pb2+. DFT simulations not only provide insight into the latter conclusion, but also show the importance and stability of the Br3- defect. These results rationalize the use of mixed A-cation materials for optimizing both solar cell stability and overall performance of HaP-based devices, and provide a basis for designing new HaP variants.
△ Less
Submitted 30 September, 2020;
originally announced September 2020.
-
PennSyn2Real: Training Object Recognition Models without Human Labeling
Authors:
Ty Nguyen,
Ian D. Miller,
Avi Cohen,
Dinesh Thakur,
Shashank Prasad,
Camillo J. Taylor,
Pratik Chaudrahi,
Vijay Kumar
Abstract:
Scalable training data generation is a critical problem in deep learning. We propose PennSyn2Real - a photo-realistic synthetic dataset consisting of more than 100,000 4K images of more than 20 types of micro aerial vehicles (MAVs). The dataset can be used to generate arbitrary numbers of training images for high-level computer vision tasks such as MAV detection and classification. Our data genera…
▽ More
Scalable training data generation is a critical problem in deep learning. We propose PennSyn2Real - a photo-realistic synthetic dataset consisting of more than 100,000 4K images of more than 20 types of micro aerial vehicles (MAVs). The dataset can be used to generate arbitrary numbers of training images for high-level computer vision tasks such as MAV detection and classification. Our data generation framework bootstraps chroma-keying, a mature cinematography technique with a motion tracking system, providing artifact-free and curated annotated images where object orientations and lighting are controlled. This framework is easy to set up and can be applied to a broad range of objects, reducing the gap between synthetic and real-world data. We show that synthetic data generated using this framework can be directly used to train CNN models for common object recognition tasks such as detection and segmentation. We demonstrate competitive performance in comparison with training using only real images. Furthermore, bootstrap** the generated synthetic data in few-shot learning can significantly improve the overall performance, reducing the number of required training data samples to achieve the desired accuracy.
△ Less
Submitted 16 October, 2020; v1 submitted 21 September, 2020;
originally announced September 2020.
-
Optimal Stable Nonlinear Approximation
Authors:
Albert Cohen,
Ronald DeVore,
Guergana Petrova,
Przemyslaw Wojtaszczyk
Abstract:
While it is well known that nonlinear methods of approximation can often perform dramatically better than linear methods, there are still questions on how to measure the optimal performance possible for such methods. This paper studies nonlinear methods of approximation that are compatible with numerical implementation in that they are required to be numerically stable. A measure of optimal perfor…
▽ More
While it is well known that nonlinear methods of approximation can often perform dramatically better than linear methods, there are still questions on how to measure the optimal performance possible for such methods. This paper studies nonlinear methods of approximation that are compatible with numerical implementation in that they are required to be numerically stable. A measure of optimal performance, called {\em stable manifold widths}, for approximating a model class $K$ in a Banach space $X$ by stable manifold methods is introduced. Fundamental inequalities between these stable manifold widths and the entropy of $K$ are established. The effects of requiring stability in the settings of deep learning and compressed sensing are discussed.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Nonlinear reduced models for state and parameter estimation
Authors:
Albert Cohen,
Wolfgang Dahmen,
Olga Mula,
James Nichols
Abstract:
State estimation aims at approximately reconstructing the solution $u$ to a parametrized partial differential equation from $m$ linear measurements, when the parameter vector $y$ is unknown. Fast numerical recovery methods have been proposed based on reduced models which are linear spaces of moderate dimension $n$ which are tailored to approximate the solution manifold $\mathcal{M}$ where the solu…
▽ More
State estimation aims at approximately reconstructing the solution $u$ to a parametrized partial differential equation from $m$ linear measurements, when the parameter vector $y$ is unknown. Fast numerical recovery methods have been proposed based on reduced models which are linear spaces of moderate dimension $n$ which are tailored to approximate the solution manifold $\mathcal{M}$ where the solution sits. These methods can be viewed as deterministic counterparts to Bayesian estimation approaches, and are proved to be optimal when the prior is expressed by approximability of the solution with respect to the reduced model. However, they are inherently limited by their linear nature, which bounds from below their best possible performance by the Kolmogorov width $d_m(\mathcal{M})$ of the solution manifold. In this paper we propose to break this barrier by using simple nonlinear reduced models that consist of a finite union of linear spaces $V_k$, each having dimension at most $m$ and leading to different estimators $u_k^*$. A model selection mechanism based on minimizing the PDE residual over the parameter space is used to select from this collection the final estimator $u^*$. Our analysis shows that $u^*$ meets optimal recovery benchmarks that are inherent to the solution manifold and not tied to its Kolmogorov width. The residual minimization procedure is computationally simple in the relevant case of affine parameter dependence in the PDE. In addition, it results in an estimator $y^*$ for the unknown parameter vector. In this setting, we also discuss an alternating minimization (coordinate descent) algorithm for joint state and parameter estimation, that potentially improves the quality of both estimators.
△ Less
Submitted 24 November, 2020; v1 submitted 6 September, 2020;
originally announced September 2020.
-
Improving axial resolution in SIM using deep learning
Authors:
Miguel Boland,
Edward A. K. Cohen,
Seth Flaxman,
Mark A. A. Neil
Abstract:
Structured Illumination Microscopy is a widespread methodology to image live and fixed biological structures smaller than the diffraction limits of conventional optical microscopy. Using recent advances in image up-scaling through deep learning models, we demonstrate a method to reconstruct 3D SIM image stacks with twice the axial resolution attainable through conventional SIM reconstructions. We…
▽ More
Structured Illumination Microscopy is a widespread methodology to image live and fixed biological structures smaller than the diffraction limits of conventional optical microscopy. Using recent advances in image up-scaling through deep learning models, we demonstrate a method to reconstruct 3D SIM image stacks with twice the axial resolution attainable through conventional SIM reconstructions. We further evaluate our method for robustness to noise & generalisability to varying observed specimens, and discuss potential adaptions of the method to further improvements in resolution.
△ Less
Submitted 18 February, 2021; v1 submitted 4 September, 2020;
originally announced September 2020.
-
Network Coding-Based Post-Quantum Cryptography
Authors:
Alejandro Cohen,
Rafael G. L. D'Oliveira,
Salman Salamatian,
Muriel Medard
Abstract:
We propose a novel hybrid universal network-coding cryptosystem (HUNCC) to obtain secure post-quantum cryptography at high communication rates. The secure network-coding scheme we offer is hybrid in the sense that it combines information-theory security with public-key cryptography. In addition, the scheme is general and can be applied to any communication network, and to any public-key cryptosyst…
▽ More
We propose a novel hybrid universal network-coding cryptosystem (HUNCC) to obtain secure post-quantum cryptography at high communication rates. The secure network-coding scheme we offer is hybrid in the sense that it combines information-theory security with public-key cryptography. In addition, the scheme is general and can be applied to any communication network, and to any public-key cryptosystem. Our hybrid scheme is based on the information theoretic notion of individual secrecy, which traditionally relies on the assumption that an eavesdropper can only observe a subset of the communication links between the trusted parties - an assumption that is often challenging to enforce. For this setting, several code constructions have been developed, where the messages are linearly mixed before transmission over each of the paths in a way that guarantees that an adversary which observes only a subset has sufficient uncertainty about each individual message.
Instead, in this paper, we take a computational viewpoint, and construct a coding scheme in which an arbitrary secure cryptosystem is utilized on a subset of the links, while a pre-processing similar to the one in individual security is utilized. Under this scheme, we demonstrate 1) a computational security guarantee for an adversary which observes the entirety of the links 2) an information theoretic security guarantee for an adversary which observes a subset of the links, and 3) information rates which approach the capacity of the network and greatly improve upon the current solutions.
A perhaps surprising consequence of our scheme is that, to guarantee a computational security level b, it is sufficient to encrypt a single link using a computational post-quantum scheme. In addition, the information rate approaches 1 as the number of communication links increases.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Centralized vs Decentralized Targeted Brute-Force Attacks: Guessing with Side-Information
Authors:
Salman Salamatian,
Wasim Huleihel,
Ahmad Beirami,
Asaf Cohen,
Muriel Médard
Abstract:
According to recent empirical studies, a majority of users have the same, or very similar, passwords across multiple password-secured online services. This practice can have disastrous consequences, as one password being compromised puts all the other accounts at much higher risk. Generally, an adversary may use any side-information he/she possesses about the user, be it demographic information, p…
▽ More
According to recent empirical studies, a majority of users have the same, or very similar, passwords across multiple password-secured online services. This practice can have disastrous consequences, as one password being compromised puts all the other accounts at much higher risk. Generally, an adversary may use any side-information he/she possesses about the user, be it demographic information, password reuse on a previously compromised account, or any other relevant information to devise a better brute-force strategy (so called targeted attack). In this work, we consider a distributed brute-force attack scenario in which $m$ adversaries, each observing some side information, attempt breaching a password secured system. We compare two strategies: an uncoordinated attack in which the adversaries query the system based on their own side-information until they find the correct password, and a fully coordinated attack in which the adversaries pool their side-information and query the system together. For passwords $\mathbf{X}$ of length $n$, generated independently and identically from a distribution $P_X$, we establish an asymptotic closed-form expression for the uncoordinated and coordinated strategies when the side-information $\mathbf{Y}_{(m)}$ are generated independently from passing $\mathbf{X}$ through a memoryless channel $P_{Y|X}$, as the length of the password $n$ goes to infinity. We illustrate our results for binary symmetric channels and binary erasure channels, two families of side-information channels which model password reuse. We demonstrate that two coordinated agents perform asymptotically better than any finite number of uncoordinated agents for these channels, meaning that sharing side-information is very valuable in distributed attacks.
△ Less
Submitted 28 August, 2020;
originally announced August 2020.
-
Growth-Etch Metal-Organic Chemical Vapor Deposition Approach of WS2 Atomic-Layers
Authors:
Assael Cohen,
Avinash Patsha,
Pranab K. Mohapatra,
Miri Kazes,
Kamalakannan Ranganathan,
Lothar Houben,
Dan Oron,
Ariel Ismach
Abstract:
Metal organic chemical vapor deposition (MOCVD) is one of the main methodologies used for thin film fabrication in the semiconductor industry today and is considered one of the most promising routes to achieve large-scale and high-quality 2D transition metal dichalcogenides (TMDCs). However, if not taken special measures, MOCVD suffers from some serious drawbacks, such as small domain size and car…
▽ More
Metal organic chemical vapor deposition (MOCVD) is one of the main methodologies used for thin film fabrication in the semiconductor industry today and is considered one of the most promising routes to achieve large-scale and high-quality 2D transition metal dichalcogenides (TMDCs). However, if not taken special measures, MOCVD suffers from some serious drawbacks, such as small domain size and carbon contamination, resulting in poor optical and crystal quality, which may inhibit its implementation for the large-scale fabrication of atomic-thin semiconductors. Here we present a Growth-Etch MOCVD (GE-MOCVD) methodology, in which a small amount of water vapor is introduced during the growth, while the precursors are delivered in pulses. The evolution of the growth as a function of the amount of water vapor, the number and type of cycles and the gas composition is described. We show a significant domain size increase is achieved relative to our conventional process. The improved crystal quality of WS2 (and WSe2) domains was demonstrated by means of Raman spectroscopy, photoluminescence (PL) spectroscopy and HRTEM studies. Moreover, time-resolved PL studies show very long exciton lifetimes, comparable to those observed in mechanically exfoliated flakes. Thus, the GE-MOCVD approach presented here may facilitate their integration into a wide range of applications.
△ Less
Submitted 8 December, 2020; v1 submitted 17 August, 2020;
originally announced August 2020.
-
Strength In Diversity: Small Bodies as the Most Important Objects in Planetary Sciences
Authors:
Laura M. Woodney,
Andrew S. Rivkin,
Walter Harris,
Barbara A. Cohen,
Gal Sarid,
Maria Womack,
Olivier Barnouin,
Kat Volk,
Rachel Klima,
Yanga R. Fernandez,
Jordan K. Steckloff,
Paul A. Abell
Abstract:
Small bodies, the unaccreted leftovers of planetary formation, are often mistaken for the leftovers of planetary science in the sense that they are everything else after the planets and their satellites (or sometimes just their regular satellites) are accounted for. This mistaken view elides the great diversity of compositions, histories, and present-day conditions and processes found in the small…
▽ More
Small bodies, the unaccreted leftovers of planetary formation, are often mistaken for the leftovers of planetary science in the sense that they are everything else after the planets and their satellites (or sometimes just their regular satellites) are accounted for. This mistaken view elides the great diversity of compositions, histories, and present-day conditions and processes found in the small bodies, and the interdisciplinary nature of their study. Understanding small bodies is critical to planetary science as a field, and we urge planetary scientists and our decision makers to continue to support science-based mission selections and to recognize that while small bodies have been grouped together for convenience, the diversity of these objects in terms of composition, mass, differentiation, evolution, activity, dynamical state, physical structure, thermal environment, thermal history, and formation vastly exceeds the observed variability in the major planets and their satellites. Treating them as a monolithic group with interchangeable members does a grave injustice to the range of fundamental questions they address. We advocate for a deep and ongoing program of missions, telescopic observations, R and A funding, and student support that respects this diversity.
△ Less
Submitted 14 August, 2020;
originally announced August 2020.
-
Report prepared by the Montreal AI Ethics Institute In Response to Mila's Proposal for a Contact Tracing App
Authors:
Allison Cohen,
Abhishek Gupta
Abstract:
Contact tracing has grown in popularity as a promising solution to the COVID-19 pandemic. The benefits of automated contact tracing are two-fold. Contact tracing promises to reduce the number of infections by being able to: 1) systematically identify all of those that have been in contact with someone who has had COVID; and, 2) ensure those that have been exposed to the virus do not unknowingly in…
▽ More
Contact tracing has grown in popularity as a promising solution to the COVID-19 pandemic. The benefits of automated contact tracing are two-fold. Contact tracing promises to reduce the number of infections by being able to: 1) systematically identify all of those that have been in contact with someone who has had COVID; and, 2) ensure those that have been exposed to the virus do not unknowingly infect others. "COVI" is the name of a recent contact tracing app developed by Mila and was proposed to help combat COVID-19 in Canada. The app was designed to inform each individual of their relative risk of being infected with the virus, which Mila claimed would empower citizens to make informed decisions about their movement and allow for a data-driven approach to public health policy; all the while ensuring data is safeguarded from governments, companies, and individuals. This article will provide a critical response to Mila's COVI White Paper. Specifically, this article will discuss: the extent to which diversity has been considered in the design of the app, assumptions surrounding users' interaction with the app and the app's utility, as well as unanswered questions surrounding transparency, accountability, and security. We see this as an opportunity to supplement the excellent risk analysis done by the COVI team to surface insights that can be applied to other contact- and proximity-tracing apps that are being developed and deployed across the world. Our hope is that, through a meaningful dialogue, we can ultimately help organizations develop better solutions that respect the fundamental rights and values of the communities these solutions are meant to serve.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Montreal AI Ethics Institute's (MAIEI) Submission to the World Intellectual Property Organization (WIPO) Conversation on Intellectual Property (IP) and Artificial Intelligence (AI) Second Session
Authors:
Allison Cohen,
Abhishek Gupta
Abstract:
This document posits that, at best, a tenuous case can be made for providing AI exclusive IP over their "inventions". Furthermore, IP protections for AI are unlikely to confer the benefit of ensuring regulatory compliance. Rather, IP protections for AI "inventors" present a host of negative externalities and obscures the fact that the genuine inventor, deserving of IP, is the human agent. This doc…
▽ More
This document posits that, at best, a tenuous case can be made for providing AI exclusive IP over their "inventions". Furthermore, IP protections for AI are unlikely to confer the benefit of ensuring regulatory compliance. Rather, IP protections for AI "inventors" present a host of negative externalities and obscures the fact that the genuine inventor, deserving of IP, is the human agent. This document will conclude by recommending strategies for WIPO to bring IP law into the 21st century, enabling it to productively account for AI "inventions".
Theme: IP Protection for AI-Generated and AI-Assisted Works Based on insights from the Montreal AI Ethics Institute (MAIEI) staff and supplemented by workshop contributions from the AI Ethics community convened by MAIEI on July 5, 2020.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Optimal Dividend Problem: Asymptotic Analysis
Authors:
Asaf Cohen,
Virginia R. Young
Abstract:
We re-visit the classical problem of optimal payment of dividends and determine the degree to which the diffusion approximation serves as a valid approximation of the classical risk model for this problem. Our results parallel some of those in Bäuerle (2004), but we obtain sharper results because we use a different technique for obtaining them. Specifically, Bäuerle (2004) uses probabilistic techn…
▽ More
We re-visit the classical problem of optimal payment of dividends and determine the degree to which the diffusion approximation serves as a valid approximation of the classical risk model for this problem. Our results parallel some of those in Bäuerle (2004), but we obtain sharper results because we use a different technique for obtaining them. Specifically, Bäuerle (2004) uses probabilistic techniques and relies on convergence in distribution of the underlying processes. By contrast, we use comparison results from the theory of differential equations, and these methods allow us to determine the rate of convergence of the value functions in question.
△ Less
Submitted 22 October, 2020; v1 submitted 21 July, 2020;
originally announced July 2020.
-
A Sylvester-Gallai result for concurrent lines in the complex plane
Authors:
Alex Cohen
Abstract:
We show that if a set of points in $\mathbb{C}^2$ lies on a family of $m$ concurrent lines, and if one of those lines contains more than $m-2$ points, then there is a line passing through exactly two points of the set. The bound $m-2$ in our result is optimal. Our main theorem resolves a conjecture of Frank de Zeeuw, and generalizes a result of Kelly and Nwankpa.
We show that if a set of points in $\mathbb{C}^2$ lies on a family of $m$ concurrent lines, and if one of those lines contains more than $m-2$ points, then there is a line passing through exactly two points of the set. The bound $m-2$ in our result is optimal. Our main theorem resolves a conjecture of Frank de Zeeuw, and generalizes a result of Kelly and Nwankpa.
△ Less
Submitted 28 September, 2020; v1 submitted 7 July, 2020;
originally announced July 2020.
-
A Macroeconomic SIR Model for COVID-19
Authors:
Erhan Bayraktar,
Asaf Cohen,
April Nellis
Abstract:
The current COVID-19 pandemic and subsequent lockdowns have highlighted the close and delicate relationship between a country's public health and economic health. Macroeconomic models that use preexisting epidemic models to calculate the impacts of a disease outbreak are therefore extremely useful for policymakers seeking to evaluate the best course of action in such a crisis. We develop an SIR mo…
▽ More
The current COVID-19 pandemic and subsequent lockdowns have highlighted the close and delicate relationship between a country's public health and economic health. Macroeconomic models that use preexisting epidemic models to calculate the impacts of a disease outbreak are therefore extremely useful for policymakers seeking to evaluate the best course of action in such a crisis. We develop an SIR model of the COVID-19 pandemic that explicitly considers herd immunity, behavior-dependent transmission rates, remote workers, and indirect externalities of lockdown. This model is presented as an exit time control problem where lockdown ends when the population achieves herd immunity, either naturally or via a vaccine. A social planner prescribes separate levels of lockdown for two separate sections of the adult population: low-risk (ages 20-64) and high-risk (ages 65 and over). These levels are determined via optimization of an objective function which assigns a macroeconomic cost to the level of lockdown and the number of deaths. We find that, by ending lockdowns once herd immunity is reached, high-risk individuals are able to leave lockdown significantly before the arrival of a vaccine without causing large increases in mortality. Moreover, if we incorporate a behavior-dependent transmission rate which represents increased personal caution in response to increased infection levels, both output loss and total mortality are lowered. Lockdown efficacy is further increased when there is less interaction between low- and high-risk individuals, and increased remote work decreases output losses. Overall, our model predicts that a lockdown which ends at the arrival of herd immunity, combined with individual actions to slow virus transmission, can reduce total mortality to one-third of the no-lockdown level, while allowing high-risk individuals to leave lockdown well before vaccine arrival.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
The State of AI Ethics Report (June 2020)
Authors:
Abhishek Gupta,
Camylle Lanteigne,
Victoria Heath,
Marianna Bergamaschi Ganapini,
Erick Galinkin,
Allison Cohen,
Tania De Gasperis,
Mo Akif,
Renjie Butalid
Abstract:
These past few months have been especially challenging, and the deployment of technology in ways hitherto untested at an unrivalled pace has left the internet and technology watchers aghast. Artificial intelligence has become the byword for technological progress and is being used in everything from hel** us combat the COVID-19 pandemic to nudging our attention in different directions as we all…
▽ More
These past few months have been especially challenging, and the deployment of technology in ways hitherto untested at an unrivalled pace has left the internet and technology watchers aghast. Artificial intelligence has become the byword for technological progress and is being used in everything from hel** us combat the COVID-19 pandemic to nudging our attention in different directions as we all spend increasingly larger amounts of time online. It has never been more important that we keep a sharp eye out on the development of this field and how it is sha** our society and interactions with each other. With this inaugural edition of the State of AI Ethics we hope to bring forward the most important developments that caught our attention at the Montreal AI Ethics Institute this past quarter. Our goal is to help you navigate this ever-evolving field swiftly and allow you and your organization to make informed decisions. This pulse-check for the state of discourse, research, and development is geared towards researchers and practitioners alike who are making decisions on behalf of their organizations in considering the societal impacts of AI-enabled solutions. We cover a wide set of areas in this report spanning Agency and Responsibility, Security and Risk, Disinformation, Jobs and Labor, the Future of AI Ethics, and more. Our staff has worked tirelessly over the past quarter surfacing signal from the noise so that you are equipped with the right tools and knowledge to confidently tread this complex yet consequential domain.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Spin-Induced Linear Polarization of Excitonic Emission in Antiferromagnetic van der Waals Crystals
Authors:
Xingzhi Wang,
Jun Cao,
Zhengguang Lu,
Arielle Cohen,
Hikari Kitadai,
Tianshu Li,
Matthew Wilson,
Chun Hung Lui,
Dmitry Smirnov,
Sahar Sharifzadeh,
Xi Ling
Abstract:
Antiferromagnets display enormous potential in spintronics owing to its intrinsic nature, including terahertz resonance, multilevel states, and absence of stray fields. Combining with the layered nature, van der Waals (vdW) antiferromagnets hold the promise in providing new insights and new designs in two-dimensional (2D) spintronics. The zero net magnetic moments of vdW antiferromagnets strengthe…
▽ More
Antiferromagnets display enormous potential in spintronics owing to its intrinsic nature, including terahertz resonance, multilevel states, and absence of stray fields. Combining with the layered nature, van der Waals (vdW) antiferromagnets hold the promise in providing new insights and new designs in two-dimensional (2D) spintronics. The zero net magnetic moments of vdW antiferromagnets strengthens the spin stability, however, impedes the correlation between spin and other excitation elements, like excitons. Such coupling is urgently anticipated for fundamental magneto-optical studies and potential opto-spintronic devices. Here, we report an ultra-sharp excitonic emission with excellent monochromaticity in antiferromagnetic nickel phosphorus trisulfides (NiPS3) from bulk to atomically thin flakes. We prove that the linear polarization of the excitonic luminescence is perpendicular to the ordered spin orientation in NiPS3. By applying an in-plane magnetic field to alter the spin orientation, we further manipulate the excitonic emission polarization. Such strong correlation between exciton and spins provides new insights for the study of magneto-optics in 2D materials, and hence opens a path for develo** opto-spintronic devices and antiferromagnet-based quantum information technologies.
△ Less
Submitted 14 June, 2020;
originally announced June 2020.
-
Noise Recycling
Authors:
Alejandro Cohen,
Amit Solomon,
Ken R. Duffy,
Muriel Médard
Abstract:
We introduce Noise Recycling, a method that substantially enhances decoding performance of orthogonal channels subject to correlated noise without the need for joint encoding or decoding. The method can be used with any combination of codes, code-rates and decoding techniques. In the approach, a continuous realization of noise is estimated from a lead channel by subtracting its decoded output from…
▽ More
We introduce Noise Recycling, a method that substantially enhances decoding performance of orthogonal channels subject to correlated noise without the need for joint encoding or decoding. The method can be used with any combination of codes, code-rates and decoding techniques. In the approach, a continuous realization of noise is estimated from a lead channel by subtracting its decoded output from its received signal. The estimate is recycled to reduce the Signal to Noise Ratio (SNR) of an orthogonal channel that is experiencing correlated noise and so improve the accuracy of its decoding. In this design, channels only aid each other only through the provision of noise estimates post-decoding.
For a system with arbitrary noise correlation between orthogonal channels experiencing potentially distinct conditions, we introduce an algorithm that determines a static decoding order that maximizes total effective SNR. We prove that this solution results in higher effective SNR than independent decoding, which in turn leads to a larger rate region. We derive upper and lower bounds on the capacity of any sequential decoding of orthogonal channels with correlated noise where the encoders are independent and show that those bounds are almost tight. We numerically compare the upper bound with the capacity of jointly Gaussian noise channel with joint encoding and decoding, showing that they match.
Simulation results illustrate that Noise Recycling can be employed with any combination of codes and decoders, and that it gives significant Block Error Rate (BLER) benefits when applying the static predetermined order used to enhance the rate region. We further establish that an additional BLER improvement is possible through Dynamic Noise Recycling, where the lead channel is not pre-determined but is chosen on-the-fly based on which decoder provides the most confident decoding.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
ARTEMIS Observations of Plasma Waves in Laminar and Perturbed Interplanetary Shocks
Authors:
L. Davis,
C. A. Cattell,
L. B. Wilson III,
Z. A. Cohen,
A. W. Breneman,
E. L. M. Hanson
Abstract:
The 'Acceleration, Reconnection, Turbulence and Electrodynamics of the Moon's Interaction with the Sun' (ARTEMIS) mission provides a unique opportunity to study the structure of interplanetary shocks and the associated generation of plasma waves with frequencies between ~50-8000 Hz due to its long duration electric and magnetic field burst waveform captures. We compare wave properties and occurren…
▽ More
The 'Acceleration, Reconnection, Turbulence and Electrodynamics of the Moon's Interaction with the Sun' (ARTEMIS) mission provides a unique opportunity to study the structure of interplanetary shocks and the associated generation of plasma waves with frequencies between ~50-8000 Hz due to its long duration electric and magnetic field burst waveform captures. We compare wave properties and occurrence rates at 11 quasi-perpendicular interplanetary shocks with burst data within 10 minutes (~3200 proton gyroradii upstream, ~1900 downstream) of the shock ramp. A perturbed shock is defined as possessing a large amplitude whistler precursor in the quasi-static magnetic field with an amplitude greater than 1/3 the difference between the upstream and downstream average magnetic field magnitudes; laminar shocks lack these large precursors and have a smooth, step function-like transition. In addition to wave modes previously observed, including ion acoustic, whistler, and electrostatic solitary waves, waves in the ion acoustic frequency range that show rapid temporal frequency change are common. The ramp region of the two laminar shocks with burst data in the ramp contained a wide range of large amplitude wave modes whereas the one perturbed shock with ramp burst data contained no such waves. Energy dissipation through wave-particle interactions is more prominent in these laminar shocks than the perturbed shock. The wave occurrence rates for laminar shocks are higher in the transition region, especially the ramp, than downstream. Perturbed shocks have approximately 2-3 times the wave occurrence rate downstream than laminar shocks.
△ Less
Submitted 22 March, 2021; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Localized Excitons in Defective Monolayer Germanium Selenide
Authors:
Arielle Cohen,
D. Kirk Lewis,
Tianlun Huang,
Sahar Sharifzadeh
Abstract:
Germanium Selenide (GeSe) is a van der Waals-bonded layered material with promising optoelectronic properties, which has been experimentally synthesized for 2D semiconductor applications. In the monolayer, due to reduced dimensionality and, thus, screening environment, perturbations such as the presence of defects have a significant impact on its properties. We apply density functional theory and…
▽ More
Germanium Selenide (GeSe) is a van der Waals-bonded layered material with promising optoelectronic properties, which has been experimentally synthesized for 2D semiconductor applications. In the monolayer, due to reduced dimensionality and, thus, screening environment, perturbations such as the presence of defects have a significant impact on its properties. We apply density functional theory and many-body perturbation theory to understand the electronic and optical properties of GeSe containing a single selenium vacancy in the $-2$ charge state. We predict that the vacancy results in mid-gap "trap states" that strongly localize the electron and hole density and lead to sharp, low-energy optical absorption peaks below the predicted pristine optical gap. Analysis of the exciton wavefunction reveals that the 2D Wannier-Mott exciton of the pristine monolayer is highly localized around the defect, reducing its Bohr radius by a factor of four and producing a dipole moment along the out-of-plane axis due to the defect-induced symmetry breaking. Overall, these results suggest that the vacancy is a strong perturbation to the system, demonstrating the importance of considering defects in the context of material design.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Compute-and-Forward in Large Relaying Systems: Limitations and Asymptotically Optimal Scheduling
Authors:
Ori Shmuel,
Asaf Cohen,
Omer Gurewitz
Abstract:
Compute and Forward (CF) is a coding scheme which enables receivers to decode linear combinations of simultaneously transmitted messages while exploiting the linear properties of lattice codes and the additive nature of a shared medium. The scheme was originally designed for relay networks, yet, it was found useful in other communication problems, such as MIMO communication. Works in the current l…
▽ More
Compute and Forward (CF) is a coding scheme which enables receivers to decode linear combinations of simultaneously transmitted messages while exploiting the linear properties of lattice codes and the additive nature of a shared medium. The scheme was originally designed for relay networks, yet, it was found useful in other communication problems, such as MIMO communication. Works in the current literature assume a fixed number of transmitters and receivers in the system. However, following the increase in communication networks density, it is interesting to investigate the performance of CF when the number of transmitters is large.
In this work, we show that as the number of transmitters grows, CF becomes degenerated, in the sense that a relay prefers to decode only one (strongest) user instead of any other linear combination of the transmitted codewords, treating the other users as noise. Moreover, the system's sum-rate tends to zero as well. This makes scheduling necessary in order to maintain the superior abilities CF provides. We thus examine the problem of scheduling for CF. We start with insights on why good scheduling opportunities can be found. Then, we provide an asymptotically optimal, polynomial-time scheduling algorithm and analyze its performance. We conclude that with proper scheduling, CF is not merely non-degenerated, but, in fact, provides a gain for the system sum-rate, up to the optimal scaling law of $O(\log{\log{L}})$.
△ Less
Submitted 10 May, 2020;
originally announced May 2020.
-
Nonlinear Methods for Model Reduction
Authors:
Andrea Bonito,
Albert Cohen,
Ronald DeVore,
Diane Guignard,
Peter Jantsch,
Guergana Petrova
Abstract:
The usual approach to model reduction for parametric partial differential equations (PDEs) is to construct a linear space $V_n$ which approximates well the solution manifold $\mathcal{M}$ consisting of all solutions $u(y)$ with $y$ the vector of parameters. This linear reduced model $V_n$ is then used for various tasks such as building an online forward solver for the PDE or estimating parameters…
▽ More
The usual approach to model reduction for parametric partial differential equations (PDEs) is to construct a linear space $V_n$ which approximates well the solution manifold $\mathcal{M}$ consisting of all solutions $u(y)$ with $y$ the vector of parameters. This linear reduced model $V_n$ is then used for various tasks such as building an online forward solver for the PDE or estimating parameters from data observations. It is well understood in other problems of numerical computation that nonlinear methods such as adaptive approximation, $n$-term approximation, and certain tree-based methods may provide improved numerical efficiency. For model reduction, a nonlinear method would replace the linear space $V_n$ by a nonlinear space $Σ_n$. This idea has already been suggested in recent papers on model reduction where the parameter domain is decomposed into a finite number of cells and a linear space of low dimension is assigned to each cell.
Up to this point, little is known in terms of performance guarantees for such a nonlinear strategy. Moreover, most numerical experiments for nonlinear model reduction use a parameter dimension of only one or two. In this work, a step is made towards a more cohesive theory for nonlinear model reduction. Framing these methods in the general setting of library approximation allows us to give a first comparison of their performance with those of standard linear approximation for any general compact set. We then turn to the study these methods for solution manifolds of parametrized elliptic PDEs. We study a very specific example of library approximation where the parameter domain is split into a finite number $N$ of rectangular cells and where different reduced affine spaces of dimension $m$ are assigned to each cell. The performance of this nonlinear procedure is analyzed from the viewpoint of accuracy of approximation versus $m$ and $N$.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
Asymptotic optimality of the generalized $cμ$ rule under model uncertainty
Authors:
Asaf Cohen,
Subhamay Saha
Abstract:
We consider a critically-loaded multiclass queueing control problem with model uncertainty. The model consists of $I$ types of customers and a single server. At any time instant, a decision-maker (DM) allocates the server's effort to the customers. The DM's goal is to minimize a convex holding cost that accounts for the ambiguity with respect to the model, i.e., the arrival and service rates. For…
▽ More
We consider a critically-loaded multiclass queueing control problem with model uncertainty. The model consists of $I$ types of customers and a single server. At any time instant, a decision-maker (DM) allocates the server's effort to the customers. The DM's goal is to minimize a convex holding cost that accounts for the ambiguity with respect to the model, i.e., the arrival and service rates. For this, we consider an adversary player whose role is to choose the worst-case scenario. Specifically, we assume that the DM has a reference probability model in mind and that the cost function is formulated by the supremum over equivalent admissible probability measures to the reference measure with two components, the first is the expected holding cost, and the second one is a penalty for the adversary player for deviating from the reference model. The penalty term is formulated by a general divergence measure.
We show that although that under the equivalent admissible measures the critically-load condition might be violated, the generalized $cμ$ rule is asymptotically optimal for this problem.
△ Less
Submitted 29 March, 2021; v1 submitted 2 April, 2020;
originally announced April 2020.
-
Poissonian correlation of higher order differences
Authors:
Alex Cohen
Abstract:
A sequence $(x_n)_{n=1}^{\infty}$ on the torus $\mathbb{T}$ exhibits Poissonian pair correlation if for all $s\geq0$, \begin{equation*}
\lim_{N\to\infty} \frac{1}{N}\#\left\{1\leq m\neq n \leq N : |x_m-x_n| \leq \frac{s}{N}\right\} = 2s.
\end{equation*}
It is known that this condition implies equidistribution of $(x_n)$. We generalize this result to four-fold differences: if for all $s> 0$ w…
▽ More
A sequence $(x_n)_{n=1}^{\infty}$ on the torus $\mathbb{T}$ exhibits Poissonian pair correlation if for all $s\geq0$, \begin{equation*}
\lim_{N\to\infty} \frac{1}{N}\#\left\{1\leq m\neq n \leq N : |x_m-x_n| \leq \frac{s}{N}\right\} = 2s.
\end{equation*}
It is known that this condition implies equidistribution of $(x_n)$. We generalize this result to four-fold differences: if for all $s> 0$ we have
\begin{equation*}
\lim_{N\to\infty} \frac{1}{N^2}\#\left\{\substack{1\leq m,n,k,l\leq N\\\{m,n\}\neq\{k,l\}} : |x_m+x_n-x_k-x_l| \leq \frac{s}{N^2}\right\} = 2s
\end{equation*}
then $(x_n)_{n=1}^{\infty}$ is equidistributed. This notion generalizes to higher orders, and for any $k$ we show that a sequence exhibiting $2k$-fold Poissonian correlation is equidistributed. In the course of this investigation we obtain a discrepancy bound for a sequence in terms of its closeness to $2k$-fold Poissonian correlation. This result refines earlier bounds of Grepstad & Larcher and Steinerberger in the case of pair correlation, and resolves an open question of Steinerberger.
△ Less
Submitted 12 December, 2020; v1 submitted 11 March, 2020;
originally announced March 2020.
-
MLIR: A Compiler Infrastructure for the End of Moore's Law
Authors:
Chris Lattner,
Mehdi Amini,
Uday Bondhugula,
Albert Cohen,
Andy Davis,
Jacques Pienaar,
River Riddle,
Tatiana Shpeisman,
Nicolas Vasilache,
Oleksandr Zinenko
Abstract:
This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. MLIR facilitates the design and implementation of code generators, translators and o…
▽ More
This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. MLIR facilitates the design and implementation of code generators, translators and optimizers at different levels of abstraction and also across application domains, hardware targets and execution environments. The contribution of this work includes (1) discussion of MLIR as a research artifact, built for extension and evolution, and identifying the challenges and opportunities posed by this novel design point in design, semantics, optimization specification, system, and engineering. (2) evaluation of MLIR as a generalized infrastructure that reduces the cost of building compilers-describing diverse use-cases to show research and educational opportunities for future programming languages, compilers, execution environments, and computer architecture. The paper also presents the rationale for MLIR, its original design principles, structures and semantics.
△ Less
Submitted 29 February, 2020; v1 submitted 25 February, 2020;
originally announced February 2020.
-
Near-optimal Regret Bounds for Stochastic Shortest Path
Authors:
Alon Cohen,
Haim Kaplan,
Yishay Mansour,
Aviv Rosenberg
Abstract:
Stochastic shortest path (SSP) is a well-known problem in planning and control, in which an agent has to reach a goal state in minimum total expected cost. In the learning formulation of the problem, the agent is unaware of the environment dynamics (i.e., the transition function) and has to repeatedly play for a given number of episodes while reasoning about the problem's optimal solution. Unlike…
▽ More
Stochastic shortest path (SSP) is a well-known problem in planning and control, in which an agent has to reach a goal state in minimum total expected cost. In the learning formulation of the problem, the agent is unaware of the environment dynamics (i.e., the transition function) and has to repeatedly play for a given number of episodes while reasoning about the problem's optimal solution. Unlike other well-studied models in reinforcement learning (RL), the length of an episode is not predetermined (or bounded) and is influenced by the agent's actions. Recently, Tarbouriech et al. (2019) studied this problem in the context of regret minimization and provided an algorithm whose regret bound is inversely proportional to the square root of the minimum instantaneous cost. In this work we remove this dependence on the minimum cost---we give an algorithm that guarantees a regret bound of $\widetilde{O}(B_\star |S| \sqrt{|A| K})$, where $B_\star$ is an upper bound on the expected cost of the optimal policy, $S$ is the set of states, $A$ is the set of actions and $K$ is the number of episodes. We additionally show that any learning algorithm must have at least $Ω(B_\star \sqrt{|S| |A| K})$ regret in the worst case.
△ Less
Submitted 23 February, 2020;
originally announced February 2020.
-
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
Authors:
Asaf Cassel,
Alon Cohen,
Tomer Koren
Abstract:
We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with t…
▽ More
We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with the number of steps in two scenarios: when only the state transition matrix $A$ is unknown, and when only the state-action transition matrix $B$ is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound that shows that when the latter condition is violated, square root regret is unavoidable.
△ Less
Submitted 1 July, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.