-
Improving Autoformalization using Type Checking
Authors:
Auguste Poiroux,
Gail Weiss,
Viktor Kunčak,
Antoine Bosselut
Abstract:
Large language models show promise for autoformalization, the task of automatically translating natural language into formal languages. However, current autoformalization methods remain limited. The last reported state-of-the-art performance on the ProofNet formalization benchmark for the Lean proof assistant, achieved using Codex for Lean 3, only showed successful formalization of 16.1% of inform…
▽ More
Large language models show promise for autoformalization, the task of automatically translating natural language into formal languages. However, current autoformalization methods remain limited. The last reported state-of-the-art performance on the ProofNet formalization benchmark for the Lean proof assistant, achieved using Codex for Lean 3, only showed successful formalization of 16.1% of informal statements. Similarly, our evaluation of GPT-4o for Lean 4 only produces successful translations 34.9% of the time. Our analysis shows that the performance of these models is largely limited by their inability to generate formal statements that successfully type-check (i.e., are syntactically correct and consistent with types) - with a whop** 86.6% of GPT-4o errors starting from a type-check failure. In this work, we propose a method to fix this issue through decoding with type-check filtering, where we initially sample a diverse set of candidate formalizations for an informal statement, then use the Lean proof assistant to filter out candidates that do not type-check. Using GPT-4o as a base model, and combining our method with self-consistency, we obtain a +18.3% absolute increase in formalization accuracy, and achieve a new state-of-the-art of 53.2% on ProofNet with Lean 4.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
A singular perturbation approach to the Dirichlet-area minimisation problem
Authors:
Anthony Salib,
Georg S. Weiss
Abstract:
We study both one and two-phase minimisers of the Dirichlet-area energy $$E(v) = \int_{B_1} \vert\nabla v\vert^2 + Per(\{v>0\},B_1).$$ In the two-phase case, we show that the energies $$E_{\varepsilon}(v) = \int_{B_1}\vert\nabla v\vert^2 + \frac{1}{\varepsilon}W\left(\frac{v}{\varepsilon^{1/2}}\right),$$ $Γ$-converge to $E$ as $\varepsilon \to 0$, where $W$ is the double well potential extended by…
▽ More
We study both one and two-phase minimisers of the Dirichlet-area energy $$E(v) = \int_{B_1} \vert\nabla v\vert^2 + Per(\{v>0\},B_1).$$ In the two-phase case, we show that the energies $$E_{\varepsilon}(v) = \int_{B_1}\vert\nabla v\vert^2 + \frac{1}{\varepsilon}W\left(\frac{v}{\varepsilon^{1/2}}\right),$$ $Γ$-converge to $E$ as $\varepsilon \to 0$, where $W$ is the double well potential extended by zero outside of $[-1,1]$ . As a consequence, we show that bounded local minimisers of $E_{\varepsilon}$ converge to a local minimiser of $E$.
△ Less
Submitted 26 June, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Kee** Behavioral Programs Alive: Specifying and Executing Liveness Requirements
Authors:
Tom Yaacov,
Achiya Elyasaf,
Gera Weiss
Abstract:
One of the benefits of using executable specifications such as Behavioral Programming (BP) is the ability to align the system implementation with its requirements. This is facilitated in BP by a protocol that allows independent implementation modules that specify what the system may, must, and must not do. By that, each module can enforce a single system requirement, including negative specificati…
▽ More
One of the benefits of using executable specifications such as Behavioral Programming (BP) is the ability to align the system implementation with its requirements. This is facilitated in BP by a protocol that allows independent implementation modules that specify what the system may, must, and must not do. By that, each module can enforce a single system requirement, including negative specifications such as "don't do X after Y." The existing BP protocol, however, allows only the enforcement of safety requirements and does not support the execution of liveness properties such as "do X at least three times." To model liveness requirements in BP directly and independently, we propose idioms for tagging states with "must-finish," indicating that tasks are yet to be completed. We show that this idiom allows a direct specification of known requirements patterns from the literature. We also offer semantics and two execution mechanisms, one based on a translation to Büchi automata and the other based on a Markov decision process (MDP). The latter approach offers the possibility of utilizing deep reinforcement learning (DRL) algorithms, which bear the potential to handle large software systems effectively. This paper presents a qualitative and quantitative assessment of the proposed approach using a proof-of-concept tool. A formal analysis of the MDP-based execution mechanism is given in an appendix.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Evolving Assembly Code in an Adversarial Environment
Authors:
Irina Maliukov,
Gera Weiss,
Oded Margalit,
Achiya Elyasaf
Abstract:
In this work, we evolve Assembly code for the CodeGuru competition. The goal is to create a survivor -- an Assembly program that runs the longest in shared memory, by resisting attacks from adversary survivors and finding their weaknesses. For evolving top-notch solvers, we specify a Backus Normal Form (BNF) for the Assembly language and synthesize the code from scratch using Genetic Programming (…
▽ More
In this work, we evolve Assembly code for the CodeGuru competition. The goal is to create a survivor -- an Assembly program that runs the longest in shared memory, by resisting attacks from adversary survivors and finding their weaknesses. For evolving top-notch solvers, we specify a Backus Normal Form (BNF) for the Assembly language and synthesize the code from scratch using Genetic Programming (GP). We evaluate the survivors by running CodeGuru games against human-written winning survivors. Our evolved programs found weaknesses in the programs they were trained against and utilized them. To push evolution further, we implemented memetic operators that utilize machine learning to explore the solution space effectively. This work has important applications for cyber-security as we utilize evolution to detect weaknesses in survivors. The Assembly BNF is domain-independent; thus, by modifying the fitness function, it can detect code weaknesses and help fix them. Finally, the CodeGuru competition offers a novel platform for analyzing GP and code evolution in adversarial environments. To support further research in this direction, we provide a thorough qualitative analysis of the evolved survivors and the weaknesses found.
△ Less
Submitted 10 June, 2024; v1 submitted 28 March, 2024;
originally announced March 2024.
-
Classification of global solutions to the obstacle problem in the plane
Authors:
Anthony Salib,
Georg Weiss
Abstract:
Global solutions to the obstacle problem were first completely classified in two dimensions by Sakai using complex analysis techniques. Although the complex analysis approach produced a very succinct proof in two dimensions, it left the higher dimensional cases, and even closely related problems in two dimensions, unresolved. A complete classification in dimensions $n\geq 3$ was recently given by…
▽ More
Global solutions to the obstacle problem were first completely classified in two dimensions by Sakai using complex analysis techniques. Although the complex analysis approach produced a very succinct proof in two dimensions, it left the higher dimensional cases, and even closely related problems in two dimensions, unresolved. A complete classification in dimensions $n\geq 3$ was recently given by Eberle, Figalli and Weiss, forty years after Sakai published his proof. In this paper we give a proof of Sakai's classification result for unbounded coincidence sets in the spirit of the recent proof by Eberle, Figalli and Weiss. Our approach, in particular, avoids the need for complex analysis techniques and offers new perspectives on two-dimensional problems that complex analysis cannot address.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
What Formal Languages Can Transformers Express? A Survey
Authors:
Lena Strobl,
William Merrill,
Gail Weiss,
David Chiang,
Dana Angluin
Abstract:
As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as formal languages. Exploring such questions can help clarify the power of transformers relative to other models of computation, their fundamental capabilities and limits, and the impact of architectural choices. Work…
▽ More
As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as formal languages. Exploring such questions can help clarify the power of transformers relative to other models of computation, their fundamental capabilities and limits, and the impact of architectural choices. Work in this subarea has made considerable progress in recent years. Here, we undertake a comprehensive survey of this work, documenting the diverse assumptions that underlie different results and providing a unified framework for harmonizing seemingly contradictory findings.
△ Less
Submitted 6 May, 2024; v1 submitted 31 October, 2023;
originally announced November 2023.
-
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
Authors:
Deniz Bayazit,
Negar Foroutan,
Zeming Chen,
Gail Weiss,
Antoine Bosselut
Abstract:
Pretrained language models (LMs) encode implicit representations of knowledge in their parameters. However, localizing these representations and disentangling them from each other remains an open problem. In this work, we investigate whether pretrained language models contain various knowledge-critical subnetworks: particular sparse computational subgraphs responsible for encoding specific knowled…
▽ More
Pretrained language models (LMs) encode implicit representations of knowledge in their parameters. However, localizing these representations and disentangling them from each other remains an open problem. In this work, we investigate whether pretrained language models contain various knowledge-critical subnetworks: particular sparse computational subgraphs responsible for encoding specific knowledge the model has memorized. We propose a multi-objective differentiable weight masking scheme to discover these subnetworks and show that we can use them to precisely remove specific knowledge from models while minimizing adverse effects on the behavior of the original language model. We demonstrate our method on multiple GPT2 variants, uncovering highly sparse subnetworks (98%+) that are solely responsible for specific collections of relational knowledge. When these subnetworks are removed, the remaining network maintains most of its initial capacity (modeling language and other memorized relational knowledge) but struggles to express the removed knowledge, and suffers performance drops on examples needing this removed knowledge on downstream tasks after finetuning.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Provengo: A Tool Suite for Scenario Driven Model-Based Testing
Authors:
Michael Bar-Sinai,
Achiya Elyasaf,
Gera Weiss,
Yeshayahu Weiss
Abstract:
We present Provengo, a comprehensive suite of tools designed to facilitate the implementation of Scenario-Driven Model-Based Testing (SDMBT), an innovative approach that utilizes scenarios to construct a model encompassing the user's perspective and the system's business value while also defining the desired outcomes. With the assistance of Provengo, testers gain the ability to effortlessly create…
▽ More
We present Provengo, a comprehensive suite of tools designed to facilitate the implementation of Scenario-Driven Model-Based Testing (SDMBT), an innovative approach that utilizes scenarios to construct a model encompassing the user's perspective and the system's business value while also defining the desired outcomes. With the assistance of Provengo, testers gain the ability to effortlessly create natural user stories and seamlessly integrate them into a model capable of generating effective tests. The demonstration illustrates how SDMBT effectively addresses the bootstrap** challenge commonly encountered in model-based testing (MBT) by enabling incremental development, starting from simple models and gradually augmenting them with additional stories.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Rectifiability, finite Hausdorff measure, and compactness for non-minimizing Bernoulli free boundaries
Authors:
Dennis Kriventsov,
Georg S. Weiss
Abstract:
While there are numerous results on minimizers or stable solutions of the Bernoulli problem proving regularity of the free boundary and analyzing singularities, much less in known about critical points of the corresponding energy. Saddle points of the energy (or of closely related energies) and solutions of the corresponding time-dependent problem occur naturally in applied problems such as water…
▽ More
While there are numerous results on minimizers or stable solutions of the Bernoulli problem proving regularity of the free boundary and analyzing singularities, much less in known about critical points of the corresponding energy. Saddle points of the energy (or of closely related energies) and solutions of the corresponding time-dependent problem occur naturally in applied problems such as water waves and combustion theory.
For such critical points $u$ -- which can be obtained as limits of classical solutions or limits of a singular perturbation problem -- it has been open since [Weiss03] whether the singular set can be large and what equation the measure $Δu$ satisfies, except for the case of two dimensions. In the present result we use recent techniques such as a frequency formula for the Bernoulli problem as well as the celebrated Naber-Valtorta procedure to answer this more than 20 year old question in an affirmative way:
For a closed class we call variational solutions of the Bernoulli problem, we show that the topological free boundary $\partial \{u > 0\}$ (including degenerate singular points $x$, at which $u(x + r \cdot)/r \rightarrow 0$ as $r\rightarrow 0$) is countably $\mathcal{H}^{n-1}$-rectifiable and has locally finite $\mathcal{H}^{n-1}$-measure, and we identify the measure $Δu$ completely. This gives a more precise characterization of the free boundary of $u$ in arbitrary dimension than was previously available even in dimension two.
We also show that limits of (not necessarily minimizing) classical solutions as well as limits of critical points of a singularly perturbed energy are variational solutions, so that the result above applies directly to all of them.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
RECKONING: Reasoning through Dynamic Knowledge Encoding
Authors:
Zeming Chen,
Gail Weiss,
Eric Mitchell,
Asli Celikyilmaz,
Antoine Bosselut
Abstract:
Recent studies on transformer-based language models show that they can answer questions by reasoning over knowledge provided as part of the context (i.e., in-context reasoning). However, since the available knowledge is often not filtered for a particular question, in-context reasoning can be sensitive to distractor facts, additional content that is irrelevant to a question but that may be relevan…
▽ More
Recent studies on transformer-based language models show that they can answer questions by reasoning over knowledge provided as part of the context (i.e., in-context reasoning). However, since the available knowledge is often not filtered for a particular question, in-context reasoning can be sensitive to distractor facts, additional content that is irrelevant to a question but that may be relevant for a different question (i.e., not necessarily random noise). In these situations, the model fails to distinguish the knowledge that is necessary to answer the question, leading to spurious reasoning and degraded performance. This reasoning failure contrasts with the model's apparent ability to distinguish its contextual knowledge from all the knowledge it has memorized during pre-training. Following this observation, we propose teaching the model to reason more robustly by folding the provided contextual knowledge into the model's parameters before presenting it with a question. Our method, RECKONING, is a bi-level learning algorithm that teaches language models to reason by updating their parametric knowledge through back-propagation, allowing them to then answer questions using the updated parameters. During training, the inner loop rapidly adapts a copy of the model weights to encode contextual knowledge into its parameters. In the outer loop, the model learns to use the updated weights to reproduce and answer reasoning questions about the memorized knowledge. Our experiments on two multi-hop reasoning datasets show that RECKONING's performance improves over the in-context reasoning baseline (by up to 4.5%). We also find that compared to in-context reasoning, RECKONING generalizes better to longer reasoning chains unseen during training, is more robust to distractors in the context, and is more computationally efficient when multiple questions are asked about the same knowledge.
△ Less
Submitted 5 November, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Automatic selfadjoint-ideal semigroups for finite matrices
Authors:
Sasmita Patnaik,
Sanehlata,
Gary Weiss
Abstract:
The notion of automatic selfadjointness of all ideals in a multiplicative semigroup of the bounded linear operators on a separable Hilbert space B(H) arose in a 2015 discussion with Heydar Radjavi who pointed out that B(H) and the finite rank operators F(H) possessed this unitary invariant property which category we named SI semigroups (for automatic selfadjoint ideal semigroups). Equivalent to th…
▽ More
The notion of automatic selfadjointness of all ideals in a multiplicative semigroup of the bounded linear operators on a separable Hilbert space B(H) arose in a 2015 discussion with Heydar Radjavi who pointed out that B(H) and the finite rank operators F(H) possessed this unitary invariant property which category we named SI semigroups (for automatic selfadjoint ideal semigroups). Equivalent to the SI property is the solvability, for each A in the semigroup, of the bilinear operator equation A^* = XAY which we believe is a new connection relating the semigroup theory with the theory of operator equations. We found in our earlier works in the subject that even at the basic level of singly generated semigroups, the investigation of SI semigroups led to interesting algebraic and analytic phenomena when generated by rank one operators, normal operators, partial and power partial isometries, subnormal-hyponormal-essentially normal operators, and weighted shift operators; and generated by commuting families of normal operators. In this paper, we focus on a separate M_n(C) treatment for singly generated SI semigroups that requires studying the solvability of the bilinear matrix equation A^* = XAY in a multiplicative semigroup of finite matrices. This separate focus is needed because the techniques employed in our earlier works we could not adapt to finite matrices. In this paper we find that for certain classes of generators, being a partial isometry is equivalent to generating an SI semigroup. Such classes are: degree 2 nilpotent matrices, weighted shifts, and non-normal Jordan matrices. For the key tools used to establish these equivalences, we developed a number of necessary conditions for singly generated semigroups to be SI for the very general classes: nonselfadjoint matrices, nonzero nilpotent matrices, nonselfadjoint invertible matrices, and Jordan blocks.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
Singly generated Selfadjoint-Ideal operator semigroups: spectral density of the generator and simplicity
Authors:
Sasmita Patnaik,
Sanehlata,
Gary Weiss
Abstract:
This extends our new study of the automatic selfadjoint ideal property for B(H)-operator semigroups introduced to us by Heydar Radjavi (SI semigroups for short). Our investigation here of singly generated SI semigroups led to unexpected algebraic and analytic phenomena on the simplicity of SI semigroups and on the spectral density of their generators. In particular: the SI property yields for a hy…
▽ More
This extends our new study of the automatic selfadjoint ideal property for B(H)-operator semigroups introduced to us by Heydar Radjavi (SI semigroups for short). Our investigation here of singly generated SI semigroups led to unexpected algebraic and analytic phenomena on the simplicity of SI semigroups and on the spectral density of their generators. In particular: the SI property yields for a hyponormal operator, zero planar area measure of its approximate point spectrum; the same for the essential spectrum of an essentially normal operator; and that SI semigroups generated by unilateral weighted shifts with periodic nonzero weights are simple. We also characterized the simplicity of the SI semigroups generated by certain commuting classes of normal operators.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
Optimization of Cartesian Tasks with Configuration Selection
Authors:
Martin G. Weiß
Abstract:
A basic task in the design of an industrial robot application is the relative placement of robot and workpiece. Process points are defined in Cartesian coordinates relative to the workpiece coordinate system, and the workpiece has to be located such that the robot can reach all points. Finding such a location is still an iterative procedure based on the developers' intuition. One difficulty is the…
▽ More
A basic task in the design of an industrial robot application is the relative placement of robot and workpiece. Process points are defined in Cartesian coordinates relative to the workpiece coordinate system, and the workpiece has to be located such that the robot can reach all points. Finding such a location is still an iterative procedure based on the developers' intuition. One difficulty is the choice of one of the several solutions of the backward transform of a typical 6R robot. % combined with the limited range of the axes. We present a novel algorithm that simultaneously optimizes the workpiece location and the robot configuration at all process points using higher order optimization algorithms. A key ingredient is the extension of the robot with a virtual prismatic axis. The practical feasibility of the approach is shown with an example using a commercial industrial robot.
△ Less
Submitted 18 February, 2023;
originally announced February 2023.
-
Non-Convergence and Limit Cycles in the Adam optimizer
Authors:
Sebastian Bock,
Martin Georg Weiß
Abstract:
One of the most popular training algorithms for deep neural networks is the Adaptive Moment Estimation (Adam) introduced by Kingma and Ba. Despite its success in many applications there is no satisfactory convergence analysis: only local convergence can be shown for batch mode under some restrictions on the hyperparameters, counterexamples exist for incremental mode. Recent results show that for s…
▽ More
One of the most popular training algorithms for deep neural networks is the Adaptive Moment Estimation (Adam) introduced by Kingma and Ba. Despite its success in many applications there is no satisfactory convergence analysis: only local convergence can be shown for batch mode under some restrictions on the hyperparameters, counterexamples exist for incremental mode. Recent results show that for simple quadratic objective functions limit cycles of period 2 exist in batch mode, but only for atypical hyperparameters, and only for the algorithm without bias correction. %More general there are several more adaptive gradient methods which try to estimate a fitting learning rate and / or search direction from the training data to improve the learning process compared to pure gradient descent with fixed learningrate. We extend the convergence analysis for Adam in the batch mode with bias correction and show that even for quadratic objective functions as the simplest case of convex functions 2-limit-cycles exist, for all choices of the hyperparameters. We analyze the stability of these limit cycles and relate our analysis to other results where approximate convergence was shown, but under the additional assumption of bounded gradients which does not apply to quadratic functions. The investigation heavily relies on the use of computer algebra due to the complexity of the equations.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
On Complex Analytic tools, and the Holomorphic Rotation methods
Authors:
Ronald R. Coifman,
Jacques Peyrière,
Guido Weiss
Abstract:
We describe recent nonlinear analytic approximation tools in the classical setting of Hardy spaces in the upper half plane and show how to transfer them to the higher dimensional real setting of harmonic functions in upper half spaces. It is known [6] that all harmonic functions in higher dimensions are combinations of holomorphic functions on 2 dimensional planes, extended as, constant in normal…
▽ More
We describe recent nonlinear analytic approximation tools in the classical setting of Hardy spaces in the upper half plane and show how to transfer them to the higher dimensional real setting of harmonic functions in upper half spaces. It is known [6] that all harmonic functions in higher dimensions are combinations of holomorphic functions on 2 dimensional planes, extended as, constant in normal directions. We derive representation theorems, with corresponding isometries, opening the door for applications in higher dimensions, to the processing of highly oscillatory multidimensional signals.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Comparing and quantifying tail dependence
Authors:
Karl Friedrich Siburg,
Christopher Strothmann,
Gregor Weiß
Abstract:
We introduce a new stochastic order for the tail dependence between random variables. We then study different measures of tail dependence which are monotone in the proposed order, thereby extending various known tail dependence coefficients from the literature. We apply our concepts in an empirical study where we investigate the tail dependence for different pairs of S&P 500 stocks and indices, an…
▽ More
We introduce a new stochastic order for the tail dependence between random variables. We then study different measures of tail dependence which are monotone in the proposed order, thereby extending various known tail dependence coefficients from the literature. We apply our concepts in an empirical study where we investigate the tail dependence for different pairs of S&P 500 stocks and indices, and illustrate the advantage of our measures of tail dependence over the classical tail dependence coefficient.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Complete classification of global solutions to the obstacle problem
Authors:
Simon Eberle,
Alessio Figalli,
Georg S. Weiss
Abstract:
The characterization of global solutions to the obstacle problems in $\mathbb{R}^N$, or equivalently of null quadrature domains, has been studied over more than 90 years. In this paper we give a conclusive answer to this problem by proving the following long-standing conjecture: The coincidence set of a global solution to the obstacle problem is either a half-space, an ellipsoid, a paraboloid, or…
▽ More
The characterization of global solutions to the obstacle problems in $\mathbb{R}^N$, or equivalently of null quadrature domains, has been studied over more than 90 years. In this paper we give a conclusive answer to this problem by proving the following long-standing conjecture: The coincidence set of a global solution to the obstacle problem is either a half-space, an ellipsoid, a paraboloid, or a cylinder with an ellipsoid or a paraboloid as base.
△ Less
Submitted 19 August, 2022; v1 submitted 5 August, 2022;
originally announced August 2022.
-
PI control of stable nonlinear plants using projected dynamical systems
Authors:
Pietro Lorenzetti,
George Weiss
Abstract:
This paper presents a novel anti-windup proportional-integral controller for stable multi-input multi-output nonlinear plants. We use tools from projected dynamical systems theory to force the integrator state to remain in a desired (compact and convex) region, such that the plant input steady-state values satisfy the operational constraints of the problem. Under suitable monotonicity assumptions…
▽ More
This paper presents a novel anti-windup proportional-integral controller for stable multi-input multi-output nonlinear plants. We use tools from projected dynamical systems theory to force the integrator state to remain in a desired (compact and convex) region, such that the plant input steady-state values satisfy the operational constraints of the problem. Under suitable monotonicity assumptions on the plant steady-state input-output map, we use singular perturbation theory results to prove the existence of a sufficiently small controller gain ensuring closed-loop (local) exponential stability and reference tracking for a feasible set of constant references. We suggest a particular controller design, which embeds (when possible) the right inverse of the plant steady-state input-output map. The relevance of the proposed controller scheme is validated through an application in the power systems domain, namely, the output (active and reactive) power regulation for a grid-connected synchronverter.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Efficient One Sided Kolmogorov Approximation
Authors:
Liat Cohen,
Tal Grinshpoun,
Gera Weiss
Abstract:
We present an efficient algorithm that, given a discrete random variable $X$ and a number $m$, computes a random variable whose support is of size at most $m$ and whose Kolmogorov distance from $X$ is minimal, also for the one-sided Kolmogorov approximation. We present some variants of the algorithm, analyse their correctness and computational complexity, and present a detailed empirical evaluatio…
▽ More
We present an efficient algorithm that, given a discrete random variable $X$ and a number $m$, computes a random variable whose support is of size at most $m$ and whose Kolmogorov distance from $X$ is minimal, also for the one-sided Kolmogorov approximation. We present some variants of the algorithm, analyse their correctness and computational complexity, and present a detailed empirical evaluation that shows how they performs in practice. The main application that we examine, which is our motivation for this work, is estimation of the probability missing deadlines in series-parallel schedules. Since exact computation of these probabilities is NP-hard, we propose to use the algorithms described in this paper to obtain an approximation.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Normalized gradient flow optimization in the training of ReLU artificial neural networks
Authors:
Simon Eberle,
Arnulf Jentzen,
Adrian Riekert,
Georg Weiss
Abstract:
The training of artificial neural networks (ANNs) is nowadays a highly relevant algorithmic procedure with many applications in science and industry. Roughly speaking, ANNs can be regarded as iterated compositions between affine linear functions and certain fixed nonlinear functions, which are usually multidimensional versions of a one-dimensional so-called activation function. The most popular ch…
▽ More
The training of artificial neural networks (ANNs) is nowadays a highly relevant algorithmic procedure with many applications in science and industry. Roughly speaking, ANNs can be regarded as iterated compositions between affine linear functions and certain fixed nonlinear functions, which are usually multidimensional versions of a one-dimensional so-called activation function. The most popular choice of such a one-dimensional activation function is the rectified linear unit (ReLU) activation function which maps a real number to its positive part $ \mathbb{R} \ni x \mapsto \max\{ x, 0 \} \in \mathbb{R} $. In this article we propose and analyze a modified variant of the standard training procedure of such ReLU ANNs in the sense that we propose to restrict the negative gradient flow dynamics to a large submanifold of the ANN parameter space, which is a strict $ C^{ \infty } $-submanifold of the entire ANN parameter space that seems to enjoy better regularity properties than the entire ANN parameter space but which is also sufficiently large and sufficiently high dimensional so that it can represent all ANN realization functions that can be represented through the entire ANN parameter space. In the special situation of shallow ANNs with just one-dimensional ANN layers we also prove for every Lipschitz continuous target function that every gradient flow trajectory on this large submanifold of the ANN parameter space is globally bounded. For the standard gradient flow on the entire ANN parameter space with Lipschitz continuous target functions it remains an open problem of research to prove or disprove the global boundedness of gradient flow trajectories even in the situation of shallow ANNs with just one-dimensional ANN layers.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
What Petri Net Obliges Us to Say: Comparing Approaches for Behavior Composition
Authors:
Achiya Elyasaf,
Tom Yaacov,
Gera Weiss
Abstract:
We identify and demonstrate a weakness of Petri Nets (PN) in specifying composite behavior of reactive systems. Specifically, we show how, when specifying multiple requirements in one PN model, modelers are obliged to specify mechanisms for combining these requirements. This yields, in many cases, over-specification and incorrect models. We demonstrate how some execution paths are missed, and some…
▽ More
We identify and demonstrate a weakness of Petri Nets (PN) in specifying composite behavior of reactive systems. Specifically, we show how, when specifying multiple requirements in one PN model, modelers are obliged to specify mechanisms for combining these requirements. This yields, in many cases, over-specification and incorrect models. We demonstrate how some execution paths are missed, and some are generated unintentionally. To support this claim, we analyze PN models from the literature, identify the combination mechanisms, and demonstrate their effect on the correctness of the model. To address this problem, we propose to model the system behavior using behavioral programming (BP), a software development and modeling paradigm designed for seamless integration of independent requirements. Specifically, we demonstrate how the semantics of BP, which define how to interweave scenarios into a single model, allow avoiding the over-specification. Additionally, while BP maintains the same mathematical properties as PN, it provides means for changing the model dynamically, thus increasing the agility of the specification. We compare BP and PN in quantitative and qualitative measures by analyzing the models, their generated execution paths, and the specification process. Finally, while BP is supported by tools that allow for applying formal methods and reasoning techniques to the model, it lacks the legacy of PN tools and algorithms. To address this issue, we propose semantics and a tool for translating BP models to PN and vice versa.
△ Less
Submitted 19 April, 2023; v1 submitted 30 April, 2022;
originally announced May 2022.
-
The structure of the regular part of the free boundary close to singularities in the obstacle problem
Authors:
Simon Eberle,
Henrik Shahgholian,
Georg Sebastian Weiss
Abstract:
We prove the -- to the best knowledge of the authors -- first result on the fine asymptotic behavior of the regular part of the free boundary of the obstacle problem close to singularities. The result is motivated by our recent partial answer to a long standing conjecture and the first partial classification of global solutions of the obstacle problem with unbounded coincidence sets in higher dime…
▽ More
We prove the -- to the best knowledge of the authors -- first result on the fine asymptotic behavior of the regular part of the free boundary of the obstacle problem close to singularities. The result is motivated by our recent partial answer to a long standing conjecture and the first partial classification of global solutions of the obstacle problem with unbounded coincidence sets in higher dimensions.
△ Less
Submitted 17 October, 2023; v1 submitted 7 April, 2022;
originally announced April 2022.
-
On commutators of compact operators via block tridiagonalization: generalizations and limitations of Anderson's approach
Authors:
Jireh Loreaux,
Sasmita Patnaik,
Srdjan Petrovic,
Gary Weiss
Abstract:
We offer a new perspective and some advances on the 1971 Pearcy--Top** problem: Is every compact operator a commutator of compact operators? Our goal is to analyze and generalize the 1970's work in this area of Joel Anderson combined with the work of the last named author of this paper. We reduce the general problem to a simpler sequence of finite matrix equations with norm constraints, while at…
▽ More
We offer a new perspective and some advances on the 1971 Pearcy--Top** problem: Is every compact operator a commutator of compact operators? Our goal is to analyze and generalize the 1970's work in this area of Joel Anderson combined with the work of the last named author of this paper. We reduce the general problem to a simpler sequence of finite matrix equations with norm constraints, while at the same time develo** strategies for counterexamples. Our approach is to ask which compact operators $T$ are commutators $AB-BA$ of compact operators $A,B$; and to analyze the implications of Joel Anderson's contributions to this problem, which will yield a generalization of his method. By extending the techniques of Anderson [1] we obtain new classes of operators that are commutators of compact operators beyond those obtained in [17] and [2]. And by employing the techniques of the last named author [22], we found obstructions to extending Anderson's techniques in terms of certain constraints for $T$, with special focus on when $T$ is a strictly positive compact diagonal operator. Some of these constraints involve general universal block tridiagonal matrix forms for operators, and some involve $\mathcal{B(H)}$-ideal constraints. And in terms of these matrix forms, we give some equivalences, some sufficient conditions and some necessary conditions for this Pearcy--Top** problem and its various offshoots to hold true. These matrix forms are a sparsification of matrix representations of an operator (an increase in the proportion of zeros in its corners by a change of basis) and we measure the support density of these forms. And finally we provide some necessary conditions for the Pearcy--Top** problem involving singular numbers and $\mathcal{B(H)}$-ideal constraints.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
The Normalized Edit Distance with Uniform Operation Costs is a Metric
Authors:
Dana Fisman,
Joshua Grogin,
Oded Margalit,
Gera Weiss
Abstract:
We prove that the normalized edit distance proposed in [Marzal and Vidal 1993] is a metric when the cost of all the edit operations are the same. This closes a long standing gap in the literature where several authors noted that this distance does not satisfy the triangle inequality in the general case, and that it was not known whether it is satisfied in the uniform case where all the edit costs…
▽ More
We prove that the normalized edit distance proposed in [Marzal and Vidal 1993] is a metric when the cost of all the edit operations are the same. This closes a long standing gap in the literature where several authors noted that this distance does not satisfy the triangle inequality in the general case, and that it was not known whether it is satisfied in the uniform case where all the edit costs are equal. We compare this metric to two normalized metrics proposed as alternatives in the literature, when people thought that Marzal's and Vidal's distance is not a metric, and identify key properties that explain why the original distance, now known to also be a metric, is better for some applications. Our examination is from a point of view of formal verification, but the properties and their significance are stated in an application agnostic way.
△ Less
Submitted 23 April, 2022; v1 submitted 16 January, 2022;
originally announced January 2022.
-
Generalized Coverage Criteria for Combinatorial Sequence Testing
Authors:
Achiya Elyasaf,
Eitan Farchi,
Oded Margalit,
Gera Weiss,
Yeshayahu Weiss
Abstract:
We present a new model-based approach for testing systems that use sequences of actions and assertions as test vectors. Our solution includes a method for quantifying testing quality, a tool for generating high-quality test suites based on the coverage criteria we propose, and a framework for assessing risks. For testing quality, we propose a method that specifies generalized coverage criteria ove…
▽ More
We present a new model-based approach for testing systems that use sequences of actions and assertions as test vectors. Our solution includes a method for quantifying testing quality, a tool for generating high-quality test suites based on the coverage criteria we propose, and a framework for assessing risks. For testing quality, we propose a method that specifies generalized coverage criteria over sequences of actions, which extends previous approaches. Our publicly available tool demonstrates how to extract effective test suites from test plans based on these criteria. We also present a Bayesian approach for measuring the probabilities of bugs or risks, and show how this quantification can help achieve an informed balance between exploitation and exploration in testing. Finally, we provide an empirical evaluation demonstrating the effectiveness of our tool in finding bugs, assessing risks, and achieving coverage.
△ Less
Submitted 31 October, 2023; v1 submitted 3 January, 2022;
originally announced January 2022.
-
Strong stabilization of (almost) impedance passive systems by static output feedback
Authors:
Ruth Curtain,
George Weiss
Abstract:
The plant to be stabilized is a system node $Σ$ with generating triple $(A,B,C)$ and transfer function $\bf G$, where $A$ generates a contraction semigroup on the Hilbert space $X$. The control and observation operators $B$ and $C$ may be unbounded and they are not assumed to be admissible. The crucial assumption is that there exists a bounded operator $E$ such that, if we replace ${\bf G}(s)$ by…
▽ More
The plant to be stabilized is a system node $Σ$ with generating triple $(A,B,C)$ and transfer function $\bf G$, where $A$ generates a contraction semigroup on the Hilbert space $X$. The control and observation operators $B$ and $C$ may be unbounded and they are not assumed to be admissible. The crucial assumption is that there exists a bounded operator $E$ such that, if we replace ${\bf G}(s)$ by ${\bf G}(s)+E$, the new system $Σ_E$ becomes impedance passive. An easier case is when $\bf G$ is already impedance passive and a special case is when \mm $Σ$ has colocated sensors and actuators. Such systems include many wave, beam and heat equations with sensors and actuators on the boundary. It has been shown for many particular cases that the feedback $u=-κy+v$, where $u$ is the input of the plant and $κ>0$, stabilizes $Σ$, strongly or even exponentially. Here, $y$ is the output of \m $Σ$ and $v$ is the new input. Our main result is that if for some $E\in{\mathcal L}(U)$, $Σ_E$ is impedance passive, and \m $Σ$ is approximately observable or approximately controllable in infinite time, then for sufficiently small $κ$ the closed-loop system is weakly stable. If, moreover, $σ(A)\cap i{\mathbb R}$ is countable, then the closed-loop semigroup and its dual are both strongly stable.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Marginals Versus Copulas: Which Account For More Model Risk In Multivariate Risk Forecasting?
Authors:
Simon Fritzsch,
Maike Timphus,
Gregor Weiss
Abstract:
Copulas. We study the model risk of multivariate risk models in a comprehensive empirical study on Copula-GARCH models used for forecasting Value-at-Risk and Expected Shortfall. To determine whether model risk inherent in the forecasting of portfolio risk is caused by the candidate marginal or copula models, we analyze different groups of models in which we fix either the marginals, the copula, or…
▽ More
Copulas. We study the model risk of multivariate risk models in a comprehensive empirical study on Copula-GARCH models used for forecasting Value-at-Risk and Expected Shortfall. To determine whether model risk inherent in the forecasting of portfolio risk is caused by the candidate marginal or copula models, we analyze different groups of models in which we fix either the marginals, the copula, or neither. Model risk is economically significant, is especially high during periods of crisis, and is almost completely due to the choice of the copula. We then propose the use of the model confidence set procedure to narrow down the set of available models and reduce model risk for Copula-GARCH risk models. Our proposed approach leads to a significant improvement in the mean absolute deviation of one day ahead forecasts by our various candidate risk models.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
Existence, uniqueness, and convergence rates for gradient flows in the training of artificial neural networks with ReLU activation
Authors:
Simon Eberle,
Arnulf Jentzen,
Adrian Riekert,
Georg S. Weiss
Abstract:
The training of artificial neural networks (ANNs) with rectified linear unit (ReLU) activation via gradient descent (GD) type optimization schemes is nowadays a common industrially relevant procedure. Till this day in the scientific literature there is in general no mathematical convergence analysis which explains the numerical success of GD type optimization schemes in the training of ANNs with R…
▽ More
The training of artificial neural networks (ANNs) with rectified linear unit (ReLU) activation via gradient descent (GD) type optimization schemes is nowadays a common industrially relevant procedure. Till this day in the scientific literature there is in general no mathematical convergence analysis which explains the numerical success of GD type optimization schemes in the training of ANNs with ReLU activation. GD type optimization schemes can be regarded as temporal discretization methods for the gradient flow (GF) differential equations associated to the considered optimization problem and, in view of this, it seems to be a natural direction of research to first aim to develop a mathematical convergence theory for time-continuous GF differential equations and, thereafter, to aim to extend such a time-continuous convergence theory to implementable time-discrete GD type optimization methods. In this article we establish two basic results for GF differential equations in the training of fully-connected feedforward ANNs with one hidden layer and ReLU activation. In the first main result of this article we establish in the training of such ANNs under the assumption that the probability distribution of the input data of the considered supervised learning problem is absolutely continuous with a bounded density function that every GF differential equation admits for every initial value a solution which is also unique among a suitable class of solutions. In the second main result of this article we prove in the training of such ANNs under the assumption that the target function and the density function of the probability distribution of the input data are piecewise polynomial that every non-divergent GF trajectory converges with an appropriate rate of convergence to a critical point and that the risk of the non-divergent GF trajectory converges with rate 1 to the risk of the critical point.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
Thinking Like Transformers
Authors:
Gail Weiss,
Yoav Goldberg,
Eran Yahav
Abstract:
What is the computational model behind a Transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language.…
▽ More
What is the computational model behind a Transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, Transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language. We map the basic components of a transformer-encoder -- attention and feed-forward computation -- into simple primitives, around which we form a programming language: the Restricted Access Sequence Processing Language (RASP). We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer, and how a Transformer can be trained to mimic a RASP solution. In particular, we provide RASP programs for histograms, sorting, and Dyck-languages. We further use our model to relate their difficulty in terms of the number of required layers and attention heads: analyzing a RASP program implies a maximum number of heads and layers necessary to encode a task in a transformer. Finally, we see how insights gained from our abstraction might be used to explain phenomena seen in recent works.
△ Less
Submitted 19 July, 2021; v1 submitted 13 June, 2021;
originally announced June 2021.
-
Regularity of the free boundary for a parabolic cooperative system
Authors:
Gohar Aleksanyan,
Morteza Fotouhi,
Henrik Shahgholian,
Georg S. Weiss
Abstract:
In this paper we study the following parabolic system
\begin{equation*}
Δ\u -\partial_t \u =|\u|^{q-1}\u\,χ_{\{ |\u|>0 \}}, \qquad \u = (u^1, \cdots , u^m) \ ,
\end{equation*}
with free boundary $\partial \{|\u | >0\}$.
For $0\leq q<1$, we prove optimal growth rate for solutions $\u $ to the above system near free boundary points, and show that in a uniform neighbourhood of any a priori…
▽ More
In this paper we study the following parabolic system
\begin{equation*}
Δ\u -\partial_t \u =|\u|^{q-1}\u\,χ_{\{ |\u|>0 \}}, \qquad \u = (u^1, \cdots , u^m) \ ,
\end{equation*}
with free boundary $\partial \{|\u | >0\}$.
For $0\leq q<1$, we prove optimal growth rate for solutions $\u $ to the above system near free boundary points, and show that in a uniform neighbourhood of any a priori well-behaved free boundary point the free boundary is $C^{1, α}$ in space directions and half-Lipschitz in the time direction.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Adapting Behaviors via Reactive Synthesis
Authors:
Gal Amram,
Suguman Bansal,
Dror Fried,
Lucas M. Tabajara,
Moshe Y. Vardi,
Gera Weiss
Abstract:
In the \emph{Adapter Design Pattern}, a programmer implements a \emph{Target} interface by constructing an \emph{Adapter} that accesses an existing \emph{Adaptee} code. In this work, we present a reactive synthesis interpretation to the adapter design pattern, wherein an algorithm takes an \emph{Adaptee} and a \emph{Target} transducers, and the aim is to synthesize an \emph{Adapter} transducer tha…
▽ More
In the \emph{Adapter Design Pattern}, a programmer implements a \emph{Target} interface by constructing an \emph{Adapter} that accesses an existing \emph{Adaptee} code. In this work, we present a reactive synthesis interpretation to the adapter design pattern, wherein an algorithm takes an \emph{Adaptee} and a \emph{Target} transducers, and the aim is to synthesize an \emph{Adapter} transducer that, when composed with the {\em Adaptee}, generates a behavior that is equivalent to the behavior of the {\em Target}. One use of such an algorithm is to synthesize controllers that achieve similar goals on different hardware platforms. While this problem can be solved with existing synthesis algorithms, current state-of-the-art tools fail to scale. To cope with the computational complexity of the problem, we introduce a special form of specification format, called {\em Separated GR($k$)}, which can be solved with a scalable synthesis algorithm but still allows for a large set of realistic specifications. We solve the realizability and the synthesis problems for Separated GR($k$), and show how to exploit the separated nature of our specification to construct better algorithms, in terms of time complexity, than known algorithms for GR($k$) synthesis. We then describe a tool, called SGR($k$), that we have implemented based on the above approach and show, by experimental evaluation, how our tool outperforms current state-of-the-art tools on various benchmarks and test-cases.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Biologically Inspired Semantic Lateral Connectivity for Convolutional Neural Networks
Authors:
Tonio Weidler,
Julian Lehnen,
Quinton Denman,
Dávid Sebők,
Gerhard Weiss,
Kurt Driessens,
Mario Senden
Abstract:
Lateral connections play an important role for sensory processing in visual cortex by supporting discriminable neuronal responses even to highly similar features. In the present work, we show that establishing a biologically inspired Mexican hat lateral connectivity profile along the filter domain can significantly improve the classification accuracy of a variety of lightweight convolutional neura…
▽ More
Lateral connections play an important role for sensory processing in visual cortex by supporting discriminable neuronal responses even to highly similar features. In the present work, we show that establishing a biologically inspired Mexican hat lateral connectivity profile along the filter domain can significantly improve the classification accuracy of a variety of lightweight convolutional neural networks without the addition of trainable network parameters. Moreover, we demonstrate that it is possible to analytically determine the stationary distribution of modulated filter activations and thereby avoid using recurrence for modeling temporal dynamics. We furthermore reveal that the Mexican hat connectivity profile has the effect of ordering filters in a sequence resembling the topographic organization of feature selectivity in early visual cortex. In an ordered filter sequence, this profile then sharpens the filters' tuning curves.
△ Less
Submitted 20 May, 2021;
originally announced May 2021.
-
Identifying Hubs in Undergraduate Course Networks Based on Scaled Co-Enrollments: Extended Version
Authors:
Gary M. Weiss,
Nam Nguyen,
Karla Dominguez,
Daniel D. Leeds
Abstract:
Understanding course enrollment patterns is valuable to predict upcoming demands for future courses, and to provide student with realistic courses to pursue given their current backgrounds. This study uses undergraduate student enrollment data to form networks of courses where connections are based on student co-enrollments. The course networks generated in this paper are based on eight years of u…
▽ More
Understanding course enrollment patterns is valuable to predict upcoming demands for future courses, and to provide student with realistic courses to pursue given their current backgrounds. This study uses undergraduate student enrollment data to form networks of courses where connections are based on student co-enrollments. The course networks generated in this paper are based on eight years of undergraduate course enrollment data from a large metropolitan university. The networks are analyzed to identify "hub" courses often taken with many other courses. Two notions of hubs are considered: one focused on raw popularity across all students, and one focused on proportional likelihoods of co-enrollment with other courses. A variety of network metrics are calculated to evaluate the course networks. Academic departments and high-level academic categories, such as Humanities vs STEM, are studied for their influence over course grou**s. The identification of hub courses has practical applications, since it can help better predict the impact of changes in course offerings and in course popularity, and in the case of interdisciplinary hub courses, can be used to increase or decrease interest and enrollments in specific academic departments and areas.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
A Cycle Joining Construction of the Prefer-Max De Bruijn Sequence
Authors:
Gal Amram,
Amir Rubin,
Gera Weiss
Abstract:
We propose a novel construction for the well-known prefer-max De Bruijn sequence, based on the cycle joining technique. We further show that the construction implies known results from the literature in a straightforward manner. First, it implies the correctness of the onion theorem, stating that, effectively, the reverse of prefer-max is in fact an infinite De Bruijn sequence. Second, it implies…
▽ More
We propose a novel construction for the well-known prefer-max De Bruijn sequence, based on the cycle joining technique. We further show that the construction implies known results from the literature in a straightforward manner. First, it implies the correctness of the onion theorem, stating that, effectively, the reverse of prefer-max is in fact an infinite De Bruijn sequence. Second, it implies the correctness of recently discovered shift rules for prefer-max, prefer-min, and their reversals. Lastly, it forms an alternative proof for the seminal FKM-theorem.
△ Less
Submitted 7 April, 2021;
originally announced April 2021.
-
Matrix splitting and ideals in $\mathcal{B}(\mathcal{H})$
Authors:
Jireh Loreaux,
Gary Weiss
Abstract:
We investigate the relationship between ideal membership of an operator and its pieces relative to several canonical types of partitions of the entries of its matrix representation with respect to a given orthonormal basis. Our main theorems establish that if $T$ lies in an ideal $\mathcal{I}$, then $\sum P_n T P_n$ (or more generally $\sum Q_n T P_n$) lies in the arithmetic mean closure of…
▽ More
We investigate the relationship between ideal membership of an operator and its pieces relative to several canonical types of partitions of the entries of its matrix representation with respect to a given orthonormal basis. Our main theorems establish that if $T$ lies in an ideal $\mathcal{I}$, then $\sum P_n T P_n$ (or more generally $\sum Q_n T P_n$) lies in the arithmetic mean closure of $\mathcal{I}$ whenever $\{P_n\}$ (and also $\{Q_n\}$) is a sequence of mutually orthogonal projections; and in any basis for which $T$ is a block band matrix, in particular, when in Patnaik--Petrovic--Weiss universal block tridiagonal form, then all the sub/super/main-block diagonals of $T$ are in $\mathcal{I}$. And in particular, the principal ideal generated by this $T$ is the finite sum of the principal ideals generated by each sub/super/main-block diagonals.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
The equilibrium points and stability of grid-connected synchronverters
Authors:
Pietro Lorenzetti,
Zeev Kustanovich,
Shivprasad Shivratri,
George Weiss
Abstract:
Virtual synchronous machines are inverters with a control algorithm that causes them to behave towards the power grid like synchronous generators. A popular way to realize such inverters are synchronverters. Their control algorithm has evolved over time, but all the different formulations in the literature share the same "basic control algorithm". We investigate the equilibrium points and the stab…
▽ More
Virtual synchronous machines are inverters with a control algorithm that causes them to behave towards the power grid like synchronous generators. A popular way to realize such inverters are synchronverters. Their control algorithm has evolved over time, but all the different formulations in the literature share the same "basic control algorithm". We investigate the equilibrium points and the stability of a synchronverter described by this basic algorithm, when connected to an infinite bus. We formulate a fifth order model for a grid-connected synchronverter and derive a necessary and sufficient condition for the existence of equilibrium points. We show that the set of equilibrium points with positive field current is a two-dimensional manifold that can be parametrized by the corresponding pair $(P,Q)$, where $P$ is the active power and $Q$ is the reactive power. This parametrization has several surprizing geometric properties, for instance, the prime mover torque, the power angle and the field current can be seen directly as distances or angles in the $(P,Q)$ plane. In addition, the stable equilibrium points correspond to a subset of a certain angular sector in the $(P,Q)$ plane. Thus, we can predict the stable operating range of a synchronverter from its parameters and from the grid voltage and frequency. Our stability result is based on the intrinsic two time scales property of the system, using tools from singular perturbation theory. We illustrate our theoretical results with two numerical examples.
△ Less
Submitted 1 July, 2021; v1 submitted 14 March, 2021;
originally announced March 2021.
-
Acoustic properties of metallic glasses at low temperatures -- tunneling systems and their dephasing
Authors:
Arnold Meißner,
Tim Voigtländer,
Saskia M. Meißner,
Uta Kühn,
Susanne Schneider,
Alexander Shnirman,
Georg Weiss
Abstract:
The low temperature acoustic properties of bulk metallic glasses measured over a broad range of frequencies rigorously test the predictions of the standard tunneling model. The strength of these experiments and their analyses is mainly based on the interaction of the tunneling states with conduction electrons or quasiparticles in the superconducting state. A new series of experiments at kHz and GH…
▽ More
The low temperature acoustic properties of bulk metallic glasses measured over a broad range of frequencies rigorously test the predictions of the standard tunneling model. The strength of these experiments and their analyses is mainly based on the interaction of the tunneling states with conduction electrons or quasiparticles in the superconducting state. A new series of experiments at kHz and GHz frequencies on the same sample material essentially confirms previous measurements and their discrepancies with theoretical predictions. These discrepancies can be lifted by considering more correctly the line widths of the dominating two-level atomic-tunneling systems. In fact, dephasing caused or mediated by interaction with conduction electrons may lead to particularly large line widths and destroy the tunneling sytems' two-level character in the normal conducting state.
△ Less
Submitted 11 March, 2021;
originally announced March 2021.
-
Revised SCLP-simplex Algorithm with Application to Large-Scale Fluid Processing Networks
Authors:
Evgeny Shindin,
Michael Masin,
Gideon Weiss,
Alexander Zadorojniy
Abstract:
We describe an efficient implementation of a recent simplex-type algorithm for the exact solution of separated continuous linear programs, and compare it with linear programming approximation of these problems obtained via discretization of the time horizon. The implementation overcomes many numerical pitfalls often neglected in theoretical analysis allowing better accuracy or acceleration up to s…
▽ More
We describe an efficient implementation of a recent simplex-type algorithm for the exact solution of separated continuous linear programs, and compare it with linear programming approximation of these problems obtained via discretization of the time horizon. The implementation overcomes many numerical pitfalls often neglected in theoretical analysis allowing better accuracy or acceleration up to several orders of magnitude both versus previous implementation of the simplex-type algorithms and versus a state-of-the-art LP solver using discretization. Numerical study includes medium, large, and very large examples of scheduling problems and problems of control of fluid processing networks. We discuss online and offline optimization settings for various applications and outline future research directions.
△ Less
Submitted 6 October, 2021; v1 submitted 7 March, 2021;
originally announced March 2021.
-
Local Convergence of Adaptive Gradient Descent Optimizers
Authors:
Sebastian Bock,
Martin Georg Weiß
Abstract:
Adaptive Moment Estimation (ADAM) is a very popular training algorithm for deep neural networks and belongs to the family of adaptive gradient descent optimizers. However to the best of the authors knowledge no complete convergence analysis exists for ADAM. The contribution of this paper is a method for the local convergence analysis in batch mode for a deterministic fixed training set, which give…
▽ More
Adaptive Moment Estimation (ADAM) is a very popular training algorithm for deep neural networks and belongs to the family of adaptive gradient descent optimizers. However to the best of the authors knowledge no complete convergence analysis exists for ADAM. The contribution of this paper is a method for the local convergence analysis in batch mode for a deterministic fixed training set, which gives necessary conditions for the hyperparameters of the ADAM algorithm. Due to the local nature of the arguments the objective function can be non-convex but must be at least twice continuously differentiable. Then we apply this procedure to other adaptive gradient descent algorithms and show for most of them local convergence with hyperparameter bounds.
△ Less
Submitted 19 February, 2021;
originally announced February 2021.
-
Synthesizing Context-free Grammars from Recurrent Neural Networks (Extended Version)
Authors:
Daniel M. Yellin,
Gail Weiss
Abstract:
We present an algorithm for extracting a subclass of the context free grammars (CFGs) from a trained recurrent neural network (RNN). We develop a new framework, pattern rule sets (PRSs), which describe sequences of deterministic finite automata (DFAs) that approximate a non-regular language. We present an algorithm for recovering the PRS behind a sequence of such automata, and apply it to the sequ…
▽ More
We present an algorithm for extracting a subclass of the context free grammars (CFGs) from a trained recurrent neural network (RNN). We develop a new framework, pattern rule sets (PRSs), which describe sequences of deterministic finite automata (DFAs) that approximate a non-regular language. We present an algorithm for recovering the PRS behind a sequence of such automata, and apply it to the sequences of automata extracted from trained RNNs using the L* algorithm. We then show how the PRS may converted into a CFG, enabling a familiar and useful presentation of the learned language.
Extracting the learned language of an RNN is important to facilitate understanding of the RNN and to verify its correctness. Furthermore, the extracted CFG can augment the RNN in classifying correct sentences, as the RNN's predictive accuracy decreases when the recursion depth and distance between matching delimiters of its input sequences increases.
△ Less
Submitted 28 March, 2021; v1 submitted 20 January, 2021;
originally announced January 2021.
-
Saturating PI control of stable nonlinear systems using singular perturbations
Authors:
Pietro Lorenzetti,
George Weiss
Abstract:
This paper presents an anti-windup PI controller, using a saturating integrator, for a single-input single-output stable nonlinear plant, whose steady-state input-output map is increasing. We prove that, under reasonable assumptions, there exists an upper bound on the controller gain such that for any constant reference input, the corresponding equilibrium point of the closed-loop system is expone…
▽ More
This paper presents an anti-windup PI controller, using a saturating integrator, for a single-input single-output stable nonlinear plant, whose steady-state input-output map is increasing. We prove that, under reasonable assumptions, there exists an upper bound on the controller gain such that for any constant reference input, the corresponding equilibrium point of the closed-loop system is exponentially stable, with a "large" region of attraction. When the state of the closed-loop system converges to this equilibrium point, then the tracking error tends to zero. The closed-loop stability analysis employs Lyapunov methods in the framework of singular perturbations theory. Finally, we show that if the plant satisfies the asymptotic gain property, then the closed-loop system is globally asymptotically stable for any sufficiently small controller gain. The effectiveness of the proposed PI controller is proved by showing how it performs as part of the control algorithm of a synchronverter (a special type of DC to AC power converter).
△ Less
Submitted 1 July, 2021; v1 submitted 13 January, 2021;
originally announced January 2021.
-
Can we detect harmony in artistic compositions? A machine learning approach
Authors:
Adam Vandor,
Marie van Vollenhoven,
Gerhard Weiss,
Gerasimos Spanakis
Abstract:
Harmony in visual compositions is a concept that cannot be defined or easily expressed mathematically, even by humans. The goal of the research described in this paper was to find a numerical representation of artistic compositions with different levels of harmony. We ask humans to rate a collection of grayscale images based on the harmony they convey. To represent the images, a set of special fea…
▽ More
Harmony in visual compositions is a concept that cannot be defined or easily expressed mathematically, even by humans. The goal of the research described in this paper was to find a numerical representation of artistic compositions with different levels of harmony. We ask humans to rate a collection of grayscale images based on the harmony they convey. To represent the images, a set of special features were designed and extracted. By doing so, it became possible to assign objective measures to subjectively judged compositions. Given the ratings and the extracted features, we utilized machine learning algorithms to evaluate the efficiency of such representations in a harmony classification problem. The best performing model (SVM) achieved 80% accuracy in distinguishing between harmonic and disharmonic images, which reinforces the assumption that concept of harmony can be expressed in a mathematical way that can be assessed by humans.
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
Volume-Scaled Common Nearest Neighbor Clustering Algorithm with Free-Energy Hierarchy
Authors:
R. Gregor Weiß,
Benjamin Ries,
Shuzhe Wang,
Sereina Riniker
Abstract:
The combination of Markov state modeling (MSM) and molecular dynamics (MD) simulations has been shown in recent years to be a valuable approach to unravel the slow processes of molecular systems with increasing complexity. While the algorithms for intermediate steps in the MSM workflow like featurization and dimensionality reduction have been specifically adapted for MD data sets, conventional clu…
▽ More
The combination of Markov state modeling (MSM) and molecular dynamics (MD) simulations has been shown in recent years to be a valuable approach to unravel the slow processes of molecular systems with increasing complexity. While the algorithms for intermediate steps in the MSM workflow like featurization and dimensionality reduction have been specifically adapted for MD data sets, conventional clustering methods are generally applied for the discretization step. This work adds to recent efforts to develop specialized density-based clustering algorithms for the Boltzmann-weighted data from MD simulations. We introduce the volume-scaled common nearest neighbor (vs-CNN) clustering that is an adapted version of the common nearest neighbor (CNN) algorithm. A major advantage of the proposed algorithm is that the introduced density-based criterion directly links to a free-energy notion via Boltzmann inversion. Such a free-energy perspective allows for a straightforward hierarchical scheme to identify conformational clusters at different levels of a generally rugged free-energy landscape of complex molecular systems.
△ Less
Submitted 18 September, 2020;
originally announced September 2020.
-
Affinity, Kinetics, and Pathways of Anisotropic Ligands Binding to Hydrophobic Model Pockets
Authors:
R. Gregor Weiß,
Richard Chudoba,
Piotr Setny,
Joachim Dzubiella
Abstract:
Using explicit-water molecular dynamics (MD) simulations of a generic pocket-ligand model we investigate how chemical and shape anisotropy of small ligands influences the affinities, kinetic rates and pathways for their association to hydrophobic binding sites. In particular, we investigate aromatic compounds, all of similar molecular size, but distinct by various hydrophilic or hydrophobic residu…
▽ More
Using explicit-water molecular dynamics (MD) simulations of a generic pocket-ligand model we investigate how chemical and shape anisotropy of small ligands influences the affinities, kinetic rates and pathways for their association to hydrophobic binding sites. In particular, we investigate aromatic compounds, all of similar molecular size, but distinct by various hydrophilic or hydrophobic residues. We demonstrate that the most hydrophobic sections are in general desolvated primarily upon binding to the cavity, suggesting that specific hydration of the different chemical units can steer the orientation pathways via a `hydrophobic torque'. Moreover, we find that ligands with bimodal orientation fluctuations have significantly increased kinetic barriers for binding compared to the kinetic barriers previously observed for spherical ligands due to translational fluctuations. We exemplify that these kinetic barriers, which are ligand specific, impact both binding and unbinding times for which we observe considerable differences between our studied ligands.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
Characterizing compact coincidence sets in the thin obstacle problem and the obstacle problem for the fractional Laplacian
Authors:
Simon Eberle,
Xavier Ros-Oton,
Georg S. Weiss
Abstract:
In this paper we give a full classification of global solutions of the obstacle problem for the fractional Laplacian (including the thin obstacle problem) with compact coincidence set and at most polynomial growth in dimension $N \geq 3$. We do this in terms of a bijection onto a set of polynomials describing the asymptotics of the solution. Furthermore we prove that coincidence sets of global sol…
▽ More
In this paper we give a full classification of global solutions of the obstacle problem for the fractional Laplacian (including the thin obstacle problem) with compact coincidence set and at most polynomial growth in dimension $N \geq 3$. We do this in terms of a bijection onto a set of polynomials describing the asymptotics of the solution. Furthermore we prove that coincidence sets of global solutions that are compact are also convex if the solution has at most quadratic growth.
△ Less
Submitted 15 June, 2021; v1 submitted 23 June, 2020;
originally announced June 2020.
-
Characterizing compact coincidence sets in the obstacle problem -- a short proof
Authors:
Simon Eberle,
Georg S. Weiss
Abstract:
Motivated by the almost completely open problem of characterizing unbounded coincidence sets of global solutions of the classical obstacle problem in higher dimensions, we give in this note a concise and easy-to-extend proof of the known fact that if the coincidence set $\{u=0 \}$ of a global solution $u$ is bounded with nonempty interior then it is an ellipsoid.
Motivated by the almost completely open problem of characterizing unbounded coincidence sets of global solutions of the classical obstacle problem in higher dimensions, we give in this note a concise and easy-to-extend proof of the known fact that if the coincidence set $\{u=0 \}$ of a global solution $u$ is bounded with nonempty interior then it is an ellipsoid.
△ Less
Submitted 3 June, 2020; v1 submitted 21 May, 2020;
originally announced May 2020.
-
On global solutions of the obstacle problem
Authors:
Simon Eberle,
Henrik Shahgholian,
Georg S. Weiss
Abstract:
Assuming a lower bound on the dimension, we prove a long standing conjecture concerning the classification of global solutions of the obstacle problem with unbounded coincidence sets.
Assuming a lower bound on the dimension, we prove a long standing conjecture concerning the classification of global solutions of the obstacle problem with unbounded coincidence sets.
△ Less
Submitted 5 August, 2022; v1 submitted 11 May, 2020;
originally announced May 2020.
-
A Formal Hierarchy of RNN Architectures
Authors:
William Merrill,
Gail Weiss,
Yoav Goldberg,
Roy Schwartz,
Noah A. Smith,
Eran Yahav
Abstract:
We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational recurrence, defined as whether the recurrent update can be described by a weighted finite-state machine. We place several RNN variants within this hierarchy. For example, we prove the LSTM is not rational, wh…
▽ More
We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN's memory, and rational recurrence, defined as whether the recurrent update can be described by a weighted finite-state machine. We place several RNN variants within this hierarchy. For example, we prove the LSTM is not rational, which formally separates it from the related QRNN (Bradbury et al., 2016). We also show how these models' expressive capacity is expanded by stacking multiple layers or composing them with different pooling functions. Our results build on the theory of "saturated" RNNs (Merrill, 2019). While formally extending these findings to unsaturated RNNs is left to future work, we hypothesize that the practical learnable capacity of unsaturated RNNs obeys a similar hierarchy. Experimental findings from training unsaturated networks on formal languages support this conjecture.
△ Less
Submitted 19 September, 2020; v1 submitted 17 April, 2020;
originally announced April 2020.
-
Stabilizability properties of a linearized water waves system
Authors:
Pei Su,
Marius Tucsnak,
George Weiss
Abstract:
We consider the strong stabilization of small amplitude gravity water waves in a two dimensional rectangular domain. The control acts on one lateral boundary, by imposing the horizontal acceleration of the water along that boundary, as a multiple of a scalar input function $u$, times a given function $h$ of the height along the active boundary. The state $z$ of the system consists of two functions…
▽ More
We consider the strong stabilization of small amplitude gravity water waves in a two dimensional rectangular domain. The control acts on one lateral boundary, by imposing the horizontal acceleration of the water along that boundary, as a multiple of a scalar input function $u$, times a given function $h$ of the height along the active boundary. The state $z$ of the system consists of two functions: the water level $ζ$ along the top boundary, and its time derivative $\dotζ$. We prove that for suitable functions $h$, there exists a bounded feedback functional $F$ such that the feedback $u=Fz$ renders the closed-loop system strongly stable. Moreover, for initial states in the domain of the semigroup generator, the norm of the solution decays like $(1+t)^{-\frac{1}{6}}$. Our approach uses a detailed analysis of the partial Dirichlet to Neumann and Neumann to Neumann operators associated to certain edges of the rectangular domain, as well as recent abstract non-uniform stabilization results by Chill, Paunonen, Seifert, Stahn and Tomilov (2019).
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Directed FCFS Infinite Bipartite Matching
Authors:
Gideon Weiss
Abstract:
We consider an infinite sequence consisting of agents of several types and goods of several types, with a bipartite compatibility graph between agent and good types. Goods are matched with agents that appear earlier in the sequence using FCFS matching, if such are available, and are lost otherwise. This model may be used for two sided queueing applications such as ride sharing, web purchases, orga…
▽ More
We consider an infinite sequence consisting of agents of several types and goods of several types, with a bipartite compatibility graph between agent and good types. Goods are matched with agents that appear earlier in the sequence using FCFS matching, if such are available, and are lost otherwise. This model may be used for two sided queueing applications such as ride sharing, web purchases, organ transplants, and for abandon at completion redundancy parallel service queues. For this model we calculate matching rates and delays. These can be used to obtain waiting times and help with design questions for related service systems. We explore some relations of this model to other FCFS stochastic matching models.
△ Less
Submitted 1 December, 2019;
originally announced December 2019.