-
Self-Consuming Generative Models Go MAD
Authors:
Sina Alemohammad,
Josue Casco-Rodriguez,
Lorenzo Luzi,
Ahmed Imtiaz Humayun,
Hossein Babaei,
Daniel LeJeune,
Ali Siahkoohi,
Richard G. Baraniuk
Abstract:
Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of au…
▽ More
Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of autophagous loops that differ in how fixed or fresh real training data is available through the generations of training and in whether the samples from previous generation models have been biased to trade off data quality versus diversity. Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease. We term this condition Model Autophagy Disorder (MAD), making analogy to mad cow disease.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Simulations of multivariant Si I to Si II phase transformation in polycrystalline silicon with finite-strain scale-free phase-field approach
Authors:
Hamed Babaei,
Raghunandan Pratoori,
Valery I. Levitas
Abstract:
Scale-free phase-field approach (PFA) at large strains and corresponding finite element method (FEM) simulations for multivariant martensitic phase transformation (PT) from cubic Si I to tetragonal Si II in a polycrystalline aggregate are presented. Important features of the model are large and very anisotropic transformation strain tensor $\varepsilon_{t}=\{0.1753;0.1753; -0.447\}$ and stress-ten…
▽ More
Scale-free phase-field approach (PFA) at large strains and corresponding finite element method (FEM) simulations for multivariant martensitic phase transformation (PT) from cubic Si I to tetragonal Si II in a polycrystalline aggregate are presented. Important features of the model are large and very anisotropic transformation strain tensor $\varepsilon_{t}=\{0.1753;0.1753; -0.447\}$ and stress-tensor dependent athermal dissipative threshold for PT, which produce essential challenges for computations. 3D polycrystals with 55 and 910 stochastically oriented grains are subjected to uniaxial strain- and stress-controlled loadings under periodic boundary conditions and zero averaged lateral strains. Coupled evolution of discrete martensitic microstructure, volume fractions of martensitic variants and Si II, stress and transformation strain tensors, and texture are presented and analyzed. Macroscopic variables effectively representing multivariant transformational behavior are introduced. Macroscopic stress-strain and transformational behavior for 55 and 910 grains are close (less than 10% difference). This allows the determination of macroscopic constitutive equations by treating aggregate with a small number of grains. Large transformation strains and grain boundaries lead to huge internal stresses of tens GPa, which affect microstructure evolution and macroscopic behavior. In contrast to a single crystal, the local mechanical instabilities due to PT and negative local tangent modulus are stabilized at the macroscale by arresting/slowing the growth of Si II regions by the grain boundaries and generating the internal back stresses. This leads to increasing stress during PT. The developed methodology can be used for studying similar PTs with large transformation strains and for further development by including plastic strain and strain-induced PTs.
△ Less
Submitted 12 February, 2023;
originally announced February 2023.
-
TITAN: Bringing The Deep Image Prior to Implicit Representations
Authors:
Lorenzo Luzi,
Daniel LeJeune,
Ali Siahkoohi,
Sina Alemohammad,
Vishwanath Saragadam,
Hossein Babaei,
Naiming Liu,
Zichao Wang,
Richard G. Baraniuk
Abstract:
We study the interpolation capabilities of implicit neural representations (INRs) of images. In principle, INRs promise a number of advantages, such as continuous derivatives and arbitrary sampling, being freed from the restrictions of a raster grid. However, empirically, INRs have been observed to poorly interpolate between the pixels of the fit image; in other words, they do not inherently posse…
▽ More
We study the interpolation capabilities of implicit neural representations (INRs) of images. In principle, INRs promise a number of advantages, such as continuous derivatives and arbitrary sampling, being freed from the restrictions of a raster grid. However, empirically, INRs have been observed to poorly interpolate between the pixels of the fit image; in other words, they do not inherently possess a suitable prior for natural images. In this paper, we propose to address and improve INRs' interpolation capabilities by explicitly integrating image prior information into the INR architecture via deep decoder, a specific implementation of the deep image prior (DIP). Our method, which we call TITAN, leverages a residual connection from the input which enables integrating the principles of the grid-based DIP into the grid-free INR. Through super-resolution and computed tomography experiments, we demonstrate that our method significantly improves upon classic INRs, thanks to the induced natural image bias. We also find that by constraining the weights to be sparse, image quality and sharpness are enhanced, increasing the Lipschitz constant.
△ Less
Submitted 1 May, 2024; v1 submitted 31 October, 2022;
originally announced November 2022.
-
Covariate Balancing Methods for Randomized Controlled Trials Are Not Adversarially Robust
Authors:
Hossein Babaei,
Sina Alemohammad,
Richard Baraniuk
Abstract:
The first step towards investigating the effectiveness of a treatment via a randomized trial is to split the population into control and treatment groups then compare the average response of the treatment group receiving the treatment to the control group receiving the placebo.
In order to ensure that the difference between the two groups is caused only by the treatment, it is crucial that the c…
▽ More
The first step towards investigating the effectiveness of a treatment via a randomized trial is to split the population into control and treatment groups then compare the average response of the treatment group receiving the treatment to the control group receiving the placebo.
In order to ensure that the difference between the two groups is caused only by the treatment, it is crucial that the control and the treatment groups have similar statistics. Indeed, the validity and reliability of a trial are determined by the similarity of two groups' statistics. Covariate balancing methods increase the similarity between the distributions of the two groups' covariates. However, often in practice, there are not enough samples to accurately estimate the groups' covariate distributions. In this paper, we empirically show that covariate balancing with the Standardized Means Difference (SMD) covariate balancing measure, as well as Pocock's sequential treatment assignment method, are susceptible to worst-case treatment assignments. Worst-case treatment assignments are those admitted by the covariate balance measure, but result in highest possible ATE estimation errors. We developed an adversarial attack to find adversarial treatment assignment for any given trial. Then, we provide an index to measure how close the given trial is to the worst-case. To this end, we provide an optimization-based algorithm, namely Adversarial Treatment ASsignment in TREatment Effect Trials (ATASTREET), to find the adversarial treatment assignments.
△ Less
Submitted 27 August, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
NFT-K: Non-Fungible Tangent Kernels
Authors:
Sina Alemohammad,
Hossein Babaei,
CJ Barberan,
Naiming Liu,
Lorenzo Luzi,
Blake Mason,
Richard G. Baraniuk
Abstract:
Deep neural networks have become essential for numerous applications due to their strong empirical performance such as vision, RL, and classification. Unfortunately, these networks are quite difficult to interpret, and this limits their applicability in settings where interpretability is important for safety, such as medical imaging. One type of deep neural network is neural tangent kernel that is…
▽ More
Deep neural networks have become essential for numerous applications due to their strong empirical performance such as vision, RL, and classification. Unfortunately, these networks are quite difficult to interpret, and this limits their applicability in settings where interpretability is important for safety, such as medical imaging. One type of deep neural network is neural tangent kernel that is similar to a kernel machine that provides some aspect of interpretability. To further contribute interpretability with respect to classification and the layers, we develop a new network as a combination of multiple neural tangent kernels, one to model each layer of the deep neural network individually as opposed to past work which attempts to represent the entire network via a single neural tangent kernel. We demonstrate the interpretability of this model on two datasets, showing that the multiple kernels model elucidates the interplay between the layers and predictions.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels
Authors:
Sina Alemohammad,
Hossein Babaei,
Randall Balestriero,
Matt Y. Cheung,
Ahmed Imtiaz Humayun,
Daniel LeJeune,
Naiming Liu,
Lorenzo Luzi,
Jasper Tan,
Zichao Wang,
Richard G. Baraniuk
Abstract:
High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation. Techniques abound to reduce the dimensionality of fixed-length sequences, yet these methods rarely generalize to variable-length sequences. To address this gap, we extend existing methods that rely on the use of kernels to variable-length seque…
▽ More
High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation. Techniques abound to reduce the dimensionality of fixed-length sequences, yet these methods rarely generalize to variable-length sequences. To address this gap, we extend existing methods that rely on the use of kernels to variable-length sequences via use of the Recurrent Neural Tangent Kernel (RNTK). Since a deep neural network with ReLu activation is a Max-Affine Spline Operator (MASO), we dub our approach Max-Affine Spline Kernel (MASK). We demonstrate how MASK can be used to extend principal components analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) and apply these new algorithms to separate synthetic time series data sampled from second-order differential equations.
△ Less
Submitted 17 April, 2021; v1 submitted 26 October, 2020;
originally announced October 2020.
-
Machine-learning based interatomic potential for phonon transport in perfect crystalline Si and crystalline Si with vacancies
Authors:
Hasan Babaei,
Ruiqiang Guo,
Amirreza Hashemi,
Sangyeop Lee
Abstract:
We report that single interatomic potential, developed using Gaussian regression of density functional theory calculation data, has high accuracy and flexibility to describe phonon transport with ab initio accuracy in two different atomistic configurations: perfect crystalline Si and crystalline Si with vacancies. The high accuracy of second- and third-order force constants from the Gaussian appro…
▽ More
We report that single interatomic potential, developed using Gaussian regression of density functional theory calculation data, has high accuracy and flexibility to describe phonon transport with ab initio accuracy in two different atomistic configurations: perfect crystalline Si and crystalline Si with vacancies. The high accuracy of second- and third-order force constants from the Gaussian approximation potential (GAP) are demonstrated with phonon dispersion, Grüneisen parameter, three-phonon scattering rate, phonon-vacancy scattering rate, and thermal conductivity, all of which are very close to the results from density functional theory calculation. We also show that the widely used empirical potentials (Stillinger-Weber and Tersoff) produce much larger errors compared to the GAP. The computational cost of GAP is higher than the two empirical potentials, but five orders of magnitude lower than the density functional theory calculation. Our work shows that GAP can provide a new opportunity for studying phonon transport in partially disordered crystalline phases with the high predictive power of ab initio calculation but at a feasible computational cost.
△ Less
Submitted 23 May, 2019;
originally announced May 2019.
-
Quantum Mechanics of a Photon
Authors:
Hassan Babaei,
Ali Mostafazadeh
Abstract:
A first quantized free photon is a complex massless vector field $A=(A^μ)$ whose field strength satisfies Maxwell's equations in vacuum. We construct the Hilbert space $\mathscr{H}$ of the photon by endowing the vector space of the fields $A$ in the temporal-Coulomb gauge with a positive-definite and relativistically invariant inner product. We give an explicit expression for this inner product, i…
▽ More
A first quantized free photon is a complex massless vector field $A=(A^μ)$ whose field strength satisfies Maxwell's equations in vacuum. We construct the Hilbert space $\mathscr{H}$ of the photon by endowing the vector space of the fields $A$ in the temporal-Coulomb gauge with a positive-definite and relativistically invariant inner product. We give an explicit expression for this inner product, identify the Hamiltonian for the photon with the generator of time translations in $\mathscr{H}$, determine the operators representing the momentum and the helicity of the photon, and introduce a chirality operator whose eigenfunctions correspond to fields having a definite sign of energy. We also construct a position operator for the photon whose components commute with each other and with the chirality and helicity operators. This allows for the construction of the localized states of the photon with a definite sign of energy and helicity. We derive an explicit formula for the latter and compute the corresponding electric and magnetic fields. These turn out to diverge not just at the point where the photon is localized but on a plane containing this point. We identify the axis normal to this plane with an associated symmetry axis, and show that each choice of this axis specifies a particular position operator, a corresponding position basis, and a position representation of the quantum mechanics of photon. In particular, we examine the position wave functions determined by such a position basis, elucidate their relationship with the Riemann-Silberstein and Landau-Peierls wave functions, and give an explicit formula for the probability density of the spatial localization of the photon.
△ Less
Submitted 18 September, 2017; v1 submitted 23 August, 2016;
originally announced August 2016.