-
An Adaptive Tangent Feature Perspective of Neural Networks
Authors:
Daniel LeJeune,
Sina Alemohammad
Abstract:
In order to better understand feature learning in neural networks, we propose a framework for understanding linear models in tangent feature space where the features are allowed to be transformed during training. We consider linear transformations of features, resulting in a joint optimization over parameters and transformations with a bilinear interpolation constraint. We show that this optimizat…
▽ More
In order to better understand feature learning in neural networks, we propose a framework for understanding linear models in tangent feature space where the features are allowed to be transformed during training. We consider linear transformations of features, resulting in a joint optimization over parameters and transformations with a bilinear interpolation constraint. We show that this optimization problem has an equivalent linearly constrained optimization with structured regularization that encourages approximately low rank solutions. Specializing to neural network structure, we gain insights into how the features and thus the kernel function change, providing additional nuance to the phenomenon of kernel alignment when the target function is poorly represented using tangent features. We verify our theoretical observations in the kernel alignment of real neural networks.
△ Less
Submitted 20 February, 2024; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Self-Consuming Generative Models Go MAD
Authors:
Sina Alemohammad,
Josue Casco-Rodriguez,
Lorenzo Luzi,
Ahmed Imtiaz Humayun,
Hossein Babaei,
Daniel LeJeune,
Ali Siahkoohi,
Richard G. Baraniuk
Abstract:
Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of au…
▽ More
Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of autophagous loops that differ in how fixed or fresh real training data is available through the generations of training and in whether the samples from previous generation models have been biased to trade off data quality versus diversity. Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease. We term this condition Model Autophagy Disorder (MAD), making analogy to mad cow disease.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
TITAN: Bringing The Deep Image Prior to Implicit Representations
Authors:
Lorenzo Luzi,
Daniel LeJeune,
Ali Siahkoohi,
Sina Alemohammad,
Vishwanath Saragadam,
Hossein Babaei,
Naiming Liu,
Zichao Wang,
Richard G. Baraniuk
Abstract:
We study the interpolation capabilities of implicit neural representations (INRs) of images. In principle, INRs promise a number of advantages, such as continuous derivatives and arbitrary sampling, being freed from the restrictions of a raster grid. However, empirically, INRs have been observed to poorly interpolate between the pixels of the fit image; in other words, they do not inherently posse…
▽ More
We study the interpolation capabilities of implicit neural representations (INRs) of images. In principle, INRs promise a number of advantages, such as continuous derivatives and arbitrary sampling, being freed from the restrictions of a raster grid. However, empirically, INRs have been observed to poorly interpolate between the pixels of the fit image; in other words, they do not inherently possess a suitable prior for natural images. In this paper, we propose to address and improve INRs' interpolation capabilities by explicitly integrating image prior information into the INR architecture via deep decoder, a specific implementation of the deep image prior (DIP). Our method, which we call TITAN, leverages a residual connection from the input which enables integrating the principles of the grid-based DIP into the grid-free INR. Through super-resolution and computed tomography experiments, we demonstrate that our method significantly improves upon classic INRs, thanks to the induced natural image bias. We also find that by constraining the weights to be sparse, image quality and sharpness are enhanced, increasing the Lipschitz constant.
△ Less
Submitted 1 May, 2024; v1 submitted 31 October, 2022;
originally announced November 2022.
-
NeuroView-RNN: It's About Time
Authors:
CJ Barberan,
Sina Alemohammad,
Naiming Liu,
Randall Balestriero,
Richard G. Baraniuk
Abstract:
Recurrent Neural Networks (RNNs) are important tools for processing sequential data such as time-series or video. Interpretability is defined as the ability to be understood by a person and is different from explainability, which is the ability to be explained in a mathematical formulation. A key interpretability issue with RNNs is that it is not clear how each hidden state per time step contribut…
▽ More
Recurrent Neural Networks (RNNs) are important tools for processing sequential data such as time-series or video. Interpretability is defined as the ability to be understood by a person and is different from explainability, which is the ability to be explained in a mathematical formulation. A key interpretability issue with RNNs is that it is not clear how each hidden state per time step contributes to the decision-making process in a quantitative manner. We propose NeuroView-RNN as a family of new RNN architectures that explains how all the time steps are used for the decision-making process. Each member of the family is derived from a standard RNN architecture by concatenation of the hidden steps into a global linear classifier. The global linear classifier has all the hidden states as the input, so the weights of the classifier have a linear map** to the hidden states. Hence, from the weights, NeuroView-RNN can quantify how important each time step is to a particular decision. As a bonus, NeuroView-RNN also offers higher accuracy in many cases compared to the RNNs and their variants. We showcase the benefits of NeuroView-RNN by evaluating on a multitude of diverse time-series datasets.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Covariate Balancing Methods for Randomized Controlled Trials Are Not Adversarially Robust
Authors:
Hossein Babaei,
Sina Alemohammad,
Richard Baraniuk
Abstract:
The first step towards investigating the effectiveness of a treatment via a randomized trial is to split the population into control and treatment groups then compare the average response of the treatment group receiving the treatment to the control group receiving the placebo.
In order to ensure that the difference between the two groups is caused only by the treatment, it is crucial that the c…
▽ More
The first step towards investigating the effectiveness of a treatment via a randomized trial is to split the population into control and treatment groups then compare the average response of the treatment group receiving the treatment to the control group receiving the placebo.
In order to ensure that the difference between the two groups is caused only by the treatment, it is crucial that the control and the treatment groups have similar statistics. Indeed, the validity and reliability of a trial are determined by the similarity of two groups' statistics. Covariate balancing methods increase the similarity between the distributions of the two groups' covariates. However, often in practice, there are not enough samples to accurately estimate the groups' covariate distributions. In this paper, we empirically show that covariate balancing with the Standardized Means Difference (SMD) covariate balancing measure, as well as Pocock's sequential treatment assignment method, are susceptible to worst-case treatment assignments. Worst-case treatment assignments are those admitted by the covariate balance measure, but result in highest possible ATE estimation errors. We developed an adversarial attack to find adversarial treatment assignment for any given trial. Then, we provide an index to measure how close the given trial is to the worst-case. To this end, we provide an optimization-based algorithm, namely Adversarial Treatment ASsignment in TREatment Effect Trials (ATASTREET), to find the adversarial treatment assignments.
△ Less
Submitted 27 August, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
NFT-K: Non-Fungible Tangent Kernels
Authors:
Sina Alemohammad,
Hossein Babaei,
CJ Barberan,
Naiming Liu,
Lorenzo Luzi,
Blake Mason,
Richard G. Baraniuk
Abstract:
Deep neural networks have become essential for numerous applications due to their strong empirical performance such as vision, RL, and classification. Unfortunately, these networks are quite difficult to interpret, and this limits their applicability in settings where interpretability is important for safety, such as medical imaging. One type of deep neural network is neural tangent kernel that is…
▽ More
Deep neural networks have become essential for numerous applications due to their strong empirical performance such as vision, RL, and classification. Unfortunately, these networks are quite difficult to interpret, and this limits their applicability in settings where interpretability is important for safety, such as medical imaging. One type of deep neural network is neural tangent kernel that is similar to a kernel machine that provides some aspect of interpretability. To further contribute interpretability with respect to classification and the layers, we develop a new network as a combination of multiple neural tangent kernels, one to model each layer of the deep neural network individually as opposed to past work which attempts to represent the entire network via a single neural tangent kernel. We demonstrate the interpretability of this model on two datasets, showing that the multiple kernels model elucidates the interplay between the layers and predictions.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Enhanced Recurrent Neural Tangent Kernels for Non-Time-Series Data
Authors:
Sina Alemohammad,
Randall Balestriero,
Zichao Wang,
Richard Baraniuk
Abstract:
Kernels derived from deep neural networks (DNNs) in the infinite-width regime provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization. In this paper, we extend the family of kernels associated with recurrent neural networks (RNNs), which were previously derived only for simple RNNs, to more complex archi…
▽ More
Kernels derived from deep neural networks (DNNs) in the infinite-width regime provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization. In this paper, we extend the family of kernels associated with recurrent neural networks (RNNs), which were previously derived only for simple RNNs, to more complex architectures including bidirectional RNNs and RNNs with average pooling. We also develop a fast GPU implementation to exploit the full practical potential of the kernels. Though RNNs are typically only applied to time-series data, we demonstrate that classifiers using RNN-based kernels outperform a range of baseline methods on 90 non-time-series datasets from the UCI data repository.
△ Less
Submitted 19 October, 2021; v1 submitted 8 December, 2020;
originally announced December 2020.
-
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels
Authors:
Sina Alemohammad,
Hossein Babaei,
Randall Balestriero,
Matt Y. Cheung,
Ahmed Imtiaz Humayun,
Daniel LeJeune,
Naiming Liu,
Lorenzo Luzi,
Jasper Tan,
Zichao Wang,
Richard G. Baraniuk
Abstract:
High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation. Techniques abound to reduce the dimensionality of fixed-length sequences, yet these methods rarely generalize to variable-length sequences. To address this gap, we extend existing methods that rely on the use of kernels to variable-length seque…
▽ More
High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation. Techniques abound to reduce the dimensionality of fixed-length sequences, yet these methods rarely generalize to variable-length sequences. To address this gap, we extend existing methods that rely on the use of kernels to variable-length sequences via use of the Recurrent Neural Tangent Kernel (RNTK). Since a deep neural network with ReLu activation is a Max-Affine Spline Operator (MASO), we dub our approach Max-Affine Spline Kernel (MASK). We demonstrate how MASK can be used to extend principal components analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) and apply these new algorithms to separate synthetic time series data sampled from second-order differential equations.
△ Less
Submitted 17 April, 2021; v1 submitted 26 October, 2020;
originally announced October 2020.
-
The Recurrent Neural Tangent Kernel
Authors:
Sina Alemohammad,
Zichao Wang,
Randall Balestriero,
Richard Baraniuk
Abstract:
The study of deep neural networks (DNNs) in the infinite-width limit, via the so-called neural tangent kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization. One key DNN architecture remains to be kernelized, namely, the recurrent neural network (RNN). In this paper we introduce and study the Recurrent Neural Tangent Kernel…
▽ More
The study of deep neural networks (DNNs) in the infinite-width limit, via the so-called neural tangent kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization. One key DNN architecture remains to be kernelized, namely, the recurrent neural network (RNN). In this paper we introduce and study the Recurrent Neural Tangent Kernel (RNTK), which provides new insights into the behavior of overparametrized RNNs. A key property of the RNTK should greatly benefit practitioners is its ability to compare inputs of different length. To this end, we characterize how the RNTK weights different time steps to form its output under different initialization parameters and nonlinearity choices. A synthetic and 56 real-world data experiments demonstrate that the RNTK offers significant performance gains over other kernels, including standard NTKs, across a wide array of data sets.
△ Less
Submitted 14 June, 2021; v1 submitted 17 June, 2020;
originally announced June 2020.
-
Characterization of Vegetation and Soil Scattering Mechanisms across Different Biomes using P-band SAR Polarimetry
Authors:
Seyed Hamed Alemohammad,
Alexandra G. Konings,
Thomas Jagdhuber,
Mahta Moghaddam,
Dara Entekhabi
Abstract:
Understanding the scattering mechanisms from the ground surface in the presence of different vegetation densities is necessary for the interpretation of P-band Synthetic Aperture Radar (SAR) observations and for the design of geophysical retrieval algorithms. In this study, a quantitative analysis of vegetation and soil scattering mechanisms estimated from the observations of an airborne P-band SA…
▽ More
Understanding the scattering mechanisms from the ground surface in the presence of different vegetation densities is necessary for the interpretation of P-band Synthetic Aperture Radar (SAR) observations and for the design of geophysical retrieval algorithms. In this study, a quantitative analysis of vegetation and soil scattering mechanisms estimated from the observations of an airborne P-band SAR instrument across nine different biomes in North America is presented. The goal is to apply a hybrid (model- and eigen- based) three component decomposition approach to separate the contributions of surface, double-bounce and vegetation volume scattering across a wide range of biome conditions. The decomposition makes no prior assumptions about vegetation structure. We characterize the dynamics of the decomposition across different North American biomes and assess their characteristic range. Impacts of vegetation cover seasonality and soil surface roughness on the contributions of each scattering mechanism are also investigated. Observations used here are part of the NASA Airborne Microwave Observatory of Subcanopy and Subsurface (AirMOSS) mission and data have been collected between 2013 and 2015.
△ Less
Submitted 8 August, 2017; v1 submitted 8 November, 2016;
originally announced November 2016.
-
A Framework for Modelling Probabilistic Uncertainty in Rainfall Scenario Analysis
Authors:
Seyed Hamed Alemohammad,
Reza Ardakanian,
Akbar Karimi
Abstract:
Predicting future probable values of model parameters, is an essential pre-requisite for assessing model decision reliability in an uncertain environment. Scenario Analysis is a methodology for modelling uncertainty in water resources management modelling. Uncertainty if not considered appropriately in decision making will decrease reliability of decisions, especially in long-term planning. One of…
▽ More
Predicting future probable values of model parameters, is an essential pre-requisite for assessing model decision reliability in an uncertain environment. Scenario Analysis is a methodology for modelling uncertainty in water resources management modelling. Uncertainty if not considered appropriately in decision making will decrease reliability of decisions, especially in long-term planning. One of the challenges in Scenario Analysis is how scenarios are made. One of the most approved methods is statistical modelling based on Auto-Regressive models. Stream flow future scenarios in developed basins that human has made changes to the natural flow process could not be generated directly by ARMA modelling. In this case, making scenarios for monthly rainfall and using it in a water resources system model makes more sense. Rainfall is an ephemeral process which has zero values in some months which introduces some limitations in making use of monthly ARMA model. Therefore, a two stage modelling approach is adopted here which in the first stage yearly modelling is done. Within this yearly model three ranges are identified: Dry, Normal and Wet. In the normal range yearly ARMA modelling is used. Dry and Wet range are considered as random processes and are modeled by frequency analysis. Monthly distribution of rainfall, which is extracted from available data from a moving average are considered to be deterministic and fixed in time. Each rainfall scenario is composed of a yearly ARMA process super-imposed by dry and wet events according to the frequency analysis. This modelling framework is applied to available data from three rain-gauge stations in Iran. Results show this modelling approach has better consistency with observed data in comparison with making use of ARMA modelling alone.
△ Less
Submitted 15 April, 2013;
originally announced April 2013.
-
Global Warming and Caspian Sea Level Fluctuations
Authors:
Reza Ardakanian,
Seyed Hamed Alemohammad
Abstract:
Coastal regions have a high social, economical and environmental importance. Due to this importance the sea level fluctuations can have many bad consequences. In this research the correlation between the increasing trend of temperature in coastal stations due to Global Warming and the Caspian Sea level has been established. The Caspian Sea level data has been received from the Jason-1 satellite. I…
▽ More
Coastal regions have a high social, economical and environmental importance. Due to this importance the sea level fluctuations can have many bad consequences. In this research the correlation between the increasing trend of temperature in coastal stations due to Global Warming and the Caspian Sea level has been established. The Caspian Sea level data has been received from the Jason-1 satellite. It was resulted that the monthly correlation between the temperature and sea level is high and also positive and almost the same for all the stations. But the yearly correlation was negative. It means that the sea level has decreased by the increase in temperature.
△ Less
Submitted 23 June, 2014; v1 submitted 11 April, 2013;
originally announced April 2013.
-
Merging Satellite Measurements of Rainfall Using Multi-scale Imagery Technique
Authors:
Seyed Hamed Alemohammad,
Dara Entekhabi
Abstract:
Several passive microwave satellites orbit the Earth and measure rainfall. These measurements have the advantage of almost full global coverage when compared to surface rain gauges. However, these satellites have low temporal revisit and missing data over some regions. Image fusion is a useful technique to fill in the gaps of one image (one satellite measurement) using another one. The proposed al…
▽ More
Several passive microwave satellites orbit the Earth and measure rainfall. These measurements have the advantage of almost full global coverage when compared to surface rain gauges. However, these satellites have low temporal revisit and missing data over some regions. Image fusion is a useful technique to fill in the gaps of one image (one satellite measurement) using another one. The proposed algorithm uses an iterative fusion scheme to integrate information from two satellite measurements. The algorithm is implemented on two datasets for 7 years of half-hourly data. The results show significant improvements in rain detection and rain intensity in the merged measurements.
△ Less
Submitted 11 April, 2013;
originally announced April 2013.