-
Transparent Anomaly Detection via Concept-based Explanations
Authors:
Laya Rafiee Sevyeri,
Ivaxi Sheth,
Farhood Farahnak,
Samira Ebrahimi Kahou,
Shirin Abbasinejad Enger
Abstract:
Advancements in deep learning techniques have given a boost to the performance of anomaly detection. However, real-world and safety-critical applications demand a level of transparency and reasoning beyond accuracy. The task of anomaly detection (AD) focuses on finding whether a given sample follows the learned distribution. Existing methods lack the ability to reason with clear explanations for t…
▽ More
Advancements in deep learning techniques have given a boost to the performance of anomaly detection. However, real-world and safety-critical applications demand a level of transparency and reasoning beyond accuracy. The task of anomaly detection (AD) focuses on finding whether a given sample follows the learned distribution. Existing methods lack the ability to reason with clear explanations for their outcomes. Hence to overcome this challenge, we propose Transparent {A}nomaly Detection {C}oncept {E}xplanations (ACE). ACE is able to provide human interpretable explanations in the form of concepts along with anomaly prediction. To the best of our knowledge, this is the first paper that proposes interpretable by-design anomaly detection. In addition to promoting transparency in AD, it allows for effective human-model interaction. Our proposed model shows either higher or comparable results to black-box uninterpretable models. We validate the performance of ACE across three realistic datasets - bird classification on CUB-200-2011, challenging histopathology slide image classification on TIL-WSI-TCGA, and gender classification on CelebA. We further demonstrate that our concept learning paradigm can be seamlessly integrated with other classification-based AD methods.
△ Less
Submitted 1 November, 2023; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Source-free Domain Adaptation Requires Penalized Diversity
Authors:
Laya Rafiee Sevyeri,
Ivaxi Sheth,
Farhood Farahnak,
Alexandre See,
Samira Ebrahimi Kahou,
Thomas Fevens,
Mohammad Havaei
Abstract:
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vit…
▽ More
While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts.
△ Less
Submitted 12 April, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation
Authors:
Farshid Varno,
Marzie Saghayi,
Laya Rafiee Sevyeri,
Sharut Gupta,
Stan Matwin,
Mohammad Havaei
Abstract:
In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data. Models are optimized locally at each client and further communicated to a central hub for aggregation. While FL is an appealing decentralized training paradigm, heterogeneity among data from different clients can cause the local optimization to drift away from the global objective. I…
▽ More
In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data. Models are optimized locally at each client and further communicated to a central hub for aggregation. While FL is an appealing decentralized training paradigm, heterogeneity among data from different clients can cause the local optimization to drift away from the global objective. In order to estimate and therefore remove this drift, variance reduction techniques have been incorporated into FL optimization recently. However, these approaches inaccurately estimate the clients' drift and ultimately fail to remove it properly. In this work, we propose an adaptive algorithm that accurately estimates drift across clients. In comparison to previous works, our approach necessitates less storage and communication bandwidth, as well as lower compute costs. Additionally, our proposed methodology induces stability by constraining the norm of estimates for client drift, making it more practical for large scale FL. Experimental findings demonstrate that the proposed algorithm converges significantly faster and achieves higher accuracy than the baselines across various FL benchmarks.
△ Less
Submitted 24 July, 2023; v1 submitted 27 April, 2022;
originally announced April 2022.
-
Bohemian Matrix Geometry
Authors:
Robert M. Corless,
George Labahn,
Dan Piponi,
Leili Rafiee Sevyeri
Abstract:
A Bohemian matrix family is a set of matrices all of whose entries are drawn from a fixed, usually discrete and hence bounded, subset of a field of characteristic zero. Originally these were integers -- hence the name, from the acronym BOunded HEight Matrix of Integers (BOHEMI) -- but other kinds of entries are also interesting. Some kinds of questions about Bohemian matrices can be answered by nu…
▽ More
A Bohemian matrix family is a set of matrices all of whose entries are drawn from a fixed, usually discrete and hence bounded, subset of a field of characteristic zero. Originally these were integers -- hence the name, from the acronym BOunded HEight Matrix of Integers (BOHEMI) -- but other kinds of entries are also interesting. Some kinds of questions about Bohemian matrices can be answered by numerical computation, but sometimes exact computation is better. In this paper we explore some Bohemian families (symmetric, upper Hessenberg, or Toeplitz) computationally, and answer some open questions posed about the distributions of eigenvalue densities.
△ Less
Submitted 26 April, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
on the effectiveness of generative adversarial network on anomaly detection
Authors:
Laya Rafiee Sevyeri,
Thomas Fevens
Abstract:
Identifying anomalies refers to detecting samples that do not resemble the training data distribution. Many generative models have been used to find anomalies, and among them, generative adversarial network (GAN)-based approaches are currently very popular. GANs mainly rely on the rich contextual information of these models to identify the actual training distribution. Following this analogy, we s…
▽ More
Identifying anomalies refers to detecting samples that do not resemble the training data distribution. Many generative models have been used to find anomalies, and among them, generative adversarial network (GAN)-based approaches are currently very popular. GANs mainly rely on the rich contextual information of these models to identify the actual training distribution. Following this analogy, we suggested a new unsupervised model based on GANs --a combination of an autoencoder and a GAN. Further, a new scoring function was introduced to target anomalies where a linear combination of the internal representation of the discriminator and the generator's visual representation, plus the encoded representation of the autoencoder, come together to define the proposed anomaly score. The model was further evaluated on benchmark datasets such as SVHN, CIFAR10, and MNIST, as well as a public medical dataset of leukemia images. In all the experiments, our model outperformed its existing counterparts while slightly improving the inference time.
△ Less
Submitted 31 December, 2021;
originally announced December 2021.
-
Approximate GCD in Lagrange bases
Authors:
Leili Rafiee Sevyeri,
Robert M. Corless
Abstract:
For a pair of polynomials with real or complex coefficients, given in any particular basis, the problem of finding their GCD is known to be ill-posed. An answer is still desired for many applications, however. Hence, looking for a GCD of so-called approximate polynomials where this term explicitly denotes small uncertainties in the coefficients has received significant attention in the field of hy…
▽ More
For a pair of polynomials with real or complex coefficients, given in any particular basis, the problem of finding their GCD is known to be ill-posed. An answer is still desired for many applications, however. Hence, looking for a GCD of so-called approximate polynomials where this term explicitly denotes small uncertainties in the coefficients has received significant attention in the field of hybrid symbolic-numeric computation. In this paper we give an algorithm, based on one of Victor Ya. Pan, to find an approximate GCD for a pair of approximate polynomials given in a Lagrange basis. More precisely, we suppose that these polynomials are given by their approximate values at distinct known points. We first find each of their roots by using a Lagrange basis companion matrix for each polynomial, cluster the roots of each polynomial to identify multiple roots, and then "marry" the two polynomials to find their GCD. At no point do we change to the monomial basis, thus preserving the good conditioning properties of the original Lagrange basis. We discuss advantages and drawbacks of this method. The computational cost is dominated by the rootfinding step; unless special-purpose eigenvalue algorithms are used, the cost is cubic in the degrees of the polynomials. In principle, this cost could be reduced but we do not do so here.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
Equivalences for Linearizations of Matrix Polynomials
Authors:
Robert M. Corless,
Leili Rafiee Sevyeri,
B. David Saunders
Abstract:
One useful standard method to compute eigenvalues of matrix polynomials ${\bf P}(z) \in \mathbb{C}^{n\times n}[z]$ of degree at most $\ell$ in $z$ (denoted of grade $\ell$, for short) is to first transform ${\bf P}(z)$ to an equivalent linear matrix polynomial ${\bf L}(z)=z{\bf B}-{\bf A}$, called a companion pencil, where ${\bf A}$ and ${\bf B}$ are usually of larger dimension than ${\bf P}(z)$ b…
▽ More
One useful standard method to compute eigenvalues of matrix polynomials ${\bf P}(z) \in \mathbb{C}^{n\times n}[z]$ of degree at most $\ell$ in $z$ (denoted of grade $\ell$, for short) is to first transform ${\bf P}(z)$ to an equivalent linear matrix polynomial ${\bf L}(z)=z{\bf B}-{\bf A}$, called a companion pencil, where ${\bf A}$ and ${\bf B}$ are usually of larger dimension than ${\bf P}(z)$ but ${\bf L}(z)$ is now only of grade $1$ in $z$. The eigenvalues and eigenvectors of ${\bf L}(z)$ can be computed numerically by, for instance, the QZ algorithm. The eigenvectors of ${\bf P}(z)$, including those for infinite eigenvalues, can also be recovered from eigenvectors of ${\bf L}(z)$ if ${\bf L}(z)$ is what is called a "strong linearization" of ${\bf P}(z)$.
In this paper we show how to use algorithms for computing the Hermite Normal Form of a companion matrix for a scalar polynomial to direct the discovery of unimodular matrix polynomial cofactors ${\bf E}(z)$ and ${\bf F}(z)$ which, via the equation ${\bf E}(z){\bf L}(z){\bf F}(z) = \mathrm{diag}( {\bf P}(z), {\bf I}_n, \ldots, {\bf I}_n)$, explicitly show the equivalence of ${\bf P}(z)$ and ${\bf L}(z)$. By this method we give new explicit constructions for several linearizations using different polynomial bases. We contrast these new unimodular pairs with those constructed by strict equivalence, some of which are also new to this paper. We discuss the limitations of this experimental, computational discovery method of finding unimodular cofactors.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Approximate GCD in a Bernstein basis
Authors:
Robert M. Corless,
Leili Rafiee Sevyeri
Abstract:
We adapt Victor Y. Pan's root-based algorithm for finding approximate GCD to the case where the polynomials are expressed in Bernstein bases. We use the numerically stable companion pencil of Gudbjörn F. Jónsson to compute the roots, and the Hopcroft-Karp bipartite matching method to find the degree of the approximate GCD. We offer some refinements to improve the process.
We adapt Victor Y. Pan's root-based algorithm for finding approximate GCD to the case where the polynomials are expressed in Bernstein bases. We use the numerically stable companion pencil of Gudbjörn F. Jónsson to compute the roots, and the Hopcroft-Karp bipartite matching method to find the degree of the approximate GCD. We offer some refinements to improve the process.
△ Less
Submitted 4 October, 2019;
originally announced October 2019.
-
Generalized Standard Triples for Algebraic Linearizations of Matrix Polynomials
Authors:
Eunice Y. S. Chan,
Robert M. Corless,
Leili Rafiee Sevyeri
Abstract:
We define \emph{generalized standard triples} $\mathbf{X}$, $\mathbf{Y}$, and $L(z) = z\mathbf{C}_{1} - \mathbf{C}_{0}$, where $L(z)$ is a linearization of a regular matrix polynomial $\mathbf{P}(z) \in \mathbb{C}^{n \times n}[z]$, in order to use the representation $\mathbf{X}(z \mathbf{C}_{1}~-~\mathbf{C}_{0})^{-1}\mathbf{Y}~=~\mathbf{P}^{-1}(z)$ which holds except when $z$ is an eigenvalue of…
▽ More
We define \emph{generalized standard triples} $\mathbf{X}$, $\mathbf{Y}$, and $L(z) = z\mathbf{C}_{1} - \mathbf{C}_{0}$, where $L(z)$ is a linearization of a regular matrix polynomial $\mathbf{P}(z) \in \mathbb{C}^{n \times n}[z]$, in order to use the representation $\mathbf{X}(z \mathbf{C}_{1}~-~\mathbf{C}_{0})^{-1}\mathbf{Y}~=~\mathbf{P}^{-1}(z)$ which holds except when $z$ is an eigenvalue of $\mathbf{P}$. This representation can be used in constructing so-called \emph{algebraic linearizations} for matrix polynomials of the form $\mathbf{H}(z) = z \mathbf{A}(z)\mathbf{B}(z) + \mathbf{C} \in \mathbb{C}^{n \times n}[z]$ from generalized standard triples of $\mathbf{A}(z)$ and $\mathbf{B}(z)$. This can be done even if $\mathbf{A}(z)$ and $\mathbf{B}(z)$ are expressed in differing polynomial bases. Our main theorem is that $\mathbf{X}$ can be expressed using the coefficients of the expression $1 = \sum_{k=0}^\ell e_k φ_k(z)$ in terms of the relevant polynomial basis. For convenience, we tabulate generalized standard triples for orthogonal polynomial bases, the monomial basis, and Newton interpolational bases; for the Bernstein basis; for Lagrange interpolational bases; and for Hermite interpolational bases. We account for the possibility of common similarity transformations.
△ Less
Submitted 25 March, 2021; v1 submitted 11 May, 2018;
originally announced May 2018.
-
The Runge Example for Interpolation and Wilkinson's Examples for Rootfinding
Authors:
Robert M. Corless,
Leili Rafiee Sevyeri
Abstract:
We look at two classical examples in the theory of numerical analysis, namely the Runge example for interpolation and Wilkinson's example (actually two examples) for rootfinding. We use the modern theory of backward error analysis and conditioning, as instigated and popularized by Wilkinson, but refined by Farouki and Rajan. By this means, we arrive at a satisfactory explanation of the puzzling ph…
▽ More
We look at two classical examples in the theory of numerical analysis, namely the Runge example for interpolation and Wilkinson's example (actually two examples) for rootfinding. We use the modern theory of backward error analysis and conditioning, as instigated and popularized by Wilkinson, but refined by Farouki and Rajan. By this means, we arrive at a satisfactory explanation of the puzzling phenomena encountered by students when they try to fit polynomials to numerical data, or when they try to use numerical rootfinding to find polynomial zeros. Computer algebra, with its controlled, arbitrary precision, plays an important didactic role.
△ Less
Submitted 23 April, 2018;
originally announced April 2018.
-
Stirling's Original Asymptotic Series from a Formula like one of Binet's and its Evaluation by Sequence Acceleration
Authors:
Robert M. Corless,
Leili Rafiee Sevyeri
Abstract:
We give an apparently new proof of Stirling's original asymptotic formula for the behavior of $\ln z!$ for large $z$. Stirling's original formula is not the formula widely known as "Stirling's formula", which was actually due to De Moivre. We also show by experiment that this old formula is quite effective for numerical evaluation of $\ln z!$ over $\mathbb{C}$, when coupled with the sequence accel…
▽ More
We give an apparently new proof of Stirling's original asymptotic formula for the behavior of $\ln z!$ for large $z$. Stirling's original formula is not the formula widely known as "Stirling's formula", which was actually due to De Moivre. We also show by experiment that this old formula is quite effective for numerical evaluation of $\ln z!$ over $\mathbb{C}$, when coupled with the sequence acceleration method known as Levin's $u$-transform. As an homage to Stirling, who apparently used inverse symbolic computation to identify the constant term in his formula, we do the same in our proof.
△ Less
Submitted 6 May, 2019; v1 submitted 14 April, 2018;
originally announced April 2018.