Search | arXiv e-print repository

Transparent Anomaly Detection via Concept-based Explanations

Authors: Laya Rafiee Sevyeri, Ivaxi Sheth, Farhood Farahnak, Samira Ebrahimi Kahou, Shirin Abbasinejad Enger

Abstract: Advancements in deep learning techniques have given a boost to the performance of anomaly detection. However, real-world and safety-critical applications demand a level of transparency and reasoning beyond accuracy. The task of anomaly detection (AD) focuses on finding whether a given sample follows the learned distribution. Existing methods lack the ability to reason with clear explanations for t… ▽ More Advancements in deep learning techniques have given a boost to the performance of anomaly detection. However, real-world and safety-critical applications demand a level of transparency and reasoning beyond accuracy. The task of anomaly detection (AD) focuses on finding whether a given sample follows the learned distribution. Existing methods lack the ability to reason with clear explanations for their outcomes. Hence to overcome this challenge, we propose Transparent {A}nomaly Detection {C}oncept {E}xplanations (ACE). ACE is able to provide human interpretable explanations in the form of concepts along with anomaly prediction. To the best of our knowledge, this is the first paper that proposes interpretable by-design anomaly detection. In addition to promoting transparency in AD, it allows for effective human-model interaction. Our proposed model shows either higher or comparable results to black-box uninterpretable models. We validate the performance of ACE across three realistic datasets - bird classification on CUB-200-2011, challenging histopathology slide image classification on TIL-WSI-TCGA, and gender classification on CelebA. We further demonstrate that our concept learning paradigm can be seamlessly integrated with other classification-based AD methods. △ Less

Submitted 1 November, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted at Neurips XAI in Action workshop

arXiv:2304.02798 [pdf, other]

Source-free Domain Adaptation Requires Penalized Diversity

Authors: Laya Rafiee Sevyeri, Ivaxi Sheth, Farhood Farahnak, Alexandre See, Samira Ebrahimi Kahou, Thomas Fevens, Mohammad Havaei

Abstract: While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vit… ▽ More While neural networks are capable of achieving human-like performance in many tasks such as image classification, the impressive performance of each model is limited to its own dataset. Source-free domain adaptation (SFDA) was introduced to address knowledge transfer between different domains in the absence of source data, thus, increasing data privacy. Diversity in representation space can be vital to a model`s adaptability in varied and difficult domains. In unsupervised SFDA, the diversity is limited to learning a single hypothesis on the source or learning multiple hypotheses with a shared feature extractor. Motivated by the improved predictive performance of ensembles, we propose a novel unsupervised SFDA algorithm that promotes representational diversity through the use of separate feature extractors with Distinct Backbone Architectures (DBA). Although diversity in feature space is increased, the unconstrained mutual information (MI) maximization may potentially introduce amplification of weak hypotheses. Thus we introduce the Weak Hypothesis Penalization (WHP) regularizer as a mitigation strategy. Our work proposes Penalized Diversity (PD) where the synergy of DBA and WHP is applied to unsupervised source-free domain adaptation for covariate shift. In addition, PD is augmented with a weighted MI maximization objective for label distribution shift. Empirical results on natural, synthetic, and medical domains demonstrate the effectiveness of PD under different distributional shifts. △ Less

Submitted 12 April, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

arXiv:2204.13170 [pdf, other]

doi 10.1007/978-3-031-20050-2_41

AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation

Authors: Farshid Varno, Marzie Saghayi, Laya Rafiee Sevyeri, Sharut Gupta, Stan Matwin, Mohammad Havaei

Abstract: In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data. Models are optimized locally at each client and further communicated to a central hub for aggregation. While FL is an appealing decentralized training paradigm, heterogeneity among data from different clients can cause the local optimization to drift away from the global objective. I… ▽ More In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data. Models are optimized locally at each client and further communicated to a central hub for aggregation. While FL is an appealing decentralized training paradigm, heterogeneity among data from different clients can cause the local optimization to drift away from the global objective. In order to estimate and therefore remove this drift, variance reduction techniques have been incorporated into FL optimization recently. However, these approaches inaccurately estimate the clients' drift and ultimately fail to remove it properly. In this work, we propose an adaptive algorithm that accurately estimates drift across clients. In comparison to previous works, our approach necessitates less storage and communication bandwidth, as well as lower compute costs. Additionally, our proposed methodology induces stability by constraining the norm of estimates for client drift, making it more practical for large scale FL. Experimental findings demonstrate that the proposed algorithm converges significantly faster and achieves higher accuracy than the baselines across various FL benchmarks. △ Less

Submitted 24 July, 2023; v1 submitted 27 April, 2022; originally announced April 2022.

Comments: Published as a conference paper at ECCV 2022; Corrected some typos in the text and a baseline algorithm

ACM Class: I.2; I.4; I.5

arXiv:2202.07769 [pdf, other]

doi 10.1145/3476446.3536177

Bohemian Matrix Geometry

Authors: Robert M. Corless, George Labahn, Dan Piponi, Leili Rafiee Sevyeri

Abstract: A Bohemian matrix family is a set of matrices all of whose entries are drawn from a fixed, usually discrete and hence bounded, subset of a field of characteristic zero. Originally these were integers -- hence the name, from the acronym BOunded HEight Matrix of Integers (BOHEMI) -- but other kinds of entries are also interesting. Some kinds of questions about Bohemian matrices can be answered by nu… ▽ More A Bohemian matrix family is a set of matrices all of whose entries are drawn from a fixed, usually discrete and hence bounded, subset of a field of characteristic zero. Originally these were integers -- hence the name, from the acronym BOunded HEight Matrix of Integers (BOHEMI) -- but other kinds of entries are also interesting. Some kinds of questions about Bohemian matrices can be answered by numerical computation, but sometimes exact computation is better. In this paper we explore some Bohemian families (symmetric, upper Hessenberg, or Toeplitz) computationally, and answer some open questions posed about the distributions of eigenvalue densities. △ Less

Submitted 26 April, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

Comments: 22 pages; 12 figures

MSC Class: 15B05 ACM Class: I.1.4

arXiv:2112.15541 [pdf, other]

doi 10.1007/978-3-030-61609-0_38

on the effectiveness of generative adversarial network on anomaly detection

Authors: Laya Rafiee Sevyeri, Thomas Fevens

Abstract: Identifying anomalies refers to detecting samples that do not resemble the training data distribution. Many generative models have been used to find anomalies, and among them, generative adversarial network (GAN)-based approaches are currently very popular. GANs mainly rely on the rich contextual information of these models to identify the actual training distribution. Following this analogy, we s… ▽ More Identifying anomalies refers to detecting samples that do not resemble the training data distribution. Many generative models have been used to find anomalies, and among them, generative adversarial network (GAN)-based approaches are currently very popular. GANs mainly rely on the rich contextual information of these models to identify the actual training distribution. Following this analogy, we suggested a new unsupervised model based on GANs --a combination of an autoencoder and a GAN. Further, a new scoring function was introduced to target anomalies where a linear combination of the internal representation of the discriminator and the generator's visual representation, plus the encoded representation of the autoencoder, come together to define the proposed anomaly score. The model was further evaluated on benchmark datasets such as SVHN, CIFAR10, and MNIST, as well as a public medical dataset of leukemia images. In all the experiments, our model outperformed its existing counterparts while slightly improving the inference time. △ Less

Submitted 31 December, 2021; originally announced December 2021.

Comments: This paper is an improved version of an existing paper published by the same authors in ICANN2020

arXiv:2103.13949 [pdf, other]

Approximate GCD in Lagrange bases

Authors: Leili Rafiee Sevyeri, Robert M. Corless

Abstract: For a pair of polynomials with real or complex coefficients, given in any particular basis, the problem of finding their GCD is known to be ill-posed. An answer is still desired for many applications, however. Hence, looking for a GCD of so-called approximate polynomials where this term explicitly denotes small uncertainties in the coefficients has received significant attention in the field of hy… ▽ More For a pair of polynomials with real or complex coefficients, given in any particular basis, the problem of finding their GCD is known to be ill-posed. An answer is still desired for many applications, however. Hence, looking for a GCD of so-called approximate polynomials where this term explicitly denotes small uncertainties in the coefficients has received significant attention in the field of hybrid symbolic-numeric computation. In this paper we give an algorithm, based on one of Victor Ya. Pan, to find an approximate GCD for a pair of approximate polynomials given in a Lagrange basis. More precisely, we suppose that these polynomials are given by their approximate values at distinct known points. We first find each of their roots by using a Lagrange basis companion matrix for each polynomial, cluster the roots of each polynomial to identify multiple roots, and then "marry" the two polynomials to find their GCD. At no point do we change to the monomial basis, thus preserving the good conditioning properties of the original Lagrange basis. We discuss advantages and drawbacks of this method. The computational cost is dominated by the rootfinding step; unless special-purpose eigenvalue algorithms are used, the cost is cubic in the degrees of the polynomials. In principle, this cost could be reduced but we do not do so here. △ Less

Submitted 25 March, 2021; originally announced March 2021.

MSC Class: 68W25

arXiv:2102.09726 [pdf, ps, other]

Equivalences for Linearizations of Matrix Polynomials

Authors: Robert M. Corless, Leili Rafiee Sevyeri, B. David Saunders

Abstract: One useful standard method to compute eigenvalues of matrix polynomials ${\bf P}(z) \in \mathbb{C}^{n\times n}[z]$ of degree at most $\ell$ in $z$ (denoted of grade $\ell$, for short) is to first transform ${\bf P}(z)$ to an equivalent linear matrix polynomial ${\bf L}(z)=z{\bf B}-{\bf A}$, called a companion pencil, where ${\bf A}$ and ${\bf B}$ are usually of larger dimension than ${\bf P}(z)$ b… ▽ More One useful standard method to compute eigenvalues of matrix polynomials ${\bf P}(z) \in \mathbb{C}^{n\times n}[z]$ of degree at most $\ell$ in $z$ (denoted of grade $\ell$, for short) is to first transform ${\bf P}(z)$ to an equivalent linear matrix polynomial ${\bf L}(z)=z{\bf B}-{\bf A}$, called a companion pencil, where ${\bf A}$ and ${\bf B}$ are usually of larger dimension than ${\bf P}(z)$ but ${\bf L}(z)$ is now only of grade $1$ in $z$. The eigenvalues and eigenvectors of ${\bf L}(z)$ can be computed numerically by, for instance, the QZ algorithm. The eigenvectors of ${\bf P}(z)$, including those for infinite eigenvalues, can also be recovered from eigenvectors of ${\bf L}(z)$ if ${\bf L}(z)$ is what is called a "strong linearization" of ${\bf P}(z)$. In this paper we show how to use algorithms for computing the Hermite Normal Form of a companion matrix for a scalar polynomial to direct the discovery of unimodular matrix polynomial cofactors ${\bf E}(z)$ and ${\bf F}(z)$ which, via the equation ${\bf E}(z){\bf L}(z){\bf F}(z) = \mathrm{diag}( {\bf P}(z), {\bf I}_n, \ldots, {\bf I}_n)$, explicitly show the equivalence of ${\bf P}(z)$ and ${\bf L}(z)$. By this method we give new explicit constructions for several linearizations using different polynomial bases. We contrast these new unimodular pairs with those constructed by strict equivalence, some of which are also new to this paper. We discuss the limitations of this experimental, computational discovery method of finding unimodular cofactors. △ Less

Submitted 18 February, 2021; originally announced February 2021.

MSC Class: 65F15; 15A22; 65D05

arXiv:1910.01998 [pdf, ps, other]

Approximate GCD in a Bernstein basis

Authors: Robert M. Corless, Leili Rafiee Sevyeri

Abstract: We adapt Victor Y. Pan's root-based algorithm for finding approximate GCD to the case where the polynomials are expressed in Bernstein bases. We use the numerically stable companion pencil of Gudbjörn F. Jónsson to compute the roots, and the Hopcroft-Karp bipartite matching method to find the degree of the approximate GCD. We offer some refinements to improve the process. We adapt Victor Y. Pan's root-based algorithm for finding approximate GCD to the case where the polynomials are expressed in Bernstein bases. We use the numerically stable companion pencil of Gudbjörn F. Jónsson to compute the roots, and the Hopcroft-Karp bipartite matching method to find the degree of the approximate GCD. We offer some refinements to improve the process. △ Less

Submitted 4 October, 2019; originally announced October 2019.

arXiv:1805.04488 [pdf, ps, other]

Generalized Standard Triples for Algebraic Linearizations of Matrix Polynomials

Authors: Eunice Y. S. Chan, Robert M. Corless, Leili Rafiee Sevyeri

Abstract: We define \emph{generalized standard triples} $\mathbf{X}$, $\mathbf{Y}$, and $L(z) = z\mathbf{C}_{1} - \mathbf{C}_{0}$, where $L(z)$ is a linearization of a regular matrix polynomial $\mathbf{P}(z) \in \mathbb{C}^{n \times n}[z]$, in order to use the representation $\mathbf{X}(z \mathbf{C}_{1}~-~\mathbf{C}_{0})^{-1}\mathbf{Y}~=~\mathbf{P}^{-1}(z)$ which holds except when $z$ is an eigenvalue of… ▽ More We define \emph{generalized standard triples} $\mathbf{X}$, $\mathbf{Y}$, and $L(z) = z\mathbf{C}_{1} - \mathbf{C}_{0}$, where $L(z)$ is a linearization of a regular matrix polynomial $\mathbf{P}(z) \in \mathbb{C}^{n \times n}[z]$, in order to use the representation $\mathbf{X}(z \mathbf{C}_{1}~-~\mathbf{C}_{0})^{-1}\mathbf{Y}~=~\mathbf{P}^{-1}(z)$ which holds except when $z$ is an eigenvalue of $\mathbf{P}$. This representation can be used in constructing so-called \emph{algebraic linearizations} for matrix polynomials of the form $\mathbf{H}(z) = z \mathbf{A}(z)\mathbf{B}(z) + \mathbf{C} \in \mathbb{C}^{n \times n}[z]$ from generalized standard triples of $\mathbf{A}(z)$ and $\mathbf{B}(z)$. This can be done even if $\mathbf{A}(z)$ and $\mathbf{B}(z)$ are expressed in differing polynomial bases. Our main theorem is that $\mathbf{X}$ can be expressed using the coefficients of the expression $1 = \sum_{k=0}^\ell e_k φ_k(z)$ in terms of the relevant polynomial basis. For convenience, we tabulate generalized standard triples for orthogonal polynomial bases, the monomial basis, and Newton interpolational bases; for the Bernstein basis; for Lagrange interpolational bases; and for Hermite interpolational bases. We account for the possibility of common similarity transformations. △ Less

Submitted 25 March, 2021; v1 submitted 11 May, 2018; originally announced May 2018.

Comments: 18 pages

MSC Class: 65F15; 15A22; 65D05

arXiv:1804.08561 [pdf, other]

The Runge Example for Interpolation and Wilkinson's Examples for Rootfinding

Authors: Robert M. Corless, Leili Rafiee Sevyeri

Abstract: We look at two classical examples in the theory of numerical analysis, namely the Runge example for interpolation and Wilkinson's example (actually two examples) for rootfinding. We use the modern theory of backward error analysis and conditioning, as instigated and popularized by Wilkinson, but refined by Farouki and Rajan. By this means, we arrive at a satisfactory explanation of the puzzling ph… ▽ More We look at two classical examples in the theory of numerical analysis, namely the Runge example for interpolation and Wilkinson's example (actually two examples) for rootfinding. We use the modern theory of backward error analysis and conditioning, as instigated and popularized by Wilkinson, but refined by Farouki and Rajan. By this means, we arrive at a satisfactory explanation of the puzzling phenomena encountered by students when they try to fit polynomials to numerical data, or when they try to use numerical rootfinding to find polynomial zeros. Computer algebra, with its controlled, arbitrary precision, plays an important didactic role. △ Less

Submitted 23 April, 2018; originally announced April 2018.

MSC Class: 97N40

arXiv:1804.05263 [pdf, other]

Stirling's Original Asymptotic Series from a Formula like one of Binet's and its Evaluation by Sequence Acceleration

Authors: Robert M. Corless, Leili Rafiee Sevyeri

Abstract: We give an apparently new proof of Stirling's original asymptotic formula for the behavior of $\ln z!$ for large $z$. Stirling's original formula is not the formula widely known as "Stirling's formula", which was actually due to De Moivre. We also show by experiment that this old formula is quite effective for numerical evaluation of $\ln z!$ over $\mathbb{C}$, when coupled with the sequence accel… ▽ More We give an apparently new proof of Stirling's original asymptotic formula for the behavior of $\ln z!$ for large $z$. Stirling's original formula is not the formula widely known as "Stirling's formula", which was actually due to De Moivre. We also show by experiment that this old formula is quite effective for numerical evaluation of $\ln z!$ over $\mathbb{C}$, when coupled with the sequence acceleration method known as Levin's $u$-transform. As an homage to Stirling, who apparently used inverse symbolic computation to identify the constant term in his formula, we do the same in our proof. △ Less

Submitted 6 May, 2019; v1 submitted 14 April, 2018; originally announced April 2018.

MSC Class: 3303

Showing 1–11 of 11 results for author: Sevyeri, L R