Search | arXiv e-print repository

Formal Verification of Object Detection

Authors: Avraham Raviv, Yizhak Y. Elboher, Michelle Aluf-Medina, Yael Leibovich Weiss, Omer Cohen, Roy Assa, Guy Katz, Hillel Kugler

Abstract: Deep Neural Networks (DNNs) are ubiquitous in real-world applications, yet they remain vulnerable to errors and adversarial attacks. This work tackles the challenge of applying formal verification to ensure the safety of computer vision models, extending verification beyond image classification to object detection. We propose a general formulation for certifying the robustness of object detection… ▽ More Deep Neural Networks (DNNs) are ubiquitous in real-world applications, yet they remain vulnerable to errors and adversarial attacks. This work tackles the challenge of applying formal verification to ensure the safety of computer vision models, extending verification beyond image classification to object detection. We propose a general formulation for certifying the robustness of object detection models using formal verification and outline implementation strategies compatible with state-of-the-art verification tools. Our approach enables the application of these tools, originally designed for verifying classification models, to object detection. We define various attacks for object detection, illustrating the diverse ways adversarial inputs can compromise neural network outputs. Our experiments, conducted on several common datasets and networks, reveal potential errors in object detection models, highlighting system vulnerabilities and emphasizing the need for expanding formal verification to these new domains. This work paves the way for further research in integrating formal verification across a broader range of computer vision applications. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2404.07153 [pdf, other]

Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations

Authors: Ofir Shifman, Yair Weiss

Abstract: Deep neural networks that achieve remarkable performance in image classification have previously been shown to be easily fooled by tiny transformations such as a one pixel translation of the input image. In order to address this problem, two approaches have been proposed in recent years. The first approach suggests using huge datasets together with data augmentation in the hope that a highly varie… ▽ More Deep neural networks that achieve remarkable performance in image classification have previously been shown to be easily fooled by tiny transformations such as a one pixel translation of the input image. In order to address this problem, two approaches have been proposed in recent years. The first approach suggests using huge datasets together with data augmentation in the hope that a highly varied training set will teach the network to learn to be invariant. The second approach suggests using architectural modifications based on sampling theory to deal explicitly with image translations. In this paper, we show that these approaches still fall short in robustly handling 'natural' image translations that simulate a subtle change in camera orientation. Our findings reveal that a mere one-pixel translation can result in a significant change in the predicted image representation for approximately 40% of the test images in state-of-the-art models (e.g. open-CLIP trained on LAION-2B or DINO-v2) , while models that are explicitly constructed to be robust to cyclic translations can still be fooled with 1 pixel realistic (non-cyclic) translations 11% of the time. We present Robust Inference by Crop Selection: a simple method that can be proven to achieve any desired level of consistency, although with a modest tradeoff with the model's accuracy. Importantly, we demonstrate how employing this method reduces the ability to fool state-of-the-art models with a 1 pixel translation to less than 5% while suffering from only a 1% drop in classification accuracy. Additionally, we show that our method can be easy adjusted to deal with circular shifts as well. In such case we achieve 100% robustness to integer shifts with state-of-the-art accuracy, and with no need for any further training. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 14 pages, 6 appendices, 17 figures

arXiv:2404.06313 [pdf, other]

On adversarial training and the 1 Nearest Neighbor classifier

Authors: Amir Hagai, Yair Weiss

Abstract: The ability to fool deep learning classifiers with tiny perturbations of the input has lead to the development of adversarial training in which the loss with respect to adversarial examples is minimized in addition to the training examples. While adversarial training improves the robustness of the learned classifiers, the procedure is computationally expensive, sensitive to hyperparameters and may… ▽ More The ability to fool deep learning classifiers with tiny perturbations of the input has lead to the development of adversarial training in which the loss with respect to adversarial examples is minimized in addition to the training examples. While adversarial training improves the robustness of the learned classifiers, the procedure is computationally expensive, sensitive to hyperparameters and may still leave the classifier vulnerable to other types of small perturbations. In this paper we analyze the adversarial robustness of the 1 Nearest Neighbor (1NN) classifier and compare its performance to adversarial training. We prove that under reasonable assumptions, the 1 NN classifier will be robust to {\em any} small image perturbation of the training images and will give high adversarial accuracy on test images as the number of training examples goes to infinity. In experiments with 45 different binary image classification problems taken from CIFAR10, we find that 1NN outperform TRADES (a powerful adversarial training algorithm) in terms of average adversarial accuracy. In additional experiments with 69 pretrained robust models for CIFAR10, we find that 1NN outperforms almost all of them in terms of robustness to perturbations that are only slightly different from those seen during training. Taken together, our results suggest that modern adversarial training methods still fall short of the robustness of the simple 1NN classifier. our code can be found at https://github.com/amirhagai/On-Adversarial-Training-And-The-1-Nearest-Neighbor-Classifier △ Less

Submitted 11 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

arXiv:2403.11756 [pdf, other]

doi 10.1145/3613904.3642864

Just Undo It: Exploring Undo Mechanics in Multi-User Virtual Reality

Authors: Julian Rasch, Florian Perzl, Yannick Weiss, Florian Müller

Abstract: With the proliferation of VR and a metaverse on the horizon, many multi-user activities are migrating to the VR world, calling for effective collaboration support. As one key feature, traditional collaborative systems provide users with undo mechanics to reverse errors and other unwanted changes. While undo has been extensively researched in this domain and is now considered industry standard, it… ▽ More With the proliferation of VR and a metaverse on the horizon, many multi-user activities are migrating to the VR world, calling for effective collaboration support. As one key feature, traditional collaborative systems provide users with undo mechanics to reverse errors and other unwanted changes. While undo has been extensively researched in this domain and is now considered industry standard, it is strikingly absent for VR systems in research and industry. This work addresses this research gap by exploring different undo techniques for basic object manipulation in different collaboration modes in VR. We conducted a study involving 32 participants organized in teams of two. Here, we studied users' performance and preferences in a tower stacking task, varying the available undo techniques and their mode of collaboration. The results suggest that users desire and use undo in VR and that the choice of the undo technique impacts users' performance and social connection. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: To appear in Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11-16, 2024, Honolulu, HI, USA

arXiv:2402.14098 [pdf, other]

Intriguing Properties of Modern GANs

Authors: Roy Friedman, Yair Weiss

Abstract: Modern GANs achieve remarkable performance in terms of generating realistic and diverse samples. This has led many to believe that ``GANs capture the training data manifold''. In this work we show that this interpretation is wrong. We empirically show that the manifold learned by modern GANs does not fit the training distribution: specifically the manifold does not pass through the training exampl… ▽ More Modern GANs achieve remarkable performance in terms of generating realistic and diverse samples. This has led many to believe that ``GANs capture the training data manifold''. In this work we show that this interpretation is wrong. We empirically show that the manifold learned by modern GANs does not fit the training distribution: specifically the manifold does not pass through the training examples and passes closer to out-of-distribution images than to in-distribution images. We also investigate the distribution over images implied by the prior over the latent codes and study whether modern GANs learn a density that approximates the training distribution. Surprisingly, we find that the learned density is very far from the data distribution and that GANs tend to assign higher density to out-of-distribution images. Finally, we demonstrate that the set of images used to train modern GANs are often not part of the typical set described by the GANs' distribution. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2308.15938 [pdf, other]

Provengo: A Tool Suite for Scenario Driven Model-Based Testing

Authors: Michael Bar-Sinai, Achiya Elyasaf, Gera Weiss, Yeshayahu Weiss

Abstract: We present Provengo, a comprehensive suite of tools designed to facilitate the implementation of Scenario-Driven Model-Based Testing (SDMBT), an innovative approach that utilizes scenarios to construct a model encompassing the user's perspective and the system's business value while also defining the desired outcomes. With the assistance of Provengo, testers gain the ability to effortlessly create… ▽ More We present Provengo, a comprehensive suite of tools designed to facilitate the implementation of Scenario-Driven Model-Based Testing (SDMBT), an innovative approach that utilizes scenarios to construct a model encompassing the user's perspective and the system's business value while also defining the desired outcomes. With the assistance of Provengo, testers gain the ability to effortlessly create natural user stories and seamlessly integrate them into a model capable of generating effective tests. The demonstration illustrates how SDMBT effectively addresses the bootstrap** challenge commonly encountered in model-based testing (MBT) by enabling incremental development, starting from simple models and gradually augmenting them with additional stories. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 4 pages, 3 figures, 2 listing

arXiv:2206.02454 [pdf, other]

What do CNNs Learn in the First Layer and Why? A Linear Systems Perspective

Authors: Rhea Chowers, Yair Weiss

Abstract: It has previously been reported that the representation that is learned in the first layer of deep Convolutional Neural Networks (CNNs) is highly consistent across initializations and architectures. In this work, we quantify this consistency by considering the first layer as a filter bank and measuring its energy distribution. We find that the energy distribution is very different from that of the… ▽ More It has previously been reported that the representation that is learned in the first layer of deep Convolutional Neural Networks (CNNs) is highly consistent across initializations and architectures. In this work, we quantify this consistency by considering the first layer as a filter bank and measuring its energy distribution. We find that the energy distribution is very different from that of the initial weights and is remarkably consistent across random initializations, datasets, architectures and even when the CNNs are trained with random labels. In order to explain this consistency, we derive an analytical formula for the energy profile of linear CNNs and show that this profile is mostly dictated by the second order statistics of image patches in the training set and it will approach a whitening transformation when the number of iterations goes to infinity. Finally, we show that this formula for linear CNNs also gives an excellent fit for the energy profiles learned by commonly used nonlinear CNNs such as ResNet and VGG, and that the first layer of these CNNs indeed perform approximate whitening of their inputs. △ Less

Submitted 14 February, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

arXiv:2203.11862 [pdf, other]

Generating natural images with direct Patch Distributions Matching

Authors: Ariel Elnekave, Yair Weiss

Abstract: Many traditional computer vision algorithms generate realistic images by requiring that each patch in the generated image be similar to a patch in a training image and vice versa. Recently, this classical approach has been replaced by adversarial training with a patch discriminator. The adversarial approach avoids the computational burden of finding nearest neighbors of patches but often requires… ▽ More Many traditional computer vision algorithms generate realistic images by requiring that each patch in the generated image be similar to a patch in a training image and vice versa. Recently, this classical approach has been replaced by adversarial training with a patch discriminator. The adversarial approach avoids the computational burden of finding nearest neighbors of patches but often requires very long training times and may fail to match the distribution of patches. In this paper we leverage the recently developed Sliced Wasserstein Distance and develop an algorithm that explicitly and efficiently minimizes the distance between patch distributions in two images. Our method is conceptually simple, requires no training and can be implemented in a few lines of codes. On a number of image generation tasks we show that our results are often superior to single-image-GANs, require no training, and can generate high quality images in a few seconds. Our implementation is available at https://github.com/ariel415el/GPDM △ Less

Submitted 4 September, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

Comments: Corrected typos In text and figures (Thanks to Ronen Schaffer)

arXiv:2201.00522 [pdf, other]

doi 10.1109/TSE.2023.3279570

Generalized Coverage Criteria for Combinatorial Sequence Testing

Authors: Achiya Elyasaf, Eitan Farchi, Oded Margalit, Gera Weiss, Yeshayahu Weiss

Abstract: We present a new model-based approach for testing systems that use sequences of actions and assertions as test vectors. Our solution includes a method for quantifying testing quality, a tool for generating high-quality test suites based on the coverage criteria we propose, and a framework for assessing risks. For testing quality, we propose a method that specifies generalized coverage criteria ove… ▽ More We present a new model-based approach for testing systems that use sequences of actions and assertions as test vectors. Our solution includes a method for quantifying testing quality, a tool for generating high-quality test suites based on the coverage criteria we propose, and a framework for assessing risks. For testing quality, we propose a method that specifies generalized coverage criteria over sequences of actions, which extends previous approaches. Our publicly available tool demonstrates how to extract effective test suites from test plans based on these criteria. We also present a Bayesian approach for measuring the probabilities of bugs or risks, and show how this quantification can help achieve an informed balance between exploitation and exploration in testing. Finally, we provide an empirical evaluation demonstrating the effectiveness of our tool in finding bugs, assessing risks, and achieving coverage. △ Less

Submitted 31 October, 2023; v1 submitted 3 January, 2022; originally announced January 2022.

Comments: 12 pages, 5 tables, 5 figures, and 2 listing

Journal ref: in IEEE Transactions on Software Engineering, vol. 49, no. 8, pp. 4023-4034, 24 May 2023

arXiv:2112.01538 [pdf, other]

Testing Reactive Systems Using Behavioural Programming, a Model Centric Approach

Authors: Yeshayahu Weiss

Abstract: Testing is a significant aspect of software development. As systems become complex and their use becomes critical to the security and the function of society, the need for testing methodologies that ensure reliability and detect faults as early as possible becomes critical. The most promising approach is the model-based approach where a model is developed that defines how the system is expected to… ▽ More Testing is a significant aspect of software development. As systems become complex and their use becomes critical to the security and the function of society, the need for testing methodologies that ensure reliability and detect faults as early as possible becomes critical. The most promising approach is the model-based approach where a model is developed that defines how the system is expected to behave and how it is meant to react. The tests are derived from the model and an analysis of the test results is conducted based on it. We will investigate the prospects of using the Behavioral Programming (BP) for a model-based testing (MBT) approach that we will develop. We will develop a natural language for representing the requirements. The model will be fed to algorithms that we will develop. This includes algorithms for the automatic creation of minimal sets of test cases that cover all of the system's requirements, analysing the results of the tests, and other tools that support the testing process. The focus of our methodology will be to find faults caused by the interaction between different requirements in ways that are difficult for the testers to detect. Specifically, we will focus our attention to concurrency issues such as deadlocks and logical race condition. We will use a variety of methods that are made possible by BP, such as non-deterministic execution of scenarios and use of in-code model-checking for building test scenarios and for finding minimal coverage of the test scenarios for the system requirements using Combinatorial Test Design (CTD) methodologies. We will develop a proof-of-concept tool kit which will allow us to demonstrate and evaluate the above mentioned capabilities. We will compare the performance of our tools with the performance of manual testers and of other model-based tools using comparison criteria that we will define and develop. △ Less

Submitted 7 December, 2021; v1 submitted 2 December, 2021; originally announced December 2021.

Comments: 31 pages, 7 figures

arXiv:2104.09895 [pdf, other]

Posterior Sampling for Image Restoration using Explicit Patch Priors

Authors: Roy Friedman, Yair Weiss

Abstract: Almost all existing methods for image restoration are based on optimizing the mean squared error (MSE), even though it is known that the best estimate in terms of MSE may yield a highly atypical image due to the fact that there are many plausible restorations for a given noisy image. In this paper, we show how to combine explicit priors on patches of natural images in order to sample from the post… ▽ More Almost all existing methods for image restoration are based on optimizing the mean squared error (MSE), even though it is known that the best estimate in terms of MSE may yield a highly atypical image due to the fact that there are many plausible restorations for a given noisy image. In this paper, we show how to combine explicit priors on patches of natural images in order to sample from the posterior probability of a full image given a degraded image. We prove that our algorithm generates correct samples from the distribution $p(x|y) \propto \exp(-E(x|y))$ where $E(x|y)$ is the cost function minimized in previous patch-based approaches that compute a single restoration. Unlike previous approaches that computed a single restoration using MAP or MMSE, our method makes explicit the uncertainty in the restored images and guarantees that all patches in the restored images will be typical given the patch prior. Unlike previous approaches that used implicit priors on fixed-size images, our approach can be used with images of any size. Our experimental results show that posterior sampling using patch priors yields images of high perceptual quality and high PSNR on a range of challenging image restoration problems. △ Less

Submitted 20 April, 2021; originally announced April 2021.

arXiv:2010.10148 [pdf, other]

Don't Drone Yourself in Work: Discussing DronOS as a Framework for Human-Drone Interaction

Authors: Matthias Hoppe, Yannick Weiß, Marinus Burger, Thomas Kosch

Abstract: More and more off-the-shelf drones provide frameworks that enable the programming of flight paths. These frameworks provide vendor-dependent programming and communication interfaces that are intended for flight path definitions. However, they are often limited to outdoor and GPS-based use only. A key disadvantage of such a solution is that they are complicated to use and require readjustments when… ▽ More More and more off-the-shelf drones provide frameworks that enable the programming of flight paths. These frameworks provide vendor-dependent programming and communication interfaces that are intended for flight path definitions. However, they are often limited to outdoor and GPS-based use only. A key disadvantage of such a solution is that they are complicated to use and require readjustments when changing the drone model. This is time-consuming since it requires redefining the flight path for the new framework. This workshop paper proposes additional features for DronOS, a community-driven framework that enables model-independent automatisation and programming of drones. We enhanced DronOS to include additional functions to account for the specific design constraints in human-drone-interaction. This paper provides a starting point for discussing the requirements involved in designing a drone system with other researchers within the human-drone interaction community. We envision DronOS as a community-driven framework that can be applied to generic drone models, hence enabling the automatisation for any commercially available drone. Our goal is to build DronOS as a software tool that can be easily used by researchers and practitioners to prototype novel drone-based systems. △ Less

Submitted 20 October, 2020; originally announced October 2020.

arXiv:2007.12568 [pdf, other]

The Surprising Effectiveness of Linear Unsupervised Image-to-Image Translation

Authors: Eitan Richardson, Yair Weiss

Abstract: Unsupervised image-to-image translation is an inherently ill-posed problem. Recent methods based on deep encoder-decoder architectures have shown impressive results, but we show that they only succeed due to a strong locality bias, and they fail to learn very simple nonlocal transformations (e.g. map** upside down faces to upright faces). When the locality bias is removed, the methods are too po… ▽ More Unsupervised image-to-image translation is an inherently ill-posed problem. Recent methods based on deep encoder-decoder architectures have shown impressive results, but we show that they only succeed due to a strong locality bias, and they fail to learn very simple nonlocal transformations (e.g. map** upside down faces to upright faces). When the locality bias is removed, the methods are too powerful and may fail to learn simple local transformations. In this paper we introduce linear encoder-decoder architectures for unsupervised image to image translation. We show that learning is much easier and faster with these architectures and yet the results are surprisingly effective. In particular, we show a number of local problems for which the results of the linear methods are comparable to those of state-of-the-art architectures but with a fraction of the training time, and a number of nonlocal problems for which the state-of-the-art fails while linear methods succeed. △ Less

Submitted 24 July, 2020; originally announced July 2020.

Comments: Preprint - under review

arXiv:2002.08859 [pdf, other]

A Bayes-Optimal View on Adversarial Examples

Authors: Eitan Richardson, Yair Weiss

Abstract: Since the discovery of adversarial examples - the ability to fool modern CNN classifiers with tiny perturbations of the input, there has been much discussion whether they are a "bug" that is specific to current neural architectures and training methods or an inevitable "feature" of high dimensional geometry. In this paper, we argue for examining adversarial examples from the perspective of Bayes-O… ▽ More Since the discovery of adversarial examples - the ability to fool modern CNN classifiers with tiny perturbations of the input, there has been much discussion whether they are a "bug" that is specific to current neural architectures and training methods or an inevitable "feature" of high dimensional geometry. In this paper, we argue for examining adversarial examples from the perspective of Bayes-Optimal classification. We construct realistic image datasets for which the Bayes-Optimal classifier can be efficiently computed and derive analytic conditions on the distributions under which these classifiers are provably robust against any adversarial attack even in high dimensions. Our results show that even when these "gold standard" optimal classifiers are robust, CNNs trained on the same datasets consistently learn a vulnerable classifier, indicating that adversarial examples are often an avoidable "bug". We further show that RBF SVMs trained on the same data consistently learn a robust classifier. The same trend is observed in experiments with real images in different datasets. △ Less

Submitted 17 March, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

Comments: Minor revision per journal review, 28 pages

arXiv:1805.12462 [pdf, other]

On GANs and GMMs

Authors: Eitan Richardson, Yair Weiss

Abstract: A longstanding problem in machine learning is to find unsupervised methods that can learn the statistical structure of high dimensional signals. In recent years, GANs have gained much attention as a possible solution to the problem, and in particular have shown the ability to generate remarkably realistic high resolution sampled images. At the same time, many authors have pointed out that GANs may… ▽ More A longstanding problem in machine learning is to find unsupervised methods that can learn the statistical structure of high dimensional signals. In recent years, GANs have gained much attention as a possible solution to the problem, and in particular have shown the ability to generate remarkably realistic high resolution sampled images. At the same time, many authors have pointed out that GANs may fail to model the full distribution ("mode collapse") and that using the learned models for anything other than generating samples may be very difficult. In this paper, we examine the utility of GANs in learning statistical models of images by comparing them to perhaps the simplest statistical model, the Gaussian Mixture Model. First, we present a simple method to evaluate generative models based on relative proportions of samples that fall into predetermined bins. Unlike previous automatic methods for evaluating models, our method does not rely on an additional neural network nor does it require approximating intractable computations. Second, we compare the performance of GANs to GMMs trained on the same datasets. While GMMs have previously been shown to be successful in modeling small patches of images, we show how to train them on full sized images despite the high dimensionality. Our results show that GMMs can generate realistic samples (although less sharp than those of GANs) but also capture the full distribution, which GANs fail to do. Furthermore, GMMs allow efficient inference and explicit representation of the underlying statistical structure. Finally, we discuss how GMMs can be used to generate sharp images. △ Less

Submitted 3 November, 2018; v1 submitted 31 May, 2018; originally announced May 2018.

Comments: Accepted to NIPS 2018

arXiv:1805.12177 [pdf, other]

Why do deep convolutional networks generalize so poorly to small image transformations?

Authors: Aharon Azulay, Yair Weiss

Abstract: Convolutional Neural Networks (CNNs) are commonly assumed to be invariant to small image transformations: either because of the convolutional architecture or because they were trained using data augmentation. Recently, several authors have shown that this is not the case: small translations or rescalings of the input image can drastically change the network's prediction. In this paper, we quantify… ▽ More Convolutional Neural Networks (CNNs) are commonly assumed to be invariant to small image transformations: either because of the convolutional architecture or because they were trained using data augmentation. Recently, several authors have shown that this is not the case: small translations or rescalings of the input image can drastically change the network's prediction. In this paper, we quantify this phenomena and ask why neither the convolutional architecture nor data augmentation are sufficient to achieve the desired invariance. Specifically, we show that the convolutional architecture does not give invariance since architectures ignore the classical sampling theorem, and data augmentation does not give invariance because the CNNs learn to be invariant to transformations only for images that are very similar to typical images from the training set. We discuss two possible solutions to this problem: (1) antialiasing the intermediate representations and (2) increasing data augmentation and show that they provide only a partial solution at best. Taken together, our results indicate that the problem of insuring invariance to small image transformations in neural networks while preserving high accuracy remains unsolved. △ Less

Submitted 31 December, 2019; v1 submitted 30 May, 2018; originally announced May 2018.

Journal ref: JMLR 20(184) 1-25 2019

arXiv:1702.05958 [pdf, other]

Reflection Separation Using Guided Annotation

Authors: Ofer Springer, Yair Weiss

Abstract: Photographs taken through a glass surface often contain an approximately linear superposition of reflected and transmitted layers. Decomposing an image into these layers is generally an ill-posed task and the use of an additional image prior and user provided cues is presently necessary in order to obtain good results. Current annotation approaches rely on a strong sparsity assumption. For images… ▽ More Photographs taken through a glass surface often contain an approximately linear superposition of reflected and transmitted layers. Decomposing an image into these layers is generally an ill-posed task and the use of an additional image prior and user provided cues is presently necessary in order to obtain good results. Current annotation approaches rely on a strong sparsity assumption. For images with significant texture this assumption does not typically hold, thus rendering the annotation process unviable. In this paper we show that using a Gaussian Mixture Model patch prior, the correct local decomposition can almost always be found as one of 100 likely modes of the posterior. Thus, the user need only choose one of these modes in a sparse set of patches and the decomposition may then be completed automatically. We demonstrate the performance of our method using synthesized and real reflection images. △ Less

Submitted 10 May, 2017; v1 submitted 20 February, 2017; originally announced February 2017.

Comments: To be presented at ICIP 2017, 6 pages, 7 figures

arXiv:1608.05275 [pdf, other]

A Tight Convex Upper Bound on the Likelihood of a Finite Mixture

Authors: Elad Mezuman, Yair Weiss

Abstract: The likelihood function of a finite mixture model is a non-convex function with multiple local maxima and commonly used iterative algorithms such as EM will converge to different solutions depending on initial conditions. In this paper we ask: is it possible to assess how far we are from the global maximum of the likelihood? Since the likelihood of a finite mixture model can grow unboundedly by ce… ▽ More The likelihood function of a finite mixture model is a non-convex function with multiple local maxima and commonly used iterative algorithms such as EM will converge to different solutions depending on initial conditions. In this paper we ask: is it possible to assess how far we are from the global maximum of the likelihood? Since the likelihood of a finite mixture model can grow unboundedly by centering a Gaussian on a single datapoint and shrinking the covariance, we constrain the problem by assuming that the parameters of the individual models are members of a large discrete set (e.g. estimating a mixture of two Gaussians where the means and variances of both Gaussians are members of a set of a million possible means and variances). For this setting we show that a simple upper bound on the likelihood can be computed using convex optimization and we analyze conditions under which the bound is guaranteed to be tight. This bound can then be used to assess the quality of solutions found by EM (where the final result is projected on the discrete set) or any other mixture estimation algorithm. For any dataset our method allows us to find a finite mixture model together with a dataset-specific bound on how far the likelihood of this mixture is from the global optimum of the likelihood △ Less

Submitted 18 August, 2016; originally announced August 2016.

Comments: icpr 2016

arXiv:1604.02902 [pdf, other]

Statistics of RGBD Images

Authors: Dan Rosenbaum, Yair Weiss

Abstract: Cameras that can measure the depth of each pixel in addition to its color have become easily available and are used in many consumer products worldwide. Often the depth channel is captured at lower quality compared to the RGB channels and different algorithms have been proposed to improve the quality of the D channel given the RGB channels. Typically these approaches work by assuming that edges in… ▽ More Cameras that can measure the depth of each pixel in addition to its color have become easily available and are used in many consumer products worldwide. Often the depth channel is captured at lower quality compared to the RGB channels and different algorithms have been proposed to improve the quality of the D channel given the RGB channels. Typically these approaches work by assuming that edges in RGB are correlated with edges in D. In this paper we approach this problem from the standpoint of natural image statistics. We obtain examples of high quality RGBD images from a computer graphics generated movie (MPI-Sintel) and we use these examples to compare different probabilistic generative models of RGBD image patches. We then use the generative models together with a degradation model and obtain a Bayes Least Squares (BLS) estimator of the D channel given the RGB channels. Our results show that learned generative models outperform the state-of-the-art in improving the quality of depth channels given the color channels in natural images even when training is performed on artificially generated images. △ Less

Submitted 11 April, 2016; originally announced April 2016.

arXiv:1604.02815 [pdf, other]

Beyond Brightness Constancy: Learning Noise Models for Optical Flow

Authors: Dan Rosenbaum, Yair Weiss

Abstract: Optical flow is typically estimated by minimizing a "data cost" and an optional regularizer. While there has been much work on different regularizers many modern algorithms still use a data cost that is not very different from the ones used over 30 years ago: a robust version of brightness constancy or gradient constancy. In this paper we leverage the recent availability of ground-truth optical fl… ▽ More Optical flow is typically estimated by minimizing a "data cost" and an optional regularizer. While there has been much work on different regularizers many modern algorithms still use a data cost that is not very different from the ones used over 30 years ago: a robust version of brightness constancy or gradient constancy. In this paper we leverage the recent availability of ground-truth optical flow databases in order to learn a data cost. Specifically we take a generative approach in which the data cost models the distribution of noise after war** an image according to the flow and we measure the "goodness" of a data cost by how well it matches the true distribution of flow warp error. Consistent with current practice, we find that robust versions of gradient constancy are better models than simple brightness constancy but a learned GMM that models the density of patches of warp error gives a much better fit than any existing assumption of constancy. This significant advantage of the GMM is due to an explicit modeling of the spatial structure of warp errors, a feature which is missing from almost all existing data costs in optical flow. Finally, we show how a good density model of warp error patches can be used for optical flow estimation on whole images. We replace the data cost by the expected patch log-likelihood (EPLL), and show how this cost can be optimized iteratively using an additional step of denoising the warp error image. The results of our experiments are promising and show that patch models with higher likelihood lead to better optical flow estimation. △ Less

Submitted 11 April, 2016; originally announced April 2016.

arXiv:1309.6848 [pdf]

Tighter Linear Program Relaxations for High Order Graphical Models

Authors: Elad Mezuman, Daniel Tarlow, Amir Globerson, Yair Weiss

Abstract: Graphical models with High Order Potentials (HOPs) have received considerable interest in recent years. While there are a variety of approaches to inference in these models, nearly all of them amount to solving a linear program (LP) relaxation with unary consistency constraints between the HOP and the individual variables. In many cases, the resulting relaxations are loose, and in these cases the… ▽ More Graphical models with High Order Potentials (HOPs) have received considerable interest in recent years. While there are a variety of approaches to inference in these models, nearly all of them amount to solving a linear program (LP) relaxation with unary consistency constraints between the HOP and the individual variables. In many cases, the resulting relaxations are loose, and in these cases the results of inference can be poor. It is thus desirable to look for more accurate ways of performing inference in these models. In this work, we study the LP relaxations that result from enforcing additional consistency constraints between the HOP and the rest of the model. We address theoretical questions about the strength of the resulting relaxations compared to the relaxations that arise in standard approaches, and we develop practical and efficient message passing algorithms for optimizing the LPs. Empirically, we show that the LPs with additional consistency constraints lead to more accurate inference on some challenging problems that include a combination of low order and high order terms. △ Less

Submitted 26 September, 2013; originally announced September 2013.

Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

Report number: UAI-P-2013-PG-421-430

arXiv:1301.6725 [pdf]

Loopy Belief Propagation for Approximate Inference: An Empirical Study

Authors: Kevin Murphy, Yair Weiss, Michael I. Jordan

Abstract: Recently, researchers have demonstrated that loopy belief propagation - the use of Pearls polytree algorithm IN a Bayesian network WITH loops OF error- correcting codes.The most dramatic instance OF this IS the near Shannon - limit performance OF Turbo Codes codes whose decoding algorithm IS equivalent TO loopy belief propagation IN a chain - structured Bayesian network. IN this paper we ask : IS… ▽ More Recently, researchers have demonstrated that loopy belief propagation - the use of Pearls polytree algorithm IN a Bayesian network WITH loops OF error- correcting codes.The most dramatic instance OF this IS the near Shannon - limit performance OF Turbo Codes codes whose decoding algorithm IS equivalent TO loopy belief propagation IN a chain - structured Bayesian network. IN this paper we ask : IS there something special about the error - correcting code context, OR does loopy propagation WORK AS an approximate inference schemeIN a more general setting? We compare the marginals computed using loopy propagation TO the exact ones IN four Bayesian network architectures, including two real - world networks : ALARM AND QMR.We find that the loopy beliefs often converge AND WHEN they do, they give a good approximation TO the correct marginals.However,ON the QMR network, the loopy beliefs oscillated AND had no obvious relationship TO the correct posteriors. We present SOME initial investigations INTO the cause OF these oscillations, AND show that SOME simple methods OF preventing them lead TO the wrong results. △ Less

Submitted 23 January, 2013; originally announced January 2013.

Comments: Appears in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999)

Report number: UAI-P-1999-PG-467-476

arXiv:1301.2296 [pdf]

The Factored Frontier Algorithm for Approximate Inference in DBNs

Authors: Kevin Murphy, Yair Weiss

Abstract: The Factored Frontier (FF) algorithm is a simple approximate inferencealgorithm for Dynamic Bayesian Networks (DBNs). It is very similar tothe fully factorized version of the Boyen-Koller (BK) algorithm, butinstead of doing an exact update at every step followed bymarginalisation (projection), it always works with factoreddistributions. Hence it can be applied to models for which the exactupdate… ▽ More The Factored Frontier (FF) algorithm is a simple approximate inferencealgorithm for Dynamic Bayesian Networks (DBNs). It is very similar tothe fully factorized version of the Boyen-Koller (BK) algorithm, butinstead of doing an exact update at every step followed bymarginalisation (projection), it always works with factoreddistributions. Hence it can be applied to models for which the exactupdate step is intractable. We show that FF is equivalent to (oneiteration of) loopy belief propagation (LBP) on the original DBN, andthat BK is equivalent (to one iteration of) LBP on a DBN where wecluster some of the nodes. We then show empirically that byiterating, LBP can improve on the accuracy of both FF and BK. Wecompare these algorithms on two real-world DBNs: the first is a modelof a water treatment plant, and the second is a coupled HMM, used tomodel freeway traffic. △ Less

Submitted 10 January, 2013; originally announced January 2013.

Comments: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

Report number: UAI-P-2001-PG-378-385

arXiv:1206.5286 [pdf]

MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies

Authors: Yair Weiss, Chen Yanover, Talya Meltzer

Abstract: Finding the most probable assignment (MAP) in a general graphical model is known to be NP hard but good approximations have been attained with max-product belief propagation (BP) and its variants. In particular, it is known that using BP on a single-cycle graph or tree reweighted BP on an arbitrary graph will give the MAP solution if the beliefs have no ties. In this paper we extend the setting un… ▽ More Finding the most probable assignment (MAP) in a general graphical model is known to be NP hard but good approximations have been attained with max-product belief propagation (BP) and its variants. In particular, it is known that using BP on a single-cycle graph or tree reweighted BP on an arbitrary graph will give the MAP solution if the beliefs have no ties. In this paper we extend the setting under which BP can be used to provably extract the MAP. We define Convex BP as BP algorithms based on a convex free energy approximation and show that this class includes ordinary BP with single-cycle, tree reweighted BP and many other BP variants. We show that when there are no ties, fixed-points of convex max-product BP will provably give the MAP solution. We also show that convex sum-product BP at sufficiently small temperatures can be used to solve linear programs that arise from relaxing the MAP problem. Finally, we derive a novel condition that allows us to derive the MAP solution even if some of the convex BP beliefs have ties. In experiments, we show that our theorems allow us to find the MAP in many real-world instances of graphical models where exact inference using junction-tree is impossible. △ Less

Submitted 20 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence (UAI2007)

Report number: UAI-P-2007-PG-416-425

arXiv:1206.3288 [pdf]

Tightening LP Relaxations for MAP using Message Passing

Authors: David Sontag, Talya Meltzer, Amir Globerson, Tommi S. Jaakkola, Yair Weiss

Abstract: Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using message-passing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP configuration. The standard LP relaxation is not tight enough in many real-world problems, however, and t… ▽ More Linear Programming (LP) relaxations have become powerful tools for finding the most probable (MAP) configuration in graphical models. These relaxations can be solved efficiently using message-passing algorithms such as belief propagation and, when the relaxation is tight, provably find the MAP configuration. The standard LP relaxation is not tight enough in many real-world problems, however, and this has lead to the use of higher order cluster-based LP relaxations. The computational cost increases exponentially with the size of the clusters and limits the number and type of clusters we can use. We propose to solve the cluster selection problem monotonically in the dual LP, iteratively selecting clusters with guaranteed improvement, and quickly re-solving with the added clusters by reusing the existing solution. Our dual message-passing algorithm finds the MAP configuration in protein sidechain placement, protein design, and stereo problems, in cases where the standard LP relaxation fails. △ Less

Submitted 13 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

Report number: UAI-P-2008-PG-503-510

arXiv:1206.3254 [pdf]

Latent Topic Models for Hypertext

Authors: Amit Gruber, Michal Rosen-Zvi, Yair Weiss

Abstract: Latent topic models have been successfully applied as an unsupervised topic discovery technique in large document collections. With the proliferation of hypertext document collection such as the Internet, there has also been great interest in extending these approaches to hypertext [6, 9]. These approaches typically model links in an analogous fashion to how they model words - the document-link co… ▽ More Latent topic models have been successfully applied as an unsupervised topic discovery technique in large document collections. With the proliferation of hypertext document collection such as the Internet, there has also been great interest in extending these approaches to hypertext [6, 9]. These approaches typically model links in an analogous fashion to how they model words - the document-link co-occurrence matrix is modeled in the same way that the document-word co-occurrence matrix is modeled in standard topic models. In this paper we present a probabilistic generative model for hypertext document collections that explicitly models the generation of links. Specifically, links from a word w to a document d depend directly on how frequent the topic of w is in d, in addition to the in-degree of d. We show how to perform EM learning on this model efficiently. By not modeling links as analogous to words, we end up using far fewer free parameters and obtain better link prediction results. △ Less

Submitted 13 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

Report number: UAI-P-2008-PG-230-239

arXiv:1205.2625 [pdf]

Convergent message passing algorithms - a unifying view

Authors: Talya Meltzer, Amir Globerson, Yair Weiss

Abstract: Message-passing algorithms have emerged as powerful techniques for approximate inference in graphical models. When these algorithms converge, they can be shown to find local (or sometimes even global) optima of variational formulations to the inference problem. But many of the most popular algorithms are not guaranteed to converge. This has lead to recent interest in convergent message-passing alg… ▽ More Message-passing algorithms have emerged as powerful techniques for approximate inference in graphical models. When these algorithms converge, they can be shown to find local (or sometimes even global) optima of variational formulations to the inference problem. But many of the most popular algorithms are not guaranteed to converge. This has lead to recent interest in convergent message-passing algorithms. In this paper, we present a unified view of convergent message-passing algorithms. We present a simple derivation of an abstract algorithm, tree-consistency bound optimization (TCBO) that is provably convergent in both its sum and max product forms. We then show that many of the existing convergent algorithms are instances of our TCBO algorithm, and obtain novel convergent algorithms "for free" by exchanging maximizations and summations in existing algorithms. In particular, we show that Wainwright's non-convergent sum-product algorithm for tree based variational bounds, is actually convergent with the right update order for the case where trees are monotonic chains. △ Less

Submitted 9 May, 2012; originally announced May 2012.

Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

Report number: UAI-P-2009-PG-393-401

arXiv:0901.4275 [pdf, ps, other]

Informative Sensing

Authors: Hyun Sung Chang, Yair Weiss, William T. Freeman

Abstract: Compressed sensing is a recent set of mathematical results showing that sparse signals can be exactly reconstructed from a small number of linear measurements. Interestingly, for ideal sparse signals with no measurement noise, random measurements allow perfect reconstruction while measurements based on principal component analysis (PCA) or independent component analysis (ICA) do not. At the same… ▽ More Compressed sensing is a recent set of mathematical results showing that sparse signals can be exactly reconstructed from a small number of linear measurements. Interestingly, for ideal sparse signals with no measurement noise, random measurements allow perfect reconstruction while measurements based on principal component analysis (PCA) or independent component analysis (ICA) do not. At the same time, for other signal and noise distributions, PCA and ICA can significantly outperform random projections in terms of enabling reconstruction from a small number of measurements. In this paper we ask: given the distribution of signals we wish to measure, what are the optimal set of linear projections for compressed sensing? We consider the problem of finding a small number of linear projections that are maximally informative about the signal. Formally, we use the InfoMax criterion and seek to maximize the mutual information between the signal, x, and the (possibly noisy) projection y=Wx. We show that in general the optimal projections are not the principal components of the data nor random projections, but rather a seemingly novel set of projections that capture what is still uncertain about the signal, given the knowledge of distribution. We present analytic solutions for certain special cases including natural images. In particular, for natural images, the near-optimal projections are bandwise random, i.e., incoherent to the sparse bases at a particular frequency band but with more weights on the low-frequencies, which has a physical relation to the multi-resolution representation of images. △ Less

Submitted 27 January, 2009; originally announced January 2009.

Comments: 26 pages; submitted to IEEE Transactions on Information Theory

Showing 1–28 of 28 results for author: Weiss, Y