Search | arXiv e-print repository

arXiv:2310.01904 [pdf, other]

Beyond the Benchmark: Detecting Diverse Anomalies in Videos

Abstract: Video Anomaly Detection (VAD) plays a crucial role in modern surveillance systems, aiming to identify various anomalies in real-world situations. However, current benchmark datasets predominantly emphasize simple, single-frame anomalies such as novel object detection. This narrow focus restricts the advancement of VAD models. In this research, we advocate for an expansion of VAD investigations to… ▽ More Video Anomaly Detection (VAD) plays a crucial role in modern surveillance systems, aiming to identify various anomalies in real-world situations. However, current benchmark datasets predominantly emphasize simple, single-frame anomalies such as novel object detection. This narrow focus restricts the advancement of VAD models. In this research, we advocate for an expansion of VAD investigations to encompass intricate anomalies that extend beyond conventional benchmark boundaries. To facilitate this, we introduce two datasets, HMDB-AD and HMDB-Violence, to challenge models with diverse action-based anomalies. These datasets are derived from the HMDB51 action recognition dataset. We further present Multi-Frame Anomaly Detection (MFAD), a novel method built upon the AI-VAD framework. AI-VAD utilizes single-frame features such as pose estimation and deep image encoding, and two-frame features such as object velocity. They then apply a density estimation algorithm to compute anomaly scores. To address complex multi-frame anomalies, we add a deep video encoding features capturing long-range temporal dependencies, and logistic regression to enhance final score calculation. Experimental results confirm our assumptions, highlighting existing models limitations with new anomaly types. MFAD excels in both simple and complex anomaly detection scenarios. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2303.02698 [pdf, other]

Robust affine point matching via quadratic assignment on Grassmannians

Authors: Alexander Kolpakov, Michael Werman

Abstract: Robust Affine matching with Grassmannians (RAG) is a new algorithm to perform affine registration of point clouds. The algorithm is based on minimizing the Frobenius distance between two elements of the Grassmannian. For this purpose, an indefinite relaxation of the Quadratic Assignment Problem (QAP) is used, and several approaches to affine feature matching are studied and compared. Experiments d… ▽ More Robust Affine matching with Grassmannians (RAG) is a new algorithm to perform affine registration of point clouds. The algorithm is based on minimizing the Frobenius distance between two elements of the Grassmannian. For this purpose, an indefinite relaxation of the Quadratic Assignment Problem (QAP) is used, and several approaches to affine feature matching are studied and compared. Experiments demonstrate that RAG is more robust to noise and point discrepancy than previous methods. △ Less

Submitted 4 May, 2024; v1 submitted 5 March, 2023; originally announced March 2023.

Comments: 8 pages, 23 figures; GitHub repository at (https://github.com/sashakolpakov/rag); Section IV: added comparison to GrassGraph (https://doi.org/10.1109/TIP.2019.2959722); notably, GrassGraph quickly loses accuracy on our test examples with noise and occlusion

arXiv:2212.05332 [pdf, other]

doi 10.1109/TPAMI.2023.3287468

An approach to robust ICP initialization

Authors: Alexander Kolpakov, Michael Werman

Abstract: In this note, we propose an approach to initialize the Iterative Closest Point (ICP) algorithm to match unlabelled point clouds related by rigid transformations. The method is based on matching the ellipsoids defined by the points' covariance matrices and then testing the various principal half-axes matchings that differ by elements of a finite reflection group. We derive bounds on the robustness… ▽ More In this note, we propose an approach to initialize the Iterative Closest Point (ICP) algorithm to match unlabelled point clouds related by rigid transformations. The method is based on matching the ellipsoids defined by the points' covariance matrices and then testing the various principal half-axes matchings that differ by elements of a finite reflection group. We derive bounds on the robustness of our approach to noise and numerical experiments confirm our theoretical findings. △ Less

Submitted 25 June, 2023; v1 submitted 10 December, 2022; originally announced December 2022.

Comments: 9 pages, 18 figures, 1 table; GitHub repository at (https://github.com/sashakolpakov/icp-init)

arXiv:2207.01127 [pdf, other]

DecisioNet: A Binary-Tree Structured Neural Network

Authors: Noam Gottlieb, Michael Werman

Abstract: Deep neural networks (DNNs) and decision trees (DTs) are both state-of-the-art classifiers. DNNs perform well due to their representational learning capabilities, while DTs are computationally efficient as they perform inference along one route (root-to-leaf) that is dependent on the input data. In this paper, we present DecisioNet (DN), a binary-tree structured neural network. We propose a system… ▽ More Deep neural networks (DNNs) and decision trees (DTs) are both state-of-the-art classifiers. DNNs perform well due to their representational learning capabilities, while DTs are computationally efficient as they perform inference along one route (root-to-leaf) that is dependent on the input data. In this paper, we present DecisioNet (DN), a binary-tree structured neural network. We propose a systematic way to convert an existing DNN into a DN to create a lightweight version of the original model. DecisioNet takes the best of both worlds - it uses neural modules to perform representational learning and utilizes its tree structure to perform only a portion of the computations. We evaluate various DN architectures, along with their corresponding baseline models on the FashionMNIST, CIFAR10, and CIFAR100 datasets. We show that the DN variants achieve similar accuracy while significantly reducing the computational cost of the original network. △ Less

Submitted 19 November, 2022; v1 submitted 3 July, 2022; originally announced July 2022.

Comments: The paper has been accepted to the ACCV2022 conference. A short summary video about the paper can be found at https://whova.com/portal/webapp/hybri1_202112/Artifact/70297

arXiv:2203.10670 [pdf, other]

Fully Convolutional Fractional Scaling

Authors: Michael Soloveitchik, Michael Werman

Abstract: We introduce a fully convolutional fractional scaling component, FCFS. Fully convolutional networks can be applied to any size input and previously did not support non-integer scaling. Our architecture is simple with an efficient single layer implementation. Examples and code implementations of three common scaling methods are published. We introduce a fully convolutional fractional scaling component, FCFS. Fully convolutional networks can be applied to any size input and previously did not support non-integer scaling. Our architecture is simple with an efficient single layer implementation. Examples and code implementations of three common scaling methods are published. △ Less

Submitted 20 March, 2022; originally announced March 2022.

arXiv:2112.10600 [pdf, other]

DeePaste -- Inpainting for Pasting

Authors: Levi Kassel Michael Werman

Abstract: One of the challenges of supervised learning training is the need to procure an substantial amount of tagged data. A well-known method of solving this problem is to use synthetic data in a copy-paste fashion, so that we cut objects and paste them onto relevant backgrounds. Pasting the objects naively results in artifacts that cause models to give poor results on real data. We present a new method… ▽ More One of the challenges of supervised learning training is the need to procure an substantial amount of tagged data. A well-known method of solving this problem is to use synthetic data in a copy-paste fashion, so that we cut objects and paste them onto relevant backgrounds. Pasting the objects naively results in artifacts that cause models to give poor results on real data. We present a new method for cleanly pasting objects on different backgrounds so that the dataset created gives competitive performance on real data. The main emphasis is on the treatment of the border of the pasted object using inpainting. We show state-of-the-art results both on instance detection and foreground segmentation △ Less

Submitted 26 December, 2021; v1 submitted 20 December, 2021; originally announced December 2021.

arXiv:2103.12980 [pdf, ps, other]

On a realization of motion and similarity group equivalence classes of labeled points in $\mathbb R^k$ with applications to computer vision

Authors: Steven B. Damelin, David L. Ragozin, Michael Werman

Abstract: We study a realization of motion and similarity group equivalence classes of $n\geq 1$ labeled points in $\mathbb R^k,\, k\geq 1$ as a metric space with a computable metric. Our study is motivated by applications in computer vision. We study a realization of motion and similarity group equivalence classes of $n\geq 1$ labeled points in $\mathbb R^k,\, k\geq 1$ as a metric space with a computable metric. Our study is motivated by applications in computer vision. △ Less

Submitted 24 March, 2021; originally announced March 2021.

MSC Class: 70E15; 15A16; 14B16; 68T45; 49K35; 49N15

arXiv:2011.07954 [pdf, other]

Using a Supervised Method without supervision for foreground segmentation

Authors: Levi Kassel, Michael Werman

Abstract: Neural networks are a powerful framework for foreground segmentation in video acquired by static cameras, segmenting moving objects from the background in a robust way in various challenging scenarios. The premier methods are those based on supervision requiring a final training stage on a database of tens to hundreds of manually segmented images from the specific static camera. In this work, we p… ▽ More Neural networks are a powerful framework for foreground segmentation in video acquired by static cameras, segmenting moving objects from the background in a robust way in various challenging scenarios. The premier methods are those based on supervision requiring a final training stage on a database of tens to hundreds of manually segmented images from the specific static camera. In this work, we propose a method to automatically create an "artificial" database that is sufficient for training the supervised methods so that it performs better than current unsupervised methods. It is based on combining a weak foreground segmenter, compared to the supervised method, to extract suitable objects from the training images and randomly inserting these objects back into a background image. Test results are shown on the test sequences in CDnet. △ Less

Submitted 20 June, 2021; v1 submitted 26 October, 2020; originally announced November 2020.

arXiv:1911.12706 [pdf, other]

Cameras Viewing Cameras Geometry

Authors: Danail Brezov, Michael Werman

Abstract: A basic problem in computer vision is to understand the structure of a real-world scene given several images of it. Here we study several theoretical aspects of the intra multi-view geometry of calibrated cameras when all that they can reliably recognize is each other. With the proliferation of wearable cameras, autonomous vehicles and drones, the geometry of these multiple cameras is a timely and… ▽ More A basic problem in computer vision is to understand the structure of a real-world scene given several images of it. Here we study several theoretical aspects of the intra multi-view geometry of calibrated cameras when all that they can reliably recognize is each other. With the proliferation of wearable cameras, autonomous vehicles and drones, the geometry of these multiple cameras is a timely and relevant problem to study. △ Less

Submitted 28 November, 2019; originally announced November 2019.

arXiv:1903.02582 [pdf, other]

Clear Skies Ahead: Towards Real-Time Automatic Sky Replacement in Video

Authors: Tavi Halperin, Harel Cain, Ofir Bibi, Michael Werman

Abstract: Digital videos such as those captured by a smartphone often exhibit exposure inconsistencies, a poorly exposed sky, or simply suffer from an uninteresting or plain looking sky. Professionals may edit these videos using advanced and time-consuming tools unavailable to most users, to replace the sky with a more expressive or imaginative sky. In this work, we propose an algorithm for automatic replac… ▽ More Digital videos such as those captured by a smartphone often exhibit exposure inconsistencies, a poorly exposed sky, or simply suffer from an uninteresting or plain looking sky. Professionals may edit these videos using advanced and time-consuming tools unavailable to most users, to replace the sky with a more expressive or imaginative sky. In this work, we propose an algorithm for automatic replacement of the sky region in a video with a different sky, providing nonprofessional users with a simple yet efficient tool to seamlessly replace the sky. The method is fast, achieving close to real-time performance on mobile devices and the user's involvement can remain as limited as simply selecting the replacement sky. △ Less

Submitted 6 March, 2019; originally announced March 2019.

Comments: Eurographics 2019. Supplementary video: https://youtu.be/1uZ46YzX-pI

arXiv:1812.09025 [pdf, other]

Detection of distal radius fractures trained by a small set of X-ray images and Faster R-CNN

Authors: Erez Yahalomi, Michael Chernofsky, Michael Werman

Abstract: Distal radius fractures are the most common fractures of the upper extremity in humans. As such, they account for a significant portion of the injuries that present to emergency rooms and clinics throughout the world. We trained a Faster R-CNN, a machine vision neural network for object detection, to identify and locate distal radius fractures in anteroposterior X-ray images. We achieved an accura… ▽ More Distal radius fractures are the most common fractures of the upper extremity in humans. As such, they account for a significant portion of the injuries that present to emergency rooms and clinics throughout the world. We trained a Faster R-CNN, a machine vision neural network for object detection, to identify and locate distal radius fractures in anteroposterior X-ray images. We achieved an accuracy of 96\% in identifying fractures and mean Average Precision, mAP, of 0.866. This is significantly more accurate than the detection achieved by physicians and radiologists. These results were obtained by training the deep learning network with only 38 original images of anteroposterior hands X-ray images with fractures. This opens the possibility to detect with this type of neural network rare diseases or rare symptoms of common diseases , where only a small set of diagnosed X-ray images could be collected for each disease. △ Less

Submitted 21 December, 2018; originally announced December 2018.

Journal ref: Computing Conference 2019

arXiv:1812.02302 [pdf, other]

doi 10.1007/978-3-030-69637-5_19

On Min-Max affine approximants of convex or concave real valued functions from $\mathbb R^k$, Chebyshev equioscillation and graphics

Authors: Steven B. Damelin, David L. Ragozin, Michael Werman

Abstract: We study Min-Max affine approximants of a continuous convex or concave function $f:Δ\subset \mathbb R^k\xrightarrow{} \mathbb R$ where $Δ$ is a convex compact subset of $\mathbb R^k$. In the case when $Δ$ is a simplex we prove that there is a vertical translate of the supporting hyperplane in $\mathbb R^{k+1}$ of the graph of $f$ at the vertices which is the unique best affine approximant to $f$ o… ▽ More We study Min-Max affine approximants of a continuous convex or concave function $f:Δ\subset \mathbb R^k\xrightarrow{} \mathbb R$ where $Δ$ is a convex compact subset of $\mathbb R^k$. In the case when $Δ$ is a simplex we prove that there is a vertical translate of the supporting hyperplane in $\mathbb R^{k+1}$ of the graph of $f$ at the vertices which is the unique best affine approximant to $f$ on $Δ$. For $k=1$, this result provides an extension of the Chebyshev equioscillation theorem for linear approximants. Our result has interesting connections to the computer graphics problem of rapid rendering of projective transformations. △ Less

Submitted 17 August, 2021; v1 submitted 5 December, 2018; originally announced December 2018.

MSC Class: 49J30; 49J35; 49J10; 49J21 (Primary); 94A08 (Secondary)

Journal ref: In: Hirn, M., Li, S., Okoudjou, K.A., Saliani, S. (eds.) Excursions in Harmonic Analysis. Applied and Numerical Harmonic Analysis, vol. 6. Springer, Cham (2021)

arXiv:1811.06287 [pdf, other]

Sketch based Reduced Memory Hough Transform

Authors: Levi Offen, Michael Werman

Abstract: This paper proposes using sketch algorithms to represent the votes in Hough transforms. Replacing the accumulator array with a sketch (Sketch Hough Transform - SHT) significantly reduces the memory needed to compute a Hough transform. We also present a new sketch, Count Median Update, which works better than known sketch methods for replacing the accumulator array in the Hough Transform. This paper proposes using sketch algorithms to represent the votes in Hough transforms. Replacing the accumulator array with a sketch (Sketch Hough Transform - SHT) significantly reduces the memory needed to compute a Hough transform. We also present a new sketch, Count Median Update, which works better than known sketch methods for replacing the accumulator array in the Hough Transform. △ Less

Submitted 15 November, 2018; originally announced November 2018.

Comments: 5 pages

MSC Class: 1

Journal ref: 2018 25th IEEE International Conference on Image Processing (ICIP)

arXiv:1811.06277 [pdf, other]

Image declip** with deep networks

Authors: Shachar Honig, Michael Werman

Abstract: We present a deep network to recover pixel values lost to clip**. The clipped area of the image is typically a uniform area of minimum or maximum brightness, losing image detail and color fidelity. The degree to which the clip** is visually noticeable depends on the amount by which values were clipped, and the extent of the clipped area. Clip** may occur in any (or all) of the pixel's color… ▽ More We present a deep network to recover pixel values lost to clip**. The clipped area of the image is typically a uniform area of minimum or maximum brightness, losing image detail and color fidelity. The degree to which the clip** is visually noticeable depends on the amount by which values were clipped, and the extent of the clipped area. Clip** may occur in any (or all) of the pixel's color channels. Although clipped pixels are common and occur to some degree in almost every image we tested, current automatic solutions have only partial success in repairing clipped pixels and work only in limited cases such as only with overexposure (not under-exposure) and when some of the color channels are not clipped. Using neural networks and their ability to model natural images allows our neural network, DeclipNet, to reconstruct data in clipped regions producing state of the art results. △ Less

Submitted 15 November, 2018; originally announced November 2018.

Comments: 5 pages

MSC Class: 68

Journal ref: 2018 25th IEEE International Conference on Image Processing (ICIP)

arXiv:1810.09496 [pdf, other]

Two view constraints on the epipoles from few correspondences

Authors: Yoni Kasten, Michael Werman

Abstract: In general it requires at least 7 point correspondences to compute the fundamental matrix between views. We use the cross ratio invariance between corresponding epipolar lines, stemming from epipolar line homography, to derive a simple formulation for the relationship between epipoles and corresponding points. We show how it can be used to reduce the number of required points for the epipolar geom… ▽ More In general it requires at least 7 point correspondences to compute the fundamental matrix between views. We use the cross ratio invariance between corresponding epipolar lines, stemming from epipolar line homography, to derive a simple formulation for the relationship between epipoles and corresponding points. We show how it can be used to reduce the number of required points for the epipolar geometry when some information about the epipoles is available and demonstrate this with a buddy search app. △ Less

Submitted 22 October, 2018; originally announced October 2018.

arXiv:1710.01692 [pdf, other]

IQ of Neural Networks

Authors: Dokhyam Hoshen, Michael Werman

Abstract: IQ tests are an accepted method for assessing human intelligence. The tests consist of several parts that must be solved under a time constraint. Of all the tested abilities, pattern recognition has been found to have the highest correlation with general intelligence. This is primarily because pattern recognition is the ability to find order in a noisy environment, a necessary skill for intelligen… ▽ More IQ tests are an accepted method for assessing human intelligence. The tests consist of several parts that must be solved under a time constraint. Of all the tested abilities, pattern recognition has been found to have the highest correlation with general intelligence. This is primarily because pattern recognition is the ability to find order in a noisy environment, a necessary skill for intelligent agents. In this paper, we propose a convolutional neural network (CNN) model for solving geometric pattern recognition problems. The CNN receives as input multiple ordered input images and outputs the next image according to the pattern. Our CNN is able to solve problems involving rotation, reflection, color, size and shape patterns and score within the top 5% of human performance. △ Less

Submitted 29 September, 2017; originally announced October 2017.

arXiv:1703.09725 [pdf, other]

An Epipolar Line from a Single Pixel

Authors: Tavi Halperin, Michael Werman

Abstract: Computing the epipolar geometry from feature points between cameras with very different viewpoints is often error prone, as an object's appearance can vary greatly between images. For such cases, it has been shown that using motion extracted from video can achieve much better results than using a static image. This paper extends these earlier works based on the scene dynamics. In this paper we pro… ▽ More Computing the epipolar geometry from feature points between cameras with very different viewpoints is often error prone, as an object's appearance can vary greatly between images. For such cases, it has been shown that using motion extracted from video can achieve much better results than using a static image. This paper extends these earlier works based on the scene dynamics. In this paper we propose a new method to compute the epipolar geometry from a video stream, by exploiting the following observation: For a pixel p in Image A, all pixels corresponding to p in Image B are on the same epipolar line. Equivalently, the image of the line going through camera A's center and p is an epipolar line in B. Therefore, when cameras A and B are synchronized, the momentary images of two objects projecting to the same pixel, p, in camera A at times t1 and t2, lie on an epipolar line in camera B. Based on this observation we achieve fast and precise computation of epipolar lines. Calibrating cameras based on our method of finding epipolar lines is much faster and more robust than previous methods. △ Less

Submitted 15 December, 2018; v1 submitted 28 March, 2017; originally announced March 2017.

Comments: WACV 2018

arXiv:1609.05257 [pdf, other]

A convolutional approach to reflection symmetry

Authors: Marcelo Cicconet, Vighnesh Birodkar, Mads Lund, Michael Werman, Davi Geiger

Abstract: We present a convolutional approach to reflection symmetry detection in 2D. Our model, built on the products of complex-valued wavelet convolutions, simplifies previous edge-based pairwise methods. Being parameter-centered, as opposed to feature-centered, it has certain computational advantages when the object sizes are known a priori, as demonstrated in an ellipse detection application. The metho… ▽ More We present a convolutional approach to reflection symmetry detection in 2D. Our model, built on the products of complex-valued wavelet convolutions, simplifies previous edge-based pairwise methods. Being parameter-centered, as opposed to feature-centered, it has certain computational advantages when the object sizes are known a priori, as demonstrated in an ellipse detection application. The method outperforms the best-performing algorithm on the CVPR 2013 Symmetry Detection Competition Database in the single-symmetry case. Code and a new database for 2D symmetry detection is available. △ Less

Submitted 16 September, 2016; originally announced September 2016.

Comments: This paper is under consideration at Pattern Recognition Letters

arXiv:1607.07660 [pdf, other]

Fundamental Matrices from Moving Objects Using Line Motion Barcodes

Authors: Yoni Kasten, Gil Ben-Artzi, Shmuel Peleg, Michael Werman

Abstract: Computing the epipolar geometry between cameras with very different viewpoints is often very difficult. The appearance of objects can vary greatly, and it is difficult to find corresponding feature points. Prior methods searched for corresponding epipolar lines using points on the convex hull of the silhouette of a single moving object. These methods fail when the scene includes multiple moving ob… ▽ More Computing the epipolar geometry between cameras with very different viewpoints is often very difficult. The appearance of objects can vary greatly, and it is difficult to find corresponding feature points. Prior methods searched for corresponding epipolar lines using points on the convex hull of the silhouette of a single moving object. These methods fail when the scene includes multiple moving objects. This paper extends previous work to scenes having multiple moving objects by using the "Motion Barcodes", a temporal signature of lines. Corresponding epipolar lines have similar motion barcodes, and candidate pairs of corresponding epipoar lines are found by the similarity of their motion barcodes. As in previous methods we assume that cameras are relatively stationary and that moving objects have already been extracted using background subtraction. △ Less

Submitted 26 July, 2016; originally announced July 2016.

Journal ref: ECCV'16, Amsterdam, Oct. 2016, Vol II, pp. 220-118

arXiv:1604.04848 [pdf, other]

Epipolar Geometry Based On Line Similarity

Authors: Gil Ben-Artzi, Tavi Halperin, Michael Werman, Shmuel Peleg

Abstract: It is known that epipolar geometry can be computed from three epipolar line correspondences but this computation is rarely used in practice since there are no simple methods to find corresponding lines. Instead, methods for finding corresponding points are widely used. This paper proposes a similarity measure between lines that indicates whether two lines are corresponding epipolar lines and enabl… ▽ More It is known that epipolar geometry can be computed from three epipolar line correspondences but this computation is rarely used in practice since there are no simple methods to find corresponding lines. Instead, methods for finding corresponding points are widely used. This paper proposes a similarity measure between lines that indicates whether two lines are corresponding epipolar lines and enables finding epipolar line correspondences as needed for the computation of epipolar geometry. A similarity measure between two lines, suitable for video sequences of a dynamic scene, has been previously described. This paper suggests a stereo matching similarity measure suitable for images. It is based on the quality of stereo matching between the two lines, as corresponding epipolar lines yield a good stereo correspondence. Instead of an exhaustive search over all possible pairs of lines, the search space is substantially reduced when two corresponding point pairs are given. We validate the proposed method using real-world images and compare it to state-of-the-art methods. We found this method to be more accurate by a factor of five compared to the standard method using seven corresponding points and comparable to the 8-points algorithm. △ Less

Submitted 7 January, 2017; v1 submitted 17 April, 2016; originally announced April 2016.

Comments: ICPR 2016, Cancun, Dec 2016

Journal ref: ICPR'16, Cancun, Dec. 2016, pp. 1865-1870

arXiv:1506.07866 [pdf, other]

Camera Calibration from Dynamic Silhouettes Using Motion Barcodes

Authors: Gil Ben-Artzi, Yoni Kasten, Shmuel Peleg, Michael Werman

Abstract: Computing the epipolar geometry between cameras with very different viewpoints is often problematic as matching points are hard to find. In these cases, it has been proposed to use information from dynamic objects in the scene for suggesting point and line correspondences. We propose a speed up of about two orders of magnitude, as well as an increase in robustness and accuracy, to methods comput… ▽ More Computing the epipolar geometry between cameras with very different viewpoints is often problematic as matching points are hard to find. In these cases, it has been proposed to use information from dynamic objects in the scene for suggesting point and line correspondences. We propose a speed up of about two orders of magnitude, as well as an increase in robustness and accuracy, to methods computing epipolar geometry from dynamic silhouettes. This improvement is based on a new temporal signature: motion barcode for lines. Motion barcode is a binary temporal sequence for lines, indicating for each frame the existence of at least one foreground pixel on that line. The motion barcodes of two corresponding epipolar lines are very similar, so the search for corresponding epipolar lines can be limited only to lines having similar barcodes. The use of motion barcodes leads to increased speed, accuracy, and robustness in computing the epipolar geometry. △ Less

Submitted 7 January, 2017; v1 submitted 25 June, 2015; originally announced June 2015.

Comments: Update metadata

Journal ref: Proc. CVPR'16, Las Vegas, June 2016, pp. 4095-4103

arXiv:1505.08070 [pdf, other]

General Deformations of Point Configurations Viewed By a Pinhole Model Camera

Authors: Yirmeyahu Kaminski, Michael Werman

Abstract: This paper is a theoretical study of the following Non-Rigid Structure from Motion problem. What can be computed from a monocular view of a parametrically deforming set of points? We treat various variations of this problem for affine and polynomial deformations with calibrated and uncalibrated cameras. We show that in general at least three images with quasi-identical two deformations are needed… ▽ More This paper is a theoretical study of the following Non-Rigid Structure from Motion problem. What can be computed from a monocular view of a parametrically deforming set of points? We treat various variations of this problem for affine and polynomial deformations with calibrated and uncalibrated cameras. We show that in general at least three images with quasi-identical two deformations are needed in order to have a finite set of solutions of the points' structure and calculate some simple examples. △ Less

Submitted 9 January, 2022; v1 submitted 29 May, 2015; originally announced May 2015.

MSC Class: 53Z99; 53A07; 68T45; 14P05

arXiv:1502.00561

Quantum Pairwise Symmetry: Applications in 2D Shape Analysis

Authors: Marcelo Cicconet, Davi Geiger, Michael Werman

Abstract: A pair of rooted tangents -- defining a quantum triangle -- with an associated quantum wave of spin 1/2 is proposed as the primitive to represent and compute symmetry. Measures of the spin characterize how "isosceles" or how "degenerate" these triangles are -- which corresponds to their mirror or parallel symmetry. We also introduce a complex-valued kernel to model probability errors in the parame… ▽ More A pair of rooted tangents -- defining a quantum triangle -- with an associated quantum wave of spin 1/2 is proposed as the primitive to represent and compute symmetry. Measures of the spin characterize how "isosceles" or how "degenerate" these triangles are -- which corresponds to their mirror or parallel symmetry. We also introduce a complex-valued kernel to model probability errors in the parameter space, which is more robust to noise and clutter than the classical model. △ Less

Submitted 9 February, 2015; v1 submitted 2 February, 2015; originally announced February 2015.

Comments: The paper has been withdrawn since the authors concluded a more comprehensive study on the choice of parameters needs to be performed

arXiv:1502.00558

Complex-Valued Hough Transforms for Circles

Authors: Marcelo Cicconet, Davi Geiger, Michael Werman

Abstract: This paper advocates the use of complex variables to represent votes in the Hough transform for circle detection. Replacing the positive numbers classically used in the parameter space of the Hough transforms by complex numbers allows cancellation effects when adding up the votes. Cancellation and the computation of shape likelihood via a complex number's magnitude square lead to more robust solut… ▽ More This paper advocates the use of complex variables to represent votes in the Hough transform for circle detection. Replacing the positive numbers classically used in the parameter space of the Hough transforms by complex numbers allows cancellation effects when adding up the votes. Cancellation and the computation of shape likelihood via a complex number's magnitude square lead to more robust solutions than the "classic" algorithms, as shown by computational experiments on synthetic and real datasets. △ Less

Submitted 9 February, 2015; v1 submitted 2 February, 2015; originally announced February 2015.

Comments: The paper has been withdrawn since the authors concluded a more comprehensive study on the choice of parameters needs to be performed

arXiv:1412.1455 [pdf, other]

Event Retrieval Using Motion Barcodes

Authors: Gil Ben-Artzi, Michael Werman, Shmuel Peleg

Abstract: We introduce a simple and effective method for retrieval of videos showing a specific event, even when the videos of that event were captured from significantly different viewpoints. Appearance-based methods fail in such cases, as appearances change with large changes of viewpoints. Our method is based on a pixel-based feature, "motion barcode", which records the existence/non-existence of motio… ▽ More We introduce a simple and effective method for retrieval of videos showing a specific event, even when the videos of that event were captured from significantly different viewpoints. Appearance-based methods fail in such cases, as appearances change with large changes of viewpoints. Our method is based on a pixel-based feature, "motion barcode", which records the existence/non-existence of motion as a function of time. While appearance, motion magnitude, and motion direction can vary greatly between disparate viewpoints, the existence of motion is viewpoint invariant. Based on the motion barcode, a similarity measure is developed for videos of the same event taken from very different viewpoints. This measure is robust to occlusions common under different viewpoints, and can be computed efficiently. Event retrieval is demonstrated using challenging videos from stationary and hand held cameras. △ Less

Submitted 12 May, 2015; v1 submitted 3 December, 2014; originally announced December 2014.

Journal ref: Proc. ICIP'15, Quebec City, Sept. 2015, pp 2621-2625

arXiv:1211.5556 [pdf, ps, other]

Improving Perceptual Color Difference using Basic Color Terms

Authors: Ofir Pele, Michael Werman

Abstract: We suggest a new color distance based on two observations. First, perceptual color differences were designed to be used to compare very similar colors. They do not capture human perception for medium and large color differences well. Thresholding was proposed to solve the problem for large color differences, i.e. two totally different colors are always the same distance apart. We show that thresho… ▽ More We suggest a new color distance based on two observations. First, perceptual color differences were designed to be used to compare very similar colors. They do not capture human perception for medium and large color differences well. Thresholding was proposed to solve the problem for large color differences, i.e. two totally different colors are always the same distance apart. We show that thresholding alone cannot improve medium color differences. We suggest to alleviate this problem using basic color terms. Second, when a color distance is used for edge detection, many small distances around the just noticeable difference may account for false edges. We suggest to reduce the effect of small distances. △ Less

Submitted 23 November, 2012; originally announced November 2012.

arXiv:1107.4958 [pdf, other]

Efficient and Accurate Gaussian Image Filtering Using Running Sums

Authors: Elhanan Elboher, Michael Werman

Abstract: This paper presents a simple and efficient method to convolve an image with a Gaussian kernel. The computation is performed in a constant number of operations per pixel using running sums along the image rows and columns. We investigate the error function used for kernel approximation and its relation to the properties of the input signal. Based on natural image statistics we propose a quadratic f… ▽ More This paper presents a simple and efficient method to convolve an image with a Gaussian kernel. The computation is performed in a constant number of operations per pixel using running sums along the image rows and columns. We investigate the error function used for kernel approximation and its relation to the properties of the input signal. Based on natural image statistics we propose a quadratic form kernel error function so that the output image l2 error is minimized. We apply the proposed approach to approximate the Gaussian kernel by linear combination of constant functions. This results in very efficient Gaussian filtering method. Our experiments show that the proposed technique is faster than state of the art methods while preserving a similar accuracy. △ Less

Submitted 25 July, 2011; originally announced July 2011.

Showing 1–27 of 27 results for author: Werman, M