Search | arXiv e-print repository

DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video

Authors: Narek Tumanyan, Assaf Singer, Shai Bagon, Tali Dekel

Abstract: We present DINO-Tracker -- a new framework for long-term dense tracking in video. The pillar of our approach is combining test-time training on a single video, with the powerful localized semantic features learned by a pre-trained DINO-ViT model. Specifically, our framework simultaneously adopts DINO's features to fit to the motion observations of the test video, while training a tracker that dire… ▽ More We present DINO-Tracker -- a new framework for long-term dense tracking in video. The pillar of our approach is combining test-time training on a single video, with the powerful localized semantic features learned by a pre-trained DINO-ViT model. Specifically, our framework simultaneously adopts DINO's features to fit to the motion observations of the test video, while training a tracker that directly leverages the refined features. The entire framework is trained end-to-end using a combination of self-supervised losses, and regularization that allows us to retain and benefit from DINO's semantic prior. Extensive evaluation demonstrates that our method achieves state-of-the-art results on known benchmarks. DINO-tracker significantly outperforms self-supervised methods and is competitive with state-of-the-art supervised trackers, while outperforming them in challenging cases of tracking under long-term occlusions. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2312.05052 [pdf, other]

doi 10.1145/3357384.3357801

Soft Frequency Cap** for Improved Ad Click Prediction in Yahoo Gemini Native

Authors: Michal Aharon, Yohay Kaplan, Rina Levy, Oren Somekh, Ayelet Blanc, Neetai Eshel, Avi Shahar, Assaf Singer, Alex Zlotnik

Abstract: Yahoo's native advertising (also known as Gemini native) serves billions of ad impressions daily, reaching a yearly run-rate of many hundred of millions USD. Driving the Gemini native models that are used to predict both click probability (pCTR) and conversion probability (pCONV) is OFFSET - a feature enhanced collaborative-filtering (CF) based event prediction algorithm. \offset is a one-pass alg… ▽ More Yahoo's native advertising (also known as Gemini native) serves billions of ad impressions daily, reaching a yearly run-rate of many hundred of millions USD. Driving the Gemini native models that are used to predict both click probability (pCTR) and conversion probability (pCONV) is OFFSET - a feature enhanced collaborative-filtering (CF) based event prediction algorithm. \offset is a one-pass algorithm that updates its model for every new batch of logged data using a stochastic gradient descent (SGD) based approach. Since OFFSET represents its users by their features (i.e., user-less model) due to sparsity issues, rule based hard frequency cap** (HFC) is used to control the number of times a certain user views a certain ad. Moreover, related statistics reveal that user ad fatigue results in a dramatic drop in click through rate (CTR). Therefore, to improve click prediction accuracy, we propose a soft frequency cap** (SFC) approach, where the frequency feature is incorporated into the OFFSET model as a user-ad feature and its weight vector is learned via logistic regression as part of OFFSET training. Online evaluation of the soft frequency cap** algorithm via bucket testing showed a significant 7.3% revenue lift. Since then, the frequency feature enhanced model has been pushed to production serving all traffic, and is generating a hefty revenue lift for Yahoo Gemini native. We also report related statistics that reveal, among other things, that while users' gender does not affect ad fatigue, the latter seems to increase with users' age. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: In Proc. CIKM'2019. arXiv admin note: text overlap with arXiv:2111.07866 by other authors

arXiv:2308.10386 [pdf, other]

Unsupervised Opinion Aggregation -- A Statistical Perspective

Authors: Noyan C. Sevuktekin, Andrew C. Singer

Abstract: Complex decision-making systems rarely have direct access to the current state of the world and they instead rely on opinions to form an understanding of what the ground truth could be. Even in problems where experts provide opinions without any intention to manipulate the decision maker, it is challenging to decide which expert's opinion is more reliable -- a challenge that is further amplified w… ▽ More Complex decision-making systems rarely have direct access to the current state of the world and they instead rely on opinions to form an understanding of what the ground truth could be. Even in problems where experts provide opinions without any intention to manipulate the decision maker, it is challenging to decide which expert's opinion is more reliable -- a challenge that is further amplified when decision-maker has limited, delayed, or no access to the ground truth after the fact. This paper explores a statistical approach to infer the competence of each expert based on their opinions without any need for the ground truth. Echoing the logic behind what is commonly referred to as \textit{the wisdom of crowds}, we propose measuring the competence of each expert by their likeliness to agree with their peers. We further show that the more reliable an expert is the more likely it is that they agree with their peers. We leverage this fact to propose a completely unsupervised version of the naïve Bayes classifier and show that the proposed technique is asymptotically optimal for a large class of problems. In addition to aggregating a large block of opinions, we further apply our technique for online opinion aggregation and for decision-making based on a limited the number of opinions. △ Less

Submitted 20 August, 2023; originally announced August 2023.

Comments: This research was conducted during Noyan Sevuktekin's time at University of Illinois at Urbana-Champaign and the results were first presented in Chapter 3 of his dissertation, entitled "Learning From Opinions". Permalink: https://hdl.handle.net/2142/110814

arXiv:2209.10531 [pdf, other]

doi 10.1073/pnas.2216507120

Autocorrelation analysis for cryo-EM with sparsity constraints: Improved sample complexity and projection-based algorithms

Authors: Tamir Bendory, Yuehaw Khoo, Joe Kileel, Oscar Mickelin, Amit Singer

Abstract: The number of noisy images required for molecular reconstruction in single-particle cryo-electron microscopy (cryo-EM) is governed by the autocorrelations of the observed, randomly-oriented, noisy projection images. In this work, we consider the effect of imposing sparsity priors on the molecule. We use techniques from signal processing, optimization, and applied algebraic geometry to obtain new t… ▽ More The number of noisy images required for molecular reconstruction in single-particle cryo-electron microscopy (cryo-EM) is governed by the autocorrelations of the observed, randomly-oriented, noisy projection images. In this work, we consider the effect of imposing sparsity priors on the molecule. We use techniques from signal processing, optimization, and applied algebraic geometry to obtain new theoretical and computational contributions for this challenging non-linear inverse problem with sparsity constraints. We prove that molecular structures modeled as sums of Gaussians are uniquely determined by the second-order autocorrelation of their projection images, implying that the sample complexity is proportional to the square of the variance of the noise. This theory improves upon the non-sparse case, where the third-order autocorrelation is required for uniformly-oriented particle images and the sample complexity scales with the cube of the noise variance. Furthermore, we build a computational framework to reconstruct molecular structures which are sparse in the wavelet basis. This method combines the sparse representation for the molecule with projection-based techniques used for phase retrieval in X-ray crystallography. △ Less

Submitted 1 May, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

Comments: 31 pages, 5 figures, 1 movie

Journal ref: Proceedings of the National Academy of Sciences 120.18 (2023): e2216507120

arXiv:2207.13674 [pdf, other]

Fast expansion into harmonics on the disk: a steerable basis with fast radial convolutions

Authors: Nicholas F. Marshall, Oscar Mickelin, Amit Singer

Abstract: We present a fast and numerically accurate method for expanding digitized $L \times L$ images representing functions on $[-1,1]^2$ supported on the disk $\{x \in \mathbb{R}^2 : |x|<1\}$ in the harmonics (Dirichlet Laplacian eigenfunctions) on the disk. Our method, which we refer to as the Fast Disk Harmonics Transform (FDHT), runs in $O(L^2 \log L)$ operations. This basis is also known as the Four… ▽ More We present a fast and numerically accurate method for expanding digitized $L \times L$ images representing functions on $[-1,1]^2$ supported on the disk $\{x \in \mathbb{R}^2 : |x|<1\}$ in the harmonics (Dirichlet Laplacian eigenfunctions) on the disk. Our method, which we refer to as the Fast Disk Harmonics Transform (FDHT), runs in $O(L^2 \log L)$ operations. This basis is also known as the Fourier-Bessel basis, and it has several computational advantages: it is orthogonal, ordered by frequency, and steerable in the sense that images expanded in the basis can be rotated by applying a diagonal transform to the coefficients. Moreover, we show that convolution with radial functions can also be efficiently computed by applying a diagonal transform to the coefficients. △ Less

Submitted 21 December, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

Comments: 26 pages, 5 figures, 1 table

MSC Class: 65R10; 65D18; 42-04; 33C10

arXiv:2202.09388 [pdf, other]

A Molecular Prior Distribution for Bayesian Inference Based on Wilson Statistics

Authors: Marc Aurèle Gilles, Amit Singer

Abstract: Background and Objective: Wilson statistics describe well the power spectrum of proteins at high frequencies. Therefore, it has found several applications in structural biology, e.g., it is the basis for sharpening steps used in cryogenic electron microscopy (cryo-EM). A recent paper gave the first rigorous proof of Wilson statistics based on a formalism of Wilson's original argument. This new ana… ▽ More Background and Objective: Wilson statistics describe well the power spectrum of proteins at high frequencies. Therefore, it has found several applications in structural biology, e.g., it is the basis for sharpening steps used in cryogenic electron microscopy (cryo-EM). A recent paper gave the first rigorous proof of Wilson statistics based on a formalism of Wilson's original argument. This new analysis also leads to statistical estimates of the scattering potential of proteins that reveal a correlation between neighboring Fourier coefficients. Here we exploit these estimates to craft a novel prior that can be used for Bayesian inference of molecular structures. Methods: We describe the properties of the prior and the computation of its hyperparameters. We then evaluate the prior on two synthetic linear inverse problems, and compare against a popular prior in cryo-EM reconstruction at a range of SNRs. Results: We show that the new prior effectively suppresses noise and fills-in low SNR regions in the spectral domain. Furthermore, it improves the resolution of estimates on the problems considered for a wide range of SNR and produces Fourier Shell Correlation curves that are insensitive to masking effects. Conclusions: We analyze the assumptions in the model, discuss relations to other regularization strategies, and postulate on potential implications for structure determination in cryo-EM. △ Less

Submitted 2 May, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

arXiv:2202.07737 [pdf, other]

Ab-initio Contrast Estimation and Denoising of Cryo-EM Images

Authors: Yunpeng Shi, Amit Singer

Abstract: Background and Objective: The contrast of cryo-EM images varies from one to another, primarily due to the uneven thickness of the ice layer. This contrast variation can affect the quality of 2-D class averaging, 3-D ab-initio modeling, and 3-D heterogeneity analysis. Contrast estimation is currently performed during 3-D iterative refinement. As a result, the estimates are not available at the earl… ▽ More Background and Objective: The contrast of cryo-EM images varies from one to another, primarily due to the uneven thickness of the ice layer. This contrast variation can affect the quality of 2-D class averaging, 3-D ab-initio modeling, and 3-D heterogeneity analysis. Contrast estimation is currently performed during 3-D iterative refinement. As a result, the estimates are not available at the earlier computational stages of class averaging and ab-initio modeling. This paper aims to solve the contrast estimation problem directly from the picked particle images in the ab-initio stage, without estimating the 3-D volume, image rotations, or class averages. Methods: The key observation underlying our analysis is that the 2-D covariance matrix of the raw images is related to the covariance of the underlying clean images, the noise variance, and the contrast variability between images. We show that the contrast variability can be derived from the 2-D covariance matrix and we apply the existing Covariance Wiener Filtering (CWF) framework to estimate it. We also demonstrate a modification of CWF to estimate the contrast of individual images. Results: Our method improves the contrast estimation by a large margin, compared to the previous CWF method. Its estimation accuracy is often comparable to that of an oracle that knows the ground truth covariance of the clean images. The more accurate contrast estimation also improves the quality of image restoration as demonstrated in both synthetic and experimental datasets. Conclusions: This paper proposes an effective method for contrast estimation directly from noisy images without using any 3-D volume information. It enables contrast correction in the earlier stage of single particle analysis, and may improve the accuracy of downstream processing. △ Less

Submitted 30 June, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

MSC Class: 15A29; 65D18; 62H35; 94A08

arXiv:2109.11656 [pdf, other]

Sparse multi-reference alignment: sample complexity and computational hardness

Authors: Tamir Bendory, Oscar Mickelin, Amit Singer

Abstract: Motivated by the problem of determining the atomic structure of macromolecules using single-particle cryo-electron microscopy (cryo-EM), we study the sample and computational complexities of the sparse multi-reference alignment (MRA) model: the problem of estimating a sparse signal from its noisy, circularly shifted copies. Based on its tight connection to the crystallographic phase retrieval prob… ▽ More Motivated by the problem of determining the atomic structure of macromolecules using single-particle cryo-electron microscopy (cryo-EM), we study the sample and computational complexities of the sparse multi-reference alignment (MRA) model: the problem of estimating a sparse signal from its noisy, circularly shifted copies. Based on its tight connection to the crystallographic phase retrieval problem, we establish that if the number of observations is proportional to the square of the variance of the noise, then the sparse MRA problem is statistically feasible for sufficiently sparse signals. To investigate its computational hardness, we consider three types of computational frameworks: projection-based algorithms, bispectrum inversion, and convex relaxations. We show that a state-of-the-art projection-based algorithm achieves the optimal estimation rate, but its computational complexity is exponential in the sparsity level. The bispectrum framework provides a statistical-computational trade-off: it requires more observations (so its estimation rate is suboptimal), but its computational load is provably polynomial in the signal's length. The convex relaxation approach provides polynomial time algorithms (with a large exponent) that recover sufficiently sparse signals at the optimal estimation rate. We conclude the paper by discussing potential statistical and algorithmic implications for cryo-EM. △ Less

Submitted 23 September, 2021; originally announced September 2021.

arXiv:2105.09831 [pdf, other]

Uncoded Binary Signaling through Modulo AWGN Channel

Authors: Gizem Tabak, Andrew Singer

Abstract: Modulo-wrap** receivers have attracted interest in several areas of digital communications, including precoding and lattice coding. The asymptotic capacity and error performance of the modulo AWGN channel have been well established. However, due to underlying assumptions of the asymptotic analyses, these findings might not always be realistic in physical world applications, which are often dimen… ▽ More Modulo-wrap** receivers have attracted interest in several areas of digital communications, including precoding and lattice coding. The asymptotic capacity and error performance of the modulo AWGN channel have been well established. However, due to underlying assumptions of the asymptotic analyses, these findings might not always be realistic in physical world applications, which are often dimension- or delay-limited. In this work, the optimum ways to achieve minimum probability of error for binary signaling through a scalar modulo AWGN channel is examined under different scenarios where the receiver has access to full or partial information. In case of partial information at the receiver, an iterative estimation rule is proposed to reduce the error rate, and the performance of different estimators are demonstrated in simulated experiments. △ Less

Submitted 20 May, 2021; originally announced May 2021.

arXiv:2104.01078 [pdf, other]

Blind Exploration and Exploitation of Stochastic Experts

Authors: Noyan C. Sevuktekin, Andrew C. Singer

Abstract: We present blind exploration and exploitation (BEE) algorithms for identifying the most reliable stochastic expert based on formulations that employ posterior sampling, upper-confidence bounds, empirical Kullback-Leibler divergence, and minmax methods for the stochastic multi-armed bandit problem. Joint sampling and consultation of experts whose opinions depend on the hidden and random state of th… ▽ More We present blind exploration and exploitation (BEE) algorithms for identifying the most reliable stochastic expert based on formulations that employ posterior sampling, upper-confidence bounds, empirical Kullback-Leibler divergence, and minmax methods for the stochastic multi-armed bandit problem. Joint sampling and consultation of experts whose opinions depend on the hidden and random state of the world becomes challenging in the unsupervised, or blind, framework as feedback from the true state is not available. We propose an empirically realizable measure of expert competence that can be inferred instantaneously using only the opinions of other experts. This measure preserves the ordering of true competences and thus enables joint sampling and consultation of stochastic experts based on their opinions on dynamically changing tasks. Statistics derived from the proposed measure is instantaneously available allowing both blind exploration-exploitation and unsupervised opinion aggregation. We discuss how the lack of supervision affects the asymptotic regret of BEE architectures that rely on UCB1, KL-UCB, MOSS, IMED, and Thompson sampling. We demonstrate the performance of different BEE algorithms empirically and compare them to their standard, or supervised, counterparts. △ Less

Submitted 2 April, 2021; originally announced April 2021.

arXiv:2102.05958 [pdf, other]

EventScore: An Automated Real-time Early Warning Score for Clinical Events

Authors: Ibrahim Hammoud, Prateek Prasanna, IV Ramakrishnan, Adam Singer, Mark Henry, Henry Thode

Abstract: Early prediction of patients at risk of clinical deterioration can help physicians intervene and alter their clinical course towards better outcomes. In addition to the accuracy requirement, early warning systems must make the predictions early enough to give physicians enough time to intervene. Interpretability is also one of the challenges when building such systems since being able to justify t… ▽ More Early prediction of patients at risk of clinical deterioration can help physicians intervene and alter their clinical course towards better outcomes. In addition to the accuracy requirement, early warning systems must make the predictions early enough to give physicians enough time to intervene. Interpretability is also one of the challenges when building such systems since being able to justify the reasoning behind model decisions is desirable in clinical practice. In this work, we built an interpretable model for the early prediction of various adverse clinical events indicative of clinical deterioration. The model is evaluated on two datasets and four clinical events. The first dataset is collected in a predominantly COVID-19 positive population at Stony Brook Hospital. The second dataset is the MIMIC III dataset. The model was trained to provide early warning scores for ventilation, ICU transfer, and mortality prediction tasks on the Stony Brook Hospital dataset and to predict mortality and the need for vasopressors on the MIMIC III dataset. Our model first separates each feature into multiple ranges and then uses logistic regression with lasso penalization to select the subset of ranges for each feature. The model training is completely automated and doesn't require expert knowledge like other early warning scores. We compare our model to the Modified Early Warning Score (MEWS) and quick SOFA (qSOFA), commonly used in hospitals. We show that our model outperforms these models in the area under the receiver operating characteristic curve (AUROC) while having a similar or better median detection time on all clinical events, even when using fewer features. Unlike MEWS and qSOFA, our model can be entirely automated without requiring any manually recorded features. We also show that discretization improves model performance by comparing our model to a baseline logistic regression model. △ Less

Submitted 13 February, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

arXiv:2101.07709 [pdf, other]

Multi-target detection with rotations

Authors: Tamir Bendory, Ti-Yen Lan, Nicholas F. Marshall, Iris Rukshin, Amit Singer

Abstract: We consider the multi-target detection problem of estimating a two-dimensional target image from a large noisy measurement image that contains many randomly rotated and translated copies of the target image. Motivated by single-particle cryo-electron microscopy, we focus on the low signal-to-noise regime, where it is difficult to estimate the locations and orientations of the target images in the… ▽ More We consider the multi-target detection problem of estimating a two-dimensional target image from a large noisy measurement image that contains many randomly rotated and translated copies of the target image. Motivated by single-particle cryo-electron microscopy, we focus on the low signal-to-noise regime, where it is difficult to estimate the locations and orientations of the target images in the measurement. Our approach uses autocorrelation analysis to estimate rotationally and translationally invariant features of the target image. We demonstrate that, regardless of the level of noise, our technique can be used to recover the target image when the measurement is sufficiently large. △ Less

Submitted 2 September, 2022; v1 submitted 19 January, 2021; originally announced January 2021.

Comments: 20 pages, 5 figures

arXiv:2012.14172 [pdf, other]

doi 10.1007/s00041-021-09879-2

Manifold learning with arbitrary norms

Authors: Joe Kileel, Amit Moscovich, Nathan Zelesko, Amit Singer

Abstract: Manifold learning methods play a prominent role in nonlinear dimensionality reduction and other tasks involving high-dimensional data sets with low intrinsic dimensionality. Many of these methods are graph-based: they associate a vertex with each data point and a weighted edge with each pair. Existing theory shows that the Laplacian matrix of the graph converges to the Laplace-Beltrami operator of… ▽ More Manifold learning methods play a prominent role in nonlinear dimensionality reduction and other tasks involving high-dimensional data sets with low intrinsic dimensionality. Many of these methods are graph-based: they associate a vertex with each data point and a weighted edge with each pair. Existing theory shows that the Laplacian matrix of the graph converges to the Laplace-Beltrami operator of the data manifold, under the assumption that the pairwise affinities are based on the Euclidean norm. In this paper, we determine the limiting differential operator for graph Laplacians constructed using $\textit{any}$ norm. Our proof involves an interplay between the second fundamental form of the manifold and the convex geometry of the given norm's unit ball. To demonstrate the potential benefits of non-Euclidean norms in manifold learning, we consider the task of map** the motion of large molecules with continuous variability. In a numerical simulation we show that a modified Laplacian eigenmaps algorithm, based on the Earthmover's distance, outperforms the classic Euclidean Laplacian eigenmaps, both in terms of computational cost and the sample size needed to recover the intrinsic geometry. △ Less

Submitted 15 July, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

Comments: 53 pages, 8 figures, 3 tables, to appear in Journal of Fourier Analysis and Applications

Journal ref: Journal of Fourier Analysis and Applications 27, 82 (2021)

arXiv:2012.03860 [pdf, other]

doi 10.1121/10.0005314

Modeling the effects of dynamic range compression on signals in noise

Authors: Ryan M. Corey, Andrew C. Singer

Abstract: Hearing aids use dynamic range compression (DRC), a form of automatic gain control, to make quiet sounds louder and loud sounds quieter. Compression can improve listening comfort, but it can also cause distortion in noisy environments. It has been widely reported that DRC performs poorly in noise, but there has been little mathematical analysis of these distortion effects. This work introduces a m… ▽ More Hearing aids use dynamic range compression (DRC), a form of automatic gain control, to make quiet sounds louder and loud sounds quieter. Compression can improve listening comfort, but it can also cause distortion in noisy environments. It has been widely reported that DRC performs poorly in noise, but there has been little mathematical analysis of these distortion effects. This work introduces a mathematical model to study the behavior of DRC in noise. Using statistical assumptions about the signal envelopes, we define an effective compression function that models the compression applied to one signal in the presence of another. This framework is used to prove results about DRC that have been previously observed experimentally: that when DRC is applied to a mixture of signals, uncorrelated signal envelopes become negatively correlated; that the effective compression applied to each sound in a mixture is weaker than it would have been for the signal alone; and that compression can reduce the long-term signal-to-noise ratio in certain conditions. These theoretical results are supported by software experiments using recorded speech signals. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2010.09989 [pdf, other]

Wasserstein K-Means for Clustering Tomographic Projections

Authors: Rohan Rao, Amit Moscovich, Amit Singer

Abstract: Motivated by the 2D class averaging problem in single-particle cryo-electron microscopy (cryo-EM), we present a k-means algorithm based on a rotationally-invariant Wasserstein metric for images. Unlike existing methods that are based on Euclidean ($L_2$) distances, we prove that the Wasserstein metric better accommodates for the out-of-plane angular differences between different particle views. We… ▽ More Motivated by the 2D class averaging problem in single-particle cryo-electron microscopy (cryo-EM), we present a k-means algorithm based on a rotationally-invariant Wasserstein metric for images. Unlike existing methods that are based on Euclidean ($L_2$) distances, we prove that the Wasserstein metric better accommodates for the out-of-plane angular differences between different particle views. We demonstrate on a synthetic dataset that our method gives superior results compared to an $L_2$ baseline. Furthermore, there is little computational overhead, thanks to the use of a fast linear-time approximation to the Wasserstein-1 metric, also known as the Earthmover's distance. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: 11 pages, 5 figures, 1 table

MSC Class: 62H30 (Primary) 92C55; 68U10 (Secondary) ACM Class: I.5.3; I.4.0

Journal ref: Machine Learning for Structural Biology Workshop, NeurIPS 2020

arXiv:2010.09908 [pdf, other]

Product Manifold Learning

Authors: Sharon Zhang, Amit Moscovich, Amit Singer

Abstract: We consider problems of dimensionality reduction and learning data representations for continuous spaces with two or more independent degrees of freedom. Such problems occur, for example, when observing shapes with several components that move independently. Mathematically, if the parameter space of each continuous independent motion is a manifold, then their combination is known as a product mani… ▽ More We consider problems of dimensionality reduction and learning data representations for continuous spaces with two or more independent degrees of freedom. Such problems occur, for example, when observing shapes with several components that move independently. Mathematically, if the parameter space of each continuous independent motion is a manifold, then their combination is known as a product manifold. In this paper, we present a new paradigm for non-linear independent component analysis called manifold factorization. Our factorization algorithm is based on spectral graph methods for manifold learning and the separability of the Laplacian operator on product spaces. Recovering the factors of a manifold yields meaningful lower-dimensional representations and provides a new way to focus on particular aspects of the data space while ignoring others. We demonstrate the potential use of our method for an important and challenging problem in structural biology: map** the motions of proteins and other large molecules using cryo-electron microscopy datasets. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: 10 pages, 4 figures

MSC Class: 68T10 (Primary); 42-08; 57Z25 (Secondary) ACM Class: I.5.0; I.2.0

Journal ref: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics. 130 (2021) 3241-3249

arXiv:2008.04521 [pdf, other]

doi 10.1121/10.0002279

Acoustic effects of medical, cloth, and transparent face masks on speech signals

Authors: Ryan M. Corey, Uriah Jones, Andrew C. Singer

Abstract: Face masks muffle speech and make communication more difficult, especially for people with hearing loss. This study examines the acoustic attenuation caused by different face masks, including medical, cloth, and transparent masks, using a head-shaped loudspeaker and a live human talker. The results suggest that all masks attenuate frequencies above 1 kHz, that attenuation is greatest in front of t… ▽ More Face masks muffle speech and make communication more difficult, especially for people with hearing loss. This study examines the acoustic attenuation caused by different face masks, including medical, cloth, and transparent masks, using a head-shaped loudspeaker and a live human talker. The results suggest that all masks attenuate frequencies above 1 kHz, that attenuation is greatest in front of the talker, and that there is substantial variation between mask types, especially cloth masks with different materials and weaves. Transparent masks have poor acoustic performance compared to both medical and cloth masks. Most masks have little effect on lapel microphones, suggesting that existing sound reinforcement and assistive listening systems may be effective for verbal communication with masks. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Journal ref: The Journal of the Acoustical Society of America, 148(4), pp. 2371-2375, Oct. 2020

arXiv:2008.03641 [pdf, other]

NMR Assignment through Linear Programming

Authors: Jose F. S. Bravo-Ferreira, David Cowburn, Yuehaw Khoo, Amit Singer

Abstract: Nuclear Magnetic Resonance (NMR) Spectroscopy is the second most used technique (after X-ray crystallography) for structural determination of proteins. A computational challenge in this technique involves solving a discrete optimization problem that assigns the resonance frequency to each atom in the protein. This paper introduces LIAN (LInear programming Assignment for NMR), a novel linear progra… ▽ More Nuclear Magnetic Resonance (NMR) Spectroscopy is the second most used technique (after X-ray crystallography) for structural determination of proteins. A computational challenge in this technique involves solving a discrete optimization problem that assigns the resonance frequency to each atom in the protein. This paper introduces LIAN (LInear programming Assignment for NMR), a novel linear programming formulation of the problem which yields state-of-the-art results in simulated and experimental datasets. △ Less

Submitted 7 September, 2021; v1 submitted 8 August, 2020; originally announced August 2020.

Comments: 28 pages, 10 figures

arXiv:2006.15354 [pdf, other]

Super-resolution multi-reference alignment

Authors: Tamir Bendory, Ariel Jaffe, William Leeb, Nir Sharon, Amit Singer

Abstract: We study super-resolution multi-reference alignment, the problem of estimating a signal from many circularly shifted, down-sampled, and noisy observations. We focus on the low SNR regime, and show that a signal in $\mathbb{R}^M$ is uniquely determined when the number $L$ of samples per observation is of the order of the square root of the signal's length $(L=O(\sqrt{M}))$. Phrased more informally,… ▽ More We study super-resolution multi-reference alignment, the problem of estimating a signal from many circularly shifted, down-sampled, and noisy observations. We focus on the low SNR regime, and show that a signal in $\mathbb{R}^M$ is uniquely determined when the number $L$ of samples per observation is of the order of the square root of the signal's length $(L=O(\sqrt{M}))$. Phrased more informally, one can square the resolution. This result holds if the number of observations is proportional to at least 1/SNR$^3$. In contrast, with fewer observations recovery is impossible even when the observations are not down-sampled ($L=M$). The analysis combines tools from statistical signal processing and invariant theory. We design an expectation-maximization algorithm and demonstrate that it can super-resolve the signal in challenging SNR regimes. △ Less

Submitted 9 November, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

arXiv:2006.11830 [pdf, ps, other]

The NYU-CUBoulder Systems for SIGMORPHON 2020 Task 0 and Task 2

Authors: Assaf Singer, Katharina Kann

Abstract: We describe the NYU-CUBoulder systems for the SIGMORPHON 2020 Task 0 on typologically diverse morphological inflection and Task 2 on unsupervised morphological paradigm completion. The former consists of generating morphological inflections from a lemma and a set of morphosyntactic features describing the target form. The latter requires generating entire paradigms for a set of given lemmas from r… ▽ More We describe the NYU-CUBoulder systems for the SIGMORPHON 2020 Task 0 on typologically diverse morphological inflection and Task 2 on unsupervised morphological paradigm completion. The former consists of generating morphological inflections from a lemma and a set of morphosyntactic features describing the target form. The latter requires generating entire paradigms for a set of given lemmas from raw text alone. We model morphological inflection as a sequence-to-sequence problem, where the input is the sequence of the lemma's characters with morphological tags, and the output is the sequence of the inflected form's characters. First, we apply a transformer model to the task. Second, as inflected forms share most characters with the lemma, we further propose a pointer-generator transformer model to allow easy copying of input characters. Our best performing system for Task 0 is placed 6th out of 23 systems. We further use our inflection systems as subcomponents of approaches for Task 2. Our best performing system for Task 2 is the 2nd best out of 7 submissions. △ Less

Submitted 21 June, 2020; originally announced June 2020.

Comments: 8 pages, 2 figures

ACM Class: I.2.7; I.2.6

arXiv:2006.09505 [pdf]

Temporal clustering network for self-diagnosing faults from vibration measurements

Authors: G. Zhang, A. R. Singer, N. Vlahopoulos

Abstract: There is a need to build intelligence in operating machinery and use data analysis on monitored signals in order to quantify the health of the operating system and self-diagnose any initiations of fault. Built-in control procedures can automatically take corrective actions in order to avoid catastrophic failure when a fault is diagnosed. This paper presents a Temporal Clustering Network (TCN) capa… ▽ More There is a need to build intelligence in operating machinery and use data analysis on monitored signals in order to quantify the health of the operating system and self-diagnose any initiations of fault. Built-in control procedures can automatically take corrective actions in order to avoid catastrophic failure when a fault is diagnosed. This paper presents a Temporal Clustering Network (TCN) capability for processing acceleration measurement(s) made on the operating system (i.e. machinery foundation, machinery casing, etc.), or any other type of temporal signals, and determine based on the monitored signal when a fault is at its onset. The new capability uses: one-dimensional convolutional neural networks (1D-CNN) for processing the measurements; unsupervised learning (i.e. no labeled signals from the different operating conditions and no signals at pristine vs. damaged conditions are necessary for training the 1D-CNN); clustering (i.e. grou** signals in different clusters reflective of the operating conditions); and statistical analysis for identifying fault signals that are not members of any of the clusters associated with the pristine operating conditions. A case study demonstrating its operation is included in the paper. Finally topics for further research are identified. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: 9 pages, 8 figures

MSC Class: 74H45 (primary)74R99 (secondary)

arXiv:2004.11956 [pdf, other]

Binaural Audio Source Remixing with Microphone Array Listening Devices

Authors: Ryan M. Corey, Andrew C. Singer

Abstract: Augmented listening devices, such as hearing aids and augmented reality headsets, enhance human perception by changing the sounds that we hear. Microphone arrays can improve the performance of listening systems in noisy environments, but most array-based listening systems are designed to isolate a single sound source from a mixture. This work considers a source-remixing filter that alters the rela… ▽ More Augmented listening devices, such as hearing aids and augmented reality headsets, enhance human perception by changing the sounds that we hear. Microphone arrays can improve the performance of listening systems in noisy environments, but most array-based listening systems are designed to isolate a single sound source from a mixture. This work considers a source-remixing filter that alters the relative level of each source independently. Remixing rather than separating sounds can help to improve perceptual transparency: it causes less distortion to the signal spectrum and especially to the interaural cues that humans use to localize sounds in space. △ Less

Submitted 24 April, 2020; originally announced April 2020.

Comments: To appear at ICASSP 2020

arXiv:1912.05043 [pdf, other]

Motion-Tolerant Beamforming with Deformable Microphone Arrays

Authors: Ryan M. Corey, Andrew C. Singer

Abstract: Microphone arrays are usually assumed to have rigid geometries: the microphones may move with respect to the sound field but remain fixed relative to each other. However, many useful arrays, such as those in wearable devices, have sensors that can move relative to each other. We compare two approaches to beamforming with deformable microphone arrays: first, by explicitly tracking the geometry of t… ▽ More Microphone arrays are usually assumed to have rigid geometries: the microphones may move with respect to the sound field but remain fixed relative to each other. However, many useful arrays, such as those in wearable devices, have sensors that can move relative to each other. We compare two approaches to beamforming with deformable microphone arrays: first, by explicitly tracking the geometry of the array as it changes over time, and second, by designing a time-invariant beamformer based on the second-order statistics of the moving array. The time-invariant approach is shown to be appropriate when the motion of the array is small relative to the acoustic wavelengths of interest. The performance of the proposed beamforming system is demonstrated using a wearable microphone array on a moving human listener in a cocktail-party scenario. △ Less

Submitted 10 December, 2019; originally announced December 2019.

Comments: Presented at WASPAA 2019

arXiv:1912.05038 [pdf, other]

Cooperative Audio Source Separation and Enhancement Using Distributed Microphone Arrays and Wearable Devices

Authors: Ryan M. Corey, Matthew D. Skarha, Andrew C. Singer

Abstract: Augmented listening devices such as hearing aids often perform poorly in noisy and reverberant environments with many competing sound sources. Large distributed microphone arrays can improve performance, but data from remote microphones often cannot be used for delay-constrained real-time processing. We present a cooperative audio source separation and enhancement system that leverages wearable li… ▽ More Augmented listening devices such as hearing aids often perform poorly in noisy and reverberant environments with many competing sound sources. Large distributed microphone arrays can improve performance, but data from remote microphones often cannot be used for delay-constrained real-time processing. We present a cooperative audio source separation and enhancement system that leverages wearable listening devices and other microphone arrays spread around a room. The full distributed array is used to separate sound sources and estimate their statistics. Each listening device uses these statistics to design real-time binaural audio enhancement filters using its own local microphones. The system is demonstrated experimentally using 10 speech sources and 160 microphones in a large, reverberant room. △ Less

Submitted 10 December, 2019; originally announced December 2019.

Comments: To appear at CAMSAP 2019

arXiv:1911.06107 [pdf, other]

doi 10.1109/ISBI45749.2020.9098723

Earthmover-based manifold learning for analyzing molecular conformation spaces

Authors: Nathan Zelesko, Amit Moscovich, Joe Kileel, Amit Singer

Abstract: In this paper, we propose a novel approach for manifold learning that combines the Earthmover's distance (EMD) with the diffusion maps method for dimensionality reduction. We demonstrate the potential benefits of this approach for learning shape spaces of proteins and other flexible macromolecules using a simulated dataset of 3-D density maps that mimic the non-uniform rotary motion of ATP synthas… ▽ More In this paper, we propose a novel approach for manifold learning that combines the Earthmover's distance (EMD) with the diffusion maps method for dimensionality reduction. We demonstrate the potential benefits of this approach for learning shape spaces of proteins and other flexible macromolecules using a simulated dataset of 3-D density maps that mimic the non-uniform rotary motion of ATP synthase. Our results show that EMD-based diffusion maps require far fewer samples to recover the intrinsic geometry than the standard diffusion maps algorithm that is based on the Euclidean distance. To reduce the computational burden of calculating the EMD for all volume pairs, we employ a wavelet-based approximation to the EMD which reduces the computation of the pairwise EMDs to a computation of pairwise weighted-$\ell_1$ distances between wavelet coefficient vectors. △ Less

Submitted 15 October, 2019; originally announced November 2019.

Comments: 5 pages, 4 figures, 1 table

Journal ref: IEEE 17th International Symposium on Biomedical Imaging (ISBI) 2020

arXiv:1910.10006 [pdf, other]

doi 10.1109/ICASSP40776.2020.9053932

Image recovery from rotational and translational invariants

Authors: Nicholas F. Marshall, Ti-Yen Lan, Tamir Bendory, Amit Singer

Abstract: We introduce a framework for recovering an image from its rotationally and translationally invariant features based on autocorrelation analysis. This work is an instance of the multi-target detection statistical model, which is mainly used to study the mathematical and computational properties of single-particle reconstruction using cryo-electron microscopy (cryo-EM) at low signal-to-noise ratios.… ▽ More We introduce a framework for recovering an image from its rotationally and translationally invariant features based on autocorrelation analysis. This work is an instance of the multi-target detection statistical model, which is mainly used to study the mathematical and computational properties of single-particle reconstruction using cryo-electron microscopy (cryo-EM) at low signal-to-noise ratios. We demonstrate with synthetic numerical experiments that an image can be reconstructed from rotationally and translationally invariant features and show that the reconstruction is robust to noise. These results constitute an important step towards the goal of structure determination of small biomolecules using cryo-EM. △ Less

Submitted 22 October, 2019; originally announced October 2019.

Comments: 5 pages, 3 figures

arXiv:1908.03454 [pdf, other]

Bias and variance reduction and denoising for CTF Estimation

Authors: Ayelet Heimowitz, Joakim Andén, Amit Singer

Abstract: When using an electron microscope for imaging of particles embedded in vitreous ice, the objective lens will inevitably corrupt the projection images. This corruption manifests as a band-pass filter on the micrograph. In addition, it causes the phase of several frequency bands to be flipped and distorts frequency bands. As a precursor to compensating for this distortion, the corrupting point sprea… ▽ More When using an electron microscope for imaging of particles embedded in vitreous ice, the objective lens will inevitably corrupt the projection images. This corruption manifests as a band-pass filter on the micrograph. In addition, it causes the phase of several frequency bands to be flipped and distorts frequency bands. As a precursor to compensating for this distortion, the corrupting point spread function, which is termed the contrast transfer function (CTF) in reciprocal space, must be estimated. In this paper, we will present a novel method for CTF estimation. Our method is based on the multi-taper method for power spectral density estimation, which aims to reduce the bias and variance of the estimator. Furthermore, we use known properties of the CTF and of the background of the power spectrum to increase the accuracy of our estimation. We will show that the resulting estimates capture the zero-crossings of the CTF in the low-mid frequency range. △ Less

Submitted 28 January, 2020; v1 submitted 9 August, 2019; originally announced August 2019.

arXiv:1908.00574 [pdf, other]

doi 10.1109/MSP.2019.2957822

Single-particle cryo-electron microscopy: Mathematical theory, computational challenges, and opportunities

Authors: Tamir Bendory, Alberto Bartesaghi, Amit Singer

Abstract: In recent years, an abundance of new molecular structures have been elucidated using cryo-electron microscopy (cryo-EM), largely due to advances in hardware technology and data processing techniques. Owing to these new exciting developments, cryo-EM was selected by Nature Methods as Method of the Year 2015, and the Nobel Prize in Chemistry 2017 was awarded to three pioneers in the field. The mai… ▽ More In recent years, an abundance of new molecular structures have been elucidated using cryo-electron microscopy (cryo-EM), largely due to advances in hardware technology and data processing techniques. Owing to these new exciting developments, cryo-EM was selected by Nature Methods as Method of the Year 2015, and the Nobel Prize in Chemistry 2017 was awarded to three pioneers in the field. The main goal of this article is to introduce the challenging and exciting computational tasks involved in reconstructing 3-D molecular structures by cryo-EM. Determining molecular structures requires a wide range of computational tools in a variety of fields, including signal processing, estimation and detection theory, high-dimensional statistics, convex and non-convex optimization, spectral algorithms, dimensionality reduction, and machine learning. The tools from these fields must be adapted to work under exceptionally challenging conditions, including extreme noise levels, the presence of missing data, and massively large datasets as large as several Terabytes. In addition, we present two statistical models: multi-reference alignment and multi-target detection, that abstract away much of the intricacies of cryo-EM, while retaining some of its essential features. Based on these abstractions, we discuss some recent intriguing results in the mathematical theory of cryo-EM, and delineate relations with group theory, invariant theory, and information theory. △ Less

Submitted 7 October, 2019; v1 submitted 1 August, 2019; originally announced August 2019.

arXiv:1907.01898 [pdf, other]

doi 10.1088/1361-6420/ab4f55

Cryo-EM reconstruction of continuous heterogeneity by Laplacian spectral volumes

Authors: Amit Moscovich, Amit Halevi, Joakim Andén, Amit Singer

Abstract: Single-particle electron cryomicroscopy is an essential tool for high-resolution 3D reconstruction of proteins and other biological macromolecules. An important challenge in cryo-EM is the reconstruction of non-rigid molecules with parts that move and deform. Traditional reconstruction methods fail in these cases, resulting in smeared reconstructions of the moving parts. This poses a major obstacl… ▽ More Single-particle electron cryomicroscopy is an essential tool for high-resolution 3D reconstruction of proteins and other biological macromolecules. An important challenge in cryo-EM is the reconstruction of non-rigid molecules with parts that move and deform. Traditional reconstruction methods fail in these cases, resulting in smeared reconstructions of the moving parts. This poses a major obstacle for structural biologists, who need high-resolution reconstructions of entire macromolecules, moving parts included. To address this challenge, we present a new method for the reconstruction of macromolecules exhibiting continuous heterogeneity. The proposed method uses projection images from multiple viewing directions to construct a graph Laplacian through which the manifold of three-dimensional conformations is analyzed. The 3D molecular structures are then expanded in a basis of Laplacian eigenvectors, using a novel generalized tomographic reconstruction algorithm to compute the expansion coefficients. These coefficients, which we name spectral volumes, provide a high-resolution visualization of the molecular dynamics. We provide a theoretical analysis and evaluate the method empirically on several simulated data sets. △ Less

Submitted 26 September, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

Comments: 33 pages, 10 figures

MSC Class: 62P10; 65R32; 94A08 ACM Class: I.4.5; I.5.4; J.3

Journal ref: Inverse Problems, 36:2 (2020)

arXiv:1907.01589 [pdf, other]

doi 10.1088/1361-6420/ab5ede

Hyper-Molecules: on the Representation and Recovery of Dynamical Structures, with Application to Flexible Macro-Molecular Structures in Cryo-EM

Authors: Roy R. Lederman, Joakim Andén, Amit Singer

Abstract: Cryo-electron microscopy (cryo-EM), the subject of the 2017 Nobel Prize in Chemistry, is a technology for determining the 3-D structure of macromolecules from many noisy 2-D projections of instances of these macromolecules, whose orientations and positions are unknown. The molecular structures are not rigid objects, but flexible objects involved in dynamical processes. The different conformations… ▽ More Cryo-electron microscopy (cryo-EM), the subject of the 2017 Nobel Prize in Chemistry, is a technology for determining the 3-D structure of macromolecules from many noisy 2-D projections of instances of these macromolecules, whose orientations and positions are unknown. The molecular structures are not rigid objects, but flexible objects involved in dynamical processes. The different conformations are exhibited by different instances of the macromolecule observed in a cryo-EM experiment, each of which is recorded as a particle image. The range of conformations and the conformation of each particle are not known a priori; one of the great promises of cryo-EM is to map this conformation space. Remarkable progress has been made in determining rigid structures from homogeneous samples of molecules in spite of the unknown orientation of each particle image and significant progress has been made in recovering a few distinct states from mixtures of rather distinct conformations, but more complex heterogeneous samples remain a major challenge. We introduce the ``hyper-molecule'' framework for modeling structures across different states of heterogeneous molecules, including continuums of states. The key idea behind this framework is representing heterogeneous macromolecules as high-dimensional objects, with the additional dimensions representing the conformation space. This idea is then refined to model properties such as localized heterogeneity. In addition, we introduce an algorithmic framework for recovering such maps of heterogeneous objects from experimental data using a Bayesian formulation of the problem and Markov chain Monte Carlo (MCMC) algorithms to address the computational challenges in recovering these high dimensional hyper-molecules. We demonstrate these ideas in a prototype applied to synthetic data. △ Less

Submitted 2 July, 2019; originally announced July 2019.

arXiv:1905.03176 [pdf, other]

doi 10.1109/TSP.2020.2975943

Multi-target Detection with an Arbitrary Spacing Distribution

Authors: Ti-Yen Lan, Tamir Bendory, Nicolas Boumal, Amit Singer

Abstract: Motivated by the structure reconstruction problem in single-particle cryo-electron microscopy, we consider the multi-target detection model, where multiple copies of a target signal occur at unknown locations in a long measurement, further corrupted by additive Gaussian noise. At low noise levels, one can easily detect the signal occurrences and estimate the signal by averaging. However, in the pr… ▽ More Motivated by the structure reconstruction problem in single-particle cryo-electron microscopy, we consider the multi-target detection model, where multiple copies of a target signal occur at unknown locations in a long measurement, further corrupted by additive Gaussian noise. At low noise levels, one can easily detect the signal occurrences and estimate the signal by averaging. However, in the presence of high noise, which is the focus of this paper, detection is impossible. Here, we propose two approaches---autocorrelation analysis and an approximate expectation maximization algorithm---to reconstruct the signal without the need to detect signal occurrences in the measurement. In particular, our methods apply to an arbitrary spacing distribution of signal occurrences. We demonstrate reconstructions with synthetic data and empirically show that the sample complexity of both methods scales as 1/SNR^3 in the low SNR regime. △ Less

Submitted 22 January, 2020; v1 submitted 8 May, 2019; originally announced May 2019.

Comments: 13 pages, 8 figures

arXiv:1903.06022 [pdf, other]

doi 10.1088/1361-6420/ab2aec

Multi-target detection with application to cryo-electron microscopy

Authors: Tamir Bendory, Nicolas Boumal, William Leeb, Eitan Levin, Amit Singer

Abstract: We consider the multi-target detection problem of recovering a set of signals that appear multiple times at unknown locations in a noisy measurement. In the low noise regime, one can estimate the signals by first detecting occurrences, then clustering and averaging them. In the high noise regime however, neither detection nor clustering can be performed reliably, so that strategies along these lin… ▽ More We consider the multi-target detection problem of recovering a set of signals that appear multiple times at unknown locations in a noisy measurement. In the low noise regime, one can estimate the signals by first detecting occurrences, then clustering and averaging them. In the high noise regime however, neither detection nor clustering can be performed reliably, so that strategies along these lines are destined to fail. Notwithstanding, using autocorrelation analysis, we show that the impossibility to detect and cluster signal occurrences in the presence of high noise does not necessarily preclude signal estimation. Specifically, to estimate the signals, we derive simple relations between the autocorrelations of the observation and those of the signals. These autocorrelations can be estimated accurately at any noise level given a sufficiently long measurement. To recover the signals from the observed autocorrelations, we solve a set of polynomial equations through nonlinear least-squares. We provide analysis regarding well-posedness of the task, and demonstrate numerically the effectiveness of the method in a variety of settings. The main goal of this work is to provide theoretical and numerical support for a recently proposed framework to image 3-D structures of biological macromolecules using cryo-electron microscopy in extreme noise levels. △ Less

Submitted 3 June, 2019; v1 submitted 12 March, 2019; originally announced March 2019.

Comments: arXiv admin note: text overlap with arXiv:1810.00226

arXiv:1903.02094 [pdf, other]

doi 10.1109/ICASSP.2019.8682733

Acoustic Impulse Responses for Wearable Audio Devices

Authors: Ryan M. Corey, Naoki Tsuda, Andrew C. Singer

Abstract: We present an open-access dataset of over 8000 acoustic impulse from 160 microphones spread across the body and affixed to wearable accessories. The data can be used to evaluate audio capture and array processing systems using wearable devices such as hearing aids, headphones, eyeglasses, jewelry, and clothing. We analyze the acoustic transfer functions of different parts of the body, measure the… ▽ More We present an open-access dataset of over 8000 acoustic impulse from 160 microphones spread across the body and affixed to wearable accessories. The data can be used to evaluate audio capture and array processing systems using wearable devices such as hearing aids, headphones, eyeglasses, jewelry, and clothing. We analyze the acoustic transfer functions of different parts of the body, measure the effects of clothing worn over microphones, compare measurements from a live human subject to those from a mannequin, and simulate the noise-reduction performance of several beamformers. The results suggest that arrays of microphones spread across the body are more effective than those confined to a single device. △ Less

Submitted 5 March, 2019; originally announced March 2019.

Comments: To appear at ICASSP 2019

Journal ref: 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:1812.08789 [pdf, other]

Steerable $e$PCA: Rotationally Invariant Exponential Family PCA

Authors: Zhizhen Zhao, Lydia T. Liu, Amit Singer

Abstract: In photon-limited imaging, the pixel intensities are affected by photon count noise. Many applications, such as 3-D reconstruction using correlation analysis in X-ray free electron laser (XFEL) single molecule imaging, require an accurate estimation of the covariance of the underlying 2-D clean images. Accurate estimation of the covariance from low-photon count images must take into account that p… ▽ More In photon-limited imaging, the pixel intensities are affected by photon count noise. Many applications, such as 3-D reconstruction using correlation analysis in X-ray free electron laser (XFEL) single molecule imaging, require an accurate estimation of the covariance of the underlying 2-D clean images. Accurate estimation of the covariance from low-photon count images must take into account that pixel intensities are Poisson distributed, hence the classical sample covariance estimator is sub-optimal. Moreover, in single molecule imaging, including in-plane rotated copies of all images could further improve the accuracy of covariance estimation. In this paper we introduce an efficient and accurate algorithm for covariance matrix estimation of count noise 2-D images, including their uniform planar rotations and possibly reflections. Our procedure, steerable $e$PCA, combines in a novel way two recently introduced innovations. The first is a methodology for principal component analysis (PCA) for Poisson distributions, and more generally, exponential family distributions, called $e$PCA. The second is steerable PCA, a fast and accurate procedure for including all planar rotations for PCA. The resulting principal components are invariant to the rotation and reflection of the input images. We demonstrate the efficiency and accuracy of steerable $e$PCA in numerical experiments involving simulated XFEL datasets and rotated Yale B face data. △ Less

Submitted 17 December, 2019; v1 submitted 20 December, 2018; originally announced December 2018.

arXiv:1810.00226 [pdf, other]

Toward single particle reconstruction without particle picking: Breaking the detection limit

Authors: Tamir Bendory, Nicolas Boumal, William Leeb, Eitan Levin, Amit Singer

Abstract: Single-particle cryo-electron microscopy (cryo-EM) has recently joined X-ray crystallography and NMR spectroscopy as a high-resolution structural method to resolve biological macromolecules. In a cryo-EM experiment, the microscope produces images called micrographs. Projections of the molecule of interest are embedded in the micrographs at unknown locations, and under unknown viewing directions. S… ▽ More Single-particle cryo-electron microscopy (cryo-EM) has recently joined X-ray crystallography and NMR spectroscopy as a high-resolution structural method to resolve biological macromolecules. In a cryo-EM experiment, the microscope produces images called micrographs. Projections of the molecule of interest are embedded in the micrographs at unknown locations, and under unknown viewing directions. Standard imaging techniques first locate these projections (detection) and then reconstruct the 3-D structure from them. Unfortunately, high noise levels hinder detection. When reliable detection is rendered impossible, the standard techniques fail. This is a problem, especially for small molecules. In this paper, we pursue a radically different approach: we contend that the structure could, in principle, be reconstructed directly from the micrographs, without intermediate detection. The aim is to bring small molecules within reach for cryo-EM. To this end, we design an autocorrelation analysis technique that allows to go directly from the micrographs to the sought structures. This involves only one pass over the micrographs, allowing online, streaming processing for large experiments. We show numerical results and discuss challenges that lay ahead to turn this proof-of-concept into a complementary approach to state-of-the-art algorithms. △ Less

Submitted 27 October, 2022; v1 submitted 29 September, 2018; originally announced October 2018.

Comments: Older citations to this paper refer to version arXiv:1810.00226v1, parts of which now appear in: Tamir Bendory, Nicolas Boumal, William Leeb, Eitan Levin, and Amit Singer. "Multi-target detection with application to cryo-electron microscopy." Inverse Problems 35, no. 10 (2019): 104003

arXiv:1808.00096 [pdf, other]

doi 10.1109/IWAENC.2018.8521260

Speech Separation Using Partially Asynchronous Microphone Arrays Without Resampling

Authors: Ryan M. Corey, Andrew C. Singer

Abstract: We consider the problem of separating speech sources captured by multiple spatially separated devices, each of which has multiple microphones and samples its signals at a slightly different rate. Most asynchronous array processing methods rely on sample rate offset estimation and resampling, but these offsets can be difficult to estimate if the sources or microphones are moving. We propose a sourc… ▽ More We consider the problem of separating speech sources captured by multiple spatially separated devices, each of which has multiple microphones and samples its signals at a slightly different rate. Most asynchronous array processing methods rely on sample rate offset estimation and resampling, but these offsets can be difficult to estimate if the sources or microphones are moving. We propose a source separation method that does not require offset estimation or signal resampling. Instead, we divide the distributed array into several synchronous subarrays. All arrays are used jointly to estimate the time-varying signal statistics, and those statistics are used to design separate time-varying spatial filters in each array. We demonstrate the method for speech mixtures recorded on both stationary and moving microphone arrays. △ Less

Submitted 31 July, 2018; originally announced August 2018.

Comments: To appear at the International Workshop on Acoustic Signal Enhancement (IWAENC 2018)

Journal ref: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC)

arXiv:1808.00082 [pdf, other]

doi 10.1109/IWAENC.2018.8521263

Delay-Performance Tradeoffs in Causal Microphone Array Processing

Authors: Ryan M. Corey, Naoki Tsuda, Andrew C. Singer

Abstract: In real-time listening enhancement applications, such as hearing aid signal processing, sounds must be processed with no more than a few milliseconds of delay to sound natural to the listener. Listening devices can achieve better performance with lower delay by using microphone arrays to filter acoustic signals in both space and time. Here, we analyze the tradeoff between delay and squared-error p… ▽ More In real-time listening enhancement applications, such as hearing aid signal processing, sounds must be processed with no more than a few milliseconds of delay to sound natural to the listener. Listening devices can achieve better performance with lower delay by using microphone arrays to filter acoustic signals in both space and time. Here, we analyze the tradeoff between delay and squared-error performance of causal multichannel Wiener filters for microphone array noise reduction. We compute exact expressions for the delay-error curves in two special cases and present experimental results from real-world microphone array recordings. We find that delay-performance characteristics are determined by both the spatial and temporal correlation structures of the signals. △ Less

Submitted 31 July, 2018; originally announced August 2018.

Comments: To appear at the International Workshop on Acoustic Signal Enhancement (IWAENC 2018)

Journal ref: 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC)

arXiv:1806.08968 [pdf, ps, other]

doi 10.1109/JSTSP.2018.2863189

A Modulo-Based Architecture for Analog-to-Digital Conversion

Authors: Or Ordentlich, Gizem Tabak, Pavan Kumar Hanumolu, Andrew C. Singer, Gregory W. Wornell

Abstract: Systems that capture and process analog signals must first acquire them through an analog-to-digital converter. While subsequent digital processing can remove statistical correlations present in the acquired data, the dynamic range of the converter is typically scaled to match that of the input analog signal. The present paper develops an approach for analog-to-digital conversion that aims at mini… ▽ More Systems that capture and process analog signals must first acquire them through an analog-to-digital converter. While subsequent digital processing can remove statistical correlations present in the acquired data, the dynamic range of the converter is typically scaled to match that of the input analog signal. The present paper develops an approach for analog-to-digital conversion that aims at minimizing the number of bits per sample at the output of the converter. This is attained by reducing the dynamic range of the analog signal by performing a modulo operation on its amplitude, and then quantizing the result. While the converter itself is universal and agnostic of the statistics of the signal, the decoder operation on the output of the quantizer can exploit the statistical structure in order to unwrap the modulo folding. The performance of this method is shown to approach information theoretical limits, as captured by the rate-distortion function, in various settings. An architecture for modulo analog-to-digital conversion via ring oscillators is suggested, and its merits are numerically demonstrated. △ Less

Submitted 23 June, 2018; originally announced June 2018.

arXiv:1806.01357 [pdf, other]

Adversarial Domain Adaptation for Classification of Prostate Histopathology Whole-Slide Images

Authors: Jian Ren, Ilker Hacihaliloglu, Eric A. Singer, David J. Foran, Xin Qi

Abstract: Automatic and accurate Gleason grading of histopathology tissue slides is crucial for prostate cancer diagnosis, treatment, and prognosis. Usually, histopathology tissue slides from different institutions show heterogeneous appearances because of different tissue preparation and staining procedures, thus the predictable model learned from one domain may not be applicable to a new domain directly.… ▽ More Automatic and accurate Gleason grading of histopathology tissue slides is crucial for prostate cancer diagnosis, treatment, and prognosis. Usually, histopathology tissue slides from different institutions show heterogeneous appearances because of different tissue preparation and staining procedures, thus the predictable model learned from one domain may not be applicable to a new domain directly. Here we propose to adopt unsupervised domain adaptation to transfer the discriminative knowledge obtained from the source domain to the target domain without requiring labeling of images at the target domain. The adaptation is achieved through adversarial training to find an invariant feature space along with the proposed Siamese architecture on the target domain to add a regularization that is appropriate for the whole-slide images. We validate the method on two prostate cancer datasets and obtain significant classification improvement of Gleason scores as compared with the baseline models. △ Less

Submitted 6 June, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

Comments: Accepted to MICCAI 2018

arXiv:1802.00469 [pdf, other]

APPLE Picker: Automatic Particle Picking, a Low-Effort Cryo-EM Framework

Authors: Ayelet Heimowitz, Joakim Andén, Amit Singer

Abstract: Particle picking is a crucial first step in the computational pipeline of single-particle cryo-electron microscopy (cryo-EM). Selecting particles from the micrographs is difficult especially for small particles with low contrast. As high-resolution reconstruction typically requires hundreds of thousands of particles, manually picking that many particles is often too time-consuming. While semi-auto… ▽ More Particle picking is a crucial first step in the computational pipeline of single-particle cryo-electron microscopy (cryo-EM). Selecting particles from the micrographs is difficult especially for small particles with low contrast. As high-resolution reconstruction typically requires hundreds of thousands of particles, manually picking that many particles is often too time-consuming. While semi-automated particle picking is currently a popular approach, it may suffer from introducing manual bias into the selection process. In addition, semi-automated particle picking is still somewhat time-consuming. This paper presents the APPLE (Automatic Particle Picking with Low user Effort) picker, a simple and novel approach for fast, accurate, and fully automatic particle picking. While our approach was inspired by template matching, it is completely template-free. This approach is evaluated on publicly available datasets containing micrographs of $β$-galactosidase and keyhole limpet hemocyanin projections. △ Less

Submitted 14 June, 2018; v1 submitted 1 February, 2018; originally announced February 2018.

Comments: 18 pages, 14 figures

arXiv:1801.04366 [pdf, other]

Estimation in the group action channel

Authors: Emmanuel Abbe, João M. Pereira, Amit Singer

Abstract: We analyze the problem of estimating a signal from multiple measurements on a $\mbox{group action channel}$ that linearly transforms a signal by a random group action followed by a fixed projection and additive Gaussian noise. This channel is motivated by applications such as multi-reference alignment and cryo-electron microscopy. We focus on the large noise regime prevalent in these applications.… ▽ More We analyze the problem of estimating a signal from multiple measurements on a $\mbox{group action channel}$ that linearly transforms a signal by a random group action followed by a fixed projection and additive Gaussian noise. This channel is motivated by applications such as multi-reference alignment and cryo-electron microscopy. We focus on the large noise regime prevalent in these applications. We give a lower bound on the mean square error (MSE) of any asymptotically unbiased estimator of the signal's orbit in terms of the signal's moment tensors, which implies that the MSE is bounded away from 0 when $N/σ^{2d}$ is bounded from above, where $N$ is the number of observations, $σ$ is the noise standard deviation, and $d$ is the so-called $\mbox{moment order cutoff}$. In contrast, the maximum likelihood estimator is shown to be consistent if $N /σ^{2d}$ diverges. △ Less

Submitted 12 January, 2018; originally announced January 2018.

Comments: 5 pages, conference

MSC Class: 94A15; 62B10

arXiv:1710.02793 [pdf, other]

Multireference Alignment is Easier with an Aperiodic Translation Distribution

Authors: Emmanuel Abbe, Tamir Bendory, William Leeb, João Pereira, Nir Sharon, Amit Singer

Abstract: In the multireference alignment model, a signal is observed by the action of a random circular translation and the addition of Gaussian noise. The goal is to recover the signal's orbit by accessing multiple independent observations. Of particular interest is the sample complexity, i.e., the number of observations/samples needed in terms of the signal-to-noise ratio (the signal energy divided by th… ▽ More In the multireference alignment model, a signal is observed by the action of a random circular translation and the addition of Gaussian noise. The goal is to recover the signal's orbit by accessing multiple independent observations. Of particular interest is the sample complexity, i.e., the number of observations/samples needed in terms of the signal-to-noise ratio (the signal energy divided by the noise variance) in order to drive the mean-square error (MSE) to zero. Previous work showed that if the translations are drawn from the uniform distribution, then, in the low SNR regime, the sample complexity of the problem scales as $ω(1/\text{SNR}^3)$. In this work, using a generalization of the Chapman--Robbins bound for orbits and expansions of the $χ^2$ divergence at low SNR, we show that in the same regime the sample complexity for any aperiodic translation distribution scales as $ω(1/\text{SNR}^2)$. This rate is achieved by a simple spectral algorithm. We propose two additional algorithms based on non-convex optimization and expectation-maximization. We also draw a connection between the multireference alignment problem and the spiked covariance model. △ Less

Submitted 3 November, 2018; v1 submitted 8 October, 2017; originally announced October 2017.

arXiv:1710.02590 [pdf, other]

Heterogeneous multireference alignment: a single pass approach

Authors: Nicolas Boumal, Tamir Bendory, Roy R. Lederman, Amit Singer

Abstract: Multireference alignment (MRA) is the problem of estimating a signal from many noisy and cyclically shifted copies of itself. In this paper, we consider an extension called heterogeneous MRA, where $K$ signals must be estimated, and each observation comes from one of those signals, unknown to us. This is a simplified model for the heterogeneity problem notably arising in cryo-electron microscopy.… ▽ More Multireference alignment (MRA) is the problem of estimating a signal from many noisy and cyclically shifted copies of itself. In this paper, we consider an extension called heterogeneous MRA, where $K$ signals must be estimated, and each observation comes from one of those signals, unknown to us. This is a simplified model for the heterogeneity problem notably arising in cryo-electron microscopy. We propose an algorithm which estimates the $K$ signals without estimating either the shifts or the classes of the observations. It requires only one pass over the data and is based on low-order moments that are invariant under cyclic shifts. Given sufficiently many measurements, one can estimate these invariant features averaged over the $K$ signals. We then design a smooth, non-convex optimization problem to compute a set of signals which are consistent with the estimated averaged features. We find that, in many cases, the proposed approach estimates the set of signals accurately despite non-convexity, and conjecture the number of signals $K$ that can be resolved as a function of the signal length $L$ is on the order of $\sqrt{L}$. △ Less

Submitted 31 January, 2018; v1 submitted 6 October, 2017; originally announced October 2017.

Comments: 6 pages, 3 figures

arXiv:1707.00943 [pdf, other]

The sample complexity of multi-reference alignment

Authors: Amelia Perry, Jonathan Weed, Afonso S. Bandeira, Philippe Rigollet, Amit Singer

Abstract: The growing role of data-driven approaches to scientific discovery has unveiled a large class of models that involve latent transformations with a rigid algebraic constraint. Three-dimensional molecule reconstruction in Cryo-Electron Microscopy (cryo-EM) is a central problem in this class. Despite decades of algorithmic and software development, there is still little theoretical understanding of t… ▽ More The growing role of data-driven approaches to scientific discovery has unveiled a large class of models that involve latent transformations with a rigid algebraic constraint. Three-dimensional molecule reconstruction in Cryo-Electron Microscopy (cryo-EM) is a central problem in this class. Despite decades of algorithmic and software development, there is still little theoretical understanding of the sample complexity of this problem, that is, number of images required for 3-D reconstruction. Here we consider multi-reference alignment (MRA), a simple model that captures fundamental aspects of the statistical and algorithmic challenges arising in cryo-EM and related problems. In MRA, an unknown signal is subject to two types of corruption: a latent cyclic shift and the more traditional additive white noise. The goal is to recover the signal at a certain precision from independent samples. While at high signal-to-noise ratio (SNR), the number of observations needed to recover a generic signal is proportional to $1/\mathrm{SNR}$, we prove that it rises to a surprising $1/\mathrm{SNR}^3$ in the low SNR regime. This precise phenomenon was observed empirically more than twenty years ago for cryo-EM but has remained unexplained to date. Furthermore, our techniques can easily be extended to the heterogeneous MRA model where the samples come from a mixture of signals, as is often the case in applications such as cryo-EM, where molecules may have different conformations. This provides a first step towards a statistical theory for heterogeneous cryo-EM. △ Less

Submitted 3 June, 2019; v1 submitted 4 July, 2017; originally announced July 2017.

Comments: To appear in SIAM Journal on Mathematics of Data Science

MSC Class: 62B10; 92C55

arXiv:1705.07779 [pdf, other]

Cost-Performance Tradeoffs in Fusing Unreliable Computational Units

Authors: Mehmet A. Donmez, Maxim Raginsky, Andrew C. Singer, Lav R. Varshney

Abstract: We investigate fusing several unreliable computational units that perform the same task. We model an unreliable computational outcome as an additive perturbation to its error-free result in terms of its fidelity and cost. We analyze performance of repetition-based strategies that distribute cost across several unreliable units and fuse their outcomes. When the cost is a convex function of fidelity… ▽ More We investigate fusing several unreliable computational units that perform the same task. We model an unreliable computational outcome as an additive perturbation to its error-free result in terms of its fidelity and cost. We analyze performance of repetition-based strategies that distribute cost across several unreliable units and fuse their outcomes. When the cost is a convex function of fidelity, the optimal repetition-based strategy in terms of incurred cost while achieving a target mean-square error (MSE) performance may fuse several computational units. For concave and linear costs, a single more reliable unit incurs lower cost compared to fusion of several lower cost and less reliable units while achieving the same MSE performance. We show how our results give insight into problems from theoretical neuroscience, circuits, and crowdsourcing. △ Less

Submitted 22 May, 2017; originally announced May 2017.

arXiv:1705.07070 [pdf, other]

EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD

Authors: Mehmet A. Donmez, Maxim Raginsky, Andrew C. Singer

Abstract: We present a generic framework for trading off fidelity and cost in computing stochastic gradients when the costs of acquiring stochastic gradients of different quality are not known a priori. We consider a mini-batch oracle that distributes a limited query budget over a number of stochastic gradients and aggregates them to estimate the true gradient. Since the optimal mini-batch size depends on t… ▽ More We present a generic framework for trading off fidelity and cost in computing stochastic gradients when the costs of acquiring stochastic gradients of different quality are not known a priori. We consider a mini-batch oracle that distributes a limited query budget over a number of stochastic gradients and aggregates them to estimate the true gradient. Since the optimal mini-batch size depends on the unknown cost-fidelity function, we propose an algorithm, {\it EE-Grad}, that sequentially explores the performance of mini-batch oracles and exploits the accumulated knowledge to estimate the one achieving the best performance in terms of cost-efficiency. We provide performance guarantees for EE-Grad with respect to the optimal mini-batch oracle, and illustrate these results in the case of strongly convex objectives. We also provide a simple numerical example that corroborates our theoretical findings. △ Less

Submitted 19 May, 2017; originally announced May 2017.

arXiv:1705.00641 [pdf, other]

doi 10.1109/TSP.2017.2775591

Bispectrum Inversion with Application to Multireference Alignment

Authors: Tamir Bendory, Nicolas Boumal, Chao Ma, Zhizhen Zhao, Amit Singer

Abstract: We consider the problem of estimating a signal from noisy circularly-translated versions of itself, called multireference alignment (MRA). One natural approach to MRA could be to estimate the shifts of the observations first, and infer the signal by aligning and averaging the data. In contrast, we consider a method based on estimating the signal directly, using features of the signal that are inva… ▽ More We consider the problem of estimating a signal from noisy circularly-translated versions of itself, called multireference alignment (MRA). One natural approach to MRA could be to estimate the shifts of the observations first, and infer the signal by aligning and averaging the data. In contrast, we consider a method based on estimating the signal directly, using features of the signal that are invariant under translations. Specifically, we estimate the power spectrum and the bispectrum of the signal from the observations. Under mild assumptions, these invariant features contain enough information to infer the signal. In particular, the bispectrum can be used to estimate the Fourier phases. To this end, we propose and analyze a few algorithms. Our main methods consist of non-convex optimization over the smooth manifold of phases. Empirically, in the absence of noise, these non-convex algorithms appear to converge to the target signal with random initialization. The algorithms are also robust to noise. We then suggest three additional methods. These methods are based on frequency marching, semidefinite relaxation and integer programming. The first two methods provably recover the phases exactly in the absence of noise. In the high noise level regime, the invariant features approach for MRA results in stable estimation if the number of measurements scales like the cube of the noise variance, which is the information-theoretic rate. Additionally, it requires only one pass over the data which is important at low signal-to-noise ratio when the number of observations must be large. △ Less

Submitted 6 October, 2017; v1 submitted 1 May, 2017; originally announced May 2017.

arXiv:1704.07969 [pdf, other]

Anisotropic twicing for single particle reconstruction using autocorrelation analysis

Authors: Tejal Bhamre, Teng Zhang, Amit Singer

Abstract: The missing phase problem in X-ray crystallography is commonly solved using the technique of molecular replacement, which borrows phases from a previously solved homologous structure, and appends them to the measured Fourier magnitudes of the diffraction patterns of the unknown structure. More recently, molecular replacement has been proposed for solving the missing orthogonal matrices problem ari… ▽ More The missing phase problem in X-ray crystallography is commonly solved using the technique of molecular replacement, which borrows phases from a previously solved homologous structure, and appends them to the measured Fourier magnitudes of the diffraction patterns of the unknown structure. More recently, molecular replacement has been proposed for solving the missing orthogonal matrices problem arising in Kam's autocorrelation analysis for single particle reconstruction using X-ray free electron lasers and cryo-EM. In classical molecular replacement, it is common to estimate the magnitudes of the unknown structure as twice the measured magnitudes minus the magnitudes of the homologous structure, a procedure known as `twicing'. Mathematically, this is equivalent to finding an unbiased estimator for a complex-valued scalar. We generalize this scheme for the case of estimating real or complex valued matrices arising in single particle autocorrelation analysis. We name this approach "Anisotropic Twicing" because unlike the scalar case, the unbiased estimator is not obtained by a simple magnitude isotropic correction. We compare the performance of the least squares, twicing and anisotropic twicing estimators on synthetic and experimental datasets. We demonstrate 3D homology modeling in cryo-EM directly from experimental data without iterative refinement or class averaging, for the first time. △ Less

Submitted 26 April, 2017; originally announced April 2017.

arXiv:1704.02899 [pdf, ps, other]

Continuously heterogeneous hyper-objects in cryo-EM and 3-D movies of many temporal dimensions

Authors: Roy R. Lederman, Amit Singer

Abstract: Single particle cryo-electron microscopy (EM) is an increasingly popular method for determining the 3-D structure of macromolecules from noisy 2-D images of single macromolecules whose orientations and positions are random and unknown. One of the great opportunities in cryo-EM is to recover the structure of macromolecules in heterogeneous samples, where multiple types or multiple conformations are… ▽ More Single particle cryo-electron microscopy (EM) is an increasingly popular method for determining the 3-D structure of macromolecules from noisy 2-D images of single macromolecules whose orientations and positions are random and unknown. One of the great opportunities in cryo-EM is to recover the structure of macromolecules in heterogeneous samples, where multiple types or multiple conformations are mixed together. Indeed, in recent years, many tools have been introduced for the analysis of multiple discrete classes of molecules mixed together in a cryo-EM experiment. However, many interesting structures have a continuum of conformations which do not fit discrete models nicely; the analysis of such continuously heterogeneous models has remained a more elusive goal. In this manuscript, we propose to represent heterogeneous molecules and similar structures as higher dimensional objects. We generalize the basic operations used in many existing reconstruction algorithms, making our approach generic in the sense that, in principle, existing algorithms can be adapted to reconstruct those higher dimensional objects. As proof of concept, we present a prototype of a new algorithm which we use to solve simulated reconstruction problems. △ Less

Submitted 10 April, 2017; originally announced April 2017.

arXiv:1702.03023 [pdf, other]

A New Rank Constraint on Multi-view Fundamental Matrices, and its Application to Camera Location Recovery

Authors: Soumyadip Sengupta, Tal Amir, Meirav Galun, Tom Goldstein, David W. Jacobs, Amit Singer, Ronen Basri

Abstract: Accurate estimation of camera matrices is an important step in structure from motion algorithms. In this paper we introduce a novel rank constraint on collections of fundamental matrices in multi-view settings. We show that in general, with the selection of proper scale factors, a matrix formed by stacking fundamental matrices between pairs of images has rank 6. Moreover, this matrix forms the sym… ▽ More Accurate estimation of camera matrices is an important step in structure from motion algorithms. In this paper we introduce a novel rank constraint on collections of fundamental matrices in multi-view settings. We show that in general, with the selection of proper scale factors, a matrix formed by stacking fundamental matrices between pairs of images has rank 6. Moreover, this matrix forms the symmetric part of a rank 3 matrix whose factors relate directly to the corresponding camera matrices. We use this new characterization to produce better estimations of fundamental matrices by optimizing an L1-cost function using Iterative Re-weighted Least Squares and Alternate Direction Method of Multiplier. We further show that this procedure can improve the recovery of camera locations, particularly in multi-view settings in which fewer images are available. △ Less

Submitted 9 February, 2017; originally announced February 2017.

Showing 1–50 of 84 results for author: Singer, A