Search | arXiv e-print repository

Nonparametric Estimation via Variance-Reduced Sketching

Authors: Yuehaw Khoo, Yifan Peng, Daren Wang

Abstract: Nonparametric models are of great interest in various scientific and engineering disciplines. Classical kernel methods, while numerically robust and statistically sound in low-dimensional settings, become inadequate in higher-dimensional settings due to the curse of dimensionality. In this paper, we introduce a new framework called Variance-Reduced Sketching (VRS), specifically designed to estimat… ▽ More Nonparametric models are of great interest in various scientific and engineering disciplines. Classical kernel methods, while numerically robust and statistically sound in low-dimensional settings, become inadequate in higher-dimensional settings due to the curse of dimensionality. In this paper, we introduce a new framework called Variance-Reduced Sketching (VRS), specifically designed to estimate density functions and nonparametric regression functions in higher dimensions with a reduced curse of dimensionality. Our framework conceptualizes multivariable functions as infinite-size matrices, and facilitates a new sketching technique motivated by numerical linear algebra literature to reduce the variance in estimation problems. We demonstrate the robust numerical performance of VRS through a series of simulated experiments and real-world data applications. Notably, VRS shows remarkable improvement over existing neural network estimators and classical kernel methods in numerous density estimation and nonparametric regression models. Additionally, we offer theoretical justifications for VRS to support its ability to deliver nonparametric estimation with a reduced curse of dimensionality. △ Less

Submitted 21 January, 2024; originally announced January 2024.

Comments: 64 pages, 8 figures

arXiv:2312.12641 [pdf, other]

Robust Point Matching with Distance Profiles

Authors: YoonHaeng Hur, Yuehaw Khoo

Abstract: While matching procedures based on pairwise distances are conceptually appealing and thus favored in practice, theoretical guarantees for such procedures are rarely found in the literature. We propose and analyze matching procedures based on distance profiles that are easily implementable in practice, showing these procedures are robust to outliers and noise. We demonstrate the performance of the… ▽ More While matching procedures based on pairwise distances are conceptually appealing and thus favored in practice, theoretical guarantees for such procedures are rarely found in the literature. We propose and analyze matching procedures based on distance profiles that are easily implementable in practice, showing these procedures are robust to outliers and noise. We demonstrate the performance of the proposed method using a real data example and provide simulation studies to complement the theoretical findings. △ Less

Submitted 15 May, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.04982 [pdf, other]

Boosting Prompt-Based Self-Training With Map**-Free Automatic Verbalizer for Multi-Class Classification

Authors: Yookyung Kho, Jaehee Kim, Pilsung Kang

Abstract: Recently, prompt-based fine-tuning has garnered considerable interest as a core technique for few-shot text classification task. This approach reformulates the fine-tuning objective to align with the Masked Language Modeling (MLM) objective. Leveraging unlabeled data, prompt-based self-training has shown greater effectiveness in binary and three-class classification. However, prompt-based self-tra… ▽ More Recently, prompt-based fine-tuning has garnered considerable interest as a core technique for few-shot text classification task. This approach reformulates the fine-tuning objective to align with the Masked Language Modeling (MLM) objective. Leveraging unlabeled data, prompt-based self-training has shown greater effectiveness in binary and three-class classification. However, prompt-based self-training for multi-class classification has not been adequately investigated, despite its significant applicability to real-world scenarios. Moreover, extending current methods to multi-class classification suffers from the verbalizer that extracts the predicted value of manually pre-defined single label word for each class from MLM predictions. Consequently, we introduce a novel, efficient verbalizer structure, named Map**-free Automatic Verbalizer (MAV). Comprising two fully connected layers, MAV serves as a trainable verbalizer that automatically extracts the requisite word features for classification by capitalizing on all available information from MLM predictions. Experimental results on five multi-class classification datasets indicate MAV's superior self-training efficacy. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: EMNLP 2023 findings

arXiv:2309.05253 [pdf, other]

A quantum tug of war between randomness and symmetries on homogeneous spaces

Authors: Rahul Arvind, Kishor Bharti, Jun Yong Khoo, Dax Enshan Koh, Jian Feng Kong

Abstract: We explore the interplay between symmetry and randomness in quantum information. Adopting a geometric approach, we consider states as $H$-equivalent if related by a symmetry transformation characterized by the group $H$. We then introduce the Haar measure on the homogeneous space $\mathbb{U}/H$, characterizing true randomness for $H$-equivalent systems. While this mathematical machinery is well-st… ▽ More We explore the interplay between symmetry and randomness in quantum information. Adopting a geometric approach, we consider states as $H$-equivalent if related by a symmetry transformation characterized by the group $H$. We then introduce the Haar measure on the homogeneous space $\mathbb{U}/H$, characterizing true randomness for $H$-equivalent systems. While this mathematical machinery is well-studied by mathematicians, it has seen limited application in quantum information: we believe our work to be the first instance of utilizing homogeneous spaces to characterize symmetry in quantum information. This is followed by a discussion of approximations of true randomness, commencing with $t$-wise independent approximations and defining $t$-designs on $\mathbb{U}/H$ and $H$-equivalent states. Transitioning further, we explore pseudorandomness, defining pseudorandom unitaries and states within homogeneous spaces. Finally, as a practical demonstration of our findings, we study the expressibility of quantum machine learning ansatze in homogeneous spaces. Our work provides a fresh perspective on the relationship between randomness and symmetry in the quantum world. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: 9 + 1 pages, 3 figures

arXiv:2306.02043 [pdf, other]

Painsight: An Extendable Opinion Mining Framework for Detecting Pain Points Based on Online Customer Reviews

Authors: Yukyung Lee, Jaehee Kim, Doyoon Kim, Yookyung Kho, Younsun Kim, Pilsung Kang

Abstract: As the e-commerce market continues to expand and online transactions proliferate, customer reviews have emerged as a critical element in sha** the purchasing decisions of prospective buyers. Previous studies have endeavored to identify key aspects of customer reviews through the development of sentiment analysis models and topic models. However, extracting specific dissatisfaction factors remain… ▽ More As the e-commerce market continues to expand and online transactions proliferate, customer reviews have emerged as a critical element in sha** the purchasing decisions of prospective buyers. Previous studies have endeavored to identify key aspects of customer reviews through the development of sentiment analysis models and topic models. However, extracting specific dissatisfaction factors remains a challenging task. In this study, we delineate the pain point detection problem and propose Painsight, an unsupervised framework for automatically extracting distinct dissatisfaction factors from customer reviews without relying on ground truth labels. Painsight employs pre-trained language models to construct sentiment analysis and topic models, leveraging attribution scores derived from model gradients to extract dissatisfaction factors. Upon application of the proposed methodology to customer review data spanning five product categories, we successfully identified and categorized dissatisfaction factors within each group, as well as isolated factors for each type. Notably, Painsight outperformed benchmark methods, achieving substantial performance enhancements and exceptional results in human evaluations. △ Less

Submitted 3 June, 2023; originally announced June 2023.

Comments: WASSA at ACL 2023

arXiv:2305.02460 [pdf, other]

Tensorizing flows: a tool for variational inference

Authors: Yuehaw Khoo, Michael Lindsey, Hongli Zhao

Abstract: Fueled by the expressive power of deep neural networks, normalizing flows have achieved spectacular success in generative modeling, or learning to draw new samples from a distribution given a finite dataset of training samples. Normalizing flows have also been applied successfully to variational inference, wherein one attempts to learn a sampler based on an expression for the log-likelihood or ene… ▽ More Fueled by the expressive power of deep neural networks, normalizing flows have achieved spectacular success in generative modeling, or learning to draw new samples from a distribution given a finite dataset of training samples. Normalizing flows have also been applied successfully to variational inference, wherein one attempts to learn a sampler based on an expression for the log-likelihood or energy function of the distribution, rather than on data. In variational inference, the unimodality of the reference Gaussian distribution used within the normalizing flow can cause difficulties in learning multimodal distributions. We introduce an extension of normalizing flows in which the Gaussian reference is replaced with a reference distribution that is constructed via a tensor network, specifically a matrix product state or tensor train. We show that by combining flows with tensor networks on difficult variational inference tasks, we can improve on the results obtained by using either tool without the other. △ Less

Submitted 3 May, 2023; originally announced May 2023.

Comments: 24 pages, 16 figures. Authors listed alphabetically

arXiv:2304.14604 [pdf, other]

doi 10.1016/j.cam.2024.115782

Deep Neural-network Prior for Orbit Recovery from Method of Moments

Authors: Yuehaw Khoo, Sounak Paul, Nir Sharon

Abstract: Orbit recovery problems are a class of problems that often arise in practice and various forms. In these problems, we aim to estimate an unknown function after being distorted by a group action and observed via a known operator. Typically, the observations are contaminated with a non-trivial level of noise. Two particular orbit recovery problems of interest in this paper are multireference alignme… ▽ More Orbit recovery problems are a class of problems that often arise in practice and various forms. In these problems, we aim to estimate an unknown function after being distorted by a group action and observed via a known operator. Typically, the observations are contaminated with a non-trivial level of noise. Two particular orbit recovery problems of interest in this paper are multireference alignment and single-particle cryo-EM modelling. In order to suppress the noise, we suggest using the method of moments approach for both problems while introducing deep neural network priors. In particular, our neural networks should output the signals and the distribution of group elements, with moments being the input. In the multireference alignment case, we demonstrate the advantage of using the NN to accelerate the convergence for the reconstruction of signals from the moments. Finally, we use our method to reconstruct simulated and biological volumes in the cryo-EM setting. △ Less

Submitted 30 January, 2024; v1 submitted 27 April, 2023; originally announced April 2023.

Journal ref: J. Comput. Appl. Math. 115782 (2024)

arXiv:2304.05305 [pdf, other]

Generative Modeling via Hierarchical Tensor Sketching

Authors: Yifan Peng, Yian Chen, E. Miles Stoudenmire, Yuehaw Khoo

Abstract: We propose a hierarchical tensor-network approach for approximating high-dimensional probability density via empirical distribution. This leverages randomized singular value decomposition (SVD) techniques and involves solving linear equations for tensor cores in this tensor network. The complexity of the resulting algorithm scales linearly in the dimension of the high-dimensional density. An analy… ▽ More We propose a hierarchical tensor-network approach for approximating high-dimensional probability density via empirical distribution. This leverages randomized singular value decomposition (SVD) techniques and involves solving linear equations for tensor cores in this tensor network. The complexity of the resulting algorithm scales linearly in the dimension of the high-dimensional density. An analysis of estimation error demonstrates the effectiveness of this method through several numerical experiments. △ Less

Submitted 11 April, 2023; originally announced April 2023.

MSC Class: 15A69; 62Gxx

arXiv:2212.00759 [pdf, other]

High-dimensional density estimation with tensorizing flow

Authors: Yinuo Ren, Hongli Zhao, Yuehaw Khoo, Lexing Ying

Abstract: We propose the tensorizing flow method for estimating high-dimensional probability density functions from the observed data. The method is based on tensor-train and flow-based generative modeling. Our method first efficiently constructs an approximate density in the tensor-train form via solving the tensor cores from a linear system based on the kernel density estimators of low-dimensional margina… ▽ More We propose the tensorizing flow method for estimating high-dimensional probability density functions from the observed data. The method is based on tensor-train and flow-based generative modeling. Our method first efficiently constructs an approximate density in the tensor-train form via solving the tensor cores from a linear system based on the kernel density estimators of low-dimensional marginals. We then train a continuous-time flow model from this tensor-train density to the observed empirical distribution by performing a maximum likelihood estimation. The proposed method combines the optimization-less feature of the tensor-train with the flexibility of the flow-based generative models. Numerical results are included to demonstrate the performance of the proposed method. △ Less

Submitted 1 December, 2022; originally announced December 2022.

arXiv:2210.10654 [pdf, other]

POGD: Gradient Descent with New Stochastic Rules

Authors: Feihu Han, Sida Xing, Sui Yang Khoo

Abstract: There introduce Particle Optimized Gradient Descent (POGD), an algorithm based on the gradient descent but integrates the particle swarm optimization (PSO) principle to achieve the iteration. From the experiments, this algorithm has adaptive learning ability. The experiments in this paper mainly focus on the training speed to reach the target value and the ability to prevent the local minimum. The… ▽ More There introduce Particle Optimized Gradient Descent (POGD), an algorithm based on the gradient descent but integrates the particle swarm optimization (PSO) principle to achieve the iteration. From the experiments, this algorithm has adaptive learning ability. The experiments in this paper mainly focus on the training speed to reach the target value and the ability to prevent the local minimum. The experiments in this paper are achieved by the convolutional neural network (CNN) image classification on the MNIST and cifar-10 datasets. △ Less

Submitted 15 October, 2022; originally announced October 2022.

arXiv:2209.10531 [pdf, other]

doi 10.1073/pnas.2216507120

Autocorrelation analysis for cryo-EM with sparsity constraints: Improved sample complexity and projection-based algorithms

Authors: Tamir Bendory, Yuehaw Khoo, Joe Kileel, Oscar Mickelin, Amit Singer

Abstract: The number of noisy images required for molecular reconstruction in single-particle cryo-electron microscopy (cryo-EM) is governed by the autocorrelations of the observed, randomly-oriented, noisy projection images. In this work, we consider the effect of imposing sparsity priors on the molecule. We use techniques from signal processing, optimization, and applied algebraic geometry to obtain new t… ▽ More The number of noisy images required for molecular reconstruction in single-particle cryo-electron microscopy (cryo-EM) is governed by the autocorrelations of the observed, randomly-oriented, noisy projection images. In this work, we consider the effect of imposing sparsity priors on the molecule. We use techniques from signal processing, optimization, and applied algebraic geometry to obtain new theoretical and computational contributions for this challenging non-linear inverse problem with sparsity constraints. We prove that molecular structures modeled as sums of Gaussians are uniquely determined by the second-order autocorrelation of their projection images, implying that the sample complexity is proportional to the square of the variance of the noise. This theory improves upon the non-sparse case, where the third-order autocorrelation is required for uniformly-oriented particle images and the sample complexity scales with the cube of the noise variance. Furthermore, we build a computational framework to reconstruct molecular structures which are sparse in the wavelet basis. This method combines the sparse representation for the molecule with projection-based techniques used for phase retrieval in X-ray crystallography. △ Less

Submitted 1 May, 2023; v1 submitted 21 September, 2022; originally announced September 2022.

Comments: 31 pages, 5 figures, 1 movie

Journal ref: Proceedings of the National Academy of Sciences 120.18 (2023): e2216507120

arXiv:2209.01341 [pdf, other]

Generative Modeling via Tree Tensor Network States

Authors: Xun Tang, Yoonhaeng Hur, Yuehaw Khoo, Lexing Ying

Abstract: In this paper, we present a density estimation framework based on tree tensor-network states. The proposed method consists of determining the tree topology with Chow-Liu algorithm, and obtaining a linear system of equations that defines the tensor-network components via sketching techniques. Novel choices of sketch functions are developed in order to consider graphical models that contain loops. S… ▽ More In this paper, we present a density estimation framework based on tree tensor-network states. The proposed method consists of determining the tree topology with Chow-Liu algorithm, and obtaining a linear system of equations that defines the tensor-network components via sketching techniques. Novel choices of sketch functions are developed in order to consider graphical models that contain loops. Sample complexity guarantees are provided and further corroborated by numerical experiments. △ Less

Submitted 3 September, 2022; originally announced September 2022.

MSC Class: 62-08; 60-08; 15A69

arXiv:2206.04186 [pdf, other]

Reinforced Inverse Scattering

Authors: Hanyang Jiang, Yuehaw Khoo, Haizhao Yang

Abstract: Inverse wave scattering aims at determining the properties of an object using data on how the object scatters incoming waves. In order to collect information, sensors are put in different locations to send and receive waves from each other. The choice of sensor positions and incident wave frequencies determines the reconstruction quality of scatterer properties. This paper introduces reinforcement… ▽ More Inverse wave scattering aims at determining the properties of an object using data on how the object scatters incoming waves. In order to collect information, sensors are put in different locations to send and receive waves from each other. The choice of sensor positions and incident wave frequencies determines the reconstruction quality of scatterer properties. This paper introduces reinforcement learning to develop precision imaging that decides sensor positions and wave frequencies adaptive to different scatterers in an intelligent way, thus obtaining a significant improvement in reconstruction quality with limited imaging resources. Extensive numerical results will be provided to demonstrate the superiority of the proposed method over existing methods. △ Less

Submitted 2 November, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

MSC Class: 68Txx; 49MXX; 65N21

arXiv:2202.11788 [pdf, other]

Generative modeling via tensor train sketching

Authors: YH. Hur, J. G. Hoskins, M. Lindsey, E. M. Stoudenmire, Y. Khoo

Abstract: In this paper, we introduce a sketching algorithm for constructing a tensor train representation of a probability density from its samples. Our method deviates from the standard recursive SVD-based procedure for constructing a tensor train. Instead, we formulate and solve a sequence of small linear systems for the individual tensor train cores. This approach can avoid the curse of dimensionality t… ▽ More In this paper, we introduce a sketching algorithm for constructing a tensor train representation of a probability density from its samples. Our method deviates from the standard recursive SVD-based procedure for constructing a tensor train. Instead, we formulate and solve a sequence of small linear systems for the individual tensor train cores. This approach can avoid the curse of dimensionality that threatens both the algorithmic and sample complexities of the recovery problem. Specifically, for Markov models under natural conditions, we prove that the tensor cores can be recovered with a sample complexity that scales logarithmically in the dimensionality. Finally, we illustrate the performance of the method with several numerical experiments. △ Less

Submitted 23 June, 2023; v1 submitted 23 February, 2022; originally announced February 2022.

MSC Class: 15A69; 62Gxx

arXiv:2112.13199 [pdf, other]

A Spectral Method for Joint Community Detection and Orthogonal Group Synchronization

Authors: Yifeng Fan, Yuehaw Khoo, Zhizhen Zhao

Abstract: Community detection and orthogonal group synchronization are both fundamental problems with a variety of important applications in science and engineering. In this work, we consider the joint problem of community detection and orthogonal group synchronization which aims to recover the communities and perform synchronization simultaneously. To this end, we propose a simple algorithm that consists o… ▽ More Community detection and orthogonal group synchronization are both fundamental problems with a variety of important applications in science and engineering. In this work, we consider the joint problem of community detection and orthogonal group synchronization which aims to recover the communities and perform synchronization simultaneously. To this end, we propose a simple algorithm that consists of a spectral decomposition step followed by a blockwise column pivoted QR factorization (CPQR). The proposed algorithm is efficient and scales linearly with the number of edges in the graph. We also leverage the recently developed `leave-one-out' technique to establish a near-optimal guarantee for exact recovery of the cluster memberships and stable recovery of the orthogonal transforms. Numerical experiments demonstrate the efficiency and efficacy of our algorithm and confirm our theoretical characterization of it. △ Less

Submitted 15 September, 2022; v1 submitted 25 December, 2021; originally announced December 2021.

arXiv:2108.00700 [pdf]

Piecewise Linear Units Improve Deep Neural Networks

Authors: Jordan Inturrisi, Sui Yang Khoo, Abbas Kouzani, Riccardo Pagliarella

Abstract: The activation function is at the heart of a deep neural networks nonlinearity; the choice of the function has great impact on the success of training. Currently, many practitioners prefer the Rectified Linear Unit (ReLU) due to its simplicity and reliability, despite its few drawbacks. While most previous functions proposed to supplant ReLU have been hand-designed, recent work on learning the fun… ▽ More The activation function is at the heart of a deep neural networks nonlinearity; the choice of the function has great impact on the success of training. Currently, many practitioners prefer the Rectified Linear Unit (ReLU) due to its simplicity and reliability, despite its few drawbacks. While most previous functions proposed to supplant ReLU have been hand-designed, recent work on learning the function during training has shown promising results. In this paper we propose an adaptive piecewise linear activation function, the Piecewise Linear Unit (PiLU), which can be learned independently for each dimension of the neural network. We demonstrate how PiLU is a generalised rectifier unit and note its similarities with the Adaptive Piecewise Linear Units, namely adaptive and piecewise linear. Across a distribution of 30 experiments, we show that for the same model architecture, hyperparameters, and pre-processing, PiLU significantly outperforms ReLU: reducing classification error by 18.53% on CIFAR-10 and 13.13% on CIFAR-100, for a minor increase in the number of neurons. Further work should be dedicated to exploring generalised piecewise linear units, as well as verifying these results across other challenging domains and larger problems. △ Less

Submitted 22 August, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

Comments: 13 pages, 6 figures, 5 tables, replaced some figures and wording

arXiv:2105.06031 [pdf, other]

Joint Community Detection and Rotational Synchronization via Semidefinite Programming

Authors: Yifeng Fan, Yuehaw Khoo, Zhizhen Zhao

Abstract: In the presence of heterogeneous data, where randomly rotated objects fall into multiple underlying categories, it is challenging to simultaneously classify them into clusters and synchronize them based on pairwise relations. This gives rise to the joint problem of community detection and synchronization. We propose a series of semidefinite relaxations, and prove their exact recovery when extendin… ▽ More In the presence of heterogeneous data, where randomly rotated objects fall into multiple underlying categories, it is challenging to simultaneously classify them into clusters and synchronize them based on pairwise relations. This gives rise to the joint problem of community detection and synchronization. We propose a series of semidefinite relaxations, and prove their exact recovery when extending the celebrated stochastic block model to this new setting where both rotations and cluster identities are to be determined. Numerical experiments demonstrate the efficacy of our proposed algorithms and confirm our theoretical result which indicates a sharp phase transition for exact recovery. △ Less

Submitted 14 September, 2023; v1 submitted 12 May, 2021; originally announced May 2021.

arXiv:2012.06727 [pdf, other]

A semigroup method for high dimensional committor functions based on neural network

Authors: Haoya Li, Yuehaw Khoo, Yinuo Ren, Lexing Ying

Abstract: This paper proposes a new method based on neural networks for computing the high-dimensional committor functions that satisfy Fokker-Planck equations. Instead of working with partial differential equations, the new method works with an integral formulation based on the semigroup of the differential operator. The variational form of the new formulation is then solved by parameterizing the committor… ▽ More This paper proposes a new method based on neural networks for computing the high-dimensional committor functions that satisfy Fokker-Planck equations. Instead of working with partial differential equations, the new method works with an integral formulation based on the semigroup of the differential operator. The variational form of the new formulation is then solved by parameterizing the committor function as a neural network. There are two major benefits of this new approach. First, stochastic gradient descent type algorithms can be applied in the training of the committor function without the need of computing any mixed second-order derivatives. Moreover, unlike the previous methods that enforce the boundary conditions through penalty terms, the new method takes into account the boundary conditions automatically. Numerical results are provided to demonstrate the performance of the proposed method. △ Less

Submitted 5 May, 2021; v1 submitted 12 December, 2020; originally announced December 2020.

Comments: 21 pages, 14 figures; Final version accepted at MSML 2021

arXiv:2008.03641 [pdf, other]

NMR Assignment through Linear Programming

Authors: Jose F. S. Bravo-Ferreira, David Cowburn, Yuehaw Khoo, Amit Singer

Abstract: Nuclear Magnetic Resonance (NMR) Spectroscopy is the second most used technique (after X-ray crystallography) for structural determination of proteins. A computational challenge in this technique involves solving a discrete optimization problem that assigns the resonance frequency to each atom in the protein. This paper introduces LIAN (LInear programming Assignment for NMR), a novel linear progra… ▽ More Nuclear Magnetic Resonance (NMR) Spectroscopy is the second most used technique (after X-ray crystallography) for structural determination of proteins. A computational challenge in this technique involves solving a discrete optimization problem that assigns the resonance frequency to each atom in the protein. This paper introduces LIAN (LInear programming Assignment for NMR), a novel linear programming formulation of the problem which yields state-of-the-art results in simulated and experimental datasets. △ Less

Submitted 7 September, 2021; v1 submitted 8 August, 2020; originally announced August 2020.

Comments: 28 pages, 10 figures

arXiv:1811.05850 [pdf, other]

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Authors: Senwei Liang, Yuehaw Khoo, Haizhao Yang

Abstract: Overfitting frequently occurs in deep learning. In this paper, we propose a novel regularization method called Drop-Activation to reduce overfitting and improve generalization. The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the aver… ▽ More Overfitting frequently occurs in deep learning. In this paper, we propose a novel regularization method called Drop-Activation to reduce overfitting and improve generalization. The key idea is to drop nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the average effect of drop** activations randomly. Our theoretical analyses support the regularization effect of Drop-Activation as implicit parameter reduction and verify its capability to be used together with Batch Normalization (Ioffe and Szegedy 2015). The experimental results on CIFAR-10, CIFAR-100, SVHN, EMNIST, and ImageNet show that Drop-Activation generally improves the performance of popular neural network architectures for the image classification task. Furthermore, as a regularizer Drop-Activation can be used in harmony with standard training and regularization techniques such as Batch Normalization and Auto Augment (Cubuk et al. 2019). The code is available at \url{https://github.com/LeungSamWai/Drop-Activation}. △ Less

Submitted 28 March, 2020; v1 submitted 14 November, 2018; originally announced November 2018.

arXiv:1810.09675 [pdf, other]

SwitchNet: a neural network model for forward and inverse scattering problems

Authors: Yuehaw Khoo, Lexing Ying

Abstract: We propose a novel neural network architecture, SwitchNet, for solving the wave equation based inverse scattering problems via providing maps between the scatterers and the scattered field (and vice versa). The main difficulty of using a neural network for this problem is that a scatterer has a global impact on the scattered wave field, rendering typical convolutional neural network with local con… ▽ More We propose a novel neural network architecture, SwitchNet, for solving the wave equation based inverse scattering problems via providing maps between the scatterers and the scattered field (and vice versa). The main difficulty of using a neural network for this problem is that a scatterer has a global impact on the scattered wave field, rendering typical convolutional neural network with local connections inapplicable. While it is possible to deal with such a problem using a fully connected network, the number of parameters grows quadratically with the size of the input and output data. By leveraging the inherent low-rank structure of the scattering problems and introducing a novel switching layer with sparse connections, the SwitchNet architecture uses much fewer parameters and facilitates the training process. Numerical experiments show promising accuracy in learning the forward and inverse maps between the scatterers and the scattered wave field. △ Less

Submitted 23 October, 2018; originally announced October 2018.

Comments: 19 pages, 7 figures

arXiv:1802.10275 [pdf, other]

Solving for high dimensional committor functions using artificial neural networks

Authors: Yuehaw Khoo, Jianfeng Lu, Lexing Ying

Abstract: In this note we propose a method based on artificial neural network to study the transition between states governed by stochastic processes. In particular, we aim for numerical schemes for the committor function, the central object of transition path theory, which satisfies a high-dimensional Fokker-Planck equation. By working with the variational formulation of such partial differential equation… ▽ More In this note we propose a method based on artificial neural network to study the transition between states governed by stochastic processes. In particular, we aim for numerical schemes for the committor function, the central object of transition path theory, which satisfies a high-dimensional Fokker-Planck equation. By working with the variational formulation of such partial differential equation and parameterizing the committor function in terms of a neural network, approximations can be obtained via optimizing the neural network weights using stochastic algorithms. The numerical examples show that moderate accuracy can be achieved for high-dimensional problems. △ Less

Submitted 28 February, 2018; originally announced February 2018.

Comments: 12 pages, 6 figures

MSC Class: 65Nxx

arXiv:1706.01115 [pdf, other]

A Random-Fern based Feature Approach for Image Matching

Authors: Yong Khoo, Seo-hyeon Keun

Abstract: Image or object recognition is an important task in computer vision. With the hight-speed processing power on modern platforms and the availability of mobile phones everywhere, millions of photos are uploaded to the internet per minute, it is critical to establish a generic framework for fast and accurate image processing for automatic recognition and information retrieval. In this paper, we propo… ▽ More Image or object recognition is an important task in computer vision. With the hight-speed processing power on modern platforms and the availability of mobile phones everywhere, millions of photos are uploaded to the internet per minute, it is critical to establish a generic framework for fast and accurate image processing for automatic recognition and information retrieval. In this paper, we proposed an efficient image recognition and matching method that is originally derived from Naive Bayesian classification method to construct a probabilistic model. Our method support real-time performance and have very high ability to distinguish similar images with high details. Experiments are conducted together with intensive comparison with state-of-the-arts on image matching, such as Ferns recognition and SIFT recognition. The results demonstrate satisfactory performance. △ Less

Submitted 4 June, 2017; originally announced June 2017.

Comments: Computer Imaging, 2017

arXiv:1705.05508 [pdf, other]

Automated Body Structure Extraction from Arbitrary 3D Mesh

Authors: Yong Khoo, Sang Chung

Abstract: This paper presents an automated method for 3D character skeleton extraction that can be applied for generic 3D shapes. Our work is motivated by the skeleton-based prior work on automatic rigging focused on skeleton extraction and can automatically aligns the extracted structure to fit the 3D shape of the given 3D mesh. The body mesh can be subsequently skinned based on the extracted skeleton and… ▽ More This paper presents an automated method for 3D character skeleton extraction that can be applied for generic 3D shapes. Our work is motivated by the skeleton-based prior work on automatic rigging focused on skeleton extraction and can automatically aligns the extracted structure to fit the 3D shape of the given 3D mesh. The body mesh can be subsequently skinned based on the extracted skeleton and thus enables rigging process. In the experiment, we apply public dataset to drive the estimated skeleton from different body shapes, as well as the real data obtained from 3D scanning systems. Satisfactory results are obtained compared to the existing approaches. △ Less

Submitted 15 May, 2017; originally announced May 2017.

Journal ref: Imaging and Graphics, 2017

arXiv:1705.05016 [pdf, other]

A Correspondence Relaxation Approach for 3D Shape Reconstruction

Authors: Yong Khoo

Abstract: This paper presents a new method for 3D shape reconstruction based on two existing methods. A 3D reconstruction from a single photograph is introduced by both papers: the first one uses a photograph and a set of existing 3D model to generate the 3D object in the photograph, while the second one uses a photograph and a selected similar model to create the 3D object in the photograph. According to t… ▽ More This paper presents a new method for 3D shape reconstruction based on two existing methods. A 3D reconstruction from a single photograph is introduced by both papers: the first one uses a photograph and a set of existing 3D model to generate the 3D object in the photograph, while the second one uses a photograph and a selected similar model to create the 3D object in the photograph. According to their difference, we propose a relaxation based method for more accurate correspondence establishment and shape recovery. The experiment demonstrates promising results compared to the state-of-the-art work on 3D shape estimation. △ Less

Submitted 14 May, 2017; originally announced May 2017.

arXiv:1608.05045 [pdf, other]

Large Angle based Skeleton Extraction for 3D Animation

Authors: Hugo Martin, Raphael Fernandez, Yong Khoo

Abstract: In this paper, we present a solution for arbitrary 3D character deformation by investigating rotation angle of decomposition and preserving the mesh topology structure. In computer graphics, skeleton extraction and skeleton-driven animation is an active areas and gains increasing interests from researchers. The accuracy is critical for realistic animation and related applications. There have been… ▽ More In this paper, we present a solution for arbitrary 3D character deformation by investigating rotation angle of decomposition and preserving the mesh topology structure. In computer graphics, skeleton extraction and skeleton-driven animation is an active areas and gains increasing interests from researchers. The accuracy is critical for realistic animation and related applications. There have been extensive studies on skeleton based 3D deformation. However for the scenarios of large angle rotation of different body parts, it has been relatively less addressed by the state-of-the-art, which often yield unsatisfactory results. Besides 3D animation problems, we also notice for many 3D skeleton detection or tracking applications from a video or depth streams, large angle rotation is also a critical factor in the regression accuracy and robustness. We introduced a distortion metric function to quantify the surface curviness before and after deformation, which is a major clue for large angle rotation detection. The intensive experimental results show that our method is suitable for 3D modeling, animation, skeleton based tracking applications. △ Less

Submitted 17 August, 2016; originally announced August 2016.

arXiv:1606.06975 [pdf, ps, other]

Bias Correction in Saupe Tensor Estimation

Authors: Yuehaw Khoo, Amit Singer, David Cowburn

Abstract: Estimation of the Saupe tensor is central to the determination of molecular structures from residual dipolar couplings (RDC) or chemical shift anisotropies. Assuming a given template structure, the singular value decomposition (SVD) method proposed in Losonczi et al. 1999 has been used traditionally to estimate the Saupe tensor. Despite its simplicity, whenever the template structure has large str… ▽ More Estimation of the Saupe tensor is central to the determination of molecular structures from residual dipolar couplings (RDC) or chemical shift anisotropies. Assuming a given template structure, the singular value decomposition (SVD) method proposed in Losonczi et al. 1999 has been used traditionally to estimate the Saupe tensor. Despite its simplicity, whenever the template structure has large structural noise, the eigenvalues of the estimated tensor have a magnitude systematically smaller than their actual values. This leads to systematic error when calculating the eigenvalue dependent parameters, magnitude and rhombicity. We propose here a Monte Carlo simulation method to remove such bias. We further demonstrate the effectiveness of our method in the setting when the eigenvalue estimates from multiple template protein fragments are available and their average is used as an improved eigenvalue estimator. For both synthetic and experimental RDC datasets of ubiquitin, when using template fragments corrupted by large noise, the magnitude of our proposed bias-reduced estimator generally reaches at least 90% of the actual value, whereas the magnitude of SVD estimator can be shrunk below 80% of the true value. △ Less

Submitted 22 June, 2016; originally announced June 2016.

Comments: 24 pages, 5 figures

ACM Class: G.3; J.2

arXiv:1604.01504 [pdf, other]

Integrating NOE and RDC using sum-of-squares relaxation for protein structure determination

Authors: Yuehaw Khoo, Amit Singer, David Cowburn

Abstract: We revisit the problem of protein structure determination from geometrical restraints from NMR, using convex optimization. It is well-known that the NP-hard distance geometry problem of determining atomic positions from pairwise distance restraints can be relaxed into a convex semidefinite program. Often the NOE distance restraints are too imprecise and sparse for accurate structure determination.… ▽ More We revisit the problem of protein structure determination from geometrical restraints from NMR, using convex optimization. It is well-known that the NP-hard distance geometry problem of determining atomic positions from pairwise distance restraints can be relaxed into a convex semidefinite program. Often the NOE distance restraints are too imprecise and sparse for accurate structure determination. Residual dipolar coupling (RDC) measurements provide additional geometric information on the angles between atom-pair directions and axes of the principal-axis-frame. The optimization problem involving RDC is highly non-convex and requires a good initialization even within the simulated annealing framework. In this paper, we model the protein backbone as an articulated structure composed of rigid units. Determining the rotation of each rigid unit gives the full protein structure. We propose solving the non-convex optimization problems using the sum-of-squares (SOS) hierarchy. The two algorithms - RDC-SOS and RDC-NOE-SOS, have polynomial time complexity in the number of amino-acid residues and run efficiently on a standard desktop. In many instances, the proposed methods exactly recover the solution to the original non-convex optimization problem. We introduce a statistical tool, the Cramer-Rao bound (CRB), to provide an information theoretic bound on the highest resolution one can hope to achieve when determining protein structure from noisy measurements using any methodology. Our simulation results show that when the RDC measurements are corrupted by Gaussian noise of realistic variance, both SOS based algorithms attain the CRB. We successfully apply our method in a divide-and-conquer fashion to determine the structure of ubiquitin from experimental NOE and RDC measurements, achieving more accurate and faster reconstructions compared to the current state of the art. △ Less

Submitted 26 February, 2017; v1 submitted 6 April, 2016; originally announced April 2016.

Comments: 41 pages, 5 figures

MSC Class: 90C90; 90C22; 68T40; 92C40 ACM Class: I.4.9; G.1.6

arXiv:1501.00630 [pdf, other]

doi 10.1109/TIP.2016.2540810

Non-iterative rigid 2D/3D point-set registration using semidefinite programming

Authors: Yuehaw Khoo, Ankur Kapoor

Abstract: We describe a convex programming framework for pose estimation in 2D/3D point-set registration with unknown point correspondences. We give two mixed-integer nonlinear program (MINP) formulations of the 2D/3D registration problem when there are multiple 2D images, and propose convex relaxations for both of the MINPs to semidefinite programs (SDP) that can be solved efficiently by interior point met… ▽ More We describe a convex programming framework for pose estimation in 2D/3D point-set registration with unknown point correspondences. We give two mixed-integer nonlinear program (MINP) formulations of the 2D/3D registration problem when there are multiple 2D images, and propose convex relaxations for both of the MINPs to semidefinite programs (SDP) that can be solved efficiently by interior point methods. Our approach to the 2D/3D registration problem is non-iterative in nature as we jointly solve for pose and correspondence. Furthermore, these convex programs can readily incorporate feature descriptors of points to enhance registration results. We prove that the convex programs exactly recover the solution to the original nonconvex 2D/3D registration problem under noiseless condition. We apply these formulations to the registration of 3D models of coronary vessels to their 2D projections obtained from multiple intra-operative fluoroscopic images. For this application, we experimentally corroborate the exact recovery property in the absence of noise and further demonstrate robustness of the convex programs in the presence of noise. △ Less

Submitted 6 April, 2016; v1 submitted 3 January, 2015; originally announced January 2015.

Comments: 15 pages, 7 figures

MSC Class: 90C22; 92C55 ACM Class: G.1.6; I.4.9

arXiv:1404.2655 [pdf, other]

Open problem: Tightness of maximum likelihood semidefinite relaxations

Authors: Afonso S. Bandeira, Yuehaw Khoo, Amit Singer

Abstract: We have observed an interesting, yet unexplained, phenomenon: Semidefinite programming (SDP) based relaxations of maximum likelihood estimators (MLE) tend to be tight in recovery problems with noisy data, even when MLE cannot exactly recover the ground truth. Several results establish tightness of SDP based relaxations in the regime where exact recovery from MLE is possible. However, to the best o… ▽ More We have observed an interesting, yet unexplained, phenomenon: Semidefinite programming (SDP) based relaxations of maximum likelihood estimators (MLE) tend to be tight in recovery problems with noisy data, even when MLE cannot exactly recover the ground truth. Several results establish tightness of SDP based relaxations in the regime where exact recovery from MLE is possible. However, to the best of our knowledge, their tightness is not understood beyond this regime. As an illustrative example, we focus on the generalized Procrustes problem. △ Less

Submitted 9 April, 2014; originally announced April 2014.

arXiv:1310.8135 [pdf, other]

doi 10.1109/ICASSP.2015.7178491

Large-Scale Sensor Network Localization via Rigid Subnetwork Registration

Authors: Kunal N. Chaudhury, Yuehaw Khoo, Amit Singer

Abstract: In this paper, we describe an algorithm for sensor network localization (SNL) that proceeds by dividing the whole network into smaller subnetworks, then localizes them in parallel using some fast and accurate algorithm, and finally registers the localized subnetworks in a global coordinate system. We demonstrate that this divide-and-conquer algorithm can be used to leverage existing high-precision… ▽ More In this paper, we describe an algorithm for sensor network localization (SNL) that proceeds by dividing the whole network into smaller subnetworks, then localizes them in parallel using some fast and accurate algorithm, and finally registers the localized subnetworks in a global coordinate system. We demonstrate that this divide-and-conquer algorithm can be used to leverage existing high-precision SNL algorithms to large-scale networks, which could otherwise only be applied to small-to-medium sized networks. The main contribution of this paper concerns the final registration phase. In particular, we consider a least-squares formulation of the registration problem (both with and without anchor constraints) and demonstrate how this otherwise non-convex problem can be relaxed into a tractable convex program. We provide some preliminary simulation results for large-scale SNL demonstrating that the proposed registration algorithm (together with an accurate localization scheme) offers a good tradeoff between run time and accuracy. △ Less

Submitted 15 January, 2015; v1 submitted 29 October, 2013; originally announced October 2013.

Comments: 5 pages, 8 figures, 1 table. To appear in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, April 19-24, 2015

arXiv:1306.5226 [pdf, other]

Global registration of multiple point clouds using semidefinite programming

Authors: Kunal N. Chaudhury, Yuehaw Khoo, Amit Singer

Abstract: Consider $N$ points in $\mathbb{R}^d$ and $M$ local coordinate systems that are related through unknown rigid transforms. For each point we are given (possibly noisy) measurements of its local coordinates in some of the coordinate systems. Alternatively, for each coordinate system, we observe the coordinates of a subset of the points. The problem of estimating the global coordinates of the $N$ poi… ▽ More Consider $N$ points in $\mathbb{R}^d$ and $M$ local coordinate systems that are related through unknown rigid transforms. For each point we are given (possibly noisy) measurements of its local coordinates in some of the coordinate systems. Alternatively, for each coordinate system, we observe the coordinates of a subset of the points. The problem of estimating the global coordinates of the $N$ points (up to a rigid transform) from such measurements comes up in distributed approaches to molecular conformation and sensor network localization, and also in computer vision and graphics. The least-squares formulation of this problem, though non-convex, has a well known closed-form solution when $M=2$ (based on the singular value decomposition). However, no closed form solution is known for $M\geq 3$. In this paper, we demonstrate how the least-squares formulation can be relaxed into a convex program, namely a semidefinite program (SDP). By setting up connections between the uniqueness of this SDP and results from rigidity theory, we prove conditions for exact and stable recovery for the SDP relaxation. In particular, we prove that the SDP relaxation can guarantee recovery under more adversarial conditions compared to earlier proposed spectral relaxations, and derive error bounds for the registration error incurred by the SDP relaxation. We also present results of numerical experiments on simulated data to confirm the theoretical findings. We empirically demonstrate that (a) unlike the spectral relaxation, the relaxation gap is mostly zero for the semidefinite program (i.e., we are able to solve the original non-convex least-squares problem) up to a certain noise threshold, and (b) the semidefinite program performs significantly better than spectral and manifold-optimization methods, particularly at large noise levels. △ Less

Submitted 23 December, 2014; v1 submitted 21 June, 2013; originally announced June 2013.

Comments: 33 pages, 12 figures. To appear in SIAM Journal on Optimization

MSC Class: 90C22; 52C25; 05C50

Showing 1–32 of 32 results for author: Khoo, Y