Search | arXiv e-print repository

A General Theory of Correct, Incorrect, and Extrinsic Equivariance

Authors: Dian Wang, Xupeng Zhu, Jung Yeon Park, Mingxi Jia, Guanang Su, Robert Platt, Robin Walters

Abstract: Although equivariant machine learning has proven effective at many tasks, success depends heavily on the assumption that the ground truth function is symmetric over the entire domain matching the symmetry in an equivariant neural network. A missing piece in the equivariant learning literature is the analysis of equivariant networks when symmetry exists only partially in the domain. In this work, w… ▽ More Although equivariant machine learning has proven effective at many tasks, success depends heavily on the assumption that the ground truth function is symmetric over the entire domain matching the symmetry in an equivariant neural network. A missing piece in the equivariant learning literature is the analysis of equivariant networks when symmetry exists only partially in the domain. In this work, we present a general theory for such a situation. We propose pointwise definitions of correct, incorrect, and extrinsic equivariance, which allow us to quantify continuously the degree of each type of equivariance a function displays. We then study the impact of various degrees of incorrect or extrinsic symmetry on model error. We prove error lower bounds for invariant or equivariant networks in classification or regression settings with partially incorrect symmetry. We also analyze the potentially harmful effects of extrinsic equivariance. Experiments validate these results in three different environments. △ Less

Submitted 28 October, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

Comments: Published at NeurIPS 2023

arXiv:2001.04029 [pdf, other]

doi 10.1103/PhysRevE.102.012152

Tangent-Space Gradient Optimization of Tensor Network for Machine Learning

Authors: Zheng-zhi Sun, Shi-ju Ran, Gang Su

Abstract: The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the v… ▽ More The gradient-based optimization method for deep machine learning models suffers from gradient vanishing and exploding problems, particularly when the computational graph becomes deep. In this work, we propose the tangent-space gradient optimization (TSGO) for the probabilistic models to keep the gradients from vanishing or exploding. The central idea is to guarantee the orthogonality between the variational parameters and the gradients. The optimization is then implemented by rotating parameter vector towards the direction of gradient. We explain and testify TSGO in tensor network (TN) machine learning, where the TN describes the joint probability distribution as a normalized state $\left| ψ\right\rangle $ in Hilbert space. We show that the gradient can be restricted in the tangent space of $\left\langle ψ\right.\left| ψ\right\rangle = 1$ hyper-sphere. Instead of additional adaptive methods to control the learning rate in deep learning, the learning rate of TSGO is naturally determined by the angle $θ$ as $η= \tan θ$. Our numerical results reveal better convergence of TSGO in comparison to the off-the-shelf Adam. △ Less

Submitted 10 January, 2020; originally announced January 2020.

Comments: 5 pages, 4 figures

Journal ref: Phys. Rev. E 102, 012152 (2020)

arXiv:1912.10729 [pdf, other]

TextNAS: A Neural Architecture Search Space tailored for Text Representation

Authors: Yu**g Wang, Yaming Yang, Yiren Chen, **g Bai, Ce Zhang, Guinan Su, Xiaoyu Kou, Yunhai Tong, Mao Yang, Lidong Zhou

Abstract: Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS… ▽ More Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition. △ Less

Submitted 23 December, 2019; originally announced December 2019.

arXiv:1907.10290 [pdf, other]

doi 10.1103/PhysRevResearch.2.033293

Quantum Compressed Sensing with Unsupervised Tensor-Network Machine Learning

Authors: Shi-Ju Ran, Zheng-Zhi Sun, Shao-Ming Fei, Gang Su, Maciej Lewenstein

Abstract: We propose tensor-network compressed sensing (TNCS) by combining the ideas of compressed sensing, tensor network (TN), and machine learning, which permits novel and efficient quantum communications of realistic data. The strategy is to use the unsupervised TN machine learning algorithm to obtain the entangled state $|Ψ\rangle$ that describes the probability distribution of a huge amount of classic… ▽ More We propose tensor-network compressed sensing (TNCS) by combining the ideas of compressed sensing, tensor network (TN), and machine learning, which permits novel and efficient quantum communications of realistic data. The strategy is to use the unsupervised TN machine learning algorithm to obtain the entangled state $|Ψ\rangle$ that describes the probability distribution of a huge amount of classical information considered to be communicated. To transfer a specific piece of information with $|Ψ\rangle$, our proposal is to encode such information in the separable state with the minimal distance to the measured state $|Φ\rangle$ that is obtained by partially measuring on $|Ψ\rangle$ in a designed way. To this end, a measuring protocol analogous to the compressed sensing with neural-network machine learning is suggested, where the measurements are designed to minimize uncertainty of information from the probability distribution given by $|Φ\rangle$. In this way, those who have $|Φ\rangle$ can reliably access the information by simply measuring on $|Φ\rangle$. We propose q-sparsity to characterize the sparsity of quantum states and the efficiency of the quantum communications by TNCS. The high q-sparsity is essentially due to the fact that the TN states describing nicely the probability distribution obey the area law of entanglement entropy. Testing on realistic datasets (hand-written digits and fashion images), TNCS is shown to possess high efficiency and accuracy, where the security of communications is guaranteed by the fundamental quantum principles. △ Less

Submitted 13 October, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

Comments: 5+6 pages, 3+6 figures. Essential changes and new data were added to this new version

Journal ref: Phys. Rev. Research 2, 033293 (2020)

arXiv:1903.10742 [pdf, other]

doi 10.1103/PhysRevB.101.075135

Generative Tensor Network Classification Model for Supervised Machine Learning

Authors: Zheng-Zhi Sun, Cheng Peng, Ding Liu, Shi-Ju Ran, Gang Su

Abstract: Tensor network (TN) has recently triggered extensive interests in develo** machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-… ▽ More Tensor network (TN) has recently triggered extensive interests in develo** machine-learning models in quantum many-body Hilbert space. Here we purpose a generative TN classification (GTNC) approach for supervised learning. The strategy is to train the generative TN for each class of the samples to construct the classifiers. The classification is implemented by comparing the distance in the many-body Hilbert space. The numerical experiments by GTNC show impressive performance on the MNIST and Fashion-MNIST dataset. The testing accuracy is competitive to the state-of-the-art convolutional neural network while higher than the naive Bayes classifier (a generative classifier) and support vector machine. Moreover, GTNC is more efficient than the existing TN models that are in general discriminative. By investigating the distances in the many-body Hilbert space, we find that (a) the samples are naturally clustering in such a space; and (b) bounding the bond dimensions of the TN's to finite values corresponds to removing redundant information in the image recognition. These two characters make GTNC an adaptive and universal model of excellent performance. △ Less

Submitted 26 March, 2019; originally announced March 2019.

Comments: 7 pages, 5 figures

Journal ref: Phys. Rev. B 101, 075135 (2020)

arXiv:1710.04833 [pdf, ps, other]

doi 10.1088/1367-2630/ab31ef

Machine Learning by Unitary Tensor Network of Hierarchical Tree Structure

Authors: Ding Liu, Shi-Ju Ran, Peter Wittek, Cheng Peng, Raul Blázquez García, Gang Su, Maciej Lewenstein

Abstract: The resemblance between the methods used in quantum-many body physics and in machine learning has drawn considerable attention. In particular, tensor networks (TNs) and deep learning architectures bear striking similarities to the extent that TNs can be used for machine learning. Previous results used one-dimensional TNs in image recognition, showing limited scalability and flexibilities. In this… ▽ More The resemblance between the methods used in quantum-many body physics and in machine learning has drawn considerable attention. In particular, tensor networks (TNs) and deep learning architectures bear striking similarities to the extent that TNs can be used for machine learning. Previous results used one-dimensional TNs in image recognition, showing limited scalability and flexibilities. In this work, we train two-dimensional hierarchical TNs to solve image recognition problems, using a training algorithm derived from the multi-scale entanglement renormalization ansatz. This approach introduces mathematical connections among quantum many-body physics, quantum information theory, and machine learning. While kee** the TN unitary in the training phase, TN states are defined, which encode classes of images into quantum many-body states. We study the quantum features of the TN states, including quantum entanglement and fidelity. We find these quantities could be properties that characterize the image classes, as well as the machine learning tasks. △ Less

Submitted 10 March, 2019; v1 submitted 13 October, 2017; originally announced October 2017.

Comments: 6 pages, 4 figures

Journal ref: New Journal of Physics, 21, 073059 (2019)

arXiv:1606.05798 [pdf, ps, other]

Interpretable Two-level Boolean Rule Learning for Classification

Authors: Guolong Su, Dennis Wei, Kush R. Varshney, Dmitry M. Malioutov

Abstract: As a contribution to interpretable machine learning research, we develop a novel optimization framework for learning accurate and sparse two-level Boolean rules. We consider rules in both conjunctive normal form (AND-of-ORs) and disjunctive normal form (OR-of-ANDs). A principled objective function is proposed to trade classification accuracy and interpretability, where we use Hamming loss to chara… ▽ More As a contribution to interpretable machine learning research, we develop a novel optimization framework for learning accurate and sparse two-level Boolean rules. We consider rules in both conjunctive normal form (AND-of-ORs) and disjunctive normal form (OR-of-ANDs). A principled objective function is proposed to trade classification accuracy and interpretability, where we use Hamming loss to characterize accuracy and sparsity to characterize interpretability. We propose efficient procedures to optimize these objectives based on linear programming (LP) relaxation, block coordinate descent, and alternating minimization. Experiments show that our new algorithms provide very good tradeoffs between accuracy and interpretability. △ Less

Submitted 18 June, 2016; originally announced June 2016.

Comments: presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

Report number: WHI 2016 submission

Showing 1–7 of 7 results for author: Su, G