Search | arXiv e-print repository

The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-Language Models

Authors: Chenwei Wu, Li Erran Li, Stefano Ermon, Patrick Haffner, Rong Ge, Zaiwei Zhang

Abstract: Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood. In this paper, we identify two sources of visual-linguistic compositionality: linguistic priors and the interplay between images and texts. We show that current attempts to improve compositional generalization rely on li… ▽ More Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood. In this paper, we identify two sources of visual-linguistic compositionality: linguistic priors and the interplay between images and texts. We show that current attempts to improve compositional generalization rely on linguistic priors rather than on information in the image. We also propose a new metric for compositionality without such linguistic priors. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2211.08978 [pdf]

doi 10.1109/ICASSP.1992.225874

Rapid Connectionist Speaker Adaptation

Authors: Michael Witbrock, Patrick Haffner

Abstract: We present SVCnet, a system for modelling speaker variability. Encoder Neural Networks specialized for each speech sound produce low dimensionality models of acoustical variation, and these models are further combined into an overall model of voice variability. A training procedure is described which minimizes the dependence of this model on which sounds have been uttered. Using the trained model… ▽ More We present SVCnet, a system for modelling speaker variability. Encoder Neural Networks specialized for each speech sound produce low dimensionality models of acoustical variation, and these models are further combined into an overall model of voice variability. A training procedure is described which minimizes the dependence of this model on which sounds have been uttered. Using the trained model (SVCnet) and a brief, unconstrained sample of a new speaker's voice, the system produces a Speaker Voice Code that can be used to adapt a recognition system to the new speaker without retraining. A system which combines SVCnet with an MS-TDNN recognizer is described △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: 6 Figures, Two Tables, ICASSP-92

Journal ref: ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1992, pp. 453-456 vol.1

arXiv:2104.08967 [pdf, other]

Deep Clustering with Measure Propagation

Authors: Minhua Chen, Badrinath Jayakumar, Padmasundari Gopalakrishnan, Qiming Huang, Michael Johnston, Patrick Haffner

Abstract: Deep models have improved state-of-the-art for both supervised and unsupervised learning. For example, deep embedded clustering (DEC) has greatly improved the unsupervised clustering performance, by using stacked autoencoders for representation learning. However, one weakness of deep modeling is that the local neighborhood structure in the original space is not necessarily preserved in the latent… ▽ More Deep models have improved state-of-the-art for both supervised and unsupervised learning. For example, deep embedded clustering (DEC) has greatly improved the unsupervised clustering performance, by using stacked autoencoders for representation learning. However, one weakness of deep modeling is that the local neighborhood structure in the original space is not necessarily preserved in the latent space. To preserve local geometry, various methods have been proposed in the supervised and semi-supervised learning literature (e.g., spectral clustering and label propagation) using graph Laplacian regularization. In this paper, we combine the strength of deep representation learning with measure propagation (MP), a KL-divergence based graph regularization method originally used in the semi-supervised scenario. The main assumption of MP is that if two data points are close in the original space, they are likely to belong to the same class, measured by KL-divergence of class membership distribution. By taking the same assumption in the unsupervised learning scenario, we propose our Deep Embedded Clustering Aided by Measure Propagation (DECAMP) model. We evaluate DECAMP on short text clustering tasks. On three public datasets, DECAMP performs competitively with other state-of-the-art baselines, including baselines using additional data to generate word embeddings used in the clustering process. As an example, on the Stackoverflow dataset, DECAMP achieved a clustering accuracy of 79%, which is about 5% higher than all existing baselines. These empirical results suggest that DECAMP is a very effective method for unsupervised learning. △ Less

Submitted 26 April, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

Comments: This work was presented as a poster in 14th Annual Machine Learning Symposium in The New York Academy of Sciences

arXiv:1711.11542 [pdf, other]

Learning to Adapt by Minimizing Discrepancy

Authors: Alexander G. Ororbia II, Patrick Haffner, David Reitter, C. Lee Giles

Abstract: We explore whether useful temporal neural generative models can be learned from sequential data without back-propagation through time. We investigate the viability of a more neurocognitively-grounded approach in the context of unsupervised generative modeling of sequences. Specifically, we build on the concept of predictive coding, which has gained influence in cognitive science, in a neural frame… ▽ More We explore whether useful temporal neural generative models can be learned from sequential data without back-propagation through time. We investigate the viability of a more neurocognitively-grounded approach in the context of unsupervised generative modeling of sequences. Specifically, we build on the concept of predictive coding, which has gained influence in cognitive science, in a neural framework. To do so we develop a novel architecture, the Temporal Neural Coding Network, and its learning algorithm, Discrepancy Reduction. The underlying directed generative model is fully recurrent, meaning that it employs structural feedback connections and temporal feedback connections, yielding information propagation cycles that create local learning signals. This facilitates a unified bottom-up and top-down approach for information transfer inside the architecture. Our proposed algorithm shows promise on the bouncing balls generative modeling problem. Further experiments could be conducted to explore the strengths and weaknesses of our approach. △ Less

Submitted 30 November, 2017; originally announced November 2017.

Comments: Note: Additional experiments in support of this paper are still running (updates will be made as they are completed)

arXiv:1411.6725 [pdf, ps, other]

Accelerated Parallel Optimization Methods for Large Scale Machine Learning

Authors: Haipeng Luo, Patrick Haffner, Jean-Francois Paiement

Abstract: The growing amount of high dimensional data in different machine learning applications requires more efficient and scalable optimization algorithms. In this work, we consider combining two techniques, parallelism and Nesterov's acceleration, to design faster algorithms for L1-regularized loss. We first simplify BOOM, a variant of gradient descent, and study it in a unified framework, which allows… ▽ More The growing amount of high dimensional data in different machine learning applications requires more efficient and scalable optimization algorithms. In this work, we consider combining two techniques, parallelism and Nesterov's acceleration, to design faster algorithms for L1-regularized loss. We first simplify BOOM, a variant of gradient descent, and study it in a unified framework, which allows us to not only propose a refined measurement of sparsity to improve BOOM, but also show that BOOM is provably slower than FISTA. Moving on to parallel coordinate descent methods, we then propose an efficient accelerated version of Shotgun, improving the convergence rate from $O(1/t)$ to $O(1/t^2)$. Our algorithm enjoys a concise form and analysis compared to previous work, and also allows one to study several connected work in a unified way. △ Less

Submitted 24 November, 2014; originally announced November 2014.

Comments: Appear in the 7th NIPS Workshop on Optimization for Machine Learning

arXiv:cs/0506101 [pdf, ps, other]

Efficient Multiclass Implementations of L1-Regularized Maximum Entropy

Authors: Patrick Haffner, Steven Phillips, Rob Schapire

Abstract: This paper discusses the application of L1-regularized maximum entropy modeling or SL1-Max [9] to multiclass categorization problems. A new modification to the SL1-Max fast sequential learning algorithm is proposed to handle conditional distributions. Furthermore, unlike most previous studies, the present research goes beyond a single type of conditional distribution. It describes and compares a… ▽ More This paper discusses the application of L1-regularized maximum entropy modeling or SL1-Max [9] to multiclass categorization problems. A new modification to the SL1-Max fast sequential learning algorithm is proposed to handle conditional distributions. Furthermore, unlike most previous studies, the present research goes beyond a single type of conditional distribution. It describes and compares a variety of modeling assumptions about the class distribution (independent or exclusive) and various types of joint or conditional distributions. It results in a new methodology for combining binary regularized classifiers to achieve multiclass categorization. In this context, Maximum Entropy can be considered as a generic and efficient regularized classification tool that matches or outperforms the state-of-the art represented by AdaBoost and SVMs. △ Less

Submitted 29 June, 2005; originally announced June 2005.

Comments: 13 pages, describes new conditional maxent algorithm, to be submitted

Showing 1–6 of 6 results for author: Haffner, P