Search | arXiv e-print repository

Memory-efficient Stochastic methods for Memory-based Transformers

Authors: Vishwajit Kumar Vishnu, C. Chandra Sekhar

Abstract: Training Memory-based transformers can require a large amount of memory and can be quite inefficient. We propose a novel two-phase training mechanism and a novel regularization technique to improve the training efficiency of memory-based transformers, which are often used for long-range context problems. For our experiments, we consider transformer-XL as our baseline model which is one of memoryba… ▽ More Training Memory-based transformers can require a large amount of memory and can be quite inefficient. We propose a novel two-phase training mechanism and a novel regularization technique to improve the training efficiency of memory-based transformers, which are often used for long-range context problems. For our experiments, we consider transformer-XL as our baseline model which is one of memorybased transformer models. We show that our resultant model, Skip Cross-head TransformerXL, outperforms the baseline on character level language modeling task with similar parameters and outperforms the baseline on word level language modelling task with almost 20% fewer parameters. Our proposed methods do not require any additional memory. We also demonstrate the effectiveness of our regularization mechanism on BERT which shows similar performance with reduction in standard deviation of scores of around 30% on multiple GLUE tasks. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2108.02850 [pdf, other]

Unsupervised Domain Adaptation in Speech Recognition using Phonetic Features

Authors: Rupam Ojha, C Chandra Sekhar

Abstract: Automatic speech recognition is a difficult problem in pattern recognition because several sources of variability exist in the speech input like the channel variations, the input might be clean or noisy, the speakers may have different accent and variations in the gender, etc. As a result, domain adaptation is important in speech recognition where we train the model for a particular source domain… ▽ More Automatic speech recognition is a difficult problem in pattern recognition because several sources of variability exist in the speech input like the channel variations, the input might be clean or noisy, the speakers may have different accent and variations in the gender, etc. As a result, domain adaptation is important in speech recognition where we train the model for a particular source domain and test it on a different target domain. In this paper, we propose a technique to perform unsupervised gender-based domain adaptation in speech recognition using phonetic features. The experiments are performed on the TIMIT dataset and there is a considerable decrease in the phoneme error rate using the proposed approach. △ Less

Submitted 4 August, 2021; originally announced August 2021.

Comments: 5 pages, 3 figures

arXiv:2105.05061 [pdf, other]

Semi-Supervised Metric Learning: A Deep Resurrection

Authors: Ujjal Kr Dutta, Mehrtash Harandi, Chellu Chandra Sekhar

Abstract: Distance Metric Learning (DML) seeks to learn a discriminative embedding where similar examples are closer, and dissimilar examples are apart. In this paper, we address the problem of Semi-Supervised DML (SSDML) that tries to learn a metric using a few labeled examples, and abundantly available unlabeled examples. SSDML is important because it is infeasible to manually annotate all the examples pr… ▽ More Distance Metric Learning (DML) seeks to learn a discriminative embedding where similar examples are closer, and dissimilar examples are apart. In this paper, we address the problem of Semi-Supervised DML (SSDML) that tries to learn a metric using a few labeled examples, and abundantly available unlabeled examples. SSDML is important because it is infeasible to manually annotate all the examples present in a large dataset. Surprisingly, with the exception of a few classical approaches that learn a linear Mahalanobis metric, SSDML has not been studied in the recent years, and lacks approaches in the deep SSDML scenario. In this paper, we address this challenging problem, and revamp SSDML with respect to deep learning. In particular, we propose a stochastic, graph-based approach that first propagates the affinities between the pairs of examples from labeled data, to that of the unlabeled pairs. The propagated affinities are used to mine triplet based constraints for metric learning. We impose orthogonality constraint on the metric parameters, as it leads to a better performance by avoiding a model collapse. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: In AAAI-2021

arXiv:2103.03215 [pdf, other]

Front-end Diarization for Percussion Separation in Taniavartanam of Carnatic Music Concerts

Authors: Nauman Dawalatabad, Jilt Sebastian, Jom Kuriakose, C. Chandra Sekhar, Shrikanth Narayanan, Hema A. Murthy

Abstract: Instrument separation in an ensemble is a challenging task. In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music. In taniavartanam, a number of percussive instruments play together or in tandem. Separation of instruments in regions where only one percussion is present leads to interference and artifacts at the output, as source se… ▽ More Instrument separation in an ensemble is a challenging task. In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music. In taniavartanam, a number of percussive instruments play together or in tandem. Separation of instruments in regions where only one percussion is present leads to interference and artifacts at the output, as source separation algorithms assume the presence of multiple percussive voices throughout the audio segment. We prevent this by first subjecting the taniavartanam to diarization. This process results in homogeneous clusters consisting of segments of either a single voice or multiple voices. A cluster of segments with multiple voices is identified using the Gaussian mixture model (GMM), which is then subjected to source separation. A deep recurrent neural network (DRNN) based approach is used to separate the multiple instrument segments. The effectiveness of the proposed system is evaluated on a standard Carnatic music dataset. The proposed approach provides close-to-oracle performance for non-overlap** segments and a significant improvement over traditional separation schemes. △ Less

Submitted 4 March, 2021; originally announced March 2021.

arXiv:2008.09880 [pdf, other]

doi 10.1109/TAI.2020.3026982

Unsupervised Deep Metric Learning via Orthogonality based Probabilistic Loss

Authors: Ujjal Kr Dutta, Mehrtash Harandi, Chellu Chandra Sekhar

Abstract: Metric learning is an important problem in machine learning. It aims to group similar examples together. Existing state-of-the-art metric learning approaches require class labels to learn a metric. As obtaining class labels in all applications is not feasible, we propose an unsupervised approach that learns a metric without making use of class labels. The lack of class labels is compensated by obt… ▽ More Metric learning is an important problem in machine learning. It aims to group similar examples together. Existing state-of-the-art metric learning approaches require class labels to learn a metric. As obtaining class labels in all applications is not feasible, we propose an unsupervised approach that learns a metric without making use of class labels. The lack of class labels is compensated by obtaining pseudo-labels of data using a graph-based clustering approach. The pseudo-labels are used to form triplets of examples, which guide the metric learning. We propose a probabilistic loss that minimizes the chances of each triplet violating an angular constraint. A weight function, and an orthogonality constraint in the objective speeds up the convergence and avoids a model collapse. We also provide a stochastic formulation of our method to scale up to large-scale datasets. Our studies demonstrate the competitiveness of our approach against state-of-the-art methods. We also thoroughly study the effect of the different components of our method. △ Less

Submitted 22 August, 2020; originally announced August 2020.

Comments: In the IEEE Transactions on Artificial Intelligence (IEEE TAI)

arXiv:2002.12394 [pdf, other]

Affinity guided Geometric Semi-Supervised Metric Learning

Authors: Ujjal Kr Dutta, Mehrtash Harandi, Chellu Chandra Sekhar

Abstract: In this paper, we revamp the forgotten classical Semi-Supervised Distance Metric Learning (SSDML) problem from a Riemannian geometric lens, to leverage stochastic optimization within a end-to-end deep framework. The motivation comes from the fact that apart from a few classical SSDML approaches learning a linear Mahalanobis metric, deep SSDML has not been studied. We first extend existing SSDML me… ▽ More In this paper, we revamp the forgotten classical Semi-Supervised Distance Metric Learning (SSDML) problem from a Riemannian geometric lens, to leverage stochastic optimization within a end-to-end deep framework. The motivation comes from the fact that apart from a few classical SSDML approaches learning a linear Mahalanobis metric, deep SSDML has not been studied. We first extend existing SSDML methods to their deep counterparts and then propose a new method to overcome their limitations. Due to the nature of constraints on our metric parameters, we leverage Riemannian optimization. Our deep SSDML method with a novel affinity propagation based triplet mining strategy outperforms its competitors. △ Less

Submitted 6 November, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

Comments: Paper accepted in NeurIPS 2020 workshop on Differential Geometry meets Deep Learning

arXiv:1905.08574 [pdf]

Online Signature Verification Based on Writer Specific Feature Selection and Fuzzy Similarity Measure

Authors: Chandra Sekhar V, Prerana Mukherjee, D. S. Guru, Viswanath Pulabaigari

Abstract: Online Signature Verification (OSV) is a widely used biometric attribute for user behavioral characteristic verification in digital forensics. In this manuscript, owing to large intra-individual variability, a novel method for OSV based on an interval symbolic representation and a fuzzy similarity measure grounded on writer specific parameter selection is proposed. The two parameters, namely, writ… ▽ More Online Signature Verification (OSV) is a widely used biometric attribute for user behavioral characteristic verification in digital forensics. In this manuscript, owing to large intra-individual variability, a novel method for OSV based on an interval symbolic representation and a fuzzy similarity measure grounded on writer specific parameter selection is proposed. The two parameters, namely, writer specific acceptance threshold and optimal feature set to be used for authenticating the writer are selected based on minimum equal error rate (EER) attained during parameter fixation phase using the training signature samples. This is in variation to current techniques for OSV, which are primarily writer independent, in which a common set of features and acceptance threshold are chosen. To prove the robustness of our system, we have exhaustively assessed our system with four standard datasets i.e. MCYT-100 (DB1), MCYT-330 (DB2), SUSIG-Visual corpus and SVC-2004- Task2. Experimental outcome confirms the effectiveness of fuzzy similarity metric-based writer dependent parameter selection for OSV by achieving a lower error rate as compared to many recent and state-of-the art OSV models. △ Less

Submitted 21 May, 2019; originally announced May 2019.

Comments: accepted in Applications of Computer Vision and Pattern Recognition to Media Forensics, CVPRW, 2019, Long Beach, California

arXiv:1904.00240 [pdf, other]

OSVNet: Convolutional Siamese Network for Writer Independent Online Signature Verification

Authors: Chandra Sekhar, Prerana Mukherjee, Devanur S Guru, Viswanath Pulabaigari

Abstract: Online signature verification (OSV) is one of the most challenging tasks in writer identification and digital forensics. Owing to the large intra-individual variability, there is a critical requirement to accurately learn the intra-personal variations of the signature to achieve higher classification accuracy. To achieve this, in this paper, we propose an OSV framework based on deep convolutional… ▽ More Online signature verification (OSV) is one of the most challenging tasks in writer identification and digital forensics. Owing to the large intra-individual variability, there is a critical requirement to accurately learn the intra-personal variations of the signature to achieve higher classification accuracy. To achieve this, in this paper, we propose an OSV framework based on deep convolutional Siamese network (DCSN). DCSN automatically extracts robust feature descriptions based on metric-based loss function which decreases intra-writer variability (Genuine-Genuine) and increases inter-individual variability (Genuine-Forgery) and directs the DCSN for effective discriminative representation learning for online signatures and extend it for one shot learning framework. Comprehensive experimentation conducted on three widely accepted benchmark datasets MCYT-100 (DB1), MCYT-330 (DB2) and SVC-2004-Task2 demonstrate the capability of our framework to distinguish the genuine and forgery samples. Experimental results confirm the efficiency of deep convolutional Siamese network based OSV by achieving a lower error rate as compared to many recent and state-of-the art OSV techniques. △ Less

Submitted 21 May, 2019; v1 submitted 30 March, 2019; originally announced April 2019.

Comments: accepted in International Conference on Document Analysis and Recognition (ICDAR 2019), University of Technology Sydney (UTS), Australia

arXiv:1902.08051 [pdf, other]

doi 10.1109/ICASSP.2019.8683114

Incremental Transfer Learning in Two-pass Information Bottleneck based Speaker Diarization System for Meetings

Authors: Nauman Dawalatabad, Srikanth Madikeri, C Chandra Sekhar, Hema A Murthy

Abstract: The two-pass information bottleneck (TPIB) based speaker diarization system operates independently on different conversational recordings. TPIB system does not consider previously learned speaker discriminative information while diarizing new conversations. Hence, the real time factor (RTF) of TPIB system is high owing to the training time required for the artificial neural network (ANN). This pap… ▽ More The two-pass information bottleneck (TPIB) based speaker diarization system operates independently on different conversational recordings. TPIB system does not consider previously learned speaker discriminative information while diarizing new conversations. Hence, the real time factor (RTF) of TPIB system is high owing to the training time required for the artificial neural network (ANN). This paper attempts to improve the RTF of the TPIB system using an incremental transfer learning approach where the parameters learned by the ANN from other conversations are updated using current conversation rather than learning parameters from scratch. This reduces the RTF significantly. The effectiveness of the proposed approach compared to the baseline IB and the TPIB systems is demonstrated on standard NIST and AMI conversational meeting datasets. With a minor degradation in performance, the proposed system shows a significant improvement of 33.07% and 24.45% in RTF with respect to TPIB system on the NIST RT-04Eval and AMI-1 datasets, respectively. △ Less

Submitted 21 February, 2019; originally announced February 2019.

Comments: 5 pages, 2 figures, To appear in Proc. ICASSP 2019, May 12-17, 2019, Brighton, UK

arXiv:1401.6121 [pdf]

A Robust Password-Based Multi-Server Authentication Scheme

Authors: Vorugunti Chandra Sekhar, Mrudula Sarvabhatla

Abstract: In 2013, Tsai et al. cryptanalyzed Yeh et al. scheme and shown that Yeh et al., scheme is vulnerable to various cryptographic attacks and proposed an improved scheme. In this poster we will show that Tsai et al., scheme is also vulnerable to undetectable online password guessing attack, on success of the attack, the adversary can perform all major cryptographic attacks. As apart of our contributio… ▽ More In 2013, Tsai et al. cryptanalyzed Yeh et al. scheme and shown that Yeh et al., scheme is vulnerable to various cryptographic attacks and proposed an improved scheme. In this poster we will show that Tsai et al., scheme is also vulnerable to undetectable online password guessing attack, on success of the attack, the adversary can perform all major cryptographic attacks. As apart of our contribution, we have proposed an improved scheme which overcomes the defects in Tsai et al. and Yeh et al. schemes. △ Less

Submitted 6 December, 2013; originally announced January 2014.

arXiv:1401.1318 [pdf]

A Robust Biometric-Based Three-factor Remote User Authentication Scheme

Authors: Vorugunti Chandra Sekhar, Mrudula Sarvabhatla

Abstract: The rapid development of Internet of Things (IoT) technology, which is an inter connection of networks through an insecure public channel i.e. Internet demands for authenticating the remote user trying to access the secure network resources. In 2013, Ankita et al. proposed an improved three factor remote user authentication scheme. In this poster we will show that Ankita et al scheme is vulnerable… ▽ More The rapid development of Internet of Things (IoT) technology, which is an inter connection of networks through an insecure public channel i.e. Internet demands for authenticating the remote user trying to access the secure network resources. In 2013, Ankita et al. proposed an improved three factor remote user authentication scheme. In this poster we will show that Ankita et al scheme is vulnerable to known session specific temporary information attack, on successfully performing the attack, the adversary can perform all other major cryptographic attacks. As a part of our contribution, we will propose an improved scheme which is resistance to all major cryptographic attacks and overcomes the defects in Ankita et al. scheme. △ Less

Submitted 7 January, 2014; originally announced January 2014.

Showing 1–11 of 11 results for author: Sekhar, C