Search | arXiv e-print repository

RepCNN: Micro-sized, Mighty Models for Wakeword Detection

Authors: Arnav Kundu, Prateeth Nayak, Hywel Richards, Priyanka Padmanabhan, Devang Naik

Abstract: Always-on machine learning models require a very low memory and compute footprint. Their restricted parameter count limits the model's capacity to learn, and the effectiveness of the usual training algorithms to find the best parameters. Here we show that a small convolutional model can be better trained by first refactoring its computation into a larger redundant multi-branched architecture. Then… ▽ More Always-on machine learning models require a very low memory and compute footprint. Their restricted parameter count limits the model's capacity to learn, and the effectiveness of the usual training algorithms to find the best parameters. Here we show that a small convolutional model can be better trained by first refactoring its computation into a larger redundant multi-branched architecture. Then, for inference, we algebraically re-parameterize the trained model into the single-branched form with fewer parameters for a lower memory footprint and compute cost. Using this technique, we show that our always-on wake-word detector model, RepCNN, provides a good trade-off between latency and accuracy during inference. RepCNN re-parameterized models are 43% more accurate than a uni-branch convolutional model while having the same runtime. RepCNN also meets the accuracy of complex architectures like BC-ResNet, while having 2x lesser peak memory usage and 10x faster runtime. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.05329 [pdf, other]

KV-Runahead: Scalable Causal LLM Inference by Parallel Key-Value Cache Generation

Authors: Minsik Cho, Mohammad Rastegari, Devang Naik

Abstract: Large Language Model or LLM inference has two phases, the prompt (or prefill) phase to output the first token and the extension (or decoding) phase to the generate subsequent tokens. In this work, we propose an efficient parallelization scheme, KV-Runahead to accelerate the prompt phase. The key observation is that the extension phase generates tokens faster than the prompt phase because of key-va… ▽ More Large Language Model or LLM inference has two phases, the prompt (or prefill) phase to output the first token and the extension (or decoding) phase to the generate subsequent tokens. In this work, we propose an efficient parallelization scheme, KV-Runahead to accelerate the prompt phase. The key observation is that the extension phase generates tokens faster than the prompt phase because of key-value cache (KV-cache). Hence, KV-Runahead parallelizes the prompt phase by orchestrating multiple processes to populate the KV-cache and minimizes the time-to-first-token (TTFT). Dual-purposing the KV-cache scheme has two main benefits. First, since KV-cache is designed to leverage the causal attention map, we minimize computation and computation automatically. Second, since it already exists for the extension phase, KV-Runahead is easy to implement. We further propose context-level load-balancing to handle uneven KV-cache generation (due to the causal attention) and to optimize TTFT. Compared with an existing parallelization scheme such as tensor or sequential parallelization where keys and values are locally generated and exchanged via all-gather collectives, our experimental results demonstrate that KV-Runahead can offer over 1.4x and 1.6x speedups for Llama 7B and Falcon 7B respectively. △ Less

Submitted 13 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: preprint for ICML 2024

arXiv:2312.09299 [pdf, other]

Weight subcloning: direct initialization of transformers using larger pretrained ones

Authors: Mohammad Samragh, Mehrdad Farajtabar, Sachin Mehta, Raviteja Vemulapalli, Fartash Faghri, Devang Naik, Oncel Tuzel, Mohammad Rastegari

Abstract: Training large transformer models from scratch for a target task requires lots of data and is computationally demanding. The usual practice of transfer learning overcomes this challenge by initializing the model with weights of a pretrained model of the same size and specification to increase the convergence and training speed. However, what if no pretrained model of the required size is available… ▽ More Training large transformer models from scratch for a target task requires lots of data and is computationally demanding. The usual practice of transfer learning overcomes this challenge by initializing the model with weights of a pretrained model of the same size and specification to increase the convergence and training speed. However, what if no pretrained model of the required size is available? In this paper, we introduce a simple yet effective technique to transfer the knowledge of a pretrained model to smaller variants. Our approach called weight subcloning expedites the training of scaled-down transformers by initializing their weights from larger pretrained models. Weight subcloning involves an operation on the pretrained model to obtain the equivalent initialized scaled-down model. It consists of two key steps: first, we introduce neuron importance ranking to decrease the embedding dimension per layer in the pretrained model. Then, we remove blocks from the transformer model to match the number of layers in the scaled-down network. The result is a network ready to undergo training, which gains significant improvements in training speed compared to random initialization. For instance, we achieve 4x faster training for vision transformers in image classification and language models designed for next token prediction. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2311.16240 [pdf, other]

Quantum hard disks on a lattice

Authors: Vighnesh Dattatraya Naik, Fabian Ballar Trigueros, Markus Heyl

Abstract: We formulate a quantum version of the hard-disk problem on lattices, which exhibits a natural realization in systems of Rydberg atoms. We find that quantum hard disks exihibit unique dynamical quantum features. In 1D, the crystal melting process displays ballistic behavior as opposed to classical sub-diffusion. For 2D, crystal structures remain intact against most defects, whereas classically they… ▽ More We formulate a quantum version of the hard-disk problem on lattices, which exhibits a natural realization in systems of Rydberg atoms. We find that quantum hard disks exihibit unique dynamical quantum features. In 1D, the crystal melting process displays ballistic behavior as opposed to classical sub-diffusion. For 2D, crystal structures remain intact against most defects, whereas classically they are washed out completely. We link this peculiar quantum behavior to quantum many-body scars. Our study highlights the potential of constrained 2D quantum matter to display unique dynamical behaviors. △ Less

Submitted 15 January, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: 9 pages, 10 figures

arXiv:2309.00964 [pdf, other]

eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models

Authors: Minsik Cho, Keivan A. Vahid, Qichen Fu, Saurabh Adya, Carlo C Del Mundo, Mohammad Rastegari, Devang Naik, Peter Zatloukal

Abstract: Since Large Language Models or LLMs have demonstrated high-quality performance on many complex language tasks, there is a great interest in bringing these LLMs to mobile devices for faster responses and better privacy protection. However, the size of LLMs (i.e., billions of parameters) requires highly effective compression to fit into storage-limited devices. Among many compression techniques, wei… ▽ More Since Large Language Models or LLMs have demonstrated high-quality performance on many complex language tasks, there is a great interest in bringing these LLMs to mobile devices for faster responses and better privacy protection. However, the size of LLMs (i.e., billions of parameters) requires highly effective compression to fit into storage-limited devices. Among many compression techniques, weight-clustering, a form of non-linear quantization, is one of the leading candidates for LLM compression, and supported by modern smartphones. Yet, its training overhead is prohibitively significant for LLM fine-tuning. Especially, Differentiable KMeans Clustering, or DKM, has shown the state-of-the-art trade-off between compression ratio and accuracy regression, but its large memory complexity makes it nearly impossible to apply to train-time LLM compression. In this paper, we propose a memory-efficient DKM implementation, eDKM powered by novel techniques to reduce the memory footprint of DKM by orders of magnitudes. For a given tensor to be saved on CPU for the backward pass of DKM, we compressed the tensor by applying uniquification and sharding after checking if there is no duplicated tensor previously copied to CPU. Our experimental results demonstrate that \prjname can fine-tune and compress a pretrained LLaMA 7B model from 12.6 GB to 2.5 GB (3bit/weight) with the Alpaca dataset by reducing the train-time memory footprint of a decoder layer by 130$\times$, while delivering good accuracy on broader LLM benchmarks (i.e., 77.7% for PIQA, 66.1% for Winograde, and so on). △ Less

Submitted 13 September, 2023; v1 submitted 2 September, 2023; originally announced September 2023.

Comments: preprint

arXiv:2309.00140 [pdf, other]

doi 10.1109/ICASSP48485.2024.10447485

Improving vision-inspired keyword spotting using dynamic module skip** in streaming conformer encoder

Authors: Alexandre Bittar, Paul Dixon, Mohammad Samragh, Kumari Nishu, Devang Naik

Abstract: Using a vision-inspired keyword spotting framework, we propose an architecture with input-dependent dynamic depth capable of processing streaming audio. Specifically, we extend a conformer encoder with trainable binary gates that allow us to dynamically skip network modules according to the input audio. Our approach improves detection and localization accuracy on continuous speech using Librispeec… ▽ More Using a vision-inspired keyword spotting framework, we propose an architecture with input-dependent dynamic depth capable of processing streaming audio. Specifically, we extend a conformer encoder with trainable binary gates that allow us to dynamically skip network modules according to the input audio. Our approach improves detection and localization accuracy on continuous speech using Librispeech top-1000 most frequent words while maintaining a small memory footprint. The inclusion of gates also reduces the average amount of processing without affecting the overall performance. These benefits are shown to be even more pronounced using the Google speech commands dataset placed over background noise where up to 97% of the processing is skipped on non-speech inputs, therefore making our method particularly interesting for an always-on keyword spotter. △ Less

Submitted 31 August, 2023; originally announced September 2023.

Journal ref: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:2308.06472 [pdf, other]

Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding

Authors: Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik

Abstract: Spotting user-defined/flexible keywords represented in text frequently uses an expensive text encoder for joint analysis with an audio encoder in an embedding space, which can suffer from heterogeneous modality representation (i.e., large mismatch) and increased complexity. In this work, we propose a novel architecture to efficiently detect arbitrary keywords based on an audio-compliant text encod… ▽ More Spotting user-defined/flexible keywords represented in text frequently uses an expensive text encoder for joint analysis with an audio encoder in an embedding space, which can suffer from heterogeneous modality representation (i.e., large mismatch) and increased complexity. In this work, we propose a novel architecture to efficiently detect arbitrary keywords based on an audio-compliant text encoder which inherently has homogeneous representation with audio embedding, and it is also much smaller than a compatible text encoder. Our text encoder converts the text to phonemes using a grapheme-to-phoneme (G2P) model, and then to an embedding using representative phoneme vectors, extracted from the paired audio encoder on rich speech datasets. We further augment our method with confusable keyword generation to develop an audio-text embedding verifier with strong discriminative power. Experimental results show that our scheme outperforms the state-of-the-art results on Libriphrase hard dataset, increasing Area Under the ROC Curve (AUC) metric from 84.21% to 92.7% and reducing Equal-Error-Rate (EER) metric from 23.36% to 14.4%. △ Less

Submitted 12 August, 2023; originally announced August 2023.

arXiv:2306.05245 [pdf, other]

Matching Latent Encoding for Audio-Text based Keyword Spotting

Authors: Kumari Nishu, Minsik Cho, Devang Naik

Abstract: Using audio and text embeddings jointly for Keyword Spotting (KWS) has shown high-quality results, but the key challenge of how to semantically align two embeddings for multi-word keywords of different sequence lengths remains largely unsolved. In this paper, we propose an audio-text-based end-to-end model architecture for flexible keyword spotting (KWS), which builds upon learned audio and text e… ▽ More Using audio and text embeddings jointly for Keyword Spotting (KWS) has shown high-quality results, but the key challenge of how to semantically align two embeddings for multi-word keywords of different sequence lengths remains largely unsolved. In this paper, we propose an audio-text-based end-to-end model architecture for flexible keyword spotting (KWS), which builds upon learned audio and text embeddings. Our architecture uses a novel dynamic programming-based algorithm, Dynamic Sequence Partitioning (DSP), to optimally partition the audio sequence into the same length as the word-based text sequence using the monotonic alignment of spoken content. Our proposed model consists of an encoder block to get audio and text embeddings, a projector block to project individual embeddings to a common latent space, and an audio-text aligner containing a novel DSP algorithm, which aligns the audio and text embeddings to determine if the spoken content is the same as the text. Experimental results show that our DSP is more effective than other partitioning schemes, and the proposed architecture outperformed the state-of-the-art results on the public dataset in terms of Area Under the ROC Curve (AUC) and Equal-Error-Rate (EER) by 14.4 % and 28.9%, respectively. △ Less

Submitted 8 June, 2023; originally announced June 2023.

arXiv:2305.11203 [pdf, other]

PDP: Parameter-free Differentiable Pruning is All You Need

Authors: Minsik Cho, Saurabh Adya, Devang Naik

Abstract: DNN pruning is a popular way to reduce the size of a model, improve the inference latency, and minimize the power consumption on DNN accelerators. However, existing approaches might be too complex, expensive or ineffective to apply to a variety of vision/language tasks, DNN architectures and to honor structured pruning constraints. In this paper, we propose an efficient yet effective train-time pr… ▽ More DNN pruning is a popular way to reduce the size of a model, improve the inference latency, and minimize the power consumption on DNN accelerators. However, existing approaches might be too complex, expensive or ineffective to apply to a variety of vision/language tasks, DNN architectures and to honor structured pruning constraints. In this paper, we propose an efficient yet effective train-time pruning scheme, Parameter-free Differentiable Pruning (PDP), which offers state-of-the-art qualities in model size, accuracy, and training cost. PDP uses a dynamic function of weights during training to generate soft pruning masks for the weights in a parameter-free manner for a given pruning target. While differentiable, the simplicity and efficiency of PDP make it universal enough to deliver state-of-the-art random/structured/channel pruning results on various vision and natural language tasks. For example, for MobileNet-v1, PDP can achieve 68.2% top-1 ImageNet1k accuracy at 86.6% sparsity, which is 1.7% higher accuracy than those from the state-of-the-art algorithms. Also, PDP yields over 83.1% accuracy on Multi-Genre Natural Language Inference with 90% sparsity for BERT, while the next best from the existing techniques shows 81.5% accuracy. In addition, PDP can be applied to structured pruning, such as N:M pruning and channel pruning. For 1:4 structured pruning of ResNet18, PDP improved the top-1 ImageNet1k accuracy by over 3.6% over the state-of-the-art. For channel pruning of ResNet50, PDP reduced the top-1 ImageNet1k accuracy by 0.6% from the state-of-the-art. △ Less

Submitted 17 November, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

Journal ref: NeurIPS 2023

arXiv:2301.04896 [pdf]

doi 10.1364/OE.484049

Fabrication and characterization of iodine photonic microcell for sub-Doppler spectroscopy and laser stabilization

Authors: Clement Goicoechea, Thomas Billotte, Matthieu Chafer, Martin Maurel, Jenny Jouin, Philippe Thomas, Devang Naik, Frederic Gerome, Benoit Debord, Fetah Benabid

Abstract: We report on the development of all-fiber stand-alone Iodine-filled Photonic Microcells demonstrating record absorption contrast at room temperature. The microcell s fiber is made of inhibited coupling guiding hollow-core photonic crystal fibers. The fiber-core loading with Iodine was undertaken at 10-1 - 10-2mbar vapor pressure using a novel gas-manifold based on metallic vacuum parts with cerami… ▽ More We report on the development of all-fiber stand-alone Iodine-filled Photonic Microcells demonstrating record absorption contrast at room temperature. The microcell s fiber is made of inhibited coupling guiding hollow-core photonic crystal fibers. The fiber-core loading with Iodine was undertaken at 10-1 - 10-2mbar vapor pressure using a novel gas-manifold based on metallic vacuum parts with ceramic coated inner surfaces for corrosion resistance. The fiber is then sealed on the tips and mounted on FC/APC connectors for better integration with standard fiber components. The stand-alone microcells display Doppler lines with contrasts up to 73% in the 633nm wavelength range, and an insertion loss between 3 to 4dB. Sub-Doppler spectroscopy based on saturable absorption has been carried out to resolve the hyperfine structure of the P(33)6-3 lines at room temperature with a full-width at half maximum of 24MHz on the b4 component with the help of lock-in amplification. Also, we demonstrate distinguishable hyperfine components on the R(39)6-3 line at room temperature without any recourse to signal-to-noise ratio amplification techniques. △ Less

Submitted 12 January, 2023; originally announced January 2023.

Comments: 11 pages, 4 figures

arXiv:2210.15425 [pdf, other]

HEiMDaL: Highly Efficient Method for Detection and Localization of wake-words

Authors: Arnav Kundu, Mohammad Samragh Razlighi, Minsik Cho, Priyanka Padmanabhan, Devang Naik

Abstract: Streaming keyword spotting is a widely used solution for activating voice assistants. Deep Neural Networks with Hidden Markov Model (DNN-HMM) based methods have proven to be efficient and widely adopted in this space, primarily because of the ability to detect and identify the start and end of the wake-up word at low compute cost. However, such hybrid systems suffer from loss metric mismatch when… ▽ More Streaming keyword spotting is a widely used solution for activating voice assistants. Deep Neural Networks with Hidden Markov Model (DNN-HMM) based methods have proven to be efficient and widely adopted in this space, primarily because of the ability to detect and identify the start and end of the wake-up word at low compute cost. However, such hybrid systems suffer from loss metric mismatch when the DNN and HMM are trained independently. Sequence discriminative training cannot fully mitigate the loss-metric mismatch due to the inherent Markovian style of the operation. We propose an low footprint CNN model, called HEiMDaL, to detect and localize keywords in streaming conditions. We introduce an alignment-based classification loss to detect the occurrence of the keyword along with an offset loss to predict the start of the keyword. HEiMDaL shows 73% reduction in detection metrics along with equivalent localization accuracy and with the same memory footprint as existing DNN-HMM style models for a given wake-word. △ Less

Submitted 26 October, 2022; originally announced October 2022.

arXiv:2210.13567 [pdf, ps, other]

I see what you hear: a vision-inspired method to localize words

Authors: Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Minsik Cho, Aman Chadha, Ashish Shrivastava, Oncel Tuzel, Devang Naik

Abstract: This paper explores the possibility of using visual object detection techniques for word localization in speech data. Object detection has been thoroughly studied in the contemporary literature for visual data. Noting that an audio can be interpreted as a 1-dimensional image, object localization techniques can be fundamentally useful for word localization. Building upon this idea, we propose a lig… ▽ More This paper explores the possibility of using visual object detection techniques for word localization in speech data. Object detection has been thoroughly studied in the contemporary literature for visual data. Noting that an audio can be interpreted as a 1-dimensional image, object localization techniques can be fundamentally useful for word localization. Building upon this idea, we propose a lightweight solution for word detection and localization. We use bounding box regression for word localization, which enables our model to detect the occurrence, offset, and duration of keywords in a given audio stream. We experiment with LibriSpeech and train a model to localize 1000 words. Compared to existing work, our method reduces model size by 94%, and improves the F1 score by 6.5\%. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2209.09504 [pdf, other]

doi 10.1142/S2251171722500088

Performance Analysis Techniques for Real-time Broadband RFI Filtering System of uGMRT

Authors: Kaushal D. Buch, Ruta Kale, Kishor D. Naik, Rahul Aragade, Mekhala Muley, Sanjay Kudale, Ajith Kumar B

Abstract: Electromagnetic radiation from human activities, known as man-made Radio Frequency Interference (RFI), adversely affects radio astronomy observations. In the vicinity of the Upgraded Giant Metrewave Radio Telescope (uGMRT) array, the sparking on power lines is the major cause of interference at observing frequencies less than 800 MHz. A real-time broadband RFI detection and filtering system is imp… ▽ More Electromagnetic radiation from human activities, known as man-made Radio Frequency Interference (RFI), adversely affects radio astronomy observations. In the vicinity of the Upgraded Giant Metrewave Radio Telescope (uGMRT) array, the sparking on power lines is the major cause of interference at observing frequencies less than 800 MHz. A real-time broadband RFI detection and filtering system is implemented as part of the uGMRT wideband signal processing backend to mitigate the effect of broadband RFI. Performance analysis techniques used for testing and commissioning the system for observations in the beamformer and correlator modes of the uGMRT are presented. The concept and implementation of recording simultaneous unfiltered and filtered data along with data analysis and interpretation is illustrated using an example. For the beamformer mode, spectrogram, single spectral channel, and its Fourier transform is used for performance analysis whereas, in the correlator mode, the cross-correlation function, closure phase, and visibilities from the simultaneously recorded unfiltered and filtered is carried out. These techniques are used for testing the performance of the broadband RFI filter and releasing it for uGMRT users. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: 14 pages, 5 figures

Journal ref: Journal of Astronomical Instrumentation, Vol. 11, No. 2 (2022) 2250008 (12 pages)

arXiv:2106.11388 [pdf, other]

How well do you know your summarization datasets?

Authors: Priyam Tejaswin, Dhruv Naik, Pengfei Liu

Abstract: State-of-the-art summarization systems are trained and evaluated on massive datasets scraped from the web. Despite their prevalence, we know very little about the underlying characteristics (data noise, summarization complexity, etc.) of these datasets, and how these affect system performance and the reliability of automatic metrics like ROUGE. In this study, we manually analyze 600 samples from t… ▽ More State-of-the-art summarization systems are trained and evaluated on massive datasets scraped from the web. Despite their prevalence, we know very little about the underlying characteristics (data noise, summarization complexity, etc.) of these datasets, and how these affect system performance and the reliability of automatic metrics like ROUGE. In this study, we manually analyze 600 samples from three popular summarization datasets. Our study is driven by a six-class typology which captures different noise types (missing facts, entities) and degrees of summarization difficulty (extractive, abstractive). We follow with a thorough analysis of 27 state-of-the-art summarization models and 5 popular metrics, and report our key insights: (1) Datasets have distinct data quality and complexity distributions, which can be traced back to their collection process. (2) The performance of models and reliability of metrics is dependent on sample complexity. (3) Faithful summaries often receive low scores because of the poor diversity of references. We release the code, annotated data and model outputs. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: Accepted into Findings of ACL-IJCNLP 2021

arXiv:2102.02166 [pdf, other]

doi 10.1103/PhysRevLett.127.013202

Fast control of atom-light interaction in a narrow linewidth cavity

Authors: A. Bertoldi, C. -H. Feng, D. S. Naik, B. Canuel, P. Bouyer, M. Prevedelli

Abstract: We propose a method to exploit high finesse optical resonators for light assisted coherent manipulation of atomic ensembles, overcoming the limit imposed by the finite response time of the cavity. The key element of our scheme is to rapidly switch the interaction between the atoms and the cavity field with an auxiliary control process as, for example, the light shift induced by an optical beam. Th… ▽ More We propose a method to exploit high finesse optical resonators for light assisted coherent manipulation of atomic ensembles, overcoming the limit imposed by the finite response time of the cavity. The key element of our scheme is to rapidly switch the interaction between the atoms and the cavity field with an auxiliary control process as, for example, the light shift induced by an optical beam. The scheme is applicable to many different atomic species, both in trapped and free fall configurations, and can be adopted to control the internal and/or external atomic degrees of freedom. Our method will open new possibilities in cavity-aided atom interferometry and in the preparation of highly non-classical atomic states. △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: 5 pages, 3 figures

Journal ref: Phys. Rev. Lett. 127, 013202 (2021)

arXiv:2011.09324 [pdf, other]

doi 10.1063/1.5129595

A control hardware based on a field programmable gate array for experiments in atomic physics

Authors: A. Bertoldi, C. -H. Feng, H. Eneriz Imaz, M. Carey, D. S. Naik, J. Junca, X. Zou, D. O. Sabulsky, B. Canuel, P. Bouyer, M. Prevedelli

Abstract: Experiments in Atomic, Molecular, and Optical (AMO) physics require precise and accurate control of digital, analog, and radio frequency (RF) signals. We present a control hardware based on a field programmable gate array (FPGA) core which drives various modules via a simple interface bus. The system supports an operating frequency of 10 MHz and a memory depth of 8 M (2$^{23}$) instructions, both… ▽ More Experiments in Atomic, Molecular, and Optical (AMO) physics require precise and accurate control of digital, analog, and radio frequency (RF) signals. We present a control hardware based on a field programmable gate array (FPGA) core which drives various modules via a simple interface bus. The system supports an operating frequency of 10 MHz and a memory depth of 8 M (2$^{23}$) instructions, both easily scalable. Successive experimental sequences can be stacked with no dead time and synchronized with external events at any instructions. Two or more units can be cascaded and synchronized to a common clock, a feature useful to operate large experimental setups in a modular way. △ Less

Submitted 18 November, 2020; originally announced November 2020.

Journal ref: Review of Scientific Instruments 91, 033203 (2020)

arXiv:2011.01151 [pdf, other]

Optimize what matters: Training DNN-HMM Keyword Spotting Model Using End Metric

Authors: Ashish Shrivastava, Arnav Kundu, Chandra Dhir, Devang Naik, Oncel Tuzel

Abstract: Deep Neural Network--Hidden Markov Model (DNN-HMM) based methods have been successfully used for many always-on keyword spotting algorithms that detect a wake word to trigger a device. The DNN predicts the state probabilities of a given speech frame, while HMM decoder combines the DNN predictions of multiple speech frames to compute the keyword detection score. The DNN, in prior methods, is traine… ▽ More Deep Neural Network--Hidden Markov Model (DNN-HMM) based methods have been successfully used for many always-on keyword spotting algorithms that detect a wake word to trigger a device. The DNN predicts the state probabilities of a given speech frame, while HMM decoder combines the DNN predictions of multiple speech frames to compute the keyword detection score. The DNN, in prior methods, is trained independent of the HMM parameters to minimize the cross-entropy loss between the predicted and the ground-truth state probabilities. The mis-match between the DNN training loss (cross-entropy) and the end metric (detection score) is the main source of sub-optimal performance for the keyword spotting task. We address this loss-metric mismatch with a novel end-to-end training strategy that learns the DNN parameters by optimizing for the detection score. To this end, we make the HMM decoder (dynamic programming) differentiable and back-propagate through it to maximize the score for the keyword and minimize the scores for non-keyword speech segments. Our method does not require any change in the model architecture or the inference framework; therefore, there is no overhead in run-time memory or compute requirements. Moreover, we show significant reduction in false rejection rate (FRR) at the same false trigger experience (> 70% over independent DNN training). △ Less

Submitted 25 February, 2021; v1 submitted 2 November, 2020; originally announced November 2020.

Comments: Accepted at ICASSP 2021

arXiv:2010.10591 [pdf, other]

Knowledge Transfer for Efficient On-device False Trigger Mitigation

Authors: Pranay Dighe, Erik Marchi, Srikanth Vishnubhotla, Sachin Kajarekar, Devang Naik

Abstract: In this paper, we address the task of determining whether a given utterance is directed towards a voice-enabled smart-assistant device or not. An undirected utterance is termed as a "false trigger" and false trigger mitigation (FTM) is essential for designing a privacy-centric non-intrusive smart assistant. The directedness of an utterance can be identified by running automatic speech recognition… ▽ More In this paper, we address the task of determining whether a given utterance is directed towards a voice-enabled smart-assistant device or not. An undirected utterance is termed as a "false trigger" and false trigger mitigation (FTM) is essential for designing a privacy-centric non-intrusive smart assistant. The directedness of an utterance can be identified by running automatic speech recognition (ASR) on it and determining the user intent by analyzing the ASR transcript. But in case of a false trigger, transcribing the audio using ASR itself is strongly undesirable. To alleviate this issue, we propose an LSTM-based FTM architecture which determines the user intent from acoustic features directly without explicitly generating ASR transcripts from the audio. The proposed models are small footprint and can be run on-device with limited computational resources. During training, the model parameters are optimized using a knowledge transfer approach where a more accurate self-attention graph neural network model serves as the teacher. Given the whole audio snippets, our approach mitigates 87% of false triggers at 99% true positive rate (TPR), and in a streaming audio scenario, the system listens to only 1.69s of the false trigger audio before rejecting it while achieving the same TPR. △ Less

Submitted 20 October, 2020; originally announced October 2020.

arXiv:2008.08113 [pdf, other]

Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation

Authors: Rishika Agarwal, Xiaochuan Niu, Pranay Dighe, Srikanth Vishnubhotla, Sameer Badaskar, Devang Naik

Abstract: False triggers in voice assistants are unintended invocations of the assistant, which not only degrade the user experience but may also compromise privacy. False trigger mitigation (FTM) is a process to detect the false trigger events and respond appropriately to the user. In this paper, we propose a novel solution to the FTM problem by introducing a parallel ASR decoding process with a special la… ▽ More False triggers in voice assistants are unintended invocations of the assistant, which not only degrade the user experience but may also compromise privacy. False trigger mitigation (FTM) is a process to detect the false trigger events and respond appropriately to the user. In this paper, we propose a novel solution to the FTM problem by introducing a parallel ASR decoding process with a special language model trained from "out-of-domain" data sources. Such language model is complementary to the existing language model optimized for the assistant task. A bidirectional lattice RNN (Bi-LRNN) classifier trained from the lattices generated by the complementary language model shows a $38.34\%$ relative reduction of the false trigger (FT) rate at the fixed rate of $0.4\%$ false suppression (FS) of correct invocations, compared to the current Bi-LRNN model. In addition, we propose to train a parallel Bi-LRNN model based on the decoding lattices from both language models, and examine various ways of implementation. The resulting model leads to further reduction in the false trigger rate by $10.8\%$. △ Less

Submitted 18 August, 2020; originally announced August 2020.

arXiv:2004.12031 [pdf, ps, other]

On the Role of Visual Cues in Audiovisual Speech Enhancement

Authors: Zakaria Aldeneh, Anushree Prasanna Kumar, Barry-John Theobald, Erik Marchi, Sachin Kajarekar, Devang Naik, Ahmed Hussen Abdelaziz

Abstract: We present an introspection of an audiovisual speech enhancement model. In particular, we focus on interpreting how a neural audiovisual speech enhancement model uses visual cues to improve the quality of the target speech signal. We show that visual cues provide not only high-level information about speech activity, i.e., speech/silence, but also fine-grained visual information about the place of… ▽ More We present an introspection of an audiovisual speech enhancement model. In particular, we focus on interpreting how a neural audiovisual speech enhancement model uses visual cues to improve the quality of the target speech signal. We show that visual cues provide not only high-level information about speech activity, i.e., speech/silence, but also fine-grained visual information about the place of articulation. One byproduct of this finding is that the learned visual embeddings can be used as features for other visual speech applications. We demonstrate the effectiveness of the learned visual embeddings for classifying visemes (the visual analogy to phonemes). Our results provide insight into important aspects of audiovisual speech enhancement and demonstrate how such models can be used for self-supervision tasks for visual speech applications. △ Less

Submitted 25 February, 2021; v1 submitted 24 April, 2020; originally announced April 2020.

Comments: ICASSP 2021

arXiv:2004.01189 [pdf, other]

doi 10.1103/PRXQuantum.2.010325

Non-Gaussianity as a signature of a quantum theory of gravity

Authors: Richard Howl, Vlatko Vedral, Devang Naik, Marios Christodoulou, Carlo Rovelli, Aditya Iyer

Abstract: Table-top tests of quantum gravity (QG) have long been thought to be practically impossible. However, remarkably, due to rapid progress in quantum information science (QIS), such tests may soon be achievable. Here, we uncover an exciting new theoretical link between QG and QIS that also leads to a radical new way of testing QG with QIS experiments. Specifically, we find that only a quantum, not cl… ▽ More Table-top tests of quantum gravity (QG) have long been thought to be practically impossible. However, remarkably, due to rapid progress in quantum information science (QIS), such tests may soon be achievable. Here, we uncover an exciting new theoretical link between QG and QIS that also leads to a radical new way of testing QG with QIS experiments. Specifically, we find that only a quantum, not classical, theory of gravity can create non-Gaussianity, a QIS resource that is necessary for universal quantum computation, in the quantum field state of matter. This allows for tests based on QIS in which non-Gaussianity in matter is used as a signature of QG. In comparison to previous studies of testing QG with QIS where entanglement is used to witness QG when all other quantum interactions are excluded, our non-Gaussianity witness cannot be created by direct classical gravity interactions, facilitating tests that are not constrained by the existence of such processes. Our new signature of QG also enables tests that are based on just a single rather than multi-partite quantum system, simplifying previously considered experimental setups. We describe a table-top test of QG that uses our non-Gaussianity signature and which is based on just a single quantum system, a Bose-Einstein condensate (BEC), in a single location. In contrast to proposals based on opto-mechanical setups, BECs have already been manipulated into massive non-classical states, aiding the prospect of testing QG in the near future. △ Less

Submitted 20 July, 2021; v1 submitted 2 April, 2020; originally announced April 2020.

Comments: 34 pages

Journal ref: PRX Quantum 2, 010325 (2021)

arXiv:2003.00786 [pdf, ps, other]

doi 10.1142/S0219887820501054

Riemann solitons and almost Riemann solitons on almost Kenmotsu manifolds

Authors: V. Venkatesha, H. Aruna Kumara, Devaraja Mallesha Naik

Abstract: The aim of this article is to study the Riemann soliton and gradient almost Riemann soliton on certain class of almost Kenmotsu manifolds. Also some suitable examples of Kenmotsu and $(κ,μ)'$-almost Kenmotsu manifolds are constructed to justify our results. The aim of this article is to study the Riemann soliton and gradient almost Riemann soliton on certain class of almost Kenmotsu manifolds. Also some suitable examples of Kenmotsu and $(κ,μ)'$-almost Kenmotsu manifolds are constructed to justify our results. △ Less

Submitted 2 March, 2020; originally announced March 2020.

MSC Class: 53C25; 53C15; 53D15

arXiv:2002.01323 [pdf, other]

Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions

Authors: Vasudha Kowtha, Vikramjit Mitra, Chris Bartels, Erik Marchi, Sue Booker, William Caruso, Sachin Kajarekar, Devang Naik

Abstract: Emotion plays an essential role in human-to-human communication, enabling us to convey feelings such as happiness, frustration, and sincerity. While modern speech technologies rely heavily on speech recognition and natural language understanding for speech content understanding, the investigation of vocal expression is increasingly gaining attention. Key considerations for building robust emotion… ▽ More Emotion plays an essential role in human-to-human communication, enabling us to convey feelings such as happiness, frustration, and sincerity. While modern speech technologies rely heavily on speech recognition and natural language understanding for speech content understanding, the investigation of vocal expression is increasingly gaining attention. Key considerations for building robust emotion models include characterizing and improving the extent to which a model, given its training data distribution, is able to generalize to unseen data conditions. This work investigated a long-shot-term memory (LSTM) network and a time convolution - LSTM (TC-LSTM) to detect primitive emotion attributes such as valence, arousal, and dominance, from speech. It was observed that training with multiple datasets and using robust features improved the concordance correlation coefficient (CCC) for valence, by 30\% with respect to the baseline system. Additionally, this work investigated how emotion primitives can be used to detect categorical emotions such as happiness, disgust, contempt, anger, and surprise from neutral speech, and results indicated that arousal, followed by dominance was a better detector of such emotions. △ Less

Submitted 30 January, 2020; originally announced February 2020.

Comments: 5 pages

arXiv:2001.10822 [pdf, other]

Lattice-based Improvements for Voice Triggering Using Graph Neural Networks

Authors: Pranay Dighe, Saurabh Adya, Nuoyu Li, Srikanth Vishnubhotla, Devang Naik, Adithya Sagar, Ying Ma, Stephen Pulman, Jason Williams

Abstract: Voice-triggered smart assistants often rely on detection of a trigger-phrase before they start listening for the user request. Mitigation of false triggers is an important aspect of building a privacy-centric non-intrusive smart assistant. In this paper, we address the task of false trigger mitigation (FTM) using a novel approach based on analyzing automatic speech recognition (ASR) lattices using… ▽ More Voice-triggered smart assistants often rely on detection of a trigger-phrase before they start listening for the user request. Mitigation of false triggers is an important aspect of building a privacy-centric non-intrusive smart assistant. In this paper, we address the task of false trigger mitigation (FTM) using a novel approach based on analyzing automatic speech recognition (ASR) lattices using graph neural networks (GNN). The proposed approach uses the fact that decoding lattice of a falsely triggered audio exhibits uncertainties in terms of many alternative paths and unexpected words on the lattice arcs as compared to the lattice of a correctly triggered audio. A pure trigger-phrase detector model doesn't fully utilize the intent of the user speech whereas by using the complete decoding lattice of user audio, we can effectively mitigate speech not intended for the smart assistant. We deploy two variants of GNNs in this paper based on 1) graph convolution layers and 2) self-attention mechanism respectively. Our experiments demonstrate that GNNs are highly accurate in FTM task by mitigating ~87% of false triggers at 99% true positive rate (TPR). Furthermore, the proposed models are fast to train and efficient in parameter requirements. △ Less

Submitted 24 January, 2020; originally announced January 2020.

arXiv:2001.10816 [pdf, other]

doi 10.1109/ICASSP40776.2020.9054760

Multi-task Learning for Speaker Verification and Voice Trigger Detection

Authors: Siddharth Sigtia, Erik Marchi, Sachin Kajarekar, Devang Naik, John Bridle

Abstract: Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classific… ▽ More Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker. We present a large-scale empirical study where the model is trained using several thousand hours of labelled training data for each task. We evaluate the speech transcription branch of the network on a voice trigger detection task while the speaker recognition branch is evaluated on a speaker verification task. Results demonstrate that the network is able to encode both phonetic \emph{and} speaker information in its learnt representations while yielding accuracies at least as good as the baseline models for each task, with the same number of parameters as the independent models. △ Less

Submitted 26 January, 2020; originally announced January 2020.

Journal ref: International Conference on Acoustics, Speech and Signal Processing (ICASSP), Spain, 2020, pp. 6844-6848

arXiv:1910.12849 [pdf, other]

doi 10.1103/PhysRevResearch.2.013212

Loading and Cooling in an Optical Trap via Hyperfine Dark States

Authors: D. S. Naik, H. Eneriz-Imaz, M. Carey, T. Freegarde, F. Minardi, B. Battelier, P. Bouyer, A. Bertoldi

Abstract: We present a novel optical cooling scheme that relies on hyperfine dark states to enhance loading and cooling atoms inside deep optical dipole traps. We demonstrate a seven-fold increase in the number of atoms loaded in the conservative potential with strongly shifted excited states. In addition, we use the energy selective dark-state to efficiently cool the atoms trapped inside the conservative p… ▽ More We present a novel optical cooling scheme that relies on hyperfine dark states to enhance loading and cooling atoms inside deep optical dipole traps. We demonstrate a seven-fold increase in the number of atoms loaded in the conservative potential with strongly shifted excited states. In addition, we use the energy selective dark-state to efficiently cool the atoms trapped inside the conservative potential rapidly and without losses. Our findings open the door to optically assisted cooling of trapped atoms and molecules which lack the closed cycling transitions normally needed to achieve low temperatures and the high initial densities required for evaporative cooling. △ Less

Submitted 25 November, 2019; v1 submitted 28 October, 2019; originally announced October 2019.

Comments: 5 pages, 5 figures + supplemental material 2 pages

Journal ref: Phys. Rev. Research 2, 013212 (2020)

arXiv:1907.00112 [pdf]

Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice

Authors: Vikramjit Mitra, Sue Booker, Erik Marchi, David Scott Farrar, Ute Dorothea Peitz, Bridget Cheng, Ermine Teves, Anuj Mehta, Devang Naik

Abstract: Millions of people reach out to digital assistants such as Siri every day, asking for information, making phone calls, seeking assistance, and much more. The expectation is that such assistants should understand the intent of the users query. Detecting the intent of a query from a short, isolated utterance is a difficult task. Intent cannot always be obtained from speech-recognized transcriptions.… ▽ More Millions of people reach out to digital assistants such as Siri every day, asking for information, making phone calls, seeking assistance, and much more. The expectation is that such assistants should understand the intent of the users query. Detecting the intent of a query from a short, isolated utterance is a difficult task. Intent cannot always be obtained from speech-recognized transcriptions. A transcription driven approach can interpret what has been said but fails to acknowledge how it has been said, and as a consequence, may ignore the expression present in the voice. Our work investigates whether a system can reliably detect vocal expression in queries using acoustic and paralinguistic embedding. Results show that the proposed method offers a relative equal error rate (EER) decrease of 60% compared to a bag-of-word based system, corroborating that expression is significantly represented by vocal attributes, rather than being purely lexical. Addition of emotion embedding helped to reduce the EER by 30% relative to the acoustic embedding, demonstrating the relevance of emotion in expressive voice. △ Less

Submitted 28 June, 2019; originally announced July 2019.

Comments: 5 pages, 6 figures

arXiv:1906.10063 [pdf, other]

doi 10.1103/PhysRevLett.123.240402

All-Optical Bose-Einstein Condensates in Microgravity

Authors: Gabriel Condon, Martin Rabault, Brynle Barrett, Laure Chichet, Romain Arguel, Hodei Eneriz-Imaz, Devang Naik, Andrea Bertoldi, Baptiste Battelier, Arnaud Landragin, Philippe Bouyer

Abstract: We report on the all-optical production of Bose-Einstein condensates in microgravity using a combination of grey molasses cooling, light-shift engineering and optical trap** in a painted potential. Forced evaporative cooling in a 3-m high Einstein elevator results in $4 \times 10^4$ condensed atoms every 13.5 s, with a temperature as low as 35 nK. In this system, the atomic cloud can expand in w… ▽ More We report on the all-optical production of Bose-Einstein condensates in microgravity using a combination of grey molasses cooling, light-shift engineering and optical trap** in a painted potential. Forced evaporative cooling in a 3-m high Einstein elevator results in $4 \times 10^4$ condensed atoms every 13.5 s, with a temperature as low as 35 nK. In this system, the atomic cloud can expand in weightlessness for up to 400 ms, paving the way for atom interferometry experiments with extended interrogation times and studies of ultra-cold matter physics at low energies on ground or in Space. △ Less

Submitted 25 June, 2019; v1 submitted 24 June, 2019; originally announced June 2019.

Comments: 4 pages + references, 5 figures

Journal ref: Phys. Rev. Lett. 123, 240402 (2019)

arXiv:1901.05222 [pdf, ps, other]

doi 10.1515/ms-2017-0321

$*$-Ricci solitons and gradient almost $*$-Ricci solitons on Kenmotsu manifolds

Authors: Venkatesha Venkatesh, Devaraja Mallesha Naik, H Aruna Kumara

Abstract: In this paper, we consider $*$-Ricci soliton in the frame-work of Kenmotsu manifolds. First, we prove that if the metric of a Kenmotsu manifold $M$ is a $*$-Ricci soliton, then soliton constant $λ$ is zero. For 3-dimensional case, if $M$ admits a $*$-Ricci soliton, then we show that $M$ is of constant sectional curvature -1. Next, we show that if $M$ admits a $*$-Ricci soliton whose potential vect… ▽ More In this paper, we consider $*$-Ricci soliton in the frame-work of Kenmotsu manifolds. First, we prove that if the metric of a Kenmotsu manifold $M$ is a $*$-Ricci soliton, then soliton constant $λ$ is zero. For 3-dimensional case, if $M$ admits a $*$-Ricci soliton, then we show that $M$ is of constant sectional curvature -1. Next, we show that if $M$ admits a $*$-Ricci soliton whose potential vector field is collinear with the characteristic vector field $ξ$, then $M$ is Einstein and soliton vector field is equal to $ξ$. Finally, we prove that if $g$ is a gradient almost $*$-Ricci soliton, then either $M$ is Einstein or the potential vector field is collinear with the characteristic vector field on an open set of $M$. We verify our result by constructing examples for both $*$-Ricci soliton and gradient almost $*$-Ricci soliton. △ Less

Submitted 16 January, 2019; originally announced January 2019.

Comments: 12 pages

MSC Class: 53C25; 53C44; 53D10; 53D15

Journal ref: Math. Slovaca, Volume 69, Issue 6, Pages 1447-1458, (2019)

arXiv:1808.06090 [pdf, ps, other]

doi 10.18514/MMN.2019.2905

Certain results on Kenmotsu pseudo-metric manifolds

Authors: Devaraja Mallesha Naik, Venkatesha, D. G. Prakasha

Abstract: In this paper, a systematic study of Kenmotsu pseudo-metric manifolds are introduced. After studying the properties of this manifolds, we provide necessary and sufficient condition for Kenmotsu pseudo-metric manifold to have constant $\varphi$-sectional curvature, and prove the structure theorem for $ξ$-conformally flat and $\varphi$-conformally flat Kenmotsu pseudo-metric manifolds. Next, we cons… ▽ More In this paper, a systematic study of Kenmotsu pseudo-metric manifolds are introduced. After studying the properties of this manifolds, we provide necessary and sufficient condition for Kenmotsu pseudo-metric manifold to have constant $\varphi$-sectional curvature, and prove the structure theorem for $ξ$-conformally flat and $\varphi$-conformally flat Kenmotsu pseudo-metric manifolds. Next, we consider Ricci solitons on this manifolds. In particular, we prove that an $η$-Einstein Kenmotsu pseudo-metric manifold of dimension higher than 3 admitting a Ricci soliton is Einstein, and a Kenmotsu pseudo-metric 3-manifold admitting a Ricci soliton is of constant curvature $-\varepsilon$. △ Less

Submitted 24 December, 2019; v1 submitted 18 August, 2018; originally announced August 2018.

Comments: 17 pages

Journal ref: Miskolc Math Notes, Vol. 20, No. 2, pp. 1083-1099, (2019)

arXiv:1805.12384 [pdf, ps, other]

Certain results on almost contact pseudo-metric manifolds

Authors: Venkatesha, Devaraja Mallesha Naik, Mukut Mani Tripathi

Abstract: We study the geometry of almost contact pseudo-metric manifolds in terms of tensor fields $h:=\frac{1}{2}£_ξ\varphi$ and $\ell := R(\cdot,ξ)ξ$, emphasizing analogies and differences with respect to the contact metric case. Certain identities involving $ξ$-sectional curvatures are obtained. We establish necessary and sufficient condition for a nondegenerate almost $CR$ structure… ▽ More We study the geometry of almost contact pseudo-metric manifolds in terms of tensor fields $h:=\frac{1}{2}£_ξ\varphi$ and $\ell := R(\cdot,ξ)ξ$, emphasizing analogies and differences with respect to the contact metric case. Certain identities involving $ξ$-sectional curvatures are obtained. We establish necessary and sufficient condition for a nondegenerate almost $CR$ structure $(\mathcal{H}(M), J, θ)$ corresponding to almost contact pseudo-metric manifold $M$ to be $CR$ manifold. Finally, we prove that a contact pseudo-metric manifold $(M,\varphi,ξ,η,g)$ is Sasakian if and only if the corresponding nondegenerate almost $CR$ structure $(\mathcal{H}(M), J)$ is integrable and $J$ is parallel along $ξ$ with respect to the Bott partial connection. △ Less

Submitted 31 May, 2018; originally announced May 2018.

MSC Class: 53C15; 53C25; 53D10

arXiv:1712.06491 [pdf, other]

BEC array in a Malleable Optical Trap formed in a Traveling Wave Cavity

Authors: D. S. Naik, G. Kuyumjyan, D. Pandey, P. Bouyer, A. Bertoldi

Abstract: Although quantum degenerate gases of neutral atoms have shown remarkable progress in the study of many body quantum physics, condensed matter physics, precision measurements, and quantum information processing, experimental progress is needed in order to reach their full potential in these fields. More complex spatial geometries as well as novel methods for engineering interesting interactions are… ▽ More Although quantum degenerate gases of neutral atoms have shown remarkable progress in the study of many body quantum physics, condensed matter physics, precision measurements, and quantum information processing, experimental progress is needed in order to reach their full potential in these fields. More complex spatial geometries as well as novel methods for engineering interesting interactions are needed. Here we demonstrate a novel experimental platform for the realization of quantum degenerate gases with a wide range of tune-ability in the spatial geometries experienced by the atoms and with the possibility of non-trivial long-range interactions both within and between multiple 87Rb Bose-Einstein condensates (BECs). We explore the use of a large mode-volume bow-tie ring cavity resonant at two wavelengths, $λ$ =1560 and 780 nm, for the creation of multiple BECs within a Malleable optical trap which also possesses the ability of photon-mediated long-range interactions. By exciting diverse transverse modes at 1560 nm, we can realize many optical trap** geometries which can open the door to spatial quantum state engineering with cavity-coupled BECs. As representative examples we realize a BEC in the fundamental TEM00 and a double BEC in the TEM01 mode of the cavity. By controlling the power between the fundamental and the higher transverse cavity mode, splitting and merging of cold thermal atomic ensemble is shown as well as the potential of creating more complex trap** geometries such as uniform potentials. Due to the double resonance of the cavity, we can envision a quantum network of BECs coupled via cavity-mediated interactions in non-trivial geometries. △ Less

Submitted 27 June, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

Comments: 20 pages, 9 figures

arXiv:1709.06467 [pdf, ps, other]

doi 10.1038/s41598-018-19814-z

$Λ$-enhanced grey molasses on the $D_2$ transition of Rubidium-87 atoms

Authors: Sara Rosi, Alessia Burchianti, Stefano Conclave, Devang S. Naik, Giacomo Roati, Chiara Fort, Francesco Minardi

Abstract: Laser cooling based on dark states, i.e. states decoupled from light, has proven to be effective to increase the phase-space density of cold trapped atoms. Dark-states cooling requires open atomic transitions, in contrast to the ordinary laser cooling used for example in magneto-optical traps (MOTs), which operate on closed atomic transitions. For alkali atoms, dark-states cooling is therefore com… ▽ More Laser cooling based on dark states, i.e. states decoupled from light, has proven to be effective to increase the phase-space density of cold trapped atoms. Dark-states cooling requires open atomic transitions, in contrast to the ordinary laser cooling used for example in magneto-optical traps (MOTs), which operate on closed atomic transitions. For alkali atoms, dark-states cooling is therefore commonly operated on the $D_1$ transition $n S_{1/2}\rightarrow n P_{1/2}$. We show that, for $^{87}\text{Rb}$, thanks to the large hyperfine structure separations the use of this transition is not strictly necessary and that $"$quasi-dark state$"$ cooling is efficient also on the $D_2$ line, $5 S_{1/2}\rightarrow 5 P_{3/2}$. We report temperatures as low as $(4.0\pm 0.3)\,μ$K and an increase of almost an order of magnitude in the phase space density with respect to ordinary laser sub-Doppler cooling. △ Less

Submitted 20 September, 2017; v1 submitted 19 September, 2017; originally announced September 2017.

Journal ref: Scientific Reports 8, (2018) 1031

arXiv:1106.0828 [pdf, ps, other]

doi 10.1103/PhysRevA.85.023623

Quantum dynamics of impurities in a 1D Bose gas

Authors: J. Catani, G. Lamporesi, D. Naik, M. Gring, M. Inguscio, F. Minardi, A. Kantian, T. Giamarchi

Abstract: Using a species-selective dipole potential, we create initially localized impurities and investigate their interactions with a majority species of bosonic atoms in a one-dimensional configuration during expansion. We find an interaction-dependent amplitude reduction of the oscillation of the impurities' size with no measurable frequency shift, and study it as a function of the interaction strength… ▽ More Using a species-selective dipole potential, we create initially localized impurities and investigate their interactions with a majority species of bosonic atoms in a one-dimensional configuration during expansion. We find an interaction-dependent amplitude reduction of the oscillation of the impurities' size with no measurable frequency shift, and study it as a function of the interaction strength. We discuss possible theoretical interpretations of the data. We compare, in particular, with a polaronic mass shift model derived following Feynman variational approach. △ Less

Submitted 21 February, 2012; v1 submitted 4 June, 2011; originally announced June 2011.

Comments: 7 pages, 6 figures

Journal ref: Phys. Rev. A 85, 023623 (2012)

arXiv:1011.5192 [pdf, ps, other]

doi 10.1103/PhysRevLett.106.115304

Hydrodynamic Expansion of a Strongly Interacting Fermi-Fermi Mixture

Authors: A. Trenkwalder, C. Kohstall, M. Zaccanti, D. Naik, A. I. Sidorov, F. Schreck, R. Grimm

Abstract: We report on the expansion of a Fermi-Fermi mixture of Li-6 and K-40 atoms under conditions of strong interactions realized near the center of an interspecies Feshbach resonance. We observe two different phenomena of hydrodynamic behavior. The first one is the well-known inversion of the aspect ratio. The second one is a collective expansion, where both species stick together and despite of their… ▽ More We report on the expansion of a Fermi-Fermi mixture of Li-6 and K-40 atoms under conditions of strong interactions realized near the center of an interspecies Feshbach resonance. We observe two different phenomena of hydrodynamic behavior. The first one is the well-known inversion of the aspect ratio. The second one is a collective expansion, where both species stick together and despite of their different masses expand jointly. Our work constitutes a first step to explore the intriguing many-body physics of this novel system. △ Less

Submitted 16 March, 2011; v1 submitted 23 November, 2010; originally announced November 2010.

Journal ref: Phys. Rev. Lett. 106, 115304 (2011)

arXiv:1010.3662 [pdf, ps, other]

doi 10.1140/epjd/e2010-10591-2

Feshbach resonances in the 6Li-40K Fermi-Fermi mixture: Elastic versus inelastic interactions

Authors: D. Naik, A. Trenkwalder, C. Kohstall, F. M. Spiegelhalder, M. Zaccanti, G. Hendl, F. Schreck, R. Grimm, T. M. Hanna, P. S. Julienne

Abstract: We present a detailed theoretical and experimental study of Feshbach resonances in the 6Li-40K mixture. Particular attention is given to the inelastic scattering properties, which have not been considered before. As an important example, we thoroughly investigate both elastic and inelastic scattering properties of a resonance that occurs near 155 G. Our theoretical predictions based on a coupled c… ▽ More We present a detailed theoretical and experimental study of Feshbach resonances in the 6Li-40K mixture. Particular attention is given to the inelastic scattering properties, which have not been considered before. As an important example, we thoroughly investigate both elastic and inelastic scattering properties of a resonance that occurs near 155 G. Our theoretical predictions based on a coupled channels calculation are found in excellent agreement with the experimental results. We also present theoretical results on the molecular state that underlies the 155G resonance, in particular concerning its lifetime against spontaneous dissociation. We then present a survey of resonances in the system, fully characterizing the corresponding elastic and inelastic scattering properties. This provides the essential information to identify optimum resonances for applications relying on interaction control in this Fermi-Fermi mixture. △ Less

Submitted 24 November, 2010; v1 submitted 18 October, 2010; originally announced October 2010.

Comments: Submitted to EPJD, EuroQUAM special issues "Cold Quantum Matter - Achievements and Prospects", v2 with updated calibration of magnetic field (+4mG correction) and updated figures 4 and 6

Journal ref: Eur. Phys. J. D 65, 55-65 (2011)

arXiv:1001.5253 [pdf, other]

doi 10.1103/PhysRevA.81.043637

All-optical production of a degenerate mixture of 6Li and 40K and creation of heteronuclear molecules

Authors: F. M. Spiegelhalder, A. Trenkwalder, D. Naik, G. Kerner, E. Wille, G. Hendl, F. Schreck, R. Grimm

Abstract: We present the essential experimental steps of our all-optical approach to prepare a double-degenerate Fermi-Fermi mixture of 6Li and 40K atoms, which then serves as a starting point for molecule formation. We first describe the optimized trap loading procedures, the internal-state preparation of the sample, and the combined evaporative and sympathetic cooling process. We then discuss the prepar… ▽ More We present the essential experimental steps of our all-optical approach to prepare a double-degenerate Fermi-Fermi mixture of 6Li and 40K atoms, which then serves as a starting point for molecule formation. We first describe the optimized trap loading procedures, the internal-state preparation of the sample, and the combined evaporative and sympathetic cooling process. We then discuss the preparation of the sample near an interspecies Feshbach resonance, and we demonstrate the formation of heteronuclear molecules by a magnetic field ramp across the resonance. △ Less

Submitted 28 January, 2010; originally announced January 2010.

Comments: 13 pages, 17 figures

Journal ref: Phys. Rev. A 81, 043637 (2010)

arXiv:0908.1101 [pdf, ps, other]

doi 10.1103/PhysRevLett.103.223203

Collisional Stability of 40K Immersed in a Strongly Interacting Fermi Gas of 6Li

Authors: F. M. Spiegelhalder, A. Trenkwalder, D. Naik, G. Hendl, F. Schreck, R. Grimm

Abstract: We investigate the collisional stability of a sample of 40K atoms immersed in a tunable spin mixture of 6Li atoms. In this three-component Fermi-Fermi mixture, we find very low loss rates in a wide range of interactions as long as molecule formation of 6Li is avoided. The stable fermionic mixture with two resonantly interacting spin states of one species together with another species is a promis… ▽ More We investigate the collisional stability of a sample of 40K atoms immersed in a tunable spin mixture of 6Li atoms. In this three-component Fermi-Fermi mixture, we find very low loss rates in a wide range of interactions as long as molecule formation of 6Li is avoided. The stable fermionic mixture with two resonantly interacting spin states of one species together with another species is a promising system for a broad variety of phenomena in few- and many-body quantum physics. △ Less

Submitted 24 November, 2009; v1 submitted 7 August, 2009; originally announced August 2009.

Comments: 4 pages, 4 figures

Journal ref: Phys. Rev. Lett. 103, 223203 (2009)

arXiv:0711.2916 [pdf, other]

doi 10.1103/PhysRevLett.100.053201

Exploring an ultracold Fermi-Fermi mixture: Interspecies Feshbach resonances and scattering properties of 6Li and 40K

Authors: E. Wille, F. M. Spiegelhalder, G. Kerner, D. Naik, A. Trenkwalder, G. Hendl, F. Schreck, R. Grimm, T. G. Tiecke, J. T. M. Walraven, S. J. J. M. F. Kokkelmans, E. Tiesinga, P. S. Julienne

Abstract: We report on the observation of Feshbach resonances in an ultracold mixture of two fermionic species, 6Li and 40K. The experimental data are interpreted using a simple asymptotic bound state model and full coupled channels calculations. This unambiguously assigns the observed resonances in terms of various s- and p-wave molecular states and fully characterizes the ground-state scattering propert… ▽ More We report on the observation of Feshbach resonances in an ultracold mixture of two fermionic species, 6Li and 40K. The experimental data are interpreted using a simple asymptotic bound state model and full coupled channels calculations. This unambiguously assigns the observed resonances in terms of various s- and p-wave molecular states and fully characterizes the ground-state scattering properties in any combination of spin states. △ Less

Submitted 1 February, 2008; v1 submitted 19 November, 2007; originally announced November 2007.

Comments: 4 pages, 4 figures, 1 table

Journal ref: Phys. Rev. Lett. 100, 053201 (2008)

arXiv:cond-mat/0606540 [pdf, ps, other]

doi 10.1364/OE.14.008947

Axicon Lens for Coherent Matter Waves

Authors: S. R. Muniz, S. D. Jenkins, T. A. B. Kennedy, D. S. Naik, C. Raman

Abstract: We have realized a conical matter wave lens. The repulsive potential of a focused laser beam was used to launch a Bose-Einstein condensate into a radially expanding wavepacket whose perfect ring shape was ensured by energy conservation. In spite of significant interactions between atoms, the spatial and velocity widths of the ring along its radial dimension remained extremely narrow, as also con… ▽ More We have realized a conical matter wave lens. The repulsive potential of a focused laser beam was used to launch a Bose-Einstein condensate into a radially expanding wavepacket whose perfect ring shape was ensured by energy conservation. In spite of significant interactions between atoms, the spatial and velocity widths of the ring along its radial dimension remained extremely narrow, as also confirmed by numerical simulations. Our results open the possibility for cylindrical atom optics without the perturbing effect of mean-field interactions. △ Less

Submitted 21 June, 2006; originally announced June 2006.

Comments: 11 pages, 5 figures, Multimedia files (movies) available in our website: http://www.physics.gatech.edu/chandra/index.htm

Journal ref: Optics Express, Vol. 14, Issue 20, pp. 8947-8957 (2006)

arXiv:cond-mat/0606129 [pdf, ps, other]

doi 10.1016/j.matcom.2006.10.029

Dynamics of rotating Bose-Einstein condensates probed by Bragg scattering

Authors: S. R. Muniz, D. S. Naik, M. Bhattacharya, C. Raman

Abstract: Gaseous Bose-Einstein condensates (BECs) have become an important test bed for studying the dynamics of quantized vortices. In this work we use two-photon Doppler sensitive Bragg scattering to study the rotation of sodium BECs. We analyze the microscopic flow field and present laboratory measurements of the coarse-grained velocity profile. Unlike time-of-flight imaging, Bragg scattering is sensi… ▽ More Gaseous Bose-Einstein condensates (BECs) have become an important test bed for studying the dynamics of quantized vortices. In this work we use two-photon Doppler sensitive Bragg scattering to study the rotation of sodium BECs. We analyze the microscopic flow field and present laboratory measurements of the coarse-grained velocity profile. Unlike time-of-flight imaging, Bragg scattering is sensitive to the direction of rotation and therefore to the phase of the condensate. In addition, we have non-destructively probed the vortex flow field using a sequence of two Bragg pulses. △ Less

Submitted 5 June, 2006; originally announced June 2006.

Comments: 13 pages, 5 figures. Invited paper submitted to a special issue on "Nonlinear Waves" of the (Elsevier) journal 'Math. Comput. Simul.', for participants in the 4th IMACS International Conference on Nonlinear Evolution Equations and Wave Phenomena (2005). Visit our website at http://www.physics.gatech.edu/chandra for additional information

Journal ref: Math. Comput. Simul., 74, 397-404 (2007)

arXiv:cond-mat/0510165 [pdf, ps, other]

doi 10.1103/PhysRevA.72.051606

Metastable Bose-Einstein Condensate in a Linear Potential

Authors: D. S. Naik, S. R. Muniz, C. Raman

Abstract: We have created a Bose-Einstein condensate whose spin orientation is metastable. Condensates were transferred into a quadrupole magnetic trap, where Majorana transitions limited the lifetime to a few hundred milliseconds, about 30 times the trap** period. Atoms held in the trap frequently displayed a ring-shaped time-of-flight distribution. We speculate that such a ring could be either a quant… ▽ More We have created a Bose-Einstein condensate whose spin orientation is metastable. Condensates were transferred into a quadrupole magnetic trap, where Majorana transitions limited the lifetime to a few hundred milliseconds, about 30 times the trap** period. Atoms held in the trap frequently displayed a ring-shaped time-of-flight distribution. We speculate that such a ring could be either a quantized vortex or a feature of the Majorana loss dynamics in the quantum regime. △ Less

Submitted 7 October, 2005; v1 submitted 7 October, 2005; originally announced October 2005.

Journal ref: Phys. Rev. A 72 (5): Art. No. 051606(R) (Nov, 2005)

arXiv:cond-mat/0508326 [pdf, ps, other]

doi 10.1103/PhysRevA.73.041605

Bragg Spectroscopy of Vortex Lattices in Bose-Einstein condensates

Authors: S. R. Muniz, D. S. Naik, C. Raman

Abstract: We have measured the velocity field of a vortex lattice within a sodium Bose-Einstein condensate using Bragg scattering. The phase gradient of the macroscopic wavefunction was mapped into the spatial structure of the diffracted atom cloud, allowing for single shot measurement of the rotation parameters. A combination of spectral and spatial information yields a complete description of the superf… ▽ More We have measured the velocity field of a vortex lattice within a sodium Bose-Einstein condensate using Bragg scattering. The phase gradient of the macroscopic wavefunction was mapped into the spatial structure of the diffracted atom cloud, allowing for single shot measurement of the rotation parameters. A combination of spectral and spatial information yields a complete description of the superfluid flow, coarse-grained over the lattice structure, including direct and independent measurements of the rate and sense of rotation. Signatures of the microscopic quantum rotation have also been observed. △ Less

Submitted 7 December, 2005; v1 submitted 12 August, 2005; originally announced August 2005.

Comments: 5 pages, 5 Figures, A movie built from the CM data is available in our Webpage: http://www.physics.gatech.edu/chandra/index.htm; added Fig.5 presents new data, showing signatures of the microscopic vortex structure in the diffracted cloud

Journal ref: Phys. Rev. A 73, 041605(R) (2006)

arXiv:cond-mat/0406341 [pdf, ps, other]

doi 10.1103/PhysRevA.71.033617

An Optically Plugged Quadrupole Trap for Bose-Einstein Condensates

Authors: D. S. Naik, C. Raman

Abstract: We created sodium Bose-Einstein condensates in an optically plugged quadrupole magnetic trap (OPT). A focused, 532nm laser beam repelled atoms from the coil center where Majorana loss is significant. We produced condensates of up to $3 \times 10^7$ atoms, a factor of 60 improvement over previous work [1], a number comparable to the best all-magnetic traps, and transferred up to $9 \times 10^6$ a… ▽ More We created sodium Bose-Einstein condensates in an optically plugged quadrupole magnetic trap (OPT). A focused, 532nm laser beam repelled atoms from the coil center where Majorana loss is significant. We produced condensates of up to $3 \times 10^7$ atoms, a factor of 60 improvement over previous work [1], a number comparable to the best all-magnetic traps, and transferred up to $9 \times 10^6$ atoms into a purely optical trap. Due to the tight axial confinement and azimuthal symmetry of the quadrupole coils, the OPT shows promise for creating Bose-Einstein condensates in a ring geometry. △ Less

Submitted 15 June, 2004; originally announced June 2004.

arXiv:quant-ph/9912105 [pdf, ps, other]

doi 10.1103/PhysRevLett.84.4733

Entangled state quantum cryptography: Eavesdrop** on the Ekert protocol

Authors: D. S. Naik, C. G. Peterson, A. G. White, A. J. Berglund, P. G. Kwiat

Abstract: Using polarization-entangled photons from spontaneous parametric downconversion, we have implemented Ekert's quantum cryptography protocol. The near-perfect correlations of the photons allow the sharing of a secret key between two parties. The presence of an eavesdropper is continually checked by measuring Bell's inequalities. We investigated several possible eavesdropper strategies, including p… ▽ More Using polarization-entangled photons from spontaneous parametric downconversion, we have implemented Ekert's quantum cryptography protocol. The near-perfect correlations of the photons allow the sharing of a secret key between two parties. The presence of an eavesdropper is continually checked by measuring Bell's inequalities. We investigated several possible eavesdropper strategies, including pseudo-quantum non-demolition measurements. In all cases, the eavesdropper's presence was readily apparent. We discuss a procedure to increase her detectability. △ Less

Submitted 22 December, 1999; originally announced December 1999.

Comments: 4 pages, 2 encapsulated postscript files, PRL (tentatively) accepted

Report number: LAUR-99-5760

Journal ref: Physical Review Letters 84, 4733-4736 (2000).

Showing 1–45 of 45 results for author: Naik, D