Search | arXiv e-print repository

Mirror Symmetry and Level-rank Duality for 3d $\mathcal{N} = 4$ Rank 0 SCFTs

Authors: Thomas Creutzig, Niklas Garner, Heeyeon Kim

Abstract: We introduce a family of 3d $\mathcal{N} = 4$ superconformal field theories that have zero-dimensional Coulomb and Higgs branches and propose that the rational vertex operator algebras $W^{\text{min}}_{k - \scriptstyle{\frac{1}{2}}}(\mathfrak{sp}_{2N})$ and $L_{k}(\mathfrak{osp}_{1|2N})$ model the modular tensor categories of line operators in their topological $A$ and $B$ twists, respectively. Ou… ▽ More We introduce a family of 3d $\mathcal{N} = 4$ superconformal field theories that have zero-dimensional Coulomb and Higgs branches and propose that the rational vertex operator algebras $W^{\text{min}}_{k - \scriptstyle{\frac{1}{2}}}(\mathfrak{sp}_{2N})$ and $L_{k}(\mathfrak{osp}_{1|2N})$ model the modular tensor categories of line operators in their topological $A$ and $B$ twists, respectively. Our analysis indicates that the action of 3d mirror symmetry on this family of theories is related to a novel level-rank duality and leads to several conjectural $q$-series identities of independent interest. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: 29 pages, 1 figure; comments welcome!

arXiv:2404.14024 [pdf, other]

Exploring neural oscillations during speech perception via surrogate gradient spiking neural networks

Authors: Alexandre Bittar, Philip N. Garner

Abstract: Understanding cognitive processes in the brain demands sophisticated models capable of replicating neural dynamics at large scales. We present a physiologically inspired speech recognition architecture, compatible and scalable with deep learning frameworks, and demonstrate that end-to-end gradient descent training leads to the emergence of neural oscillations in the central spiking neural network.… ▽ More Understanding cognitive processes in the brain demands sophisticated models capable of replicating neural dynamics at large scales. We present a physiologically inspired speech recognition architecture, compatible and scalable with deep learning frameworks, and demonstrate that end-to-end gradient descent training leads to the emergence of neural oscillations in the central spiking neural network. Significant cross-frequency couplings, indicative of these oscillations, are measured within and across network layers during speech processing, whereas no such interactions are observed when handling background noise inputs. Furthermore, our findings highlight the crucial inhibitory role of feedback mechanisms, such as spike frequency adaptation and recurrent connections, in regulating and synchronising neural activity to improve recognition performance. Overall, on top of develo** our understanding of synchronisation phenomena notably observed in the human auditory pathway, our architecture exhibits dynamic and efficient information processing, with relevance to neuromorphic technology. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2402.12220 [pdf, ps, other]

Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting

Authors: Haolin Chen, Philip N. Garner

Abstract: Although motivated by the adaptation of text-to-speech synthesis models, we argue that more generic parameter-efficient fine-tuning (PEFT) is an appropriate framework to do such adaptation. However, catastrophic forgetting remains an issue with PEFT, damaging the pre-trained model's inherent capabilities. We demonstrate that existing Bayesian learning techniques can be applied to PEFT to prevent c… ▽ More Although motivated by the adaptation of text-to-speech synthesis models, we argue that more generic parameter-efficient fine-tuning (PEFT) is an appropriate framework to do such adaptation. However, catastrophic forgetting remains an issue with PEFT, damaging the pre-trained model's inherent capabilities. We demonstrate that existing Bayesian learning techniques can be applied to PEFT to prevent catastrophic forgetting as long as the parameter shift of the fine-tuned layers can be calculated differentiably. In a principled series of experiments on language modeling and speech synthesis tasks, we utilize established Laplace approximations, including diagonal and Kronecker factored approaches, to regularize PEFT with the low-rank adaptation (LoRA) and compare their performance in pre-training knowledge preservation. Our results demonstrate that catastrophic forgetting can be overcome by our methods without degrading the fine-tuning performance, and using the Kronecker factored approximations produces a better preservation of the pre-training knowledge than the diagonal ones. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2401.16192 [pdf, other]

B-twisted Gaiotto-Witten theory and topological quantum field theory

Authors: Niklas Garner, Nathan Geer, Matthew B. Young

Abstract: We develop representation theoretic techniques to construct three dimensional non-semisimple topological quantum field theories which model homologically truncated topological B-twists of abelian Gaiotto-Witten theory with linear matter. Our constructions are based on relative modular structures on the category of weight modules over an unrolled quantization of a Lie superalgebra. The Lie superalg… ▽ More We develop representation theoretic techniques to construct three dimensional non-semisimple topological quantum field theories which model homologically truncated topological B-twists of abelian Gaiotto-Witten theory with linear matter. Our constructions are based on relative modular structures on the category of weight modules over an unrolled quantization of a Lie superalgebra. The Lie superalgebra, originally defined by Gaiotto and Witten, is associated to a complex symplectic representation of a metric abelian Lie algebra. The physical theories we model admit alternative realizations as Chern-Simons-Rozansky-Witten theories and supergroup Chern-Simons theories and include as particular examples global forms of $\mathfrak{gl}(1 \vert 1)$-Chern-Simons theory and toral Chern-Simons theory. Fundamental to our approach is the systematic incorporation of non-genuine line operators which source flat connections for the topological flavour symmetry of the theory. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 46 pages

MSC Class: Primary 57R56; Secondary 17B10

arXiv:2311.17655 [pdf, other]

Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes

Authors: Pavel Korshunov, Haolin Chen, Philip N. Garner, Sebastien Marcel

Abstract: The task of deepfakes detection is far from being solved by speech or vision researchers. Several publicly available databases of fake synthetic video and speech were built to aid the development of detection methods. However, existing databases typically focus on visual or voice modalities and provide no proof that their deepfakes can in fact impersonate any real person. In this paper, we present… ▽ More The task of deepfakes detection is far from being solved by speech or vision researchers. Several publicly available databases of fake synthetic video and speech were built to aid the development of detection methods. However, existing databases typically focus on visual or voice modalities and provide no proof that their deepfakes can in fact impersonate any real person. In this paper, we present the first realistic audio-visual database of deepfakes SWAN-DF, where lips and speech are well synchronized and video have high visual and audio qualities. We took the publicly available SWAN dataset of real videos with different identities to create audio-visual deepfakes using several models from DeepFaceLab and blending techniques for face swap** and HiFiVC, DiffVC, YourTTS, and FreeVC models for voice conversion. From the publicly available speech dataset LibriTTS, we also created a separate database of only audio deepfakes LibriTTS-DF using several latest text to speech methods: YourTTS, Adaspeech, and TorToiSe. We demonstrate the vulnerability of a state of the art speaker recognition system, such as ECAPA-TDNN-based model from SpeechBrain, to the synthetic voices. Similarly, we tested face recognition system based on the MobileFaceNet architecture to several variants of our visual deepfakes. The vulnerability assessment show that by tuning the existing pretrained deepfake models to specific identities, one can successfully spoof the face and speaker recognition systems in more than 90% of the time and achieve a very realistic looking and sounding fake video of a given person. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 10 pages, 3 figures, 3 tables

ACM Class: I.4.3; I.2.10; H.5.1

arXiv:2311.05087 [pdf, other]

Boundary vertex algebras for 3d $\mathcal{N}=4$ rank-0 SCFTs

Authors: Andrea E. V. Ferrari, Niklas Garner, Heeyeon Kim

Abstract: We initiate the study of boundary Vertex Operator Algebras (VOAs) of topologically twisted 3d $\mathcal{N}=4$ rank-0 SCFTs. This is a recently introduced class of $\mathcal{N}=4$ SCFTs that by definition have zero-dimensional Higgs and Coulomb branches. We briefly explain why it is reasonable to obtain rational VOAs at the boundary of their topological twists. When a rank-0 SCFT is realized as the… ▽ More We initiate the study of boundary Vertex Operator Algebras (VOAs) of topologically twisted 3d $\mathcal{N}=4$ rank-0 SCFTs. This is a recently introduced class of $\mathcal{N}=4$ SCFTs that by definition have zero-dimensional Higgs and Coulomb branches. We briefly explain why it is reasonable to obtain rational VOAs at the boundary of their topological twists. When a rank-0 SCFT is realized as the IR fixed point of a $\mathcal{N}=2$ Lagrangian theory, we propose a technique for the explicit construction of its topological twists and boundary VOAs based on deformations of the holomorphic-topological twist of the $\mathcal{N}=2$ microscopic description. We apply this technique to the $B$ twist of a newly discovered family of 3d $\mathcal{N}=4$ rank-0 SCFTs ${\mathcal T}_r$ and argue that they admit the simple affine VOAs $L_r(\mathfrak{osp}(1|2))$ at their boundary. In the simplest case, this leads to a novel level-rank duality between $L_1(\mathfrak{osp}(1|2))$ and the minimal model $M(2,5)$. As an aside, we present a TQFT obtained by twisting a 3d $\mathcal{N}=2$ QFT that admits the $M(3,4)$ minimal model as a boundary VOA and briefly comment on the classical freeness of VOAs at the boundary of 3d TQFTs. △ Less

Submitted 27 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

Comments: minor revision

arXiv:2310.08524 [pdf, ps, other]

Higgs and coulomb branches from superconformal raviolo vertex algebras

Authors: Niklas Garner, Surya Raghavendran, Brian R. Williams

Abstract: We propose a method for extracting the Higgs and Coulomb branches of a three-dimensional N = 4 quantum field theory from the algebra of local operators in its holomorphic-topological twist using the formalism of raviolo vertex algebras. Our construction parallels that of the chiral ring and twisted chiral ring of an N = 2 superconformal vertex operator algebra. We propose a method for extracting the Higgs and Coulomb branches of a three-dimensional N = 4 quantum field theory from the algebra of local operators in its holomorphic-topological twist using the formalism of raviolo vertex algebras. Our construction parallels that of the chiral ring and twisted chiral ring of an N = 2 superconformal vertex operator algebra. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 45 pages; feedback welcome

arXiv:2310.08516 [pdf, ps, other]

Enhanced symmetries in minimally-twisted three-dimensional supersymmetric theories

Authors: Niklas Garner, Surya Raghavendran, Brian R. Williams

Abstract: We show that the action of residual supersymmetries in holomorphic-topological twists of $N = 2$ theories in three dimensions naturally extends to the action of certain infinite dimensional Lie superalgebras. We demonstrate this in a range of examples, including $N = 4$ Yang-Mills theories and superconformal Chern-Simons theories, describing how the symmetries are implemented at the level of local… ▽ More We show that the action of residual supersymmetries in holomorphic-topological twists of $N = 2$ theories in three dimensions naturally extends to the action of certain infinite dimensional Lie superalgebras. We demonstrate this in a range of examples, including $N = 4$ Yang-Mills theories and superconformal Chern-Simons theories, describing how the symmetries are implemented at the level of local operators. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 41 pages, feedback welcome!

arXiv:2308.04414 [pdf, ps, other]

Raviolo vertex algebras

Authors: Niklas Garner, Brian R. Williams

Abstract: We develop an algebraic structure modeling local operators in a three-dimensional quantum field theory which is partially holomorphic and partially topological. The geometric space organizing our algebraic structure is called the raviolo (or bubble) and replaces the punctured disk underlying vertex algebras; we refer to this structure as a raviolo vertex algebra. The raviolo has appeared in many c… ▽ More We develop an algebraic structure modeling local operators in a three-dimensional quantum field theory which is partially holomorphic and partially topological. The geometric space organizing our algebraic structure is called the raviolo (or bubble) and replaces the punctured disk underlying vertex algebras; we refer to this structure as a raviolo vertex algebra. The raviolo has appeared in many contexts related to three-dimensional supersymmetric gauge theory, especially in work on the affine Grassmannian. We prove a number of structure theorems for raviolo vertex algebras and provide simple examples that share many similarities with their vertex algebra counterparts. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: 78 pages. Comments and feedback are welcomed!

arXiv:2305.13512 [pdf, other]

Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding

Authors: Mutian He, Philip N. Garner

Abstract: Recently, large pretrained language models have demonstrated strong language understanding capabilities. This is particularly reflected in their zero-shot and in-context learning abilities on downstream tasks through prompting. To assess their impact on spoken language understanding (SLU), we evaluate several such models like ChatGPT and OPT of different sizes on multiple benchmarks. We verify the… ▽ More Recently, large pretrained language models have demonstrated strong language understanding capabilities. This is particularly reflected in their zero-shot and in-context learning abilities on downstream tasks through prompting. To assess their impact on spoken language understanding (SLU), we evaluate several such models like ChatGPT and OPT of different sizes on multiple benchmarks. We verify the emergent ability unique to the largest models as they can reach intent classification accuracy close to that of supervised models with zero or few shots on various languages given oracle transcripts. By contrast, the results for smaller models fitting a single GPU fall far behind. We note that the error cases often arise from the annotation scheme of the dataset; responses from ChatGPT are still reasonable. We show, however, that the model is worse at slot filling, and its performance is sensitive to ASR errors, suggesting serious challenges for the application of those textual models on SLU. △ Less

Submitted 17 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: 6 pages, 2 figures; Accepted by Interspeech 2023

arXiv:2305.09652 [pdf, other]

The Interpreter Understands Your Meaning: End-to-end Spoken Language Understanding Aided by Speech Translation

Authors: Mutian He, Philip N. Garner

Abstract: End-to-end spoken language understanding (SLU) remains elusive even with current large pretrained language models on text and speech, especially in multilingual cases. Machine translation has been established as a powerful pretraining objective on text as it enables the model to capture high-level semantics of the input utterance and associations between different languages, which is desired for s… ▽ More End-to-end spoken language understanding (SLU) remains elusive even with current large pretrained language models on text and speech, especially in multilingual cases. Machine translation has been established as a powerful pretraining objective on text as it enables the model to capture high-level semantics of the input utterance and associations between different languages, which is desired for speech models that work on lower-level acoustic frames. Motivated particularly by the task of cross-lingual SLU, we demonstrate that the task of speech translation (ST) is a good means of pretraining speech models for end-to-end SLU on both intra- and cross-lingual scenarios. By introducing ST, our models reach higher performance over baselines on monolingual and multilingual intent classification as well as spoken question answering using SLURP, MINDS-14, and NMSQA benchmarks. To verify the effectiveness of our methods, we also create new benchmark datasets from both synthetic and real sources, for speech summarization and low-resource/zero-shot transfer from English to French or Spanish. We further show the value of preserving knowledge for the ST pretraining task for better downstream performance, possibly using Bayesian transfer regularizers. △ Less

Submitted 17 October, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

Comments: 16 pages, 3 figures; accepted by Findings of EMNLP 2023

arXiv:2305.00049 [pdf, ps, other]

Twistorial monopoles & chiral algebras

Authors: Niklas Garner, Natalie M. Paquette

Abstract: We initiate the study of how the insertion of magnetically charged states in 4d self-dual gauge theories impacts the 2d chiral algebras supported on the celestial sphere at asymptotic null infinity, from the point of view of the 4d/2d twistorial correspondence introduced by Costello and the second author. By reducing the 6d twistorial theory to a 3d holomorphic-topological theory with suitable bou… ▽ More We initiate the study of how the insertion of magnetically charged states in 4d self-dual gauge theories impacts the 2d chiral algebras supported on the celestial sphere at asymptotic null infinity, from the point of view of the 4d/2d twistorial correspondence introduced by Costello and the second author. By reducing the 6d twistorial theory to a 3d holomorphic-topological theory with suitable boundary conditions, we can motivate certain non-perturbative enhancements of the celestial chiral algebra corresponding to extensions by modules arising from 3d boundary monopole operators. We also identify the insertion of 4d (non-abelian) monopoles with families of spectral flow automorphisms of the celestial chiral algebra. △ Less

Submitted 6 July, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

Comments: 43+1 pages; v2 added references and a discussion of modules arising from holomorphic Wilson lines; v3 minor revisions, corrected error in section 3.2.3; comments welcome!

arXiv:2304.05414 [pdf, ps, other]

Line Operators in $U(1|1)$ Chern-Simons Theory

Authors: Niklas Garner, Wenjun Niu

Abstract: We analyze the non-semisimple category of line operators in Chern-Simons gauge theories based off the Lie superalgebra $\mathfrak{gl}(1|1)$. Our proposal is that the category of line operators $\mathcal{C}$ can be identified with the derived category of modules for a boundary vertex operator algebra $\mathcal{V}$ realized as a certain infinite-order simple current extension of the affine current a… ▽ More We analyze the non-semisimple category of line operators in Chern-Simons gauge theories based off the Lie superalgebra $\mathfrak{gl}(1|1)$. Our proposal is that the category of line operators $\mathcal{C}$ can be identified with the derived category of modules for a boundary vertex operator algebra $\mathcal{V}$ realized as a certain infinite-order simple current extension of the affine current algebra $V(\mathfrak{gl}(1|1))$ by boundary monopole operators. By translating this simple current extension of $V(\mathfrak{gl}(1|1))$ to the unrolled, restricted quantum group ${\overline{U}}^E(\mathfrak{gl}(1|1))$, we show that our category of line operators admits a second description in terms of a quantum group $\mathcal{A}$ realized by uprolling. We also compare our results across an expected physical duality with the cyclic orbifold of a free, $B$-twisted hypermultiplet and find a slight discrepancy at the level of braiding. We end with a detailed analysis of coupling to background flat $GL(1, \mathbb{C})$ connections and the resulting category of non-genuine line operators. △ Less

Submitted 13 September, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: 49+9 pages. Corrected a mistake in Appendix B. Comments welcome!

arXiv:2303.01849 [pdf, other]

An investigation into the adaptability of a diffusion-based TTS model

Authors: Haolin Chen, Philip N. Garner

Abstract: Given the recent success of diffusion in producing natural-sounding synthetic speech, we investigate how diffusion can be used in speaker adaptive TTS. Taking cues from more traditional adaptation approaches, we show that adaptation can be included in a diffusion pipeline using conditional layer normalization with a step embedding. However, we show experimentally that, whilst the approach has meri… ▽ More Given the recent success of diffusion in producing natural-sounding synthetic speech, we investigate how diffusion can be used in speaker adaptive TTS. Taking cues from more traditional adaptation approaches, we show that adaptation can be included in a diffusion pipeline using conditional layer normalization with a step embedding. However, we show experimentally that, whilst the approach has merit, such adaptation alone cannot approach the performance of Transformer-based techniques. In a second experiment, we show that diffusion can be optimally combined with Transformer, with the latter taking the bulk of the adaptation load and the former contributing to improved naturalness. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2212.01187 [pdf, other]

Surrogate Gradient Spiking Neural Networks as Encoders for Large Vocabulary Continuous Speech Recognition

Authors: Alexandre Bittar, Philip N. Garner

Abstract: Compared to conventional artificial neurons that produce dense and real-valued responses, biologically-inspired spiking neurons transmit sparse and binary information, which can also lead to energy-efficient implementations. Recent research has shown that spiking neural networks can be trained like standard recurrent neural networks using the surrogate gradient method. They have shown promising re… ▽ More Compared to conventional artificial neurons that produce dense and real-valued responses, biologically-inspired spiking neurons transmit sparse and binary information, which can also lead to energy-efficient implementations. Recent research has shown that spiking neural networks can be trained like standard recurrent neural networks using the surrogate gradient method. They have shown promising results on speech command recognition tasks. Using the same technique, we show that they are scalable to large vocabulary continuous speech recognition, where they are capable of replacing LSTMs in the encoder with only minor loss of performance. This suggests that they may be applicable to more involved sequence-to-sequence tasks. Moreover, in contrast to their recurrent non-spiking counterparts, they show robustness to exploding gradient problems without the need to use gates. △ Less

Submitted 16 February, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2208.11700 [pdf, ps, other]

Low-Level Physiological Implications of End-to-End Learning of Speech Recognition

Authors: Louise Coppieters de Gibson, Philip N. Garner

Abstract: Current speech recognition architectures perform very well from the point of view of machine learning, hence user interaction. This suggests that they are emulating the human biological system well. We investigate whether the inference can be inverted to provide insights into that biological system; in particular the hearing mechanism. Using SincNet, we confirm that end-to-end systems do learn wel… ▽ More Current speech recognition architectures perform very well from the point of view of machine learning, hence user interaction. This suggests that they are emulating the human biological system well. We investigate whether the inference can be inverted to provide insights into that biological system; in particular the hearing mechanism. Using SincNet, we confirm that end-to-end systems do learn well known filterbank structures. However, we also show that wider band-width filters are important in the learned structure. Whilst some benefits can be gained by initialising both narrow and wide-band filters, physiological constraints suggest that such filters arise in mid-brain rather than the cochlea. We show that standard machine learning architectures must be modified to allow this process to be emulated neurally. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: Submitted to INTERSPEECH 2022

arXiv:2207.10486 [pdf, ps, other]

doi 10.21437/Interspeech.2022-11035

Bayesian Recurrent Units and the Forward-Backward Algorithm

Authors: Alexandre Bittar, Philip N. Garner

Abstract: Using Bayes's theorem, we derive a unit-wise recurrence as well as a backward recursion similar to the forward-backward algorithm. The resulting Bayesian recurrent units can be integrated as recurrent neural networks within deep learning frameworks, while retaining a probabilistic interpretation from the direct correspondence with hidden Markov models. Whilst the contribution is mainly theoretical… ▽ More Using Bayes's theorem, we derive a unit-wise recurrence as well as a backward recursion similar to the forward-backward algorithm. The resulting Bayesian recurrent units can be integrated as recurrent neural networks within deep learning frameworks, while retaining a probabilistic interpretation from the direct correspondence with hidden Markov models. Whilst the contribution is mainly theoretical, experiments on speech recognition indicate that adding the derived units at the end of state-of-the-art recurrent architectures can improve the performance at a very low cost in terms of trainable parameters. △ Less

Submitted 21 July, 2022; originally announced July 2022.

Comments: Submitted to INTERSPEECH 2022

arXiv:2204.02997 [pdf, other]

Twisted Formalism for 3d $\mathcal{N}=4$ Theories

Authors: Niklas Garner

Abstract: We describe the topological $A$ and $B$ twists of 3d $\mathcal{N}=4$ theories of hypermultiplets gauged by $\mathcal{N}=4$ vector multiplets as certain deformations of the holomorphic-topological ($HT$) twist of those theories, utilizing the twisted superfields of Aganagic-Costello-Vafa-McNamara describing $HT$-twisted 3d $\mathcal{N}=2$ theories. We rederive many known results from this perspecti… ▽ More We describe the topological $A$ and $B$ twists of 3d $\mathcal{N}=4$ theories of hypermultiplets gauged by $\mathcal{N}=4$ vector multiplets as certain deformations of the holomorphic-topological ($HT$) twist of those theories, utilizing the twisted superfields of Aganagic-Costello-Vafa-McNamara describing $HT$-twisted 3d $\mathcal{N}=2$ theories. We rederive many known results from this perspective, including state spaces on Riemann surfaces, deformations induced by flavor symmetries, the boundary VOAs of Costello-Gaiotto, and the category of line operators as proposed by Costello-Dimofte-Gaiotto-Hilburn-Yoo. Along the way, we show how the secondary product of local operators in the holomorphic-topological twist is related to the secondary product in the fully topological twist. △ Less

Submitted 17 March, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: 44 pages; v2 added discussion of line operators, added references, fixed minor typos

arXiv:2204.02991 [pdf, other]

Vertex Operator Algebras and Topologically Twisted Chern-Simons-Matter Theories

Authors: Niklas Garner

Abstract: We consider several topologically twisted Chern-Simons-matter theories and propose boundary VOAs whose module categories should model the category of line operators of the 3d bulk. Our main examples come from the topological $A$ and $B$ twists of the exotic $\mathcal{N}=4$ Chern-Simons-matter theories of Gaiotto-Witten, but we show that there is a topological "$A$-twist" for a much larger class of… ▽ More We consider several topologically twisted Chern-Simons-matter theories and propose boundary VOAs whose module categories should model the category of line operators of the 3d bulk. Our main examples come from the topological $A$ and $B$ twists of the exotic $\mathcal{N}=4$ Chern-Simons-matter theories of Gaiotto-Witten, but we show that there is a topological "$A$-twist" for a much larger class of $\mathcal{N}\neq4$ theories. We illustrate a particular example of this new class of theories that admits the $p=2$ singlet VOA $\mathfrak{M}(2)$ on its boundary and comment on its relation to the $ψ\to \infty$ limit of the Gaiotto-Rap{\v c}{á}k corner VOA $Y_{1,1,0}[ψ]$. △ Less

Submitted 10 July, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: 40 pages; v2 added references, corrected typos; v3 minor edits

arXiv:2204.01914 [pdf, other]

TASI Lectures on the Mathematics of String Dualities

Authors: Niklas Garner, Natalie M. Paquette

Abstract: In these lecture proceedings, we describe some of the fundamental mathematical concepts that underlie supersymmetric string theory and field theory, and their role in describing and testing dualities. In particular, we provide a pedagogical introduction to topological and holomorphic twisting, descent, and higher algebraic structures. Our primary examples are worldsheet theories of topological str… ▽ More In these lecture proceedings, we describe some of the fundamental mathematical concepts that underlie supersymmetric string theory and field theory, and their role in describing and testing dualities. In particular, we provide a pedagogical introduction to topological and holomorphic twisting, descent, and higher algebraic structures. Our primary examples are worldsheet theories of topological strings, namely the A- and B-models, which we briefly review. These proceedings are based on lectures given by the second author at TASI 2021. △ Less

Submitted 12 April, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

Comments: 61 pages + 3 appendices, 14 figures; v2 added references and minor corrections

arXiv:2112.01559 [pdf, other]

A QFT for non-semisimple TQFT

Authors: Thomas Creutzig, Tudor Dimofte, Niklas Garner, Nathan Geer

Abstract: We construct a family of 3d quantum field theories $\mathcal T_{n,k}^A$ that conjecturally provide a physical realization -- and derived generalization -- of non-semisimple mathematical TQFT's based on the modules for the quantum group $U_q(\mathfrak{sl}_n)$ at an even root of unity $q=\text{exp}(iπ/k)$. The theories $\mathcal T_{n,k}^A$ are defined as topological twists of certain 3d… ▽ More We construct a family of 3d quantum field theories $\mathcal T_{n,k}^A$ that conjecturally provide a physical realization -- and derived generalization -- of non-semisimple mathematical TQFT's based on the modules for the quantum group $U_q(\mathfrak{sl}_n)$ at an even root of unity $q=\text{exp}(iπ/k)$. The theories $\mathcal T_{n,k}^A$ are defined as topological twists of certain 3d $\mathcal N=4$ Chern-Simons-matter theories, which also admit string/M-theory realizations. They may be thought of as $SU(n)_{k-n}$ Chern-Simons theories, coupled to a twisted $\mathcal N=4$ matter sector (the source of non-semisimplicity). We show that $\mathcal T_{n,k}^A$ admits holomorphic boundary conditions supporting two different logarithmic vertex operator algebras, one of which is an $\mathfrak{sl}_n$-type Feigin-Tipunin algebra; and we conjecture that these two vertex operator algebras are related by a novel logarithmic level-rank duality. (We perform detailed computations to support the conjecture.) We thus relate the category of line operators in $\mathcal T_{n,k}^A$ to the derived category of modules for a boundary Feigin-Tipunin algebra, and -- using a logarithmic Kazhdan-Lusztig-like correspondence that has been established for $n=2$ and expected for general $n$ -- to the derived category of $U_q(\mathfrak{sl}_n)$ modules. We analyze many other key features of $\mathcal T_{n,k}^A$ and match them from quantum-group and VOA perspectives, including deformations by flat $PSL(n,\mathbb C)$ connections, one-form symmetries, and indices of (derived) genus-$g$ state spaces. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: 195 pages and many figures

arXiv:2006.05389 [pdf, other]

doi 10.1109/LSP.2020.3001843

A t-distribution based operator for enhancing out of distribution robustness of neural network classifiers

Authors: Niccolò Antonello, Philip N. Garner

Abstract: Neural Network (NN) classifiers can assign extreme probabilities to samples that have not appeared during training (out-of-distribution samples) resulting in erroneous and unreliable predictions. One of the causes for this unwanted behaviour lies in the use of the standard softmax operator which pushes the posterior probabilities to be either zero or unity hence failing to model uncertainty. The s… ▽ More Neural Network (NN) classifiers can assign extreme probabilities to samples that have not appeared during training (out-of-distribution samples) resulting in erroneous and unreliable predictions. One of the causes for this unwanted behaviour lies in the use of the standard softmax operator which pushes the posterior probabilities to be either zero or unity hence failing to model uncertainty. The statistical derivation of the softmax operator relies on the assumption that the distributions of the latent variables for a given class are Gaussian with known variance. However, it is possible to use different assumptions in the same derivation and attain from other families of distributions as well. This allows derivation of novel operators with more favourable properties. Here, a novel operator is proposed that is derived using $t$-distributions which are capable of providing a better description of uncertainty. It is shown that classifiers that adopt this novel operator can be more robust to out of distribution samples, often outperforming NNs that use the standard softmax operator. These enhancements can be reached with minimal changes to the NN architecture. △ Less

Submitted 9 October, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: 5 pages, 5 figures, to be published in IEEE Signal Processing Letters, reproducible code https://github.com/idiap/tsoftmax

arXiv:2004.15024 [pdf, other]

Generalized affine Springer theory and Hilbert schemes on planar curves

Authors: Niklas Garner, Oscar Kivinen

Abstract: We show that Hilbert schemes of planar curve singularities and their parabolic variants can be interpreted as certain generalized affine Springer fibers for $GL_n$, as defined by Goresky-Kottwitz-MacPherson. Using a generalization of affine Springer theory for Braverman-Finkelberg-Nakajima's Coulomb branch algebras, we construct a rational Cherednik algebra action on the homology of the Hilbert sc… ▽ More We show that Hilbert schemes of planar curve singularities and their parabolic variants can be interpreted as certain generalized affine Springer fibers for $GL_n$, as defined by Goresky-Kottwitz-MacPherson. Using a generalization of affine Springer theory for Braverman-Finkelberg-Nakajima's Coulomb branch algebras, we construct a rational Cherednik algebra action on the homology of the Hilbert schemes, and compute it in examples. Along the way, we generalize to the parahoric setting the recent construction of Hilburn-Kamnitzer-Weekes, which may be of independent interest. In the spherical case, we make our computations explicit through a new general localization formula for Coulomb branches. Via results of Hogancamp-Mellit, we also show the rational Cherednik algebra acts on the HOMFLY homologies of torus knots. This work was inspired in part by a construction in three-dimensional $\mathcal{N}=4$ gauge theory. △ Less

Submitted 27 January, 2022; v1 submitted 30 April, 2020; originally announced April 2020.

Comments: 51 pages, to appear in IMRN. Comments welcome!

MSC Class: 22E50; 14C05; 57M27; 20C08

arXiv:1910.11247 [pdf, other]

doi 10.1109/TPAMI.2020.2976978

A Bayesian Approach to Recurrence in Neural Networks

Authors: Philip N. Garner, Sibo Tong

Abstract: We begin by reiterating that common neural network activation functions have simple Bayesian origins. In this spirit, we go on to show that Bayes's theorem also implies a simple recurrence relation; this leads to a Bayesian recurrent unit with a prescribed feedback formulation. We show that introduction of a context indicator leads to a variable feedback that is similar to the forget mechanism in… ▽ More We begin by reiterating that common neural network activation functions have simple Bayesian origins. In this spirit, we go on to show that Bayes's theorem also implies a simple recurrence relation; this leads to a Bayesian recurrent unit with a prescribed feedback formulation. We show that introduction of a context indicator leads to a variable feedback that is similar to the forget mechanism in conventional recurrent units. A similar approach leads to a probabilistic input gate. The Bayesian formulation leads naturally to the two pass algorithm of the Kalman smoother or forward-backward algorithm, meaning that inference naturally depends upon future inputs as well as past ones. Experiments on speech recognition confirm that the resulting architecture can perform as well as a bidirectional recurrent network with the same number of parameters as a unidirectional one. Further, when configured explicitly bidirectionally, the architecture can exceed the performance of a conventional bidirectional recurrence. △ Less

Submitted 20 April, 2020; v1 submitted 24 October, 2019; originally announced October 2019.

arXiv:1908.00013 [pdf, other]

doi 10.1007/JHEP02(2020)075

Mirror symmetry and line operators

Authors: Tudor Dimofte, Niklas Garner, Michael Geracie, Justin Hilburn

Abstract: We study half-BPS line operators in 3d N=4 gauge theories, focusing in particular on the algebras of local operators at their junctions. It is known that there are two basic types of such line operators, distinguished by the SUSY subalgebras that they preserve; the two types can roughly be called "Wilson lines" and "vortex lines", and are exchanged under 3d mirror symmetry. We describe a large cla… ▽ More We study half-BPS line operators in 3d N=4 gauge theories, focusing in particular on the algebras of local operators at their junctions. It is known that there are two basic types of such line operators, distinguished by the SUSY subalgebras that they preserve; the two types can roughly be called "Wilson lines" and "vortex lines", and are exchanged under 3d mirror symmetry. We describe a large class of vortex lines that can be characterized by basic algebraic data, and propose a mathematical scheme to compute the algebras of local operators at their junctions --- including monopole operators --- in terms of this data. The computation generalizes mathematical and physical definitions/analyses of the bulk Coulomb-branch chiral ring. We fully classify the junctions of half-BPS Wilson lines and of half-BPS vortex lines in abelian gauge theories with sufficient matter. We also test our computational scheme in a non-abelian quiver gauge theory, using a 3d-mirror-map of line operators from work of Assel and Gomis. △ Less

Submitted 10 February, 2020; v1 submitted 31 July, 2019; originally announced August 2019.

Comments: 117 pages + appendices; v2 added references, corrected typo in (C.31); v3 references and many small clarifications added; typos fixed; v4 minor edits to appendix C; v5 minor typos fixed, references clarified

arXiv:1808.05226 [pdf, other]

doi 10.1007/JHEP02(2019)004

Coulomb Branches of Star-Shaped Quivers

Authors: Tudor Dimofte, Niklas Garner

Abstract: We study the Coulomb branches of 3d N=4 `star-shaped' quiver gauge theories and their deformation quantizations, by applying algebraic techniques that have been developed in the mathematics and physics literature over the last few years. The algebraic techniques supply an abelianization map, which embeds the Coulomb-branch chiral ring into a vastly simpler abelian algebra A. Relations among chiral… ▽ More We study the Coulomb branches of 3d N=4 `star-shaped' quiver gauge theories and their deformation quantizations, by applying algebraic techniques that have been developed in the mathematics and physics literature over the last few years. The algebraic techniques supply an abelianization map, which embeds the Coulomb-branch chiral ring into a vastly simpler abelian algebra A. Relations among chiral-ring operators, and their deformation quantization, are canonically induced from the embedding into A. In the case of star-shaped quivers -- whose Coulomb branches are related to Higgs branches of 4d N=2 theories of Class S -- this allows us to systematically verify known relations, to generalize them, and to quantize them. In the quantized setting, we find several new families of relations. △ Less

Submitted 7 December, 2018; v1 submitted 15 August, 2018; originally announced August 2018.

Comments: 63 pages + appendices; added references

arXiv:1806.08685 [pdf, other]

A Variational Prosody Model for Map** the Context-Sensitive Variation of Functional Prosodic Prototypes

Authors: Branislav Gerazov, Gérard Bailly, Omar Mohammed, Yi Xu, Philip N. Garner

Abstract: The quest for comprehensive generative models of intonation that link linguistic and paralinguistic functions to prosodic forms has been a longstanding challenge of speech communication research. Traditional intonation models have given way to the overwhelming performance of deep learning (DL) techniques for training general purpose end-to-end map**s using millions of tunable parameters. The shi… ▽ More The quest for comprehensive generative models of intonation that link linguistic and paralinguistic functions to prosodic forms has been a longstanding challenge of speech communication research. Traditional intonation models have given way to the overwhelming performance of deep learning (DL) techniques for training general purpose end-to-end map**s using millions of tunable parameters. The shift towards black box machine learning models has nonetheless posed the reverse problem -- a compelling need to discover knowledge, to explain, visualise and interpret. Our work bridges between a comprehensive generative model of intonation and state-of-the-art DL techniques. We build upon the modelling paradigm of the Superposition of Functional Contours (SFC) model and propose a Variational Prosody Model (VPM) that uses a network of variational contour generators to capture the context-sensitive variation of the constituent elementary prosodic contours. We show that the VPM can give insight into the intrinsic variability of these prosodic prototypes through learning a meaningful prosodic latent space representation structure. We also show that the VPM is able to capture prosodic phenomena that have multiple dimensions of context based variability. Since it is based on the principle of superposition, the VPM does not necessitate the use of specially crafted corpora for the analysis, opening up the possibilities of using big data for prosody analysis. In a speech synthesis scenario, the model can be used to generate a dynamic and natural prosody contour that is devoid of averaging effects. △ Less

Submitted 18 March, 2019; v1 submitted 22 June, 2018; originally announced June 2018.

Comments: Updated with recurrent version of contour generators, unified prosodic latent space, and performance evaluation with baseline

arXiv:1711.10025 [pdf, other]

Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model

Authors: Sibo Tong, Philip N. Garner, Hervé Bourlard

Abstract: Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to benefit from more training data, and better lend themselves to adaptation to under-resourced languages. However, initialisation from monolingual context-dependent models leads to an explosion of context-dependent states. Connectionist Temporal Classification (CTC) is a potential solution to this as… ▽ More Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to benefit from more training data, and better lend themselves to adaptation to under-resourced languages. However, initialisation from monolingual context-dependent models leads to an explosion of context-dependent states. Connectionist Temporal Classification (CTC) is a potential solution to this as it performs well with monophone labels. We investigate multilingual CTC in the context of adaptation and regularisation techniques that have been shown to be beneficial in more conventional contexts. The multilingual model is trained to model a universal International Phonetic Alphabet (IPA)-based phone set using the CTC loss function. Learning Hidden Unit Contribution (LHUC) is investigated to perform language adaptive training. In addition, dropout during cross-lingual adaptation is also studied and tested in order to mitigate the overfitting problem. Experiments show that the performance of the universal phoneme-based CTC system can be improved by applying LHUC and it is extensible to new phonemes during cross-lingual adaptation. Updating all the parameters shows consistent improvement on limited data. Applying dropout during adaptation can further improve the system and achieve competitive performance with Deep Neural Network / Hidden Markov Model (DNN/HMM) systems on limited data. △ Less

Submitted 23 January, 2018; v1 submitted 27 November, 2017; originally announced November 2017.

arXiv:1604.04383 [pdf, other]

doi 10.1109/TASLP.2016.2604566

Composition of Deep and Spiking Neural Networks for Very Low Bit Rate Speech Coding

Authors: Milos Cernak, Alexandros Lazaridis, Afsaneh Asaei, Philip N. Garner

Abstract: Most current very low bit rate (VLBR) speech coding systems use hidden Markov model (HMM) based speech recognition/synthesis techniques. This allows transmission of information (such as phonemes) segment by segment that decreases the bit rate. However, the encoder based on a phoneme speech recognition may create bursts of segmental errors. Segmental errors are further propagated to optional supras… ▽ More Most current very low bit rate (VLBR) speech coding systems use hidden Markov model (HMM) based speech recognition/synthesis techniques. This allows transmission of information (such as phonemes) segment by segment that decreases the bit rate. However, the encoder based on a phoneme speech recognition may create bursts of segmental errors. Segmental errors are further propagated to optional suprasegmental (such as syllable) information coding. Together with the errors of voicing detection in pitch parametrization, HMM-based speech coding creates speech discontinuities and unnatural speech sound artefacts. In this paper, we propose a novel VLBR speech coding framework based on neural networks (NNs) for end-to-end speech analysis and synthesis without HMMs. The speech coding framework relies on phonological (sub-phonetic) representation of speech, and it is designed as a composition of deep and spiking NNs: a bank of phonological analysers at the transmitter, and a phonological synthesizer at the receiver, both realised as deep NNs, and a spiking NN as an incremental and robust encoder of syllable boundaries for coding of continuous fundamental frequency (F0). A combination of phonological features defines much more sound patterns than phonetic features defined by HMM-based speech coders, and the finer analysis/synthesis code contributes into smoother encoded speech. Listeners significantly prefer the NN-based approach due to fewer discontinuities and speech artefacts of the encoded speech. A single forward pass is required during the speech encoding and decoding. The proposed VLBR speech coding operates at a bit rate of approximately 360 bits/s. △ Less

Submitted 29 August, 2016; v1 submitted 15 April, 2016; originally announced April 2016.

Report number: Idiap-RR-11-2016

Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Volume: 24, Issue: 12, Dec. 2016

arXiv:1409.0203 [pdf, other]

Ad Hoc Microphone Array Calibration: Euclidean Distance Matrix Completion Algorithm and Theoretical Guarantees

Authors: Mohammad J. Taghizadeh, Reza Parhizkar, Philip N. Garner, Herve Bourlard, Afsaneh Asaei

Abstract: This paper addresses the problem of ad hoc microphone array calibration where only partial information about the distances between microphones is available. We construct a matrix consisting of the pairwise distances and propose to estimate the missing entries based on a novel Euclidean distance matrix completion algorithm by alternative low-rank matrix completion and projection onto the Euclidean… ▽ More This paper addresses the problem of ad hoc microphone array calibration where only partial information about the distances between microphones is available. We construct a matrix consisting of the pairwise distances and propose to estimate the missing entries based on a novel Euclidean distance matrix completion algorithm by alternative low-rank matrix completion and projection onto the Euclidean distance space. This approach confines the recovered matrix to the EDM cone at each iteration of the matrix completion algorithm. The theoretical guarantees of the calibration performance are obtained considering the random and locally structured missing entries as well as the measurement noise on the known distances. This study elucidates the links between the calibration error and the number of microphones along with the noise level and the ratio of missing distances. Thorough experiments on real data recordings and simulated setups are conducted to demonstrate these theoretical insights. A significant improvement is achieved by the proposed Euclidean distance matrix completion algorithm over the state-of-the-art techniques for ad hoc microphone array calibration. △ Less

Submitted 31 August, 2014; originally announced September 2014.

Comments: In Press, available online, August 1, 2014. http://www.sciencedirect.com/science/article/pii/S0165168414003508, Signal Processing, 2014

Showing 1–30 of 30 results for author: Garner, N