-
Sign Stitching: A Novel Approach to Sign Language Production
Authors:
Harry Walsh,
Ben Saunders,
Richard Bowden
Abstract:
Sign Language Production (SLP) is a challenging task, given the limited resources available and the inherent diversity within sign data. As a result, previous works have suffered from the problem of regression to the mean, leading to under-articulated and incomprehensible signing. In this paper, we propose using dictionary examples and a learnt codebook of facial expressions to create expressive s…
▽ More
Sign Language Production (SLP) is a challenging task, given the limited resources available and the inherent diversity within sign data. As a result, previous works have suffered from the problem of regression to the mean, leading to under-articulated and incomprehensible signing. In this paper, we propose using dictionary examples and a learnt codebook of facial expressions to create expressive sign language sequences. However, simply concatenating signs and adding the face creates robotic and unnatural sequences. To address this we present a 7-step approach to effectively stitch sequences together. First, by normalizing each sign into a canonical pose, crop**, and stitching we create a continuous sequence. Then, by applying filtering in the frequency domain and resampling each sign, we create cohesive natural sequences that mimic the prosody found in the original data. We leverage a SignGAN model to map the output to a photo-realistic signer and present a complete Text-to-Sign (T2S) SLP pipeline. Our evaluation demonstrates the effectiveness of the approach, showcasing state-of-the-art performance across all datasets. Finally, a user evaluation shows our approach outperforms the baseline model and is capable of producing realistic sign language sequences.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Select and Reorder: A Novel Approach for Neural Sign Language Production
Authors:
Harry Walsh,
Ben Saunders,
Richard Bowden
Abstract:
Sign languages, often categorised as low-resource languages, face significant challenges in achieving accurate translation due to the scarcity of parallel annotated datasets. This paper introduces Select and Reorder (S&R), a novel approach that addresses data scarcity by breaking down the translation process into two distinct steps: Gloss Selection (GS) and Gloss Reordering (GR). Our method levera…
▽ More
Sign languages, often categorised as low-resource languages, face significant challenges in achieving accurate translation due to the scarcity of parallel annotated datasets. This paper introduces Select and Reorder (S&R), a novel approach that addresses data scarcity by breaking down the translation process into two distinct steps: Gloss Selection (GS) and Gloss Reordering (GR). Our method leverages large spoken language models and the substantial lexical overlap between source spoken languages and target sign languages to establish an initial alignment. Both steps make use of Non-AutoRegressive (NAR) decoding for reduced computation and faster inference speeds. Through this disentanglement of tasks, we achieve state-of-the-art BLEU and Rouge scores on the Meine DGS Annotated (mDGS) dataset, demonstrating a substantial BLUE-1 improvement of 37.88% in Text to Gloss (T2G) Translation. This innovative approach paves the way for more effective translation models for sign languages, even in resource-constrained settings.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
A Data-Driven Representation for Sign Language Production
Authors:
Harry Walsh,
Abolfazl Ravanshad,
Mariam Rahmani,
Richard Bowden
Abstract:
Phonetic representations are used when recording spoken languages, but no equivalent exists for recording signed languages. As a result, linguists have proposed several annotation systems that operate on the gloss or sub-unit level; however, these resources are notably irregular and scarce.
Sign Language Production (SLP) aims to automatically translate spoken language sentences into continuous s…
▽ More
Phonetic representations are used when recording spoken languages, but no equivalent exists for recording signed languages. As a result, linguists have proposed several annotation systems that operate on the gloss or sub-unit level; however, these resources are notably irregular and scarce.
Sign Language Production (SLP) aims to automatically translate spoken language sentences into continuous sequences of sign language. However, current state-of-the-art approaches rely on scarce linguistic resources to work. This has limited progress in the field. This paper introduces an innovative solution by transforming the continuous pose generation problem into a discrete sequence generation problem. Thus, overcoming the need for costly annotation. Although, if available, we leverage the additional information to enhance our approach.
By applying Vector Quantisation (VQ) to sign language data, we first learn a codebook of short motions that can be combined to create a natural sequence of sign. Where each token in the codebook can be thought of as the lexicon of our representation. Then using a transformer we perform a translation from spoken language text to a sequence of codebook tokens. Each token can be directly mapped to a sequence of poses allowing the translation to be performed by a single network. Furthermore, we present a sign stitching method to effectively join tokens together. We evaluate on the RWTH-PHOENIX-Weather-2014T (PHOENIX14T) and the more challenging Meine DGS Annotated (mDGS) datasets. An extensive evaluation shows our approach outperforms previous methods, increasing the BLEU-1 back translation score by up to 72%.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Interface fluctuations associated with split Fermi seas
Authors:
Harriet Walsh
Abstract:
We consider the asymptotic behaviour of a family of unidimensional lattice fermion models, which are in exact correspondence with certain probability laws on partitions and on unitary matrices. These models exhibit limit shapes, and in the case where the bulk of these shapes are described by analytic functions, the fluctuations around their interfaces have been shown to follow a universal Tracy-Wi…
▽ More
We consider the asymptotic behaviour of a family of unidimensional lattice fermion models, which are in exact correspondence with certain probability laws on partitions and on unitary matrices. These models exhibit limit shapes, and in the case where the bulk of these shapes are described by analytic functions, the fluctuations around their interfaces have been shown to follow a universal Tracy-Widom distribution or its higher-order analogue. Non-differentiable bulk limit shape functions arise when a gap appears in some quantum numbers of the model, in other words when the Fermi sea is split. We show that split Fermi seas give rise to new interface fluctuations, governed by integer powers of universal distributions. This breakdown in universality is analogous to the behaviour of a random Hermitian matrix when the support of its limiting eigenvalue distribution has multiple cuts, with oscillations appearing in the limit of the two-point correlation function. We show that when the Fermi sea is split in the lattice fermion model, there are multiple cuts in the eigenvalue support of the corresponding unitary matrix model.
△ Less
Submitted 27 January, 2024; v1 submitted 3 November, 2023;
originally announced November 2023.
-
Gloss Alignment Using Word Embeddings
Authors:
Harry Walsh,
Ozge Mercanoglu Sincan,
Ben Saunders,
Richard Bowden
Abstract:
Capturing and annotating Sign language datasets is a time consuming and costly process. Current datasets are orders of magnitude too small to successfully train unconstrained \acf{slt} models. As a result, research has turned to TV broadcast content as a source of large-scale training data, consisting of both the sign language interpreter and the associated audio subtitle. However, lack of sign la…
▽ More
Capturing and annotating Sign language datasets is a time consuming and costly process. Current datasets are orders of magnitude too small to successfully train unconstrained \acf{slt} models. As a result, research has turned to TV broadcast content as a source of large-scale training data, consisting of both the sign language interpreter and the associated audio subtitle. However, lack of sign language annotation limits the usability of this data and has led to the development of automatic annotation techniques such as sign spotting. These spottings are aligned to the video rather than the subtitle, which often results in a misalignment between the subtitle and spotted signs. In this paper we propose a method for aligning spottings with their corresponding subtitles using large spoken language models. Using a single modality means our method is computationally inexpensive and can be utilized in conjunction with existing alignment techniques. We quantitatively demonstrate the effectiveness of our method on the \acf{mdgs} and \acf{bobsl} datasets, recovering up to a 33.22 BLEU-1 score in word alignment.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Multicritical Schur measures and higher-order analogues of the Tracy-Widom distribution
Authors:
Dan Betea,
Jérémie Bouttier,
Harriet Walsh
Abstract:
We introduce multicritical Schur measures, which are probability laws on integer partitions which give rise to non-generic fluctuations at their edge. They are in the same universality classes as one-dimensional momentum-space models of free fermions in flat confining potentials, studied by Le Doussal, Majumdar and Schehr. These universality classes involve critical exponents of the form 1/(2m+1),…
▽ More
We introduce multicritical Schur measures, which are probability laws on integer partitions which give rise to non-generic fluctuations at their edge. They are in the same universality classes as one-dimensional momentum-space models of free fermions in flat confining potentials, studied by Le Doussal, Majumdar and Schehr. These universality classes involve critical exponents of the form 1/(2m+1), with m a positive integer, and asymptotic distributions given by Fredholm determinants constructed from higher order Airy kernels, extending the generic Tracy-Widom GUE distribution recovered for m=1. We also compute limit shapes for the multicritical Schur measures, discuss the finite temperature setting, and exhibit an exact map** to the multicritical unitary matrix models previously encountered by Periwal and Shevitz.
△ Less
Submitted 27 January, 2024; v1 submitted 9 July, 2023;
originally announced July 2023.
-
Changing the Representation: Examining Language Representation for Neural Sign Language Production
Authors:
Harry Walsh,
Ben Saunders,
Richard Bowden
Abstract:
Neural Sign Language Production (SLP) aims to automatically translate from spoken language sentences to sign language videos. Historically the SLP task has been broken into two steps; Firstly, translating from a spoken language sentence to a gloss sequence and secondly, producing a sign language video given a sequence of glosses. In this paper we apply Natural Language Processing techniques to the…
▽ More
Neural Sign Language Production (SLP) aims to automatically translate from spoken language sentences to sign language videos. Historically the SLP task has been broken into two steps; Firstly, translating from a spoken language sentence to a gloss sequence and secondly, producing a sign language video given a sequence of glosses. In this paper we apply Natural Language Processing techniques to the first step of the SLP pipeline. We use language models such as BERT and Word2Vec to create better sentence level embeddings, and apply several tokenization techniques, demonstrating how these improve performance on the low resource translation task of Text to Gloss. We introduce Text to HamNoSys (T2H) translation, and show the advantages of using a phonetic representation for sign language translation rather than a sign level gloss representation. Furthermore, we use HamNoSys to extract the hand shape of a sign and use this as additional supervision during training, further increasing the performance on T2H. Assembling best practise, we achieve a BLEU-4 score of 26.99 on the MineDGS dataset and 25.09 on PHOENIX14T, two new state-of-the-art baselines.
△ Less
Submitted 16 September, 2022;
originally announced October 2022.
-
Random partitions under the Plancherel-Hurwitz measure, high genus Hurwitz numbers and maps
Authors:
Guillaume Chapuy,
Baptiste Louf,
Harriet Walsh
Abstract:
We study the asymptotic behaviour of random integer partitions under a new probability law that we introduce, the Plancherel-Hurwitz measure. This distribution, which has a natural definition in terms of Young tableaux, is a deformation of the classical Plancherel measure which appears naturally in the context of Hurwitz numbers, enumerating certain transposition factorisations in symmetric groups…
▽ More
We study the asymptotic behaviour of random integer partitions under a new probability law that we introduce, the Plancherel-Hurwitz measure. This distribution, which has a natural definition in terms of Young tableaux, is a deformation of the classical Plancherel measure which appears naturally in the context of Hurwitz numbers, enumerating certain transposition factorisations in symmetric groups.
We study a regime in which the number of factors in the underlying factorisations grows linearly with the order of the group, and the corresponding topological objects, Hurwitz maps, are of high genus. We prove that the limiting behaviour exhibits a new, twofold, phenomenon: the first part becomes very large, while the rest of the partition has the standard Vershik-Kerov-Logan-Shepp limit shape. As a consequence, we obtain asymptotic estimates for unconnected Hurwitz numbers with linear Euler characteristic, which we use to study random Hurwitz maps in this regime. This result can also be interpreted as the return probability of the transposition random walk on the symmetric group after linearly many steps.
△ Less
Submitted 3 August, 2023; v1 submitted 22 June, 2022;
originally announced June 2022.
-
Multicritical random partitions
Authors:
Dan Betea,
Jérémie Bouttier,
Harriet Walsh
Abstract:
We study two families of probability measures on integer partitions, which are Schur measures with parameters tuned in such a way that the edge fluctuations are characterized by a critical exponent different from the generic $1/3$. We find that the first part asymptotically follows a "higher-order analogue" of the Tracy-Widom GUE distribution, previously encountered by Le Doussal, Majumdar and Sch…
▽ More
We study two families of probability measures on integer partitions, which are Schur measures with parameters tuned in such a way that the edge fluctuations are characterized by a critical exponent different from the generic $1/3$. We find that the first part asymptotically follows a "higher-order analogue" of the Tracy-Widom GUE distribution, previously encountered by Le Doussal, Majumdar and Schehr in quantum statistical physics. We also compute limit shapes, and discuss an exact map** between one of our families and the multicritical unitary matrix models introduced by Periwal and Shevitz.
△ Less
Submitted 16 April, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
On $T\bar{T}$ deformations and supersymmetry
Authors:
Marco Baggio,
Alessandro Sfondrini,
Gabriele Tartaglino-Mazzucchelli,
Harriet Walsh
Abstract:
We investigate the "$T\bar{T}$" deformations of two-dimensional supersymmetric quantum field theories. More precisely, we show that, by using the conservation equations for the supercurrent multiplet, the $T\bar{T}$ deforming operator can be constructed as a supersymmetric descendant. Here we focus on $\mathcal{N}=(1,0)$ and $\mathcal{N}=(1,1)$ supersymmetry. As an example, we analyse in detail th…
▽ More
We investigate the "$T\bar{T}$" deformations of two-dimensional supersymmetric quantum field theories. More precisely, we show that, by using the conservation equations for the supercurrent multiplet, the $T\bar{T}$ deforming operator can be constructed as a supersymmetric descendant. Here we focus on $\mathcal{N}=(1,0)$ and $\mathcal{N}=(1,1)$ supersymmetry. As an example, we analyse in detail the $T\bar{T}$ deformation of a free $\mathcal{N}=(1,0)$ supersymmetric action. We also argue that the link between $T\bar{T}$ and string theory can be extended to superstrings: by analysing the light-cone gauge fixing for superstrings in flat space, we show the correspondence of the string action to the $T\bar{T}$ deformation of a free theory of eight $\mathcal{N}=(1,1)$ scalar multiplets on the nose. We comment on how these constructions relate to the geometrical interpretations of $T\bar{T}$ deformations that have recently been discussed in the literature.
△ Less
Submitted 13 June, 2019; v1 submitted 1 November, 2018;
originally announced November 2018.