Skip to main content

Showing 1–17 of 17 results for author: San, N

.
  1. arXiv:2406.16746  [pdf, other

    cs.LG cs.AI cs.CL

    The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

    Authors: Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini

    Abstract: Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet: a growing collection of 250+ tools and resources spanning text, vision, and speech modalities. We draw on a large body of prior work to survey resources (e.g. software, documentation,… ▽ More

    Submitted 25 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2402.02302  [pdf, other

    eess.AS cs.CL

    Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens

    Authors: Nay San, Georgios Paraskevopoulos, Aryaman Arora, Xiluo He, Prabhjot Kaur, Oliver Adams, Dan Jurafsky

    Abstract: While massively multilingual speech models like wav2vec 2.0 XLSR-128 can be directly fine-tuned for automatic speech recognition (ASR), downstream performance can still be relatively poor on languages that are under-represented in the pre-training data. Continued pre-training on 70-200 hours of untranscribed speech in these languages can help -- but what about languages without that much recorded… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted for SIGTYP2024

  3. arXiv:2306.06086  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Develo** Speech Processing Pipelines for Police Accountability

    Authors: Anjalie Field, Prateek Verma, Nay San, Jennifer L. Eberhardt, Dan Jurafsky

    Abstract: Police body-worn cameras have the potential to improve accountability and transparency in policing. Yet in practice, they result in millions of hours of footage that is never reviewed. We investigate the potential of large pre-trained speech models for facilitating reviews, focusing on ASR and officer speech detection in footage from traffic stops. Our proposed pipeline includes training data alig… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: Accepted to INTERSPEECH 2023

  4. arXiv:2305.10951  [pdf, other

    cs.CL eess.AS

    Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation

    Authors: Martijn Bartelds, Nay San, Bradley McDonnell, Dan Jurafsky, Martijn Wieling

    Abstract: The performance of automatic speech recognition (ASR) systems has advanced substantially in recent years, particularly for languages for which a large amount of transcribed speech is available. Unfortunately, for low-resource languages, such as minority languages, regional languages or dialects, ASR performance generally remains much lower. In this study, we investigate whether data augmentation t… ▽ More

    Submitted 18 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  5. arXiv:2302.04975  [pdf, other

    cs.CL

    Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions

    Authors: Nay San, Martijn Bartelds, Blaine Billings, Ella de Falco, Hendi Feriza, Johan Safri, Wawan Sahrozi, Ben Foley, Bradley McDonnell, Dan Jurafsky

    Abstract: Recent research using pre-trained transformer models suggests that just 10 minutes of transcribed speech may be enough to fine-tune such a model for automatic speech recognition (ASR) -- at least if we can also leverage vast amounts of text data (803 million tokens). But is that much text data necessary? We study the use of different amounts of text data, both for creating a lexicon that constrain… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted for ComputEL-6

  6. arXiv:2302.01669  [pdf, other

    quant-ph

    All-coupling solution for the continuous polaron problem in the Schrödinger representation

    Authors: I. D. Feranchuk, N. Q. San, O. D. Skoromnik

    Abstract: The solution for the large-radius Fröhlich polaron in the Schrödinger representation of the quantum theory is constructed in the entire range of variation of the coupling constant. The energy and the effective mass of the polaron are calculated by simple algebraic transformations and are analogous to the results found by Feynman on the basis of the variational principle for the path-integrals of t… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: 6 pages, 3 figures

  7. Superradiant parametric X-ray emission

    Authors: I. D. Feranchuk, N. Q. San, O. D. Skoromnik

    Abstract: We compute a spectrum of parametric X-ray radiation (PXR) inside a crystal from a bunch of electrons, which is periodically modulated in density. We consider that the bunch of electrons is exiting from a XFEL channel. We demonstrate that in the case of a resonance between the frequency of parametric X-ray radiation and a frequency of modulation of an electron bunch the sequence of strong quasi-mon… ▽ More

    Submitted 19 June, 2022; originally announced June 2022.

    Comments: 11 pages, 4 figures

  8. arXiv:2204.07272  [pdf, other

    cs.CL cs.SD eess.AS

    Automated speech tools for hel** communities process restricted-access corpora for language revival efforts

    Authors: Nay San, Martijn Bartelds, Tolúlopé Ògúnrèmí, Alison Mount, Ruben Thompson, Michael Higgins, Roy Barker, Jane Simpson, Dan Jurafsky

    Abstract: Many archival recordings of speech from endangered languages remain unannotated and inaccessible to community members and language learning programs. One bottleneck is the time-intensive nature of annotation. An even narrower bottleneck occurs for recordings with access constraints, such as language that must be vetted or filtered by authorised community members before annotation can begin. We pro… ▽ More

    Submitted 24 April, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted at ComputEL-5

  9. Eigenstates of two-level systems in a single-mode quantum field: from quantum Rabi model to $N$-atom Dicke model

    Authors: A. U. Leonau, N. Q. San, A. P. Ulyanenkov, O. D. Skoromnik, I. D. Feranchuk

    Abstract: In the present paper we show that the Hamiltonian describing the resonant interaction of $N$ two-level systems with a single-mode electromagnetic quantum field in the Coulomb gauge can be diagonalized with a high degree of accuracy using a simple basis set of states. This allows one to find an analytical approximation for the eigenvectors and eigenvalues of the system, which interpolates the numer… ▽ More

    Submitted 7 February, 2022; originally announced February 2022.

  10. arXiv:2104.01176  [pdf

    cs.CY q-fin.GN

    Trends in eBusiness and eGovernment

    Authors: Antonio Sánchez-Bayón, Miguel Ángel García-Ramos Lucero, Annie Ng Cheng San, Choy Johnn Yee, Krishna Moorthy, Alex Foo Tun Lee, Angelita Kithatu-Kiwekete, Shikha Vyas-Doorgapersad, Anthony Kiryagana Isabirye, Nobukhosi Dlodlo, Lydia Mbati, Edmore Tarambiwa, Chengedzai Mafini, Anastas Djurovski, Ephrem Habtemichael Redda, Jhalukpreya Surujlal

    Abstract: The first chapter is a critical review and a case study in eBusiness, with special attention to the digital currencies resource and its possibilities. 2. chapter attempts to incorporate the UTAUT model with perceived risk theory to explore its impact on the intention to use m-government services. 3. chapter aims to assess the level of gender inclusivity in the municipal e-procurement processes in… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

  11. arXiv:2103.14583  [pdf, other

    cs.CL cs.SD eess.AS

    Leveraging pre-trained representations to improve access to untranscribed speech from endangered languages

    Authors: Nay San, Martijn Bartelds, Mitchell Browne, Lily Clifford, Fiona Gibson, John Mansfield, David Nash, Jane Simpson, Myfany Turpin, Maria Vollmer, Sasha Wilmoth, Dan Jurafsky

    Abstract: Pre-trained speech representations like wav2vec 2.0 are a powerful tool for automatic speech recognition (ASR). Yet many endangered languages lack sufficient data for pre-training such models, or are predominantly oral vernaculars without a standardised writing system, precluding fine-tuning. Query-by-example spoken term detection (QbE-STD) offers an alternative for iteratively indexing untranscri… ▽ More

    Submitted 13 September, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: Accepted at ASRU 2021

  12. Radiation induced interaction potential of two qubits strongly coupled with a quantized electromagnetic field

    Authors: I. D. Feranchuk, N. Q. San, A. U. Leonau, O. D. Skoromnik

    Abstract: We investigate the interaction of two two-level qubits with a single mode quantum field in a cavity without rotating wave approximation and considering that qubits can be located at an arbitrary distance from each other. We demonstrate that there exists a radiation induced interaction potential between atoms. We studied the properties of the system numerically and in addition constructed a simple… ▽ More

    Submitted 4 September, 2020; v1 submitted 31 May, 2020; originally announced June 2020.

    Journal ref: Phys. Rev. A 102, 043702 (2020)

  13. arXiv:2002.03702  [pdf, other

    quant-ph

    Exact solution for the quantum Rabi model with the $\boldsymbol{\mathsf{A}}^{2}$ term

    Authors: I. D. Feranchuk, N. Q. San, A. U. Leonau, O. D. Skoromnik

    Abstract: Quantum Rabi model (QRM) is widely used for the analysis of the radiation-matter interaction at the fundamental level in cavity quantum electrodynamics. Typically the QRM Hamiltonian includes only $\boldsymbol{\mathsf{p}} \cdot \boldsymbol{\mathsf{A}}$ term, however, the complete nonrelativistic Hamiltonian of quantum electrodynamics includes $\boldsymbol{\mathsf{A}}^{2}$ term as well. Here we fin… ▽ More

    Submitted 18 February, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: 7 pages, 4 figures

  14. arXiv:math/0405430  [pdf, ps, other

    math.SG math.DG

    A singular Poincare lemma

    Authors: Eva Miranda, Vu Ngoc San

    Abstract: We prove a Poincare lemma for a set of r smooth functions on a 2n-dimensional smooth manifold satisfying a commutation relation determined by r singular vector fields associated to a Cartan subalgebra of $\frak{sp}(2r,\mathbb R)$. This result has a natural interpretation in terms of the cohomology associated to the infinitesimal deformation of a completely integrable system.

    Submitted 23 May, 2004; originally announced May 2004.

    Comments: 18 pages

    MSC Class: 37G05; 53D20; 57R70; 70H06

    Journal ref: final version at Int Math Res Notices, n 1, 27-46, 2005

  15. arXiv:math/0306392  [pdf, ps, other

    math.DS math.SG nlin.SI

    Vanishing Twist near Focus-Focus Points

    Authors: Holger R. Dullin, Vu Ngoc San

    Abstract: We show that near a focus-focus point in a Liouville integrable Hamiltonian system with two degrees of freedom lines of locally constant rotation number in the image of the energy-momentum map are spirals determined by the eigenvalue of the equilibrium. From this representation of the rotation number we derive that the twist condition for the isoenergetic KAM condition vanishes on a curve in the… ▽ More

    Submitted 27 June, 2003; originally announced June 2003.

    Comments: 13 pages

    MSC Class: 37J35; 37J15; 37J40; 70H06; 70H08; 37G20

    Journal ref: Nonlinearity, 17:1777--1786, 2004

  16. arXiv:math/9810137  [pdf, ps, other

    math.AP math-ph math.SP

    Bohr-Sommerfeld conditions for Integrable Systems with critical manifolds of focus-focus type

    Authors: Vu Ngoc San

    Abstract: We present a detailed study, in the semi-classical regime $h \to 0$, of microlocal properties of systems of two commuting h-PDO s $P_1(h)$, $P_2(h)$ such that the joint principal symbol $p=(p_1,p_2)$ has a special kind of singularity called a "focus-focus" singularity. Typical examples include the quantum spherical pendulum or the quantum Champagne bottle. In the spirit of Colin de Verdière an… ▽ More

    Submitted 22 October, 1998; originally announced October 1998.

    Comments: 70 pages, 12 figures (prefer the .ps file) \usepackage{amsfonts,amssymb,euscript,a4,epsfig} preprint Institut Fourier/Utrecht Univ

    Report number: IF 433/UU 1076 MSC Class: 34C20; 34E20; 35P20; 57R70; 58F07; 81Q20

  17. arXiv:math/9803027  [pdf, ps, other

    math.AP math.DS

    Formes normales semi-classiques des systemes completement integrables au voisinage d'un point critique de l'application moment

    Authors: Vu Ngoc San

    Abstract: The semi-classical study of a 1-dimensional Schrödinger operator near a non-degenerate maximum of the potential has lead Colin de Verdière and Parisse to prove a microlocal normal form theorem for any 1-dimensional pseudo-differential operator with the same kind of singularity. We present here a generalization of this result to pseudo-differential integrable systems of any finite degree of freed… ▽ More

    Submitted 9 March, 1998; originally announced March 1998.

    Comments: 1 figure, 28 pages, in french uses isolatin1.sty

    Report number: IF-377 MSC Class: 58F05; 57R70; 58G15; 58F36; 34C20; 81Q20; 81S05