Skip to main content

Showing 1–28 of 28 results for author: Udagawa, T

.
  1. arXiv:2405.10725  [pdf, other

    cs.CL cs.IR

    INDUS: Effective and Efficient Language Models for Scientific Applications

    Authors: Bishwaranjan Bhattacharjee, Aashka Trivedi, Masayasu Muraoka, Muthukumaran Ramasubramanian, Takuma Udagawa, Iksha Gurung, Rong Zhang, Bharath Dandala, Rahul Ramachandran, Manil Maskey, Kaylin Bugbee, Mike Little, Elizabeth Fancher, Lauren Sanders, Sylvain Costes, Sergi Blanco-Cuaresma, Kelly Lockhart, Thomas Allen, Felix Grezes, Megan Ansdell, Alberto Accomazzi, Yousef El-Kurdi, Davis Wertheimer, Birgit Pfitzmann, Cesar Berrospi Ramis , et al. (9 additional authors not shown)

    Abstract: Large language models (LLMs) trained on general domain corpora showed remarkable results on natural language processing (NLP) tasks. However, previous research demonstrated LLMs trained using domain-focused corpora perform better on specialized tasks. Inspired by this pivotal insight, we developed INDUS, a comprehensive suite of LLMs tailored for the Earth science, biology, physics, heliophysics,… ▽ More

    Submitted 20 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  2. arXiv:2311.02353  [pdf, ps, other

    math-ph hep-th math.DG

    Solutions of the tt*-equations constructed from the SU(2)$_k$-fusion ring, and Smyth potentials

    Authors: Tadashi Udagawa

    Abstract: Cecotti and Vafa introduced the tt*-equation (topological-antitopological fusion equation), whose solutions describe massive deformations of supersymmetric conformal field theories. We describe some solutions of the tt*-equation constructed from the SU(2)$_k$-fusion algebra. The idea of the construction is due to Cecotti and Vafa, but we give a precise mathematical formulation and a description of… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: 21 pages

    MSC Class: 53Z05; 17B80

  3. arXiv:2310.13331  [pdf, ps, other

    math.DG

    Globality of the DPW construction for Smyth potentials in the case of SU(1,1)

    Authors: Tadashi Udagawa

    Abstract: We construct harmonic maps into SU(1,1)/U(1) satrting from Smyth potentials ξ, by the DPW method, In this method, harmonic maps are obtained from the Iwasawa factorization of a solution L of L^{-1} dL = ξ. However, the Iwasawa factorization in the case of a noncompact group is not always global. We show that L can be expressed in terms of Bessel functions and from the asymptotic expansion of Besse… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 24 pages

    MSC Class: 53A10; 35Q15

  4. arXiv:2310.08797  [pdf, other

    cs.CL cs.AI

    A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models

    Authors: Takuma Udagawa, Aashka Trivedi, Michele Merler, Bishwaranjan Bhattacharjee

    Abstract: Large language models have become a vital component in modern NLP, achieving state of the art performance in a variety of tasks. However, they are often inefficient for real-world deployment due to their expensive inference costs. Knowledge distillation is a promising technique to improve their efficiency while retaining most of their effectiveness. In this paper, we reproduce, compare and analyze… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Industry Track

  5. arXiv:2309.04031  [pdf, other

    cs.CL cs.SD eess.AS

    Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

    Authors: Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Masayasu Muraoka, George Saon

    Abstract: Transferring the knowledge of large language models (LLMs) is a promising technique to incorporate linguistic knowledge into end-to-end automatic speech recognition (ASR) systems. However, existing works only transfer a single representation of LLM (e.g. the last layer of pretrained BERT), while the representation of a text is inherently non-unique and can be obtained variously from different laye… ▽ More

    Submitted 25 December, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024

  6. arXiv:2303.09639  [pdf, other

    cs.CL

    Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models

    Authors: Aashka Trivedi, Takuma Udagawa, Michele Merler, Rameswar Panda, Yousef El-Kurdi, Bishwaranjan Bhattacharjee

    Abstract: Large pretrained language models have achieved state-of-the-art results on a variety of downstream tasks. Knowledge Distillation (KD) into a smaller student model addresses their inefficiency, allowing for deployment in resource-constrained environments. However, KD can be ineffective when the student is manually selected from a set of existing options, since it can be a sub-optimal choice within… ▽ More

    Submitted 13 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: 11 pages, 5 figures

  7. arXiv:2301.13352  [pdf, other

    cs.CL cs.AI

    Sentence Identification with BOS and EOS Label Combinations

    Authors: Takuma Udagawa, Hiroshi Kanayama, Issei Yoshida

    Abstract: The sentence is a fundamental unit in many NLP applications. Sentence segmentation is widely used as the first preprocessing task, where an input text is split into consecutive sentences considering the end of the sentence (EOS) as their boundaries. This task formulation relies on a strong assumption that the input text consists only of sentences, or what we call the sentential units (SUs). Howeve… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted to EACL 2023 (Findings)

  8. arXiv:2211.13904  [pdf, other

    cs.LG

    Policy-Adaptive Estimator Selection for Off-Policy Evaluation

    Authors: Takuma Udagawa, Haruka Kiyohara, Yusuke Narita, Yuta Saito, Kei Tateno

    Abstract: Off-policy evaluation (OPE) aims to accurately evaluate the performance of counterfactual policies using only offline logged data. Although many estimators have been developed, there is no single estimator that dominates the others, because the estimators' accuracy can vary greatly depending on a given OPE task such as the evaluation policy, number of actions, and noise level. Thus, the data-drive… ▽ More

    Submitted 29 January, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: accepted at AAAI'23

  9. arXiv:2204.00212  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems

    Authors: Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Nobuyasu Itoh, George Saon

    Abstract: Large-scale language models (LLMs) such as GPT-2, BERT and RoBERTa have been successfully applied to ASR N-best rescoring. However, whether or how they can benefit competitive, near state-of-the-art ASR systems remains unexplored. In this study, we incorporate LLM rescoring into one of the most competitive ASR baselines: the Conformer-Transducer model. We demonstrate that consistent improvement is… ▽ More

    Submitted 18 August, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: Accepted to Interspeech 2022

  10. arXiv:2109.08621  [pdf, ps, other

    cs.AI

    Data-Driven Off-Policy Estimator Selection: An Application in User Marketing on An Online Content Delivery Service

    Authors: Yuta Saito, Takuma Udagawa, Kei Tateno

    Abstract: Off-policy evaluation (OPE) is the method that attempts to estimate the performance of decision making policies using historical data generated by different policies without conducting costly online A/B tests. Accurate OPE is essential in domains such as healthcare, marketing or recommender systems to avoid deploying poor performing policies, as such policies may hart human lives or destroy the us… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: presented at REVEAL workshop, RecSys2020

  11. arXiv:2108.13703  [pdf, other

    stat.ML cs.AI cs.LG

    Evaluating the Robustness of Off-Policy Evaluation

    Authors: Yuta Saito, Takuma Udagawa, Haruka Kiyohara, Kazuki Mogi, Yusuke Narita, Kei Tateno

    Abstract: Off-policy Evaluation (OPE), or offline evaluation in general, evaluates the performance of hypothetical policies leveraging only offline log data. It is particularly useful in applications where the online interaction involves high stakes and expensive setting such as precision medicine and recommender systems. Since many OPE estimators have been proposed and some of them have hyperparameters to… ▽ More

    Submitted 31 August, 2021; originally announced August 2021.

    Comments: Accepted at RecSys2021

  12. arXiv:2105.14207  [pdf, other

    cs.CL cs.AI

    Maintaining Common Ground in Dynamic Environments

    Authors: Takuma Udagawa, Akiko Aizawa

    Abstract: Common grounding is the process of creating and maintaining mutual understandings, which is a critical aspect of sophisticated human communication. While various task settings have been proposed in existing literature, they mostly focus on creating common ground under static context and ignore the aspect of maintaining them overtime under dynamic context. In this work, we propose a novel task sett… ▽ More

    Submitted 29 May, 2021; originally announced May 2021.

    Comments: Accepted at TACL; pre-MIT Press publication version

  13. arXiv:2010.03127  [pdf, other

    cs.CL cs.AI

    A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions

    Authors: Takuma Udagawa, Takato Yamazaki, Akiko Aizawa

    Abstract: Recent models achieve promising results in visually grounded dialogues. However, existing datasets often contain undesirable biases and lack sophisticated linguistic analyses, which make it difficult to understand how well current models recognize their precise linguistic structures. To address this problem, we make two design choices: first, we focus on OneCommon Corpus \citep{udagawa2019natural,… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: 16 pages, Findings of EMNLP 2020

  14. arXiv:1911.07588  [pdf, other

    cs.CL cs.AI

    An Annotated Corpus of Reference Resolution for Interpreting Common Grounding

    Authors: Takuma Udagawa, Akiko Aizawa

    Abstract: Common grounding is the process of creating, repairing and updating mutual understandings, which is a fundamental aspect of natural language conversation. However, interpreting the process of common grounding is a challenging task, especially under continuous and partially-observable context where complex ambiguity, uncertainty, partial understandings and misunderstandings are introduced. Interpre… ▽ More

    Submitted 18 November, 2019; originally announced November 2019.

    Comments: 9 pages, 7 figures, 6 tables, Accepted by AAAI 2020

  15. arXiv:1907.03399  [pdf, other

    cs.CL cs.AI

    A Natural Language Corpus of Common Grounding under Continuous and Partially-Observable Context

    Authors: Takuma Udagawa, Akiko Aizawa

    Abstract: Common grounding is the process of creating, repairing and updating mutual understandings, which is a critical aspect of sophisticated human communication. However, traditional dialogue systems have limited capability of establishing common ground, and we also lack task formulations which introduce natural difficulty in terms of common grounding while enabling easy evaluation and analysis of compl… ▽ More

    Submitted 8 July, 2019; originally announced July 2019.

    Comments: AAAI 2019

  16. arXiv:1701.00346  [pdf, ps, other

    cond-mat.stat-mech math-ph quant-ph

    Finite-size Gap, Magnetization, and Entanglement of Deformed Fredkin Spin Chain

    Authors: Takuma Udagawa, Hosho Katsura

    Abstract: We investigate ground- and excited-state properties of the deformed Fredkin spin chain proposed by Salberger, Zhang, Klich, Korepin, and the authors. This model is a one-parameter deformation of the Fredkin spin chain, whose Hamiltonian is $3$-local and translationally invariant in the bulk. The model is frustration-free and its unique ground state can be expressed as a weighted superposition of c… ▽ More

    Submitted 7 September, 2017; v1 submitted 2 January, 2017; originally announced January 2017.

    Comments: 16 pages, 5 figures. v2: Sec. 5.3 has been modified, references added v3: Sec. 4.2 has been modified

    Journal ref: J. Phys. A: Math. Theor. 50 (2017) 405002

  17. arXiv:1611.04983  [pdf, other

    cond-mat.stat-mech hep-th math-ph quant-ph

    Deformed Fredkin Spin Chain with Extensive Entanglement

    Authors: Olof Salberger, Takuma Udagawa, Zhao Zhang, Hosho Katsura, Israel Klich, Vladimir Korepin

    Abstract: We introduce a new spin chain which is a deformation of the Fredkin spin chain and has a phase transition between bounded and extensive entanglement entropy scaling. In this chain, spins have a local interaction of three nearest neighbors. The Hamiltonian is frustration-free and its ground state can be described analytically as a weighted superposition of Dyck paths. In the purely spin $1/2$ case,… ▽ More

    Submitted 15 November, 2016; originally announced November 2016.

    Journal ref: J. Stat. Mech. (2017) 063103

  18. arXiv:0807.1464   

    nucl-th

    Extended Optical Model Analyses of Elastic Scattering and Fusion Cross Section Data for the 9Be+28Si, 144Sm, and 208Pb Systems at Near-Coulomb-Barrier Energies using Double Folding Potential

    Authors: W. Y. So, T. Udagawa, K. S. Kim, S. W. Hong, B. T. Kim

    Abstract: Based on the extended optical model with the double folding potential, in which the polarization potential is decomposed into direct reaction (DR) and fusion parts, simultaneous $χ^{2}$ analyses are performed of elastic scattering and fusion cross section data for the $^{9}$Be+$^{28}$Si, $^{144}$Sm, and $^{208}$Pb systems at near-Coulomb-barrier energies. We find that the real part of the result… ▽ More

    Submitted 14 March, 2010; v1 submitted 9 July, 2008; originally announced July 2008.

    Comments: 28 pages, 6 figures, submitted to Physical Review C

  19. Extended Optical Model Analyses of Elastic Scattering and Fusion Cross Section Data for the $^{12}$C+$^{208}$Pb System at Near-Coulomb-Barrier Energies by using a Folding Potential

    Authors: W. Y. So, T. Udagawa, S. W. Hong, B. T. Kim

    Abstract: Simultaneous $χ^{2}$ analyses are performed for elastic scattering and fusion cross section data for the $^{12}$C+$^{208}$Pb system at near-Coulomb-barrier energies by using the extended optical model approach in which the polarization potential is decomposed into direct reaction (DR) and fusion parts. Use is made of the double folding potential as a bare potential. It is found that the experime… ▽ More

    Submitted 14 January, 2008; originally announced January 2008.

    Comments: 26 pages, 7 figures, submitted to Physical Review C

    Journal ref: Phys.Rev.C77:024609,2008

  20. Extended Optical Model Analyses of Elastic Scattering and Fusion Cross Section Data for the 7Li+208Pb System at Near-Coulomb-Barrier Energies using the Folding Potential

    Authors: W. Y. So, T. Udagawa, K. S. Kim, S. W. Hong, B. T. Kim

    Abstract: Simultaneous $χ^{2}$ analyses previously made for elastic scattering and fusion cross section data for the $^{6}$Li+$^{208}$Pb system is extended to the $^{7}$Li+$^{208}$Pb system at near-Coulomb-barrier energies based on the extended optical model approach, in which the polarization potential is decomposed into direct reaction (DR) and fusion parts. Use is made of the double folding potential a… ▽ More

    Submitted 5 June, 2007; originally announced June 2007.

    Comments: 7 figures, submitted to PRC

    Journal ref: Phys.Rev.C76:024613,2007

  21. Extended Optical Model Analyses of Elastic Scattering and Fusion Cross Sections for 6Li + 208Pb System at Near-Coulomb-Barrier Energies by using Folding Potential

    Authors: W. Y. So, T. Udagawa, K. S. Kim, S. W. Hong, B. T. Kim

    Abstract: Based on the extended optical model approach in which the polarization potential is decomposed into direct reaction (DR) and fusion parts, simultaneous $χ^{2}$ analyses are performed for elastic scattering and fusion cross section data for the $^{6}$Li+$^{208}$Pb system at near-Coulomb-barrier energies. A folding potential is used as the bare potential. It is found that the real part of the resu… ▽ More

    Submitted 13 December, 2006; originally announced December 2006.

    Comments: 6 figures

    Journal ref: Phys.Rev. C75 (2007) 024610

  22. Extended Optical Model Analyses of Elastic Scattering, Direct Reaction, and Fusion Cross Sections for the 9Be + 208Pb System at Near-Coulomb-Barrier Energies

    Authors: W. Y. So, S. W. Hong, B. T. Kim, T. Udagawa

    Abstract: Based on the extended optical model approach in which the polarization potential is decomposed into direct reaction (DR) and fusion parts, simultaneous $χ^{2}$ analyses are performed for elastic scattering, DR, and fusion cross section data for the $^{9}$Be+$^{208}$Pb system at near-Coulomb-barrier energies. Similar $χ^{2}$ analyses are also performed by only taking into account the elastic scat… ▽ More

    Submitted 27 September, 2005; originally announced September 2005.

    Comments: 5 figures

    Journal ref: Phys.Rev. C72 (2005) 064602

  23. arXiv:physics/0201066  [pdf, ps, other

    physics.atom-ph physics.comp-ph

    A Novel Method for the Solution of the Schroedinger Eq. in the Presence of Exchange Terms

    Authors: G. H. Rawitscher, S. Y. Kang, I. Koltracht, E. Zerrad, K. Zerrad, B. T. Kim, T. Udagawa

    Abstract: In the Hartree-Fock approximation the Pauli exclusion principle leads to a Schroedinger Eq. of an integro-differential form. We describe a new spectral noniterative method (S-IEM), previously developed for solving the Lippman-Schwinger integral equation with local potentials, which has now been extended so as to include the exchange nonlocality. We apply it to the restricted case of electron-Hyd… ▽ More

    Submitted 29 January, 2002; originally announced January 2002.

    Comments: 29 pages, 4 figures, submitted to Phys. Rev. A

  24. Simultaneous Optical Model Analyses of Elastic Scattering, Breakup, and Fusion Cross Section Data for the $^{6}$He + $^{209}$Bi System at Near-Coulomb-Barrier Energies

    Authors: B. T. Kim, W. Y. So, S. W. Hong, T. Udagawa

    Abstract: Based on an approach recently proposed by us, simultaneous $χ^{2}$-analyses are performed for elastic scattering, direct reaction (DR) and fusion cross sections data for the $^{6}$He+$^{209}$Bi system at near-Coulomb-barrier energies to determine the parameters of the polarization potential consisting of DR and fusion parts. We show that the data are well reproduced by the resultant potential, w… ▽ More

    Submitted 22 November, 2001; originally announced November 2001.

    Journal ref: Phys.Rev. C65 (2002) 044616

  25. Semi-classical Characters and Optical Model Description of Heavy Ion Scattering, Direct Reactions, and Fusion at Near-barrier Energies

    Authors: B. T. Kim, W. Y. So, S. W. Hong, T. Udagawa

    Abstract: An approach is proposed to calculate the direct reaction (DR) and fusion probabilities for heavy ion collisions at near-Coulomb-barrier energies as functions of the distance of closest approach D within the framework of the optical model that introduces two types of imaginary potentials, DR and fusion. The probabilities are calculated by using partial DR and fusion cross sections, together with… ▽ More

    Submitted 2 November, 2001; v1 submitted 1 November, 2001; originally announced November 2001.

    Journal ref: Phys.Rev. C65 (2002) 044607

  26. Can Doubly Strange Dibaryon Resonances be Discovered at RHIC?

    Authors: S. D. Paganis, G. W. Hoffmann, R. L. Ray, J. -L. Tang, T. Udagawa, R. S. Longacre

    Abstract: The baryon-baryon continuum invariant mass spectrum generated from relativistic nucleus + nucleus collision data may reveal the existence of doubly-strange dibaryons not stable against strong decay if they lie within a few MeV of threshold. Furthermore, since the dominant component of these states is a superposition of two color-octet clusters which can be produced intermediately in a color-deco… ▽ More

    Submitted 12 June, 2000; v1 submitted 8 October, 1999; originally announced October 1999.

    Comments: 28 pages, 5 figures, revised version

    Journal ref: Phys.Rev. C62 (2000) 024906

  27. Can only flavor-nonsinglet H dibaryons be stable against strong decays?

    Authors: Stathes D. Paganis, Takeshi Udagawa, G. W. Hoffmann, R. L. Ray

    Abstract: Using the QCD sum rule approach, we show that the flavor-nonsinglet $H$ dibaryon states with J$^π = 1^+$, J$^π = 0^+$, I=1 (27plet) are nearly degenerate with the J$^π = 0^+$, I=0 singlet $H_0$ dibaryon, which has been predicted to be stable against strong decay, but has not been observed. Our calculation, which does not require an instanton correction, suggests that the $H_0$ is slightly heavie… ▽ More

    Submitted 19 June, 1997; originally announced June 1997.

    Comments: 10 pages, 2 figures

    Journal ref: Phys.Rev.C56:570-573,1997

  28. Dam** mechanisms of the Delta resonance in nuclei

    Authors: B. Koerfgen, P. Oltmanns, F. Osterfeld, T. Udagawa

    Abstract: The dam** mechanisms of the Delta(1232) resonance in nuclei are studied by analyzing the quasi-free decay reactions 12C(pi+,pi+ p)11B and 12C(3He,t pi+ p)11B and the 2p emission reactions 12C(pi+,pp)10B and 12C(3He,t pp)10B. The coincidence cross sections are calculated within the framework of the isobar-hole model. It is found that the 2p emission process induced by the decay of the Delta res… ▽ More

    Submitted 17 December, 1996; originally announced December 1996.

    Comments: 9 pages, 5 Postscript figures, uses RevTex, psfig.sty. Accepted by Physical Review C

    Report number: IKP(Th)-96-23

    Journal ref: Phys.Rev.C55:1819-1825,1997