Skip to main content

Showing 1–16 of 16 results for author: Morioka, N

.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2402.18932  [pdf, other

    eess.AS cs.SD

    Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

    Authors: Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov

    Abstract: Collecting high-quality studio recordings of audio is challenging, which limits the language coverage of text-to-speech (TTS) systems. This paper proposes a framework for scaling a multilingual TTS model to 100+ languages using found data without supervision. The proposed framework combines speech-text encoder pretraining with unsupervised training using untranscribed speech and unspoken text data… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: To appear in ICASSP 2024

  3. arXiv:2311.00945  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    E3 TTS: Easy End-to-End Diffusion-based Text to Speech

    Authors: Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen

    Abstract: We propose Easy End-to-End Diffusion-based Text to Speech, a simple and efficient end-to-end text-to-speech model based on diffusion. E3 TTS directly takes plain text as input and generates an audio waveform through an iterative refinement process. Unlike many prior work, E3 TTS does not rely on any intermediate representations like spectrogram features or alignment information. Instead, E3 TTS mo… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted by ASRU 2023

  4. arXiv:2305.18802  [pdf, other

    eess.AS cs.SD

    LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

    Authors: Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna

    Abstract: This paper introduces a new speech dataset called ``LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz sampling rate from 2,456 speakers and the corresponding texts. The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved.… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

  5. arXiv:2303.01664  [pdf, other

    cs.SD cs.LG eess.AS

    Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

    Authors: Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

    Abstract: Speech restoration (SR) is a task of converting degraded speech signals into high-quality ones. In this study, we propose a robust SR model called Miipher, and apply Miipher to a new SR application: increasing the amount of high-quality training data for speech generation by converting speech samples collected from the Web to studio-quality. To make our SR model robust against various degradation,… ▽ More

    Submitted 14 August, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted to WASPAA 2023

  6. arXiv:2210.15868  [pdf, other

    cs.SD cs.CL eess.AS

    Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation

    Authors: Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding

    Abstract: Adapting a neural text-to-speech (TTS) model to a target speaker typically involves fine-tuning most if not all of the parameters of a pretrained multi-speaker backbone model. However, serving hundreds of fine-tuned neural TTS models is expensive as each of them requires significant footprint and separate computational resources (e.g., accelerators, memory). To scale speaker adapted neural TTS voi… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP 2023

  7. arXiv:2210.15447  [pdf, other

    cs.SD cs.CL eess.AS

    Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

    Authors: Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

    Abstract: This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models. Existing multilingual TTS typically supports tens of languages, which are a small fraction of the thousands of languages in the world. One difficulty to scale multilingual TTS to hundreds of languages is collecting high-quality speech-text paired da… ▽ More

    Submitted 15 March, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: To appear in ICASSP 2023

  8. arXiv:2203.13339  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation

    Authors: Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobuyuki Morioka

    Abstract: End-to-end speech-to-speech translation (S2ST) without relying on intermediate text representations is a rapidly emerging frontier of research. Recent works have demonstrated that the performance of such direct S2ST systems is approaching that of conventional cascade S2ST when trained on comparable datasets. However, in practice, the performance of direct S2ST is bounded by the availability of pai… ▽ More

    Submitted 27 June, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: Interspeech 2022

  9. Spin-optical dynamics and quantum efficiency of single V1 center in silicon carbide

    Authors: Naoya Morioka, Di Liu, Öney O. Soykal, Izel Gediz, Charles Babin, Rainer Stöhr, Takeshi Ohshima, Nguyen Tien Son, Jawad Ul-Hassan, Florian Kaiser, Jörg Wrachtrup

    Abstract: Color centers in silicon carbide are emerging candidates for distributed spin-based quantum applications due to the scalability of host materials and the demonstration of integration into nanophotonic resonators. Recently, silicon vacancy centers in silicon carbide have been identified as a promising system with excellent spin and optical properties. Here, we in-depth study the spin-optical dynami… ▽ More

    Submitted 19 April, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: 32 pages, 12 figures

  10. Nanofabricated and integrated colour centres in silicon carbide with high-coherence spin-optical properties

    Authors: Charles Babin, Rainer Stöhr, Naoya Morioka, Tobias Linkewitz, Timo Steidl, Raphael Wörnle, Di Liu, Erik Hesselmeier, Vadim Vorobyov, Andrej Denisenko, Mario Hentschel, Christian Gobert, Patrick Berwian, Georgy V. Astakhov, Wolfgang Knolle, Sridhar Majety, Pranta Saha, Marina Radulaski, Nguyen Tien Son, Jawad Ul-Hassan, Florian Kaiser, Jörg Wrachtrup

    Abstract: Optically addressable spin defects in silicon carbide (SiC) are an emerging platform for quantum information processing. Lending themselves to modern semiconductor nanofabrication, they promise scalable high-efficiency spin-photon interfaces. We demonstrate here nanoscale fabrication of silicon vacancy centres (VSi) in 4H-SiC without deterioration of their intrinsic spin-optical properties. In par… ▽ More

    Submitted 29 September, 2021; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: 18 pages, 4 figures

    Journal ref: Nature Materials 21, 67-73 (2022)

  11. arXiv:2003.12591  [pdf, other

    quant-ph physics.optics

    Spectrally reconfigurable quantum emitters enabled by optimized fast modulation

    Authors: Daniil M. Lukin, Alexander D. White, Rahul Trivedi, Melissa A. Guidry, Naoya Morioka, Charles Babin, Öney O. Soykal, Jawad Ul Hassan, Nguyen Tien Son, Takeshi Ohshima, Praful K. Vasireddy, Mamdouh H. Nasr, Shuo Sun, Jean-Phillipe W. MacLean, Constantin Dory, Emilio A. Nanni, Jörg Wrachtrup, Florian Kaiser, Jelena Vučković

    Abstract: The ability to shape photon emission facilitates strong photon-mediated interactions between disparate physical systems, thereby enabling applications in quantum information processing, simulation and communication. Spectral control in solid state platforms such as color centers, rare earth ions, and quantum dots is particularly attractive for realizing such applications on-chip. Here we propose t… ▽ More

    Submitted 27 July, 2020; v1 submitted 27 March, 2020; originally announced March 2020.

    Comments: 9 pages, 6 figures; Supplementary Information

    Journal ref: npj Quantum Inf 6, 80 (2020)

  12. arXiv:2001.02459  [pdf, ps, other

    quant-ph cond-mat.mtrl-sci physics.atom-ph

    Vibronic states and their effect on the temperature and strain dependence of silicon-vacancy qubits in 4H silicon carbide

    Authors: Péter Udvarhelyi, Gergő Thiering, Naoya Morioka, Charles Babin, Florian Kaiser, Daniil Lukin, Takeshi Ohshima, Jawad Ul-Hassan, Nguyen Tien Son, Jelena Vučković, Jörg Wrachtrup, Adam Gali

    Abstract: Silicon-vacancy qubits in silicon carbide (SiC) are emerging tools in quantum technology applications due to their excellent optical and spin properties. In this paper, we explore the effect of temperature and strain on these properties by focusing on the two silicon-vacancy qubits, V1 and V2, in 4H SiC. We apply density functional theory beyond the Born-Oppenheimer approximation to describe the t… ▽ More

    Submitted 18 April, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

    Comments: 8 pages, 5 figures

    Journal ref: Phys. Rev. Applied 13, 054017 (2020)

  13. Spin-controlled generation of indistinguishable and distinguishable photons from silicon vacancy centres in silicon carbide

    Authors: Naoya Morioka, Charles Babin, Roland Nagy, Izel Gediz, Erik Hesselmeier, Di Liu, Matthew Joliffe, Matthias Niethammer, Durga Dasari, Vadim Vorobyov, Roman Kolesov, Rainer Stöhr, Jawad Ul-Hassan, Nguyen Tien Son, Takeshi Ohshima, Péter Udvarhelyi, Gergő Thiering, Adam Gali, Jörg Wrachtrup, Florian Kaiser

    Abstract: Quantum systems combining indistinguishable photon generation and spin-based quantum information processing are essential for remote quantum applications and networking. However, identification of suitable systems in scalable platforms remains a challenge. Here, we investigate the silicon vacancy centre in silicon carbide and demonstrate controlled emission of indistinguishable and distinguishable… ▽ More

    Submitted 10 January, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

    Comments: Manuscript and Methods: 21 pages, 4 figures Supplementary Information: 18 pages, 6 figures, 1 table

    Journal ref: Nature Communications 11, 2516 (2020)

  14. arXiv:1906.05964  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci quant-ph

    Electrical charge state manipulation of single silicon vacancies in a silicon carbide quantum optoelectronic device

    Authors: Matthias Widmann, Matthias Niethammer, Dmitry Yu. Fedyanin, Igor A. Khramtsov, Torsten Rendler, Ian D. Booker, Jawad Ul Hassan, Naoya Morioka, Yu-Chen Chen, Ivan G. Ivanov, Nguyen Tien Son, Takeshi Ohshima, Michel Bockstedte, Adam Gali, Cristian Bonato, Sang-Yun Lee, Jörg Wrachtrup

    Abstract: Colour centres with long-lived spins are established platforms for quantum sensing and quantum information applications. Colour centres exist in different charge states, each of them with distinct optical and spin properties. Application to quantum technology requires the capability to access and stabilize charge states for each specific task. Here, we investigate charge state manipulation of indi… ▽ More

    Submitted 23 June, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

  15. arXiv:1903.12236  [pdf, other

    cond-mat.mes-hall physics.app-ph quant-ph

    Coherent electrical readout of defect spins in 4H-SiC by photo-ionization at ambient conditions

    Authors: Matthias Niethammer, Matthias Widmann, Torsten Rendler, Naoya Morioka, Yu-Chen Chen, Rainer Stöhr, Jawad Ul Hassan, Shinobu Onoda, Takeshi Ohshima, Sang-Yun Lee, Amlan Mukherjee, Junichi Isoya, Nguyen Tien Son, Jörg Wrachtrup

    Abstract: Quantum technology relies on proper hardware, enabling coherent quantum state control as well as efficient quantum state readout. In this regard, wide-bandgap semiconductors are an emerging material platform with scalable wafer fabrication methods, hosting several promising spin-active point defects. Conventional readout protocols for such defect spins rely on fluorescence detection and are limite… ▽ More

    Submitted 28 March, 2019; originally announced March 2019.

    Journal ref: Nature Communications vol 10, 5569 (2019)

  16. arXiv:1812.04284  [pdf

    cond-mat.mes-hall physics.app-ph quant-ph

    Laser writing of scalable single colour centre in silicon carbide

    Authors: Yu-Chen Chen, Patrick S. Salter, Matthias Niethammer, Matthias Widmann, Florian Kaiser, Roland Nagy, Naoya Morioka, Charles Babin, J ürgen Erlekampf, Patrick Berwian, Martin Booth, J örg Wrachtrup

    Abstract: Single photon emitters in silicon carbide (SiC) are attracting attention as quantum photonic systems. However, to achieve scalable devices it is essential to generate single photon emitters at desired locations on demand. Here we report the controlled creation of single silicon vacancy ($V_{Si}$) centres in 4H-SiC using laser writing without any post-annealing process. Due to the aberration correc… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.