Skip to main content

Showing 1–22 of 22 results for author: Roberts, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2402.02292  [pdf

    physics.comp-ph eess.SP

    A 3-D Full-Wave Model to Study the Impact of Soybean Components and Structure on L-Band Backscatter

    Authors: Kaiser Niknam, Jasmeet Judge, A. Kaleo Roberts, Alejandro Monsivais-Huertero, Robert Moore, Kamal Sarabandi, Jiayi Wu

    Abstract: Microwave remote sensing offers a powerful tool for monitoring the growth of short, dense vegetation like soybean. As the plants mature, changes in their biomass and 3-D structure impact the electromagnetic (EM) backscatter signal. This backscatter information holds valuable insights into crop health and yield, prompting the need for a comprehensive understanding of how structural and biophysical… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  2. arXiv:2304.11476  [pdf

    eess.IV

    Maximum Spherical Mean Value (mSMV) Filtering for Whole Brain Quantitative Susceptibility Map**

    Authors: Alexandra G. Roberts, Dominick J. Romano, Mert Şişman, Alexey V. Dimov, Pascal Spincemaille, Thanh D. Nguyen, Ilhami Kovanlikaya, Susan A. Gauthier, Yi Wang

    Abstract: To develop a tissue field filtering algorithm, called maximum Spherical Mean Value (mSMV), for reducing shadow artifacts in quantitative susceptibility map** (QSM) of the brain without requiring brain tissue erosion.Residual background field is a major source of shadow artifacts in QSM. The mSMV algorithm filters large field values near the border, where the maximum value of the harmonic backgro… ▽ More

    Submitted 27 November, 2023; v1 submitted 22 April, 2023; originally announced April 2023.

    Comments: 12 pages, 5 figures

  3. arXiv:2301.12662  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    SingSong: Generating musical accompaniments from singing

    Authors: Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse Engel

    Abstract: We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice. To accomplish this, we build on recent developments in musical source separation and audio generation. Specifically, we apply a state-of-the-art source separation algorithm to a large corpus… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

  4. arXiv:2301.11325  [pdf, other

    cs.SD cs.LG eess.AS

    MusicLM: Generating Music From Text

    Authors: Andrea Agostinelli, Timo I. Denk, Zalán Borsos, Jesse Engel, Mauro Verzetti, Antoine Caillon, Qingqing Huang, Aren Jansen, Adam Roberts, Marco Tagliasacchi, Matt Sharifi, Neil Zeghidour, Christian Frank

    Abstract: We introduce MusicLM, a model generating high-fidelity music from text descriptions such as "a calming violin melody backed by a distorted guitar riff". MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes. Our experiments show that MusicLM outperforms previous s… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: Supplementary material at https://google-research.github.io/seanet/musiclm/examples and https://kaggle.com/datasets/googleai/musiccaps

  5. arXiv:2206.14639  [pdf, other

    eess.AS cs.LG cs.SD

    DDKtor: Automatic Diadochokinetic Speech Analysis

    Authors: Yael Segal, Kasia Hitczenko, Matthew Goldrick, Adam Buchwald, Angela Roberts, Joseph Keshet

    Abstract: Diadochokinetic speech tasks (DDK), in which participants repeatedly produce syllables, are commonly used as part of the assessment of speech motor impairments. These studies rely on manual analyses that are time-intensive, subjective, and provide only a coarse-grained picture of speech. This paper presents two deep neural network models that automatically segment consonants and vowels from unanno… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted to Interspeech 2022

  6. arXiv:2206.05408  [pdf, other

    cs.SD cs.LG eess.AS

    Multi-instrument Music Synthesis with Spectrogram Diffusion

    Authors: Curtis Hawthorne, Ian Simon, Adam Roberts, Neil Zeghidour, Josh Gardner, Ethan Manilow, Jesse Engel

    Abstract: An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes. Recent neural synthesizers have exhibited a tradeoff between domain-specific models that offer detailed control of only specific instruments, or raw waveform models that can train on any music but with minimal control and slow generat… ▽ More

    Submitted 12 December, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

  7. Articulatory Coordination for Speech Motor Tracking in Huntington Disease

    Authors: Matthew Perez, Amrit Romana, Angela Roberts, Noelle Carlozzi, Jennifer Ann Miner, Praveen Dayalu, Emily Mower Provost

    Abstract: Huntington Disease (HD) is a progressive disorder which often manifests in motor impairment. Motor severity (captured via motor score) is a key component in assessing overall HD severity. However, motor score evaluation involves in-clinic visits with a trained medical professional, which are expensive and not always accessible. Speech analysis provides an attractive avenue for tracking HD severity… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

  8. Quantifying the unknown impact of segmentation uncertainty on image-based simulations

    Authors: Michael C. Krygier, Tyler LaBonte, Carianne Martinez, Chance Norris, Krish Sharma, Lincoln N. Collins, Partha P. Mukherjee, Scott A. Roberts

    Abstract: Image-based simulation, the use of 3D images to calculate physical quantities, fundamentally relies on image segmentation to create the computational geometry. However, this process introduces image segmentation uncertainty because there is a variety of different segmentation tools (both manual and machine-learning-based) that will each produce a unique and valid segmentation. First, we demonstrat… ▽ More

    Submitted 9 September, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Journal ref: Nature Communications 12, 5414 (2021)

  9. arXiv:2010.08503  [pdf, other

    eess.AS cs.SD

    Classification of Manifest Huntington Disease using Vowel Distortion Measures

    Authors: Amrit Romana, John Bandon, Noelle Carlozzi, Angela Roberts, Emily Mower Provost

    Abstract: Huntington disease (HD) is a fatal autosomal dominant neurocognitive disorder that causes cognitive disturbances, neuropsychiatric symptoms, and impaired motor abilities (e.g., gait, speech, voice). Due to its progressive nature, HD treatment requires ongoing clinical monitoring of symptoms. Individuals with the gene mutation which causes HD may exhibit a range of speech symptoms as they progress… ▽ More

    Submitted 19 October, 2020; v1 submitted 16 October, 2020; originally announced October 2020.

  10. Classification of Huntington Disease using Acoustic and Lexical Features

    Authors: Matthew Perez, Wenyu **, Duc Le, Noelle Carlozzi, Praveen Dayalu, Angela Roberts, Emily Mower Provost

    Abstract: Speech is a critical biomarker for Huntington Disease (HD), with changes in speech increasing in severity as the disease progresses. Speech analyses are currently conducted using either transcriptions created manually by trained professionals or using global rating scales. Manual transcription is both expensive and time-consuming and global rating scales may lack sufficient sensitivity and fidelit… ▽ More

    Submitted 7 August, 2020; originally announced August 2020.

    Comments: 4 pages

  11. arXiv:2001.04643  [pdf, other

    cs.LG cs.SD eess.AS eess.SP stat.ML

    DDSP: Differentiable Digital Signal Processing

    Authors: Jesse Engel, Lamtharn Hantrakul, Chenjie Gu, Adam Roberts

    Abstract: Most generative models of audio directly generate samples in one of two domains: time or frequency. While sufficient to express any signal, these representations are inefficient, as they do not utilize existing knowledge of how sound is generated and perceived. A third approach (vocoders/synthesizers) successfully incorporates strong domain knowledge of signal processing and perception, but has be… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

  12. arXiv:1910.10793  [pdf, other

    eess.IV cs.CV cs.LG

    We Know Where We Don't Know: 3D Bayesian CNNs for Credible Geometric Uncertainty

    Authors: Tyler LaBonte, Carianne Martinez, Scott A. Roberts

    Abstract: Deep learning has been successfully applied to the segmentation of 3D Computed Tomography (CT) scans. Establishing the credibility of these segmentations requires uncertainty quantification (UQ) to identify untrustworthy predictions. Recent UQ architectures include Monte Carlo dropout networks (MCDNs), which approximate deep Gaussian processes, and Bayesian neural networks (BNNs), which learn the… ▽ More

    Submitted 1 April, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

    Comments: Preprint

    Report number: SAND2020-3269 R

  13. arXiv:1907.06637  [pdf, other

    cs.SD cs.HC cs.LG eess.AS stat.ML

    The Bach Doodle: Approachable music composition with machine learning at scale

    Authors: Cheng-Zhi Anna Huang, Curtis Hawthorne, Adam Roberts, Monica Dinculescu, James Wexler, Leon Hong, Jacob Howcroft

    Abstract: To make music composition more approachable, we designed the first AI-powered Google Doodle, the Bach Doodle, where users can create their own melody and have it harmonized by a machine learning model Coconet (Huang et al., 2017) in the style of Bach. For users to input melodies, we designed a simplified sheet-music based interface. To support an interactive experience at scale, we re-implemented… ▽ More

    Submitted 14 July, 2019; originally announced July 2019.

    Comments: Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2019

  14. arXiv:1906.08827  [pdf, other

    physics.med-ph eess.IV

    Deformable Slice-to-Volume Registration for Motion Correction in Fetal Body MRI

    Authors: Alena Uus, Tong Zhang, Laurence H. Jackson, Thomas A. Roberts, Mary A. Rutherford, Joseph V. Hajnal, Maria Deprez

    Abstract: In in-utero MRI, motion correction for fetal body and placenta poses a particular challenge due to the presence of local non-rigid transformations of organs caused by bending and stretching. The existing slice-to-volume registration (SVR) reconstruction methods are widely employed for motion correction of fetal brain that undergoes only rigid transformation. However, for reconstruction of fetal bo… ▽ More

    Submitted 28 February, 2020; v1 submitted 20 June, 2019; originally announced June 2019.

  15. arXiv:1905.06118  [pdf, other

    cs.SD cs.LG cs.MM eess.AS stat.ML

    Learning to Groove with Inverse Sequence Transformations

    Authors: Jon Gillick, Adam Roberts, Jesse Engel, Douglas Eck, David Bamman

    Abstract: We explore models for translating abstract musical ideas (scores, rhythms) into expressive performances using Seq2Seq and recurrent Variational Information Bottleneck (VIB) models. Though Seq2Seq models usually require painstakingly aligned corpora, we show that it is possible to adapt an approach from the Generative Adversarial Network (GAN) literature (e.g. Pix2Pix (Isola et al., 2017) and Vid2V… ▽ More

    Submitted 26 July, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: Blog post and links: https://g.co/magenta/groovae

    ACM Class: J.5; I.2

    Journal ref: Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2269-2279, 2019

  16. arXiv:1903.07227  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Counterpoint by Convolution

    Authors: Cheng-Zhi Anna Huang, Tim Cooijmans, Adam Roberts, Aaron Courville, Douglas Eck

    Abstract: Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end. On the contrary, human composers write music in a nonlinear fashion, scribbling motifs here and there, often revisiting choices previously made. In order to better approximate this process, we train a convolutional neural netwo… ▽ More

    Submitted 17 March, 2019; originally announced March 2019.

    Comments: Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017

    ACM Class: H.5.5; I.2

  17. arXiv:1902.08710  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    GANSynth: Adversarial Neural Audio Synthesis

    Authors: Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, Adam Roberts

    Abstract: Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence. Autoregressive models, such as WaveNet, model local structure at the expense of global latent structure and slow iterative sampling, while Generative Adversarial Networks (GANs), have global latent conditioning and efficient parall… ▽ More

    Submitted 14 April, 2019; v1 submitted 22 February, 2019; originally announced February 2019.

    Comments: Colab Notebook: http://goo.gl/magenta/gansynth-demo

  18. arXiv:1810.12247  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset

    Authors: Curtis Hawthorne, Andriy Stasyuk, Adam Roberts, Ian Simon, Cheng-Zhi Anna Huang, Sander Dieleman, Erich Elsen, Jesse Engel, Douglas Eck

    Abstract: Generating musical audio directly with neural networks is notoriously difficult because it requires coherently modeling structure at many different timescales. Fortunately, most music is also highly structured and can be represented as discrete note events played on musical instruments. Herein, we show that by using notes as an intermediate representation, we can train a suite of models capable of… ▽ More

    Submitted 17 January, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Examples available at https://goo.gl/magenta/maestro-examples

  19. arXiv:1806.00195  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    Learning a Latent Space of Multitrack Measures

    Authors: Ian Simon, Adam Roberts, Colin Raffel, Jesse Engel, Curtis Hawthorne, Douglas Eck

    Abstract: Discovering and exploring the underlying structure of multi-instrumental music using learning-based approaches remains an open problem. We extend the recent MusicVAE model to represent multitrack polyphonic measures as vectors in a latent space. Our approach enables several useful operations such as generating plausible measures from scratch, interpolating between measures in a musically meaningfu… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

  20. arXiv:1803.05428  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music

    Authors: Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, Douglas Eck

    Abstract: The Variational Autoencoder (VAE) has proven to be an effective model for producing semantically meaningful latent representations for natural data. However, it has thus far seen limited application to sequential data, and, as we demonstrate, existing recurrent VAE models have difficulty modeling sequences with long-term structure. To address this issue, we propose the use of a hierarchical decode… ▽ More

    Submitted 11 November, 2019; v1 submitted 13 March, 2018; originally announced March 2018.

    Comments: ICML Camera Ready Version (w/ fixed typos)

    Journal ref: ICML 2018

  21. arXiv:1710.11153  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Onsets and Frames: Dual-Objective Piano Transcription

    Authors: Curtis Hawthorne, Erich Elsen, Jialin Song, Adam Roberts, Ian Simon, Colin Raffel, Jesse Engel, Sageev Oore, Douglas Eck

    Abstract: We advance the state of the art in polyphonic piano music transcription by using a deep convolutional and recurrent neural network which is trained to jointly predict onsets and frames. Our model predicts pitch onset events and then uses those predictions to condition framewise pitch predictions. During inference, we restrict the predictions from the framewise detector by not allowing a new note t… ▽ More

    Submitted 5 June, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: Examples available at https://goo.gl/magenta/onsets-frames-examples

  22. arXiv:1106.0016  [pdf, ps, other

    math.OC eess.SY

    A New Position Control Strategy for VTOL UAVs using IMU and GPS measurements

    Authors: Andrew Roberts, Abdelhamid Tayebi

    Abstract: We propose a new position control strategy for VTOL-UAVs using IMU and GPS measurements. Since there is no sensor that measures the attitude, our approach does not rely on the knowledge (or reconstruction) of the system orientation as usually done in the existing literature. Instead, IMU and GPS measurements are directly incorporated in the control law. An important feature of the proposed strateg… ▽ More

    Submitted 20 March, 2012; v1 submitted 31 May, 2011; originally announced June 2011.

    Comments: Submitted to Automatica