Skip to main content

Showing 1–50 of 50 results for author: Williams, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.14861  [pdf, other

    eess.SY cs.ET

    Resilience of the Electric Grid through Trustable IoT-Coordinated Assets

    Authors: Vineet J. Nair, Venkatesh Venkataramanan, Priyank Srivastava, Partha S. Sarker, Anurag Srivastava, Laurentiu D. Marinovici, Jun Zha, Christopher Irwin, Prateek Mittal, John Williams, H. Vincent Poor, Anuradha M. Annaswamy

    Abstract: The electricity grid has evolved from a physical system to a cyber-physical system with digital devices that perform measurement, control, communication, computation, and actuation. The increased penetration of distributed energy resources (DERs) that include renewable generation, flexible loads, and storage provides extraordinary opportunities for improvements in efficiency and sustainability. Ho… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Submitted to the Proceedings of the National Academy of Sciences (PNAS), under review

  2. arXiv:2406.07797  [pdf, other

    eess.SP physics.app-ph

    Real-time Deformation Correction in Additively Printed Flexible Antenna Arrays

    Authors: Sreeni Poolakkal, Abdullah Islam, Shrestha Bansal, Arpit Rao, Ted Dabrowski, Kalsi Kwan, Amit Mishra, Quiyan Xu, Erfan Ghaderi, Pradeep Lall, Sudip Shekhar, Julio Navarro, Shenqiang Ren, John Williams, Subhanshu Gupta

    Abstract: Conformal phased arrays provide multiple degrees of freedom to the scan angle, which is typically limited by antenna aperture in rigid arrays. Silicon-based RF signal processing offers reliable, reconfigurable, multi-functional, and compact control for conformal phased arrays that can be used for on-the-move communication. While the lightweight, compactness, and shape-changing properties of the co… ▽ More

    Submitted 21 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2403.06899  [pdf, other

    eess.SP

    Multiobject Tracking for Thresholded Cell Measurements

    Authors: Thomas Kropfreiter, Jason L. Williams, Florian Meyer

    Abstract: In many multiobject tracking applications, including radar and sonar tracking, after prefiltering the received signal, measurement data is typically structured in cells. The cells, e.g., represent different range and bearing values. However, conventional multiobject tracking methods use so-called point measurements. Point measurements are provided by a preprocessing stage that applies a threshold… ▽ More

    Submitted 1 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: FUSION-24 conference

  4. arXiv:2402.06304  [pdf, ps, other

    cs.SD cs.AI eess.AS

    A New Approach to Voice Authenticity

    Authors: Nicolas M. Müller, Piotr Kawa, Shen Hu, Matthias Neu, Jennifer Williams, Philip Sperl, Konstantin Böttinger

    Abstract: Voice faking, driven primarily by recent advances in text-to-speech (TTS) synthesis technology, poses significant societal challenges. Currently, the prevailing assumption is that unaltered human speech can be considered genuine, while fake speech comes from TTS synthesis. We argue that this binary distinction is oversimplified. For instance, altered playback speeds can be used for malicious purpo… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  5. arXiv:2402.04753  [pdf, other

    eess.IV cs.CV

    Cortical Surface Diffusion Generative Models

    Authors: Zhenshan Xie, Simon Dahan, Logan Z. J. Williams, M. Jorge Cardoso, Emma C. Robinson

    Abstract: Cortical surface analysis has gained increased prominence, given its potential implications for neurological and developmental disorders. Traditional vision diffusion models, while effective in generating natural images, present limitations in capturing intricate development patterns in neuroimaging due to limited datasets. This is particularly true for generating cortical surfaces where individua… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 4 pages

  6. arXiv:2401.03936  [pdf, other

    eess.AS cs.CR cs.LG cs.SD

    Exploratory Evaluation of Speech Content Masking

    Authors: Jennifer Williams, Karla Pizzi, Paul-Gauthier Noe, Sneha Das

    Abstract: Most recent speech privacy efforts have focused on anonymizing acoustic speaker attributes but there has not been as much research into protecting information from speech content. We introduce a toy problem that explores an emerging type of privacy called "content masking" which conceals selected words and phrases in speech. In our efforts to define this problem space, we evaluate an introductory… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: Accepted to ITG Speech Conference 2023

  7. arXiv:2308.05474  [pdf, other

    eess.IV cs.CV

    Spatio-Temporal Encoding of Brain Dynamics with Surface Masked Autoencoders

    Authors: Simon Dahan, Logan Z. J. Williams, Yourong Guo, Daniel Rueckert, Emma C. Robinson

    Abstract: The development of robust and generalisable models for encoding the spatio-temporal dynamics of human brain activity is crucial for advancing neuroscientific discoveries. However, significant individual variation in the organisation of the human cerebral cortex makes it difficult to identify population-level trends in these signals. Recently, Surface Vision Transformers (SiTs) have emerged as a pr… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: Accepted for publications for MIDL 2024; 20 figures; 7 figures

  8. arXiv:2307.05426  [pdf, ps, other

    eess.SP cs.AI cs.LG

    Using BOLD-fMRI to Compute the Respiration Volume per Time (RTV) and Respiration Variation (RV) with Convolutional Neural Networks (CNN) in the Human Connectome Development Cohort

    Authors: Abdoljalil Addeh, Fernando Vega, Rebecca J Williams, Ali Golestani, G. Bruce Pike, M. Ethan MacDonald

    Abstract: In many fMRI studies, respiratory signals are unavailable or do not have acceptable quality. Consequently, the direct removal of low-frequency respiratory variations from BOLD signals is not possible. This study proposes a one-dimensional CNN model for reconstruction of two respiratory measures, RV and RVT. Results show that a CNN can capture informative features from resting BOLD signals and reco… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: 6 pages, 5 figures

  9. arXiv:2306.01375  [pdf, other

    eess.IV cs.CV cs.LG

    Robust and Generalisable Segmentation of Subtle Epilepsy-causing Lesions: a Graph Convolutional Approach

    Authors: Hannah Spitzer, Mathilde Ripart, Abdulah Fawaz, Logan Z. J. Williams, MELD project, Emma Robinson, Juan Eugenio Iglesias, Sophie Adler, Konrad Wagstyl

    Abstract: Focal cortical dysplasia (FCD) is a leading cause of drug-resistant focal epilepsy, which can be cured by surgery. These lesions are extremely subtle and often missed even by expert neuroradiologists. "Ground truth" manual lesion masks are therefore expensive, limited and have large inter-rater variability. Existing FCD detection methods are limited by high numbers of false positive predictions, p… ▽ More

    Submitted 5 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: accepted at MICCAI 2023

  10. arXiv:2303.11909  [pdf, other

    eess.IV cs.CV q-bio.NC

    The Multiscale Surface Vision Transformer

    Authors: Simon Dahan, Logan Z. J. Williams, Daniel Rueckert, Emma C. Robinson

    Abstract: Surface meshes are a favoured domain for representing structural and functional information on the human cortex, but their complex topology and geometry pose significant challenges for deep learning analysis. While Transformers have excelled as domain-agnostic architectures for sequence-to-sequence learning, the quadratic cost of the self-attention operation remains an obstacle for many dense pred… ▽ More

    Submitted 11 June, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepted for publication at MIDL 2024, 17 pages, 6 figures

  11. arXiv:2301.08925  [pdf, other

    eess.AS cs.CR cs.SD

    New Challenges for Content Privacy in Speech and Audio

    Authors: Jennifer Williams, Karla Pizzi, Shuvayanti Das, Paul-Gauthier Noe

    Abstract: Privacy in speech and audio has many facets. A particularly under-developed area of privacy in this domain involves consideration for information related to content and context. Speech content can include words and their meaning or even stylistic markers, pathological speech, intonation patterns, or emotion. More generally, audio captured in-the-wild may contain background speech or reveal context… ▽ More

    Submitted 21 January, 2023; originally announced January 2023.

    Comments: Accepted for publication in ISCA SPSC Symposium 2022

  12. arXiv:2301.04416  [pdf, other

    q-bio.QM cs.CV eess.IV

    pyssam -- a Python library for statistical modelling of biomedical shape and appearance

    Authors: Josh Williams, Ali Ozel, Uwe Wolfram

    Abstract: pyssam is a Python library for creating statistical shape and appearance models (SSAMs) for biological (and other) shapes such as bones, lungs or other organs. A point cloud best describing the anatomical 'landmarks' of the organ are required from each sample in a small population as an input. Additional information such as landmark gray-value can be included to incorporate joint correlations of s… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

    Comments: 5 pages, 3 figures, Journal of Open Source Software submission

  13. arXiv:2212.11594  [pdf, other

    eess.SP

    Electromagnetic Based Communication Model for Dynamic Metasurface Antennas

    Authors: Robin Jess Williams, Pablo Ramirez-Espinosa, Jide Yuan, Elisabeth De Carvalho

    Abstract: Dynamic metasurface antennas (DMAs) arise as a promising technology in the field of massive multiple-input multiple-output (mMIMO) systems, offering the possibility of integrating a large number of antennas in a limited -- and potentially large -- aperture while kee** the required number of radio-frequency (RF) chains under control. Although envisioned as practical realizations of mMIMO systems,… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

  14. arXiv:2211.15510  [pdf, other

    cs.CV cs.LG eess.IV

    Localized Shortcut Removal

    Authors: Nicolas M. Müller, Jochen Jacobs, Jennifer Williams, Konstantin Böttinger

    Abstract: Machine learning is a data-driven field, and the quality of the underlying datasets plays a crucial role in learning success. However, high performance on held-out test data does not necessarily indicate that a model generalizes or learns anything meaningful. This is often due to the existence of machine learning shortcuts - features in the data that are predictive but unrelated to the problem at… ▽ More

    Submitted 23 May, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: Accepted at XAI4CV @ CVPR2023

  15. arXiv:2207.10164  [pdf, other

    eess.SP

    Trajectory PMB Filters for Extended Object Tracking Using Belief Propagation

    Authors: Yuxuan Xia, Ángel F. García-Fernández, Florian Meyer, Jason L. Williams, Karl Granström, Lennart Svensson

    Abstract: In this paper, we propose a Poisson multi-Bernoulli (PMB) filter for extended object tracking (EOT), which directly estimates the set of object trajectories, using belief propagation (BP). The proposed filter propagates a PMB density on the posterior of sets of trajectories through the filtering recursions over time, where the PMB mixture (PMBM) posterior after the update step is approximated as a… ▽ More

    Submitted 19 September, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted for publication in IEEE Transactions on Aerospace and Electronic Systems. MATLAB implementation available at https://github.com/yuhsuansia/Trajectory-PMB-EOT-BP

  16. arXiv:2206.13245  [pdf, other

    eess.SP

    Performance Evaluation of Dynamic Metasurface Antennas: Impact of Insertion Losses and Coupling

    Authors: Pablo Ramírez-Espinosa, Robin Jess Williams, Jide Yuan, Elisabeth de Carvalho

    Abstract: This paper evaluates the performance of multi-user massive multiple-input multiple-output (MIMO) systems in which the base station is equipped with a dynamic metasurface antenna (DMA). Due to the physical implementation of DMAs, conventional models widely-used in MIMO are no longer valid, and electromagnetic phenomena such as mutual coupling, insertion losses and reflections inside the waveguides… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  17. Multiple Object Trajectory Estimation Using Backward Simulation

    Authors: Yuxuan Xia, Lennart Svensson, Ángel F. García-Fernández, Jason L. Williams, Daniel Svensson, Karl Granström

    Abstract: This paper presents a general solution for computing the multi-object posterior for sets of trajectories from a sequence of multi-object (unlabelled) filtering densities and a multi-object dynamic model. Importantly, the proposed solution opens an avenue of trajectory estimation possibilities for multi-object filters that do not explicitly estimate trajectories. In this paper, we first derive a ge… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in IEEE Transactions on Signal Processing

  18. arXiv:2204.03408  [pdf, other

    eess.IV cs.CV q-bio.NC

    Surface Vision Transformers: Flexible Attention-Based Modelling of Biomedical Surfaces

    Authors: Simon Dahan, Hao Xu, Logan Z. J. Williams, Abdulah Fawaz, Chunhui Yang, Timothy S. Coalson, Michelle C. Williams, David E. Newby, A. David Edwards, Matthew F. Glasser, Alistair A. Young, Daniel Rueckert, Emma C. Robinson

    Abstract: Recent state-of-the-art performances of Vision Transformers (ViT) in computer vision tasks demonstrate that a general-purpose architecture, which implements long-range self-attention, could replace the local feature learning operations of convolutional neural networks. In this paper, we extend ViTs to surfaces by reformulating the task of surface learning as a sequence-to-sequence learning problem… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: 10 pages, 3 figures, Submitted to IEEE Transactions on Medical Imaging

  19. arXiv:2203.16414  [pdf, other

    cs.CV eess.IV q-bio.NC

    Surface Vision Transformers: Attention-Based Modelling applied to Cortical Analysis

    Authors: Simon Dahan, Abdulah Fawaz, Logan Z. J. Williams, Chunhui Yang, Timothy S. Coalson, Matthew F. Glasser, A. David Edwards, Daniel Rueckert, Emma C. Robinson

    Abstract: The extension of convolutional neural networks (CNNs) to non-Euclidean geometries has led to multiple frameworks for studying manifolds. Many of those methods have shown design limitations resulting in poor modelling of long-range associations, as the generalisation of convolutions to irregular surfaces is non-trivial. Motivated by the success of attention-modelling in computer vision, we translat… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Comments: 22 pages, 6 figures, Accepted to MIDL 2022, OpenReview link https://openreview.net/forum?id=mpp843Bsf-

    Journal ref: Proceedings of Machine Learning Research. 172 (2022) 282-303

  20. arXiv:2203.14640  [pdf, other

    eess.AS

    Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE

    Authors: Shuvayanti Das, Jennifer Williams, Catherine Lai

    Abstract: This paper presents an analysis of speech synthesis quality achieved by simultaneously performing voice conversion and language code-switching using multilingual VQ-VAE speech synthesis in German, French, English and Italian. In this paper, we utilize VQ code indices representing phone information from VQ-VAE to perform code-switching and a VQ speaker code to perform voice conversion in a single s… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Submitted to Interspeech 2022

  21. arXiv:2112.06048  [pdf, other

    q-bio.NC cs.LG eess.IV

    Behavior measures are predicted by how information is encoded in an individual's brain

    Authors: Jennifer Williams, Leila Wehbe

    Abstract: Similar to how differences in the proficiency of the cardiovascular and musculoskeletal system predict an individual's athletic ability, differences in how the same brain region encodes information across individuals may explain their behavior. However, when studying how the brain encodes information, researchers choose different neuroimaging tasks (e.g., language or motor tasks), which can rely o… ▽ More

    Submitted 11 December, 2021; originally announced December 2021.

  22. arXiv:2110.06760  [pdf, other

    eess.AS

    Revisiting Speech Content Privacy

    Authors: Jennifer Williams, Junichi Yamagishi, Paul-Gauthier Noe, Cassia Valentini Botinhao, Jean-Francois Bonastre

    Abstract: In this paper, we discuss an important aspect of speech privacy: protecting spoken content. New capabilities from the field of machine learning provide a unique and timely opportunity to revisit speech content protection. There are many different applications of content privacy, even though this area has been under-explored in speech technology research. This paper presents several scenarios that… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted to ISCA Security and Privacy in Speech Communication (1st SPSC Symposium)

  23. arXiv:2109.03115  [pdf, other

    q-bio.NC cs.CV cs.LG eess.IV

    Improving Phenotype Prediction using Long-Range Spatio-Temporal Dynamics of Functional Connectivity

    Authors: Simon Dahan, Logan Z. J. Williams, Daniel Rueckert, Emma C. Robinson

    Abstract: The study of functional brain connectivity (FC) is important for understanding the underlying mechanisms of many psychiatric disorders. Many recent analyses adopt graph convolutional networks, to study non-linear interactions between functionally-correlated states. However, although patterns of brain activation are known to be hierarchically organised in both space and time, many methods have fail… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: MLCN 2021

  24. arXiv:2109.01490  [pdf, other

    eess.SP

    A Scalable Track-Before-Detect Method With Poisson/Multi-Bernoulli Model

    Authors: Thomas Kropfreiter, Jason L. Williams, Florian Meyer

    Abstract: We propose a scalable track-before-detect (TBD) tracking method based on a Poisson/multi-Bernoulli model. To limit computational complexity, we approximate the exact multi-Bernoulli mixture posterior probability density function (pdf) by a multi-Bernoulli pdf. Data association based on the sum-product algorithm and recycling of Bernoulli components enable the detection and tracking of low-observab… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

    Comments: published at FUSION conference 2021

  25. arXiv:2107.09667  [pdf, other

    cs.HC cs.AI cs.SD eess.AS

    Human Perception of Audio Deepfakes

    Authors: Nicolas M. Müller, Karla Pizzi, Jennifer Williams

    Abstract: The recent emergence of deepfakes has brought manipulated and generated content to the forefront of machine learning research. Automatic detection of deepfakes has seen many new machine learning techniques, however, human detection capabilities are far less explored. In this paper, we present results from comparing the abilities of humans and machines for detecting audio deepfakes used to imitate… ▽ More

    Submitted 6 October, 2022; v1 submitted 20 July, 2021; originally announced July 2021.

    Comments: Published at ACM Multimedia 2022 Workshop DDAM First International Workshop on Deepfake Detection for Audio Multimedia at ACM Multimedia 2022

  26. arXiv:2106.12914  [pdf, other

    cs.SD eess.AS

    Speech is Silver, Silence is Golden: What do ASVspoof-trained Models Really Learn?

    Authors: Nicolas M. Müller, Franziska Dieckmann, Pavel Czempin, Roman Canals, Konstantin Böttinger, Jennifer Williams

    Abstract: We present our analysis of a significant data artifact in the official 2019/2021 ASVspoof Challenge Dataset. We identify an uneven distribution of silence duration in the training and test splits, which tends to correlate with the target prediction label. Bonafide instances tend to have significantly longer leading and trailing silences than spoofed instances. In this paper, we explore this phenom… ▽ More

    Submitted 28 September, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

    Journal ref: ASVspoof 2021 Workshop

  27. arXiv:2105.01573  [pdf, other

    eess.AS cs.SD

    Exploring Disentanglement with Multilingual and Monolingual VQ-VAE

    Authors: Jennifer Williams, Jason Fong, Erica Cooper, Junichi Yamagishi

    Abstract: This work examines the content and usefulness of disentangled phone and speaker representations from two separately trained VQ-VAE systems: one trained on multilingual data and another trained on monolingual data. We explore the multi- and monolingual models using four small proof-of-concept tasks: copy-synthesis, voice transformation, linguistic code-switching, and content-based privacy masking.… ▽ More

    Submitted 28 June, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

    Comments: Accepted to Speech Synthesis Workshop 2021 (SSW11)

  28. Scalable Detection and Tracking of Geometric Extended Objects

    Authors: Florian Meyer, Jason L. Williams

    Abstract: Multiobject tracking provides situational awareness that enables new applications for modern convenience, public safety, and homeland security. This paper presents a factor graph formulation and a particle-based sum-product algorithm (SPA) for scalable detection and tracking of extended objects. The proposed method dynamically introduces states of newly detected objects, efficiently performs proba… ▽ More

    Submitted 9 December, 2021; v1 submitted 20 March, 2021; originally announced March 2021.

    Comments: 29 pages, 8 figures, 2 tables

  29. arXiv:2103.02561  [pdf, other

    cs.CV cs.LG eess.IV

    ICAM-reg: Interpretable Classification and Regression with Feature Attribution for Map** Neurological Phenotypes in Individual Scans

    Authors: Cher Bass, Mariana da Silva, Carole Sudre, Logan Z. J. Williams, Petru-Daniel Tudosiu, Fidel Alfaro-Almagro, Sean P. Fitzgibbon, Matthew F. Glasser, Stephen M. Smith, Emma C. Robinson

    Abstract: An important goal of medical imaging is to be able to precisely detect patterns of disease specific to individual scans; however, this is challenged in brain imaging by the degree of heterogeneity of shape and appearance. Traditional methods, based on image registration to a global template, historically fail to detect variable features of disease, as they utilise population-based analyses, suited… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

  30. arXiv:2011.00922  [pdf, other

    cs.IT eess.SP

    Multiuser MIMO with Large Intelligent Surfaces: Communication Model and Transmit Design

    Authors: Robin Jess Williams, Pablo Ramírez-Espinosa, Elisabeth de Carvalho, Thomas L. Marzetta

    Abstract: This paper proposes a communication model for multiuser multiple-input multiple-output (MIMO) systems based on large intelligent surfaces (LIS), where the LIS is modeled as a collection of tightly packed antenna elements. The LIS system is first represented in a circuital way, obtaining expressions for the radiated and received powers, as well as for the coupling between the distinct elements. The… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: 6 pages, 3 figures; This paper is submitted to IEEE International Conference on Communications (ICC) 2021

  31. arXiv:2010.10727  [pdf, other

    eess.AS cs.LG cs.SD

    Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm

    Authors: Jennifer Williams, Yi Zhao, Erica Cooper, Junichi Yamagishi

    Abstract: We present a new approach to disentangle speaker voice and phone content by introducing new components to the VQ-VAE architecture for speech synthesis. The original VQ-VAE does not generalize well to unseen speakers or content. To alleviate this problem, we have incorporated a speaker encoder and speaker VQ codebook that learns global speaker characteristics entirely separate from the existing sub… ▽ More

    Submitted 10 February, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to ICASSP 2021

  32. arXiv:2008.02051  [pdf, ps, other

    eess.SP

    Backward Simulation for Sets of Trajectories

    Authors: Yuxuan Xia, Lennart Svensson, Ángel F. García-Fernández, Karl Granström, Jason L. Williams

    Abstract: This paper presents a solution for recovering full trajectory information, via the calculation of the posterior of the set of trajectories, from a sequence of multitarget (unlabelled) filtering densities and the multitarget dynamic model. Importantly, the proposed solution opens an avenue of trajectory estimation possibilities for multitarget filters that do not explicitly estimate trajectories. I… ▽ More

    Submitted 22 February, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

    Comments: Published in 23rd International Conference on Information Fusion. This arXiv version contains more detailed derivations

  33. arXiv:2007.15652  [pdf, other

    cs.RO eess.IV

    Canopy Density Estimation in Perennial Horticulture Crops Using 3D Spinning Lidar SLAM

    Authors: Thomas Lowe, Peyman Moghadam, Everard Edwards, Jason Williams

    Abstract: We propose a novel, canopy density estimation solution using a 3D ray cloud representation for perennial horticultural crops at the field scale. To attain high spatial and temporal fidelity in field conditions, we propose the application of continuous-time 3D SLAM (Simultaneous Localisation and Map**) to a spinning lidar payload (AgScan3D) mounted on a moving farm vehicle. The AgScan3D data is p… ▽ More

    Submitted 14 December, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: Accepted to Journal of Field Robotics. More information at https://github.com/csiro-robotics/agscan3d

  34. arXiv:2006.06563  [pdf, other

    eess.SP cs.CV cs.LG

    A Primer on Large Intelligent Surface (LIS) for Wireless Sensing in an Industrial Setting

    Authors: Cristian J. Vaca-Rubio, Pablo Ramirez-Espinosa, Robin Jess Williams, Kimmo Kansanen, Zheng-Hua Tan, Elisabeth de Carvalho, Petar Popovski

    Abstract: One of the beyond-5G developments that is often highlighted is the integration of wireless communication and radio sensing. This paper addresses the potential of communication-sensing integration of Large Intelligent Surfaces (LIS) in an exemplary Industry 4.0 scenario. Besides the potential for high throughput and efficient multiplexing of wireless links, an LIS can offer a high-resolution render… ▽ More

    Submitted 16 November, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

  35. arXiv:2005.07884  [pdf, other

    eess.AS cs.SD

    Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction

    Authors: Yi Zhao, Haoyu Li, Cheng-I Lai, Jennifer Williams, Erica Cooper, Junichi Yamagishi

    Abstract: Vector Quantized Variational AutoEncoders (VQ-VAE) are a powerful representation learning framework that can discover discrete groups of features from a speech signal without supervision. Until now, the VQ-VAE architecture has previously modeled individual types of speech features, such as only phones or only F0. This paper introduces an important extension to VQ-VAE for learning F0-related supras… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  36. arXiv:2002.12696  [pdf, other

    eess.SP

    Spatiotemporal Constraints for Sets of Trajectories with Applications to PMBM Densities

    Authors: Karl Granström, Lennart Svensson, Yuxuan Xia, Angel F. Garcia-Fernandez, Jason Williams

    Abstract: In this paper we introduce spatiotemporal constraints for trajectories, i.e., restrictions that the trajectory must be in some part of the state space (spatial constraint) at some point in time (temporal constraint). Spatiotemporal contraints on trajectories can be used to answer a range of important questions, including, e.g., "where did the person that were in area A at time t, go afterwards?".… ▽ More

    Submitted 28 February, 2020; originally announced February 2020.

  37. arXiv:2002.12645  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis

    Authors: Jennifer Williams, Joanna Rownicka, Pilar Oplustil, Simon King

    Abstract: We aim to characterize how different speakers contribute to the perceived output quality of multi-speaker Text-to-Speech (TTS) synthesis. We automatically rate the quality of TTS using a neural network (NN) trained on human mean opinion score (MOS) ratings. First, we train and evaluate our NN model on 13 different TTS and voice conversion (VC) systems from the ASVSpoof 2019 Logical Access (LA) Dat… ▽ More

    Submitted 27 April, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

    Comments: accepted at Speaker Odyssey 2020

  38. arXiv:2001.10822  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Lattice-based Improvements for Voice Triggering Using Graph Neural Networks

    Authors: Pranay Dighe, Saurabh Adya, Nuoyu Li, Srikanth Vishnubhotla, Devang Naik, Adithya Sagar, Ying Ma, Stephen Pulman, Jason Williams

    Abstract: Voice-triggered smart assistants often rely on detection of a trigger-phrase before they start listening for the user request. Mitigation of false triggers is an important aspect of building a privacy-centric non-intrusive smart assistant. In this paper, we address the task of false trigger mitigation (FTM) using a novel approach based on analyzing automatic speech recognition (ASR) lattices using… ▽ More

    Submitted 24 January, 2020; originally announced January 2020.

  39. arXiv:1912.08718  [pdf, other

    eess.SP cs.RO eess.IV stat.CO

    Poisson Multi-Bernoulli Mixtures for Sets of Trajectories

    Authors: Karl Granström, Lennart Svensson, Yuxuan Xia, Jason Williams, Ángel F. García-Fernández

    Abstract: For the standard point target model with Poisson birth process, the Poisson Multi-Bernoulli Mixture (PMBM) is a conjugate multi-target density. The PMBM filter for sets of targets has been shown to have state-of-the-art performance and a structure similar to the Multiple Hypothesis Tracker (MHT). In this paper we consider a recently developed formulation of multiple target tracking as a random fin… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

    Comments: arXiv admin note: text overlap with arXiv:1812.05131

  40. arXiv:1912.06644  [pdf, other

    cs.IT eess.SP

    A Communication Model for Large Intelligent Surfaces

    Authors: Robin Jess Williams, Elisabeth De Carvalho, Thomas L. Marzetta

    Abstract: The purpose of this paper is to introduce a communication model for Large Intelligent Surfaces (LIS). A LIS is modelled as a collection of tiny closely spaced antenna elements. Due to the proximity of the elements, mutual coupling arises. An optimal transmitter design depends on the mutual coupling matrix. For single user communication, the optimal transmitter uses the inverse of the mutual coupli… ▽ More

    Submitted 4 May, 2020; v1 submitted 13 December, 2019; originally announced December 2019.

    Comments: 6 pages, 7 figures; typos corrected

  41. arXiv:1912.01748  [pdf, ps, other

    eess.SP

    Multi-Scan Implementation of the Trajectory Poisson Multi-Bernoulli Mixture Filter

    Authors: Yuxuan Xia, Karl Granström, Lennart Svensson, Ángel F. García-Fernández, Jason L. Williams

    Abstract: The Poisson multi-Bernoulli mixture (PMBM) and the multi-Bernoulli mixture (MBM) are two multi-target distributions for which closed-form filtering recursions exist. The PMBM has a Poisson birth process, whereas the MBM has a multi-Bernoulli birth process. This paper considers a recently developed formulation of the multi-target tracking problem using a random finite set of trajectories, through w… ▽ More

    Submitted 27 February, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: Published in Journals of Advances in Information Fusion, Special issue on Multiple Hypothesis Tracking, Volume 14, Number 2, Page 213-235, December 2019. MATLAB code is available at https://github.com/yuhsuansia/Multi-scan-trajectory-PMBM-filter

    Journal ref: Journal of Advances in Information Fusion Volume 14 Number 2 December 2019

  42. arXiv:1911.09025  [pdf, ps, other

    eess.SP

    Extended target Poisson multi-Bernoulli mixture trackers based on sets of trajectories

    Authors: Yuxuan Xia, Karl Granström, Lennart Svensson, Ángel F. García-Fernández, Jason L. Williams

    Abstract: The Poisson multi-Bernoulli mixture (PMBM) is a multi-target distribution for which the prediction and update are closed. By applying the random finite set (RFS) framework to multi-target tracking with sets of trajectories as the variable of interest, the PMBM trackers can efficiently estimate the set of target trajectories. This paper derives two trajectory RFS filters for extended target trackin… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: MATLAB code is available at https://github.com/yuhsuansia/Extended-Target-PMBM-Tracker. arXiv admin note: text overlap with arXiv:1812.05131

    Journal ref: Proceedings of the 22nd International Conference on Information Fusion, 2019

  43. arXiv:1908.08819  [pdf, other

    eess.SP cs.CV stat.AP

    Gaussian implementation of the multi-Bernoulli mixture filter

    Authors: Ángel F. García-Fernández, Yuxuan Xia, Karl Granström, Lennart Svensson, Jason L. Williams

    Abstract: This paper presents the Gaussian implementation of the multi-Bernoulli mixture (MBM) filter. The MBM filter provides the filtering (multi-target) density for the standard dynamic and radar measurement models when the birth model is multi-Bernoulli or multi-Bernoulli mixture. Under linear/Gaussian models, the single target densities of the MBM mixture admit Gaussian closed-form expressions. Murty's… ▽ More

    Submitted 23 August, 2019; originally announced August 2019.

    Comments: Matlab code of the MBM and PMBM filters is provided in https://github.com/Agarciafernandez/MTT . Additional information on MTT including PMBM and MBM filters can be found in the online course https://www.youtube.com/channel/UCa2-fpj6AV8T6JK1uTRuFpw

    Journal ref: Proceedings of the 22nd International Conference on Information Fusion, 2019

  44. arXiv:1812.05131  [pdf, other

    eess.SP eess.SY

    Poisson multi-Bernoulli mixture trackers: continuity through random finite sets of trajectories

    Authors: Karl Granström, Lennart Svensson, Yuxuan Xia, Jason Williams, Angel F Garcia-Fernandez

    Abstract: The Poisson multi-Bernoulli mixture (PMBM) is an unlabelled multi-target distribution for which the prediction and update are closed. It has a Poisson birth process, and new Bernoulli components are generated on each new measurement as a part of the Bayesian measurement update. The PMBM filter is similar to the multiple hypothesis tracker (MHT), but seemingly does not provide explicit continuity b… ▽ More

    Submitted 12 December, 2018; originally announced December 2018.

  45. arXiv:1801.01353  [pdf, other

    eess.SP

    Poisson Multi-Bernoulli Approximations for Multiple Extended Object Filtering

    Authors: Yuxuan Xia, Karl Granström, Lennart Svensson, Maryam Fatemi, Ángel F. García-Fernández, Jason L. Williams

    Abstract: The Poisson multi-Bernoulli mixture (PMBM) is a multi-object conjugate prior for the closed-form Bayes random finite sets filter. The extended object PMBM filter provides a closed-form solution for multiple extended object filtering with standard models. This paper considers computationally lighter alternatives to the extended object PMBM filter by propagating a Poisson multi-Bernoulli (PMB) densi… ▽ More

    Submitted 13 August, 2021; v1 submitted 4 January, 2018; originally announced January 2018.

    Comments: Accepted for publication in IEEE T-AES

  46. arXiv:1607.07942  [pdf, other

    cs.AI cs.IT eess.SY

    Multiple scan data association by convex variational inference

    Authors: Jason L. Williams, Roslyn A. Lau

    Abstract: Data association, the reasoning over correspondence between targets and measurements, is a problem of fundamental importance in target tracking. Recently, belief propagation (BP) has emerged as a promising method for estimating the marginal probabilities of measurement to target association, providing fast, accurate estimates. The excellent performance of BP in the particular formulation used may… ▽ More

    Submitted 23 January, 2018; v1 submitted 26 July, 2016; originally announced July 2016.

  47. An efficient, variational approximation of the best fitting multi-Bernoulli filter

    Authors: Jason L. Williams

    Abstract: The joint probabilistic data association (JPDA) filter is a popular tracking methodology for problems involving well-spaced targets, but it is rarely applied in problems with closely-spaced targets due to its complexity in these cases, and due to the well-known phenomenon of coalescence. This paper addresses these difficulties using random finite sets (RFSs) and variational inference, deriving a h… ▽ More

    Submitted 18 November, 2014; v1 submitted 19 March, 2014; originally announced March 2014.

    Comments: Accepted, IEEE Transactions on Signal Processing, http://dx.doi.org/10.1109/TSP.2014.2370946

    Journal ref: IEEE Transactions on Signal Processing, vol 63, no 1, pp 258-273, January 2015

  48. Marginal multi-Bernoulli filters: RFS derivation of MHT, JIPDA and association-based MeMBer

    Authors: Jason L. Williams

    Abstract: Recent developments in random finite sets (RFSs) have yielded a variety of tracking methods that avoid data association. This paper derives a form of the full Bayes RFS filter and observes that data association is implicitly present, in a data structure similar to MHT. Subsequently, algorithms are obtained by approximating the distribution of associations. Two algorithms result: one nearly identic… ▽ More

    Submitted 24 August, 2016; v1 submitted 13 March, 2012; originally announced March 2012.

    Comments: Journal version at http://ieeexplore.ieee.org/document/7272821. Matlab code of simple implementation included with ancillary files

    Journal ref: IEEE Transactions on Aerospace and Electronic Systems, vol 51, no 3, pp 1664-1687, July 2015

  49. arXiv:1203.2992  [pdf, other

    eess.SY cs.CV

    Hybrid Poisson and multi-Bernoulli filters

    Authors: Jason L. Williams

    Abstract: The probability hypothesis density (PHD) and multi-target multi-Bernoulli (MeMBer) filters are two leading algorithms that have emerged from random finite sets (RFS). In this paper we study a method which combines these two approaches. Our work is motivated by a sister paper, which proves that the full Bayes RFS filter naturally incorporates a Poisson component representing targets that have never… ▽ More

    Submitted 13 March, 2012; originally announced March 2012.

    Comments: Submitted to 15th International Conference on Information Fusion (2012)

  50. arXiv:1105.3298  [pdf, other

    eess.SY math.OC

    Graphical model approximations of random finite set filters

    Authors: Jason L. Williams

    Abstract: Random finite sets (RFSs) has been a fruitful area of research in recent years, yielding new approximate filters such as the probability hypothesis density (PHD), cardinalised PHD (CPHD), and multiple target multi-Bernoulli (MeMBer). These new methods have largely been based on approximations that side-step the need for measurement-to-track association. Comparably, RFS methods that incorporate dat… ▽ More

    Submitted 30 August, 2011; v1 submitted 17 May, 2011; originally announced May 2011.

    Comments: Extended version; first version submitted to Fusion 2011