Search | arXiv e-print repository

arXiv:2208.05517 [pdf, other]

Beyond the Blue Sky of Multimodal Interaction: A Centennial Vision of Interplanetary Virtual Spaces in Turn-based Metaverse

Authors: Lik Hang Lee, Carlos Bermejo Fernandez, Ahmad Alhilal, Tristan Braud, Simo Hosio, Pan Hui, Esmée Henrieke Henrieke Anne de Haas

Abstract: Human habitation across multiple planets requires communication and social connection between planets. When the infrastructure of a deep space network becomes mature, immersive cyberspace, known as the Metaverse, can exchange diversified user data and host multitudinous virtual worlds. Nevertheless, such immersive cyberspace unavoidably encounters latency in minutes, and thus operates in a turn-ta… ▽ More Human habitation across multiple planets requires communication and social connection between planets. When the infrastructure of a deep space network becomes mature, immersive cyberspace, known as the Metaverse, can exchange diversified user data and host multitudinous virtual worlds. Nevertheless, such immersive cyberspace unavoidably encounters latency in minutes, and thus operates in a turn-taking manner. This Blue Sky paper illustrates a vision of an interplanetary Metaverse that connects Earthian and Martian users in a turn-based Metaverse. Accordingly, we briefly discuss several grand challenges to catalyze research initiatives for the `Digital Big Bang' on Mars. △ Less

Submitted 28 July, 2022; originally announced August 2022.

Comments: Accepted Paper in ACM ICMI 2022 (Blue Sky Track)

MSC Class: Nil ACM Class: A.1; K.0

arXiv:2206.04385 [pdf, other]

HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask

Authors: Anish K. Vallapuram, Pengyuan Zhou, Young D. Kwon, Lik Hang Lee, Hengwei Xu, Pan Hui

Abstract: Federated learning alleviates the privacy risk in distributed learning by transmitting only the local model updates to the central server. However, it faces challenges including statistical heterogeneity of clients' datasets and resource constraints of client devices, which severely impact the training performance and user experience. Prior works have tackled these challenges by combining personal… ▽ More Federated learning alleviates the privacy risk in distributed learning by transmitting only the local model updates to the central server. However, it faces challenges including statistical heterogeneity of clients' datasets and resource constraints of client devices, which severely impact the training performance and user experience. Prior works have tackled these challenges by combining personalization with model compression schemes including quantization and pruning. However, the pruning is data-dependent and thus must be done on the client side which requires considerable computation cost. Moreover, the pruning normally trains a binary supermask $\in \{0, 1\}$ which significantly limits the model capacity yet with no computation benefit. Consequently, the training requires high computation cost and a long time to converge while the model performance does not pay off. In this work, we propose HideNseek which employs one-shot data-agnostic pruning at initialization to get a subnetwork based on weights' synaptic saliency. Each client then optimizes a sign supermask $\in \{-1, +1\}$ multiplied by the unpruned weights to allow faster convergence with the same compression rates as state-of-the-art. Empirical results from three datasets demonstrate that compared to state-of-the-art, HideNseek improves inferences accuracies by up to 40.6\% while reducing the communication cost and training time by up to 39.7\% and 46.8\% respectively. △ Less

Submitted 9 June, 2022; originally announced June 2022.

arXiv:2205.03247 [pdf, other]

Musical Score Following and Audio Alignment

Authors: Lin Hao Lee

Abstract: Real-time tracking of the position of a musical performance on a musical score, i.e. score following, can be useful in music practice, performance and production. Example applications of such technology include computer-aided accompaniment and automatic page turning. Score following is a challenging task, especially when considering deviations in performance data from the score stemming from mista… ▽ More Real-time tracking of the position of a musical performance on a musical score, i.e. score following, can be useful in music practice, performance and production. Example applications of such technology include computer-aided accompaniment and automatic page turning. Score following is a challenging task, especially when considering deviations in performance data from the score stemming from mistakes or expressive choices. In this project, the extensive research present in the field is first explored before two open-source evaluation testbenches for score following--one quantitative and the other qualitative--are introduced. A new way of obtaining quantitative testbench data is proposed, and the QualScofo dataset for qualitative benchmarking is introduced. Subsequently, three different score followers, each of a different class, are implemented. First, a beat-based follower for an interactive conductor application--the TuneApp Conductor--is created to demonstrate an entertaining application of score following. Then, an Approximate String Matching (ASM) non-real-time follower is implemented to complement the quantitative testbench and provide more technical background details of score following. Finally, a Constant Q-Transform (CQT) Dynamic Time War** (DTW) score follower robust against major challenges in score following (such as polyphonic music and performance deviations) is outlined and implemented; it is shown that this CQT-based approach consistently and significantly outperforms a commonly used FFT-based approach in extracting audio features for score following. △ Less

Submitted 6 May, 2022; originally announced May 2022.

Comments: Imperial College London MEng Final Year Project Report

arXiv:2108.12719 [pdf, other]

A Dual Adversarial Calibration Framework for Automatic Fetal Brain Biometry

Authors: Yuan Gao, Lok Hin Lee, Richard Droste, Rachel Craik, Sridevi Beriwal, Aris Papageorghiou, Alison Noble

Abstract: This paper presents a novel approach to automatic fetal brain biometry motivated by needs in low- and medium- income countries. Specifically, we leverage high-end (HE) ultrasound images to build a biometry solution for low-cost (LC) point-of-care ultrasound images. We propose a novel unsupervised domain adaptation approach to train deep models to be invariant to significant image distribution shif… ▽ More This paper presents a novel approach to automatic fetal brain biometry motivated by needs in low- and medium- income countries. Specifically, we leverage high-end (HE) ultrasound images to build a biometry solution for low-cost (LC) point-of-care ultrasound images. We propose a novel unsupervised domain adaptation approach to train deep models to be invariant to significant image distribution shift between the image types. Our proposed method, which employs a Dual Adversarial Calibration (DAC) framework, consists of adversarial pathways which enforce model invariance to; i) adversarial perturbations in the feature space derived from LC images, and ii) appearance domain discrepancy. Our Dual Adversarial Calibration method estimates transcerebellar diameter and head circumference on images from low-cost ultrasound devices with a mean absolute error (MAE) of 2.43mm and 1.65mm, compared with 7.28 mm and 5.65 mm respectively for SOTA. △ Less

Submitted 28 August, 2021; originally announced August 2021.

Comments: CVAMD ICCV 2021

arXiv:2103.07895 [pdf, other]

Principled Ultrasound Data Augmentation for Classification of Standard Planes

Authors: Lok Hin Lee, Yuan Gao, J. Alison Noble

Abstract: Deep learning models with large learning capacities often overfit to medical imaging datasets. This is because training sets are often relatively small due to the significant time and financial costs incurred in medical data acquisition and labelling. Data augmentation is therefore often used to expand the availability of training data and to increase generalization. However, augmentation strategi… ▽ More Deep learning models with large learning capacities often overfit to medical imaging datasets. This is because training sets are often relatively small due to the significant time and financial costs incurred in medical data acquisition and labelling. Data augmentation is therefore often used to expand the availability of training data and to increase generalization. However, augmentation strategies are often chosen on an ad-hoc basis without justification. In this paper, we present an augmentation policy search method with the goal of improving model classification performance. We include in the augmentation policy search additional transformations that are often used in medical image analysis and evaluate their performance. In addition, we extend the augmentation policy search to include non-linear mixed-example data augmentation strategies. Using these learned policies, we show that principled data augmentation for medical image model training can lead to significant improvements in ultrasound standard plane detection, with an an average F1-score improvement of 7.0% overall over naive data augmentation strategies in ultrasound fetal standard plane classification. We find that the learned representations of ultrasound images are better clustered and defined with optimized data augmentation. △ Less

Submitted 14 March, 2021; originally announced March 2021.

Comments: Information Processing in Medical Imaging (IPMI) 2021

arXiv:2007.09207 [pdf]

Towards Augmented Reality-driven Human-City Interaction: Current Research on Mobile Headsets and Future Challenges

Authors: Lik Hang Lee, Tristan Braud, Simo Hosio, Pan Hui

Abstract: Interaction design for Augmented Reality (AR) is gaining increasing attention from both academia and industry. This survey discusses 260 articles (68.8% of articles published between 2015 - 2019) to review the field of human interaction in connected cities with emphasis on augmented reality-driven interaction. We provide an overview of Human-City Interaction and related technological approaches, f… ▽ More Interaction design for Augmented Reality (AR) is gaining increasing attention from both academia and industry. This survey discusses 260 articles (68.8% of articles published between 2015 - 2019) to review the field of human interaction in connected cities with emphasis on augmented reality-driven interaction. We provide an overview of Human-City Interaction and related technological approaches, followed by a review of the latest trends of information visualization, constrained interfaces, and embodied interaction for AR headsets. We highlight under-explored issues in interface design and input techniques that warrant further research, and conjecture that AR with complementary Conversational User Interfaces (CUIs) is a key enabler for ubiquitous interaction with immersive systems in smart cities. Our work helps researchers understand the current potential and future needs of AR in Human-City Interaction. △ Less

Submitted 25 May, 2021; v1 submitted 17 July, 2020; originally announced July 2020.

Comments: 39 pages, 12 figures, ACM Computing Survey (Accepted, May 2021)

MSC Class: 68-02 ACM Class: B.4; H.5

Showing 1–6 of 6 results for author: Lee, L H