Skip to main content

Showing 1–50 of 118 results for author: Wong, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10916  [pdf, other

    cs.RO cs.DC

    M-SET: Multi-Drone Swarm Intelligence Experimentation with Collision Avoidance Realism

    Authors: Chuhao Qin, Alexander Robins, Callum Lillywhite-Roake, Adam Pearce, Hritik Mehta, Scott James, Tsz Ho Wong, Evangelos Pournaras

    Abstract: Distributed sensing by cooperative drone swarms is crucial for several Smart City applications, such as traffic monitoring and disaster response. Using an indoor lab with inexpensive drones, a testbed supports complex and ambitious studies on these systems while maintaining low cost, rigor, and external validity. This paper introduces the Multi-drone Sensing Experimentation Testbed (M-SET), a nove… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 7 pages, 7 figures. This work has been submitted to the IEEE conferenece

  2. arXiv:2405.17933  [pdf, other

    cs.CV

    ToonCrafter: Generative Cartoon Interpolation

    Authors: **bo Xing, Hanyuan Liu, Menghan Xia, Yong Zhang, Xintao Wang, Ying Shan, Tien-Tsin Wong

    Abstract: We introduce ToonCrafter, a novel approach that transcends traditional correspondence-based cartoon video interpolation, paving the way for generative interpolation. Traditional methods, that implicitly assume linear motion and the absence of complicated phenomena like dis-occlusion, often struggle with the exaggerated non-linear and large motions with occlusion commonly found in cartoons, resulti… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Project page: https://doubiiu.github.io/projects/ToonCrafter/

  3. Physics-based Scene Layout Generation from Human Motion

    Authors: Jianan Li, Tao Huang, Qingxu Zhu, Tien-Tsin Wong

    Abstract: Creating scenes for captured motions that achieve realistic human-scene interaction is crucial for 3D animation in movies or video games. As character motion is often captured in a blue-screened studio without real furniture or objects in place, there may be a discrepancy between the planned motion and the captured one. This gives rise to the need for automatic scene layout generation to relieve t… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH conference

  4. arXiv:2403.08266  [pdf, other

    cs.CV cs.GR

    Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models

    Authors: Jian Lin, Xueting Liu, Chengze Li, Minshan Xie, Tien-Tsin Wong

    Abstract: While manga is a popular entertainment form, creating manga is tedious, especially adding screentones to the created sketch, namely manga screening. Unfortunately, there is no existing method that tailors for automatic manga screening, probably due to the difficulty of generating high-quality shaded high-frequency screentones. The classic manga screening approaches generally require user input to… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 7 pages, 6 figures

    ACM Class: I.4.6; I.3.3; I.3.8

  5. arXiv:2402.15903  [pdf, other

    cs.LG cs.AI cs.NI

    ESFL: Efficient Split Federated Learning over Resource-Constrained Heterogeneous Wireless Devices

    Authors: Guangyu Zhu, Yiqin Deng, Xianhao Chen, Haixia Zhang, Yuguang Fang, Tan F. Wong

    Abstract: Federated learning (FL) allows multiple parties (distributed devices) to train a machine learning model without sharing raw data. How to effectively and efficiently utilize the resources on devices and the central server is a highly interesting yet challenging problem. In this paper, we propose an efficient split federated learning algorithm (ESFL) to take full advantage of the powerful computing… ▽ More

    Submitted 16 April, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  6. arXiv:2402.08788  [pdf

    cs.CL cs.SD eess.AS

    Syllable based DNN-HMM Cantonese Speech to Text System

    Authors: Timothy Wong, Claire Li, Sam Lam, Billy Chiu, Qin Lu, Minglei Li, Dan Xiong, Roy Shing Yu, Vincent T. Y. Ng

    Abstract: This paper reports our work on building up a Cantonese Speech-to-Text (STT) system with a syllable based acoustic model. This is a part of an effort in building a STT system to aid dyslexic students who have cognitive deficiency in writing skills but have no problem expressing their ideas through speech. For Cantonese speech recognition, the basic unit of acoustic models can either be the conventi… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 7 pages, 3 figures, LREC 2016

    MSC Class: 94-06 ACM Class: I.2.7

  7. arXiv:2402.07916  [pdf, other

    cs.HC cs.GR

    Perceptual Thresholds for Radial Optic Flow Distortion in Near-Eye Stereoscopic Displays

    Authors: Mohammad R. Saeedpour-Parizi, Niall L. Williams, Tim Wong, Phillip Guan, Dinesh Manocha, Ian M. Erkelens

    Abstract: We provide the first perceptual quantification of user's sensitivity to radial optic flow artifacts and demonstrate a promising approach for masking this optic flow artifact via blink suppression. Near-eye HMDs allow users to feel immersed in virtual environments by providing visual cues, like motion parallax and stereoscopy, that mimic how we view the physical world. However, these systems exhibi… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  8. arXiv:2402.02463  [pdf, other

    cs.LG stat.ML

    A Fast Method for Lasso and Logistic Lasso

    Authors: Siu-Wing Cheng, Man Ting Wong

    Abstract: We propose a fast method for solving compressed sensing, Lasso regression, and Logistic Lasso regression problems that iteratively runs an appropriate solver using an active set approach. We design a strategy to update the active set that achieves a large speedup over a single call of several solvers, including gradient projection for sparse reconstruction (GPSR), lassoglm of Matlab, and glmnet. F… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  9. arXiv:2312.00933  [pdf, other

    cs.IT

    Privacy Preserving Event Detection

    Authors: Xiaoshan Wang, Tan F. Wong

    Abstract: This paper presents a privacy-preserving event detection scheme based on measurements made by a network of sensors. A diameter-like decision statistic made up of the marginal types of the measurements observed by the sensors is employed. The proposed detection scheme can achieve the best type-I error exponent as the type-II error rate is required to be negligible. Detection performance with finite… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 26 pages, 9 figures, submitted to IEEE Transactions on Information Theory

  10. arXiv:2311.14343  [pdf, other

    cs.CV

    Highly Detailed and Temporal Consistent Video Stylization via Synchronized Multi-Frame Diffusion

    Authors: Minshan Xie, Hanyuan Liu, Chengze Li, Tien-Tsin Wong

    Abstract: Text-guided video-to-video stylization transforms the visual appearance of a source video to a different appearance guided on textual prompts. Existing text-guided image diffusion models can be extended for stylized video synthesis. However, they struggle to generate videos with both highly detailed appearance and temporal consistency. In this paper, we propose a synchronized multi-frame diffusion… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 11 pages, 11 figures

  11. arXiv:2311.12891  [pdf, other

    cs.CV

    Text-Guided Texturing by Synchronized Multi-View Diffusion

    Authors: Yuxin Liu, Minshan Xie, Hanyuan Liu, Tien-Tsin Wong

    Abstract: This paper introduces a novel approach to synthesize texture to dress up a given 3D object, given a text prompt. Based on the pretrained text-to-image (T2I) diffusion model, existing methods usually employ a project-and-inpaint approach, in which a view of the given object is first generated and warped to another view for inpainting. But it tends to generate inconsistent texture due to the asynchr… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  12. arXiv:2310.12190  [pdf, other

    cs.CV

    DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

    Authors: **bo Xing, Menghan Xia, Yong Zhang, Haoxin Chen, Wangbo Yu, Hanyuan Liu, Xintao Wang, Tien-Tsin Wong, Ying Shan

    Abstract: Animating a still image offers an engaging visual experience. Traditional image animation techniques mainly focus on animating natural scenes with stochastic dynamics (e.g. clouds and fluid) or domain-specific motions (e.g. human hair or body motions), and thus limits their applicability to more general visual content. To overcome this limitation, we explore the synthesis of dynamic content for op… ▽ More

    Submitted 27 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Project page: https://doubiiu.github.io/projects/DynamiCrafter

  13. arXiv:2310.04992  [pdf, other

    eess.IV cs.CV

    VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

    Authors: Jianing Qiu, Jian Wu, Hao Wei, Peilun Shi, Minqing Zhang, Yunyun Sun, Lin Li, Hanruo Liu, Hongyi Liu, Simeng Hou, Yuyang Zhao, Xuehui Shi, Junfang Xian, Xiaoxia Qu, Sirui Zhu, Lijie Pan, Xiaoniao Chen, Xiaojia Zhang, Shuai Jiang, Kebing Wang, Chenlong Yang, Mingqiang Chen, Sujie Fan, Jianhua Hu, Aiguo Lv , et al. (17 additional authors not shown)

    Abstract: We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassifi… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

  14. arXiv:2310.03884  [pdf, other

    cs.IT cs.LG eess.SP math.DG stat.ML

    Information Geometry for the Working Information Theorist

    Authors: Kumar Vijay Mishra, M. Ashok Kumar, Ting-Kam Leonard Wong

    Abstract: Information geometry is a study of statistical manifolds, that is, spaces of probability distributions from a geometric perspective. Its classical information-theoretic applications relate to statistical concepts such as Fisher information, sufficient statistics, and efficient estimators. Today, information geometry has emerged as an interdisciplinary field that finds applications in diverse areas… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: 12 pages, 3 figures, 1 table

  15. arXiv:2310.01081  [pdf, other

    cs.CR

    Unmasking Role-Play Attack Strategies in Exploiting Decentralized Finance (DeFi) Systems

    Authors: Weilin Li, Zhun Wang, Chenyu Li, Heying Chen, Taiyu Wong, Pengyu Sun, Yufei Yu, Chao Zhang

    Abstract: The rapid growth and adoption of decentralized finance (DeFi) systems have been accompanied by various threats, notably those emerging from vulnerabilities in their intricate design. In our work, we introduce and define an attack strategy termed as Role-Play Attack, in which the attacker acts as multiple roles concurrently to exploit the DeFi system and cause substantial financial losses. We provi… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  16. arXiv:2309.03509  [pdf, other

    cs.CV

    BroadCAM: Outcome-agnostic Class Activation Map** for Small-scale Weakly Supervised Applications

    Authors: Jiatai Lin, Guoqiang Han, Xuemiao Xu, Changhong Liang, Tien-Tsin Wong, C. L. Philip Chen, Zaiyi Liu, Chu Han

    Abstract: Class activation map**~(CAM), a visualization technique for interpreting deep learning models, is now commonly used for weakly supervised semantic segmentation~(WSSS) and object localization~(WSOL). It is the weighted aggregation of the feature maps by activating the high class-relevance ones. Current CAM methods achieve it relying on the training outcomes, such as predicted scores~(forward info… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  17. arXiv:2308.12642  [pdf, other

    cs.CV

    Tag-Based Annotation for Avatar Face Creation

    Authors: An Ngo, Daniel Phelps, Derrick Lai, Thanyared Wong, Lucas Mathias, Anish Shivamurthy, Mustafa Ajmal, Minghao Liu, James Davis

    Abstract: Currently, digital avatars can be created manually using human images as reference. Systems such as Bitmoji are excellent producers of detailed avatar designs, with hundreds of choices for customization. A supervised learning model could be trained to generate avatars automatically, but the hundreds of possible options create difficulty in securing non-noisy data to train a model. As a solution, w… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: 9 pages, 5 figures, 18 tables

  18. arXiv:2308.07767  [pdf, other

    eess.AS cs.SD

    Preliminary investigation of the short-term in situ performance of an automatic masker selection system

    Authors: Bhan Lam, Zhen-Ting Ong, Kenneth Ooi, Wen-Hui Ong, Trevor Wong, Karn N. Watcharasupat, Woon-Seng Gan

    Abstract: Soundscape augmentation or "masking" introduces wanted sounds into the acoustic environment to improve acoustic comfort. Usually, the masker selection and playback strategies are either arbitrary or based on simple rules (e.g. -3 dBA), which may lead to sub-optimal increment or even reduction in acoustic comfort for dynamic acoustic environments. To reduce ambiguity in the selection of maskers, an… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: paper submitted to the 52nd International Congress and Exposition on Noise Control Engineering held in Chiba, Greater Tokyo, Japan, on 20-23 August 2023 (Inter-Noise 2023)

    ACM Class: J.2; J.4

  19. Taming Reversible Halftoning via Predictive Luminance

    Authors: Cheuk-Kit Lau, Menghan Xia, Tien-Tsin Wong

    Abstract: Traditional halftoning usually drops colors when dithering images with binary dots, which makes it difficult to recover the original color information. We proposed a novel halftoning technique that converts a color image into a binary halftone with full restorability to its original version. Our novel base halftoning technique consists of two convolutional neural networks (CNNs) to produce the rev… ▽ More

    Submitted 7 February, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: published in IEEE Transactions on Visualization and Computer Graphics

  20. arXiv:2306.04114  [pdf, other

    cs.CV eess.IV

    Manga Rescreening with Interpretable Screentone Representation

    Authors: Minshan Xie, Chengze Li, Tien-Tsin Wong

    Abstract: The process of adapting or repurposing manga pages is a time-consuming task that requires manga artists to manually work on every single screentone region and apply new patterns to create novel screentones across multiple panels. To address this issue, we propose an automatic manga rescreening pipeline that aims to minimize the human effort involved in manga adaptation. Our pipeline automatically… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: 10 pages, 11 figures

  21. arXiv:2306.01732  [pdf, other

    cs.CV cs.AI cs.GR

    Video Colorization with Pre-trained Text-to-Image Diffusion Models

    Authors: Hanyuan Liu, Minshan Xie, **bo Xing, Chengze Li, Tien-Tsin Wong

    Abstract: Video colorization is a challenging task that involves inferring plausible and temporally consistent colors for grayscale frames. In this paper, we present ColorDiffuser, an adaptation of a pre-trained text-to-image latent diffusion model for video colorization. With the proposed adapter-based approach, we repropose the pre-trained text-to-image model to accept input grayscale video frames, with t… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: project page: https://colordiffuser.github.io/

  22. arXiv:2306.00943  [pdf, other

    cs.CV

    Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance

    Authors: **bo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong

    Abstract: Creating a vivid video from the event or scenario in our imagination is a truly fascinating experience. Recent advancements in text-to-video synthesis have unveiled the potential to achieve this with prompts only. While text is convenient in conveying the overall scene context, it may be insufficient to control precisely. In this paper, we explore customized video generation by utilizing text as c… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 13 pages, 8 figures. Project page: https://doubiiu.github.io/projects/Make-Your-Video/

  23. arXiv:2305.17193  [pdf

    q-bio.SC cs.AI cs.CV cs.LG physics.bio-ph q-bio.QM

    AI-based analysis of super-resolution microscopy: Biological discovery in the absence of ground truth

    Authors: Ivan R. Nabi, Ben Cardoen, Ismail M. Khater, Guang Gao, Timothy H. Wong, Ghassan Hamarneh

    Abstract: Super-resolution microscopy, or nanoscopy, enables the use of fluorescent-based molecular localization tools to study molecular structure at the nanoscale level in the intact cell, bridging the mesoscale gap to classical structural biology methodologies. Analysis of super-resolution data by artificial intelligence (AI), such as machine learning, offers tremendous potential for discovery of new bio… ▽ More

    Submitted 27 May, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 26 pages, 4 figures

  24. arXiv:2304.11105  [pdf, other

    cs.CV cs.GR

    Improved Diffusion-based Image Colorization via Piggybacked Models

    Authors: Hanyuan Liu, **bo Xing, Minshan Xie, Chengze Li, Tien-Tsin Wong

    Abstract: Image colorization has been attracting the research interests of the community for decades. However, existing methods still struggle to provide satisfactory colorized results given grayscale images due to a lack of human-like global understanding of colors. Recently, large-scale Text-to-Image (T2I) models have been exploited to transfer the semantic information from the text prompts to the image d… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

    Comments: project page: https://piggyback-color.github.io/

  25. arXiv:2303.16117  [pdf, ps, other

    q-fin.ST cs.LG

    Feature Engineering Methods on Multivariate Time-Series Data for Financial Data Science Competitions

    Authors: Thomas Wong, Mauricio Barahona

    Abstract: This paper is a work in progress. We are looking for collaborators to provide us financial datasets in Equity/Futures market to conduct more bench-marking studies. The authors have papers employing similar methods applied on the Numerai dataset, which is freely available but obfuscated. We apply different feature engineering methods for time-series to US market price data. The predictive power o… ▽ More

    Submitted 18 April, 2023; v1 submitted 25 March, 2023; originally announced March 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.07925

  26. arXiv:2303.07925  [pdf, other

    cs.LG q-fin.MF

    Deep incremental learning models for financial temporal tabular datasets with distribution shifts

    Authors: Thomas Wong, Mauricio Barahona

    Abstract: We present a robust deep incremental learning framework for regression tasks on financial temporal tabular datasets which is built upon the incremental use of commonly available tabular and time series prediction models to adapt to distributional shifts typical of financial datasets. The framework uses a simple basic building block (decision trees) to build self-similar models of any required comp… ▽ More

    Submitted 10 October, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

  27. arXiv:2301.02379  [pdf, other

    cs.CV

    CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

    Authors: **bo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong

    Abstract: Speech-driven 3D facial animation has been widely studied, yet there is still a gap to achieving realism and vividness due to the highly ill-posed nature and scarcity of audio-visual data. Existing works typically formulate the cross-modal map** into a regression task, which suffers from the regression-to-mean problem leading to over-smoothed facial motions. In this paper, we propose to cast spe… ▽ More

    Submitted 3 April, 2023; v1 submitted 6 January, 2023; originally announced January 2023.

    Comments: CVPR2023 Camera-Ready. Project Page: https://doubiiu.github.io/projects/codetalker/, Code: https://github.com/Doubiiu/CodeTalker

  28. arXiv:2301.01841  [pdf

    cs.CV

    Classification of Single Tree Decay Stages from Combined Airborne LiDAR Data and CIR Imagery

    Authors: Tsz Chung Wong, Abubakar Sani-Mohammed, **hong Wang, Puzuo Wang, Wei Yao, Marco Heurich

    Abstract: Understanding forest health is of great importance for the conservation of the integrity of forest ecosystems. In this regard, evaluating the amount and quality of dead wood is of utmost interest as they are favorable indicators of biodiversity. Apparently, remote sensing-based machine learning techniques have proven to be more efficient and sustainable with unprecedented accuracy in forest invent… ▽ More

    Submitted 21 December, 2023; v1 submitted 4 January, 2023; originally announced January 2023.

  29. arXiv:2301.00790  [pdf, other

    q-fin.CP cs.CE cs.LG

    Online learning techniques for prediction of temporal tabular datasets with regime changes

    Authors: Thomas Wong, Mauricio Barahona

    Abstract: The application of deep learning to non-stationary temporal datasets can lead to overfitted models that underperform under regime changes. In this work, we propose a modular machine learning pipeline for ranking predictions on temporal panel datasets which is robust under regime changes. The modularity of the pipeline allows the use of different models, including Gradient Boosting Decision Trees (… ▽ More

    Submitted 10 August, 2023; v1 submitted 30 December, 2022; originally announced January 2023.

  30. arXiv:2207.12899  [pdf, other

    eess.AS cs.SD

    Assessment of a cost-effective headphone calibration procedure for soundscape evaluations

    Authors: Bhan Lam, Kenneth Ooi, Zhen-Ting Ong, Karn N. Watcharasupat, Trevor Wong, Woon-Seng Gan

    Abstract: To increase the availability and adoption of the soundscape standard, a low-cost calibration procedure for reproduction of audio stimuli over headphones was proposed as part of the global ``Soundscape Attributes Translation Project'' (SATP) for validating ISO/TS~12913-2:2018 perceived affective quality (PAQ) attribute translations. A previous preliminary study revealed significant deviations from… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

    Comments: For 24th International Congress on Acoustics

    Journal ref: in Proc. 24th Int. Congr. Acoust., 2022, pp. 1-8

  31. arXiv:2207.07839  [pdf, other

    cs.CG

    On Non-Negative Quadratic Programming in Geometric Optimization

    Authors: Siu-Wing Cheng, Man Ting Wong

    Abstract: We present experimental and theoretical results on a method that applies a numerical solver iteratively to solve several non-negative quadratic programming problems in geometric optimization. The method gains efficiency by exploiting the potential sparsity of the intermediate solutions. We implemented the method to call quadprog of MATLAB iteratively. In comparison with a single call of quadprog,… ▽ More

    Submitted 19 November, 2023; v1 submitted 16 July, 2022; originally announced July 2022.

  32. arXiv:2206.15445  [pdf, other

    eess.IV cs.CV

    Asymmetry Disentanglement Network for Interpretable Acute Ischemic Stroke Infarct Segmentation in Non-Contrast CT Scans

    Authors: Haomiao Ni, Yuan Xue, Kelvin Wong, John Volpi, Stephen T. C. Wong, James Z. Wang, Xiaolei Huang

    Abstract: Accurate infarct segmentation in non-contrast CT (NCCT) images is a crucial step toward computer-aided acute ischemic stroke (AIS) assessment. In clinical practice, bilateral symmetric comparison of brain hemispheres is usually used to locate pathological abnormalities. Recent research has explored asymmetries to assist with AIS segmentation. However, most previous symmetry-based work mixed differ… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  33. arXiv:2206.01741  [pdf, other

    eess.IV cs.CV

    Patcher: Patch Transformers with Mixture of Experts for Precise Medical Image Segmentation

    Authors: Yanglan Ou, Ye Yuan, Xiaolei Huang, Stephen T. C. Wong, John Volpi, James Z. Wang, Kelvin Wong

    Abstract: We present a new encoder-decoder Vision Transformer architecture, Patcher, for medical image segmentation. Unlike standard Vision Transformers, it employs Patcher blocks that segment an image into large patches, each of which is further divided into small patches. Transformers are applied to the small patches within a large patch, which constrains the receptive field of each pixel. We intentionall… ▽ More

    Submitted 29 May, 2023; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  34. arXiv:2205.04728  [pdf, other

    eess.AS cs.SD

    Preliminary assessment of a cost-effective headphone calibration procedure for soundscape evaluations

    Authors: Bhan Lam, Kenneth Ooi, Karn N. Watcharasupat, Zhen-Ting Ong, Yun-Ting Lau, Trevor Wong, Woon-Seng Gan

    Abstract: The introduction of ISO 12913-2:2018 has provided a framework for standardized data collection and reporting procedures for soundscape practitioners. A strong emphasis was placed on the use of calibrated head and torso simulators (HATS) for binaural audio capture to obtain an accurate subjective impression and acoustic measure of the soundscape under evaluation. To auralise the binaural recordings… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Submitted to the 28th International Congress on Sound and Vibration

  35. arXiv:2204.13890  [pdf, other

    eess.AS cs.SD eess.SY

    Deployment of an IoT System for Adaptive In-Situ Soundscape Augmentation

    Authors: Trevor Wong, Karn N. Watcharasupat, Bhan Lam, Kenneth Ooi, Zhen-Ting Ong, Furi Andi Karnapi, Woon-Seng Gan

    Abstract: Soundscape augmentation is an emerging approach for noise mitigation by introducing additional sounds known as "maskers" to increase acoustic comfort. Traditionally, the choice of maskers is often predicated on expert guidance or post-hoc analysis which can be time-consuming and sometimes arbitrary. Moreover, this often results in a static set of maskers that are inflexible to the dynamic nature o… ▽ More

    Submitted 29 April, 2022; originally announced April 2022.

    Comments: To be presented at the 51st International Congress and Exposition on Noise Control Engineering

    Journal ref: INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Feb. 2022, vol. 265, no. 5, pp. 2013-2021

  36. arXiv:2204.13883  [pdf, other

    eess.AS cs.LG cs.SD

    Autonomous In-Situ Soundscape Augmentation via Joint Selection of Masker and Gain

    Authors: Karn N. Watcharasupat, Kenneth Ooi, Bhan Lam, Trevor Wong, Zhen-Ting Ong, Woon-Seng Gan

    Abstract: The selection of maskers and playback gain levels in a soundscape augmentation system is crucial to its effectiveness in improving the overall acoustic comfort of a given environment. Traditionally, the selection of appropriate maskers and gain levels has been informed by expert opinion, which may not representative of the target population, or by listening tests, which can be time-consuming and l… ▽ More

    Submitted 23 July, 2022; v1 submitted 29 April, 2022; originally announced April 2022.

    Comments: Accepted to IEEE Signal Processing Letters. (c) 2022 IEEE

    Journal ref: IEEE Signal Processing Letters, Vol. 29, pp. 1749 - 1753, 2022

  37. arXiv:2203.09860  [pdf, other

    eess.IV cs.CV

    Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification

    Authors: Luyang Luo, Dunyuan Xu, Hao Chen, Tien-Tsin Wong, Pheng-Ann Heng

    Abstract: Deep learning models were frequently reported to learn from shortcuts like dataset biases. As deep learning is playing an increasingly important role in the modern healthcare system, it is of great need to combat shortcut learning in medical data as well as develop unbiased and trustworthy models. In this paper, we study the problem of develo** debiased chest X-ray diagnosis models from the bias… ▽ More

    Submitted 4 August, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: To appear in MICCAI 2022. Code available at https://github.com/LLYXC/PBBL

  38. arXiv:2203.03396  [pdf, other

    cs.CV

    Screentone-Preserved Manga Retargeting

    Authors: Minshan Xie, Menghan Xia, Xueting Liu, Tien-Tsin Wong

    Abstract: As a popular comic style, manga offers a unique impression by utilizing a rich set of bitonal patterns, or screentones, for illustration. However, screentones can easily be contaminated with visual-unpleasant aliasing and/or blurriness after resampling, which harms its visualization on displays of diverse resolutions. To address this problem, we propose the first manga retargeting method that synt… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 10 pages, 13 figures

  39. arXiv:2202.13577  [pdf, other

    cs.CV

    Point Set Self-Embedding

    Authors: Ruihui Li, Xianzhi Li, Tien-Tsin Wong, Chi-Wing Fu

    Abstract: This work presents an innovative method for point set self-embedding, that encodes the structural information of a dense point set into its sparser version in a visual but imperceptible form. The self-embedded point set can function as the ordinary downsampled one and be visualized efficiently on mobile devices. Particularly, we can leverage the self-embedded information to fully restore the origi… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: Accepted by IEEE Transactions on Visualization and Computer Graphics (IEEE TVCG), 2022. All resources can be found at https://liruihui.github.io/

  40. arXiv:2202.06242  [pdf, other

    cs.LG cs.CR math.OC

    Beyond NaN: Resiliency of Optimization Layers in The Face of Infeasibility

    Authors: Wai Tuck Wong, Sarah Kinsey, Ramesha Karunasena, Thanh Nguyen, Arunesh Sinha

    Abstract: Prior work has successfully incorporated optimization layers as the last layer in neural networks for various problems, thereby allowing joint learning and planning in one neural network forward pass. In this work, we identify a weakness in such a set-up where inputs to the optimization layer lead to undefined output of the neural network. Such undefined decision outputs can lead to possible catas… ▽ More

    Submitted 4 February, 2023; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: To appear in AAAI-23 Special Track on Safe and Robust AI

  41. arXiv:2201.12576  [pdf, other

    cs.CV eess.IV

    Scale-arbitrary Invertible Image Downscaling

    Authors: **bo Xing, Wenbo Hu, Tien-Tsin Wong

    Abstract: Conventional social media platforms usually downscale the HR images to restrict their resolution to a specific size for saving transmission/storage cost, which leads to the super-resolution (SR) being highly ill-posed. Recent invertible image downscaling methods jointly model the downscaling/upscaling problems and achieve significant improvements. However, they only consider fixed integer scale fa… ▽ More

    Submitted 9 March, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

  42. arXiv:2110.12596  [pdf, other

    cs.HC

    GeoSneakPique: Visual Autocompletion for Geospatial Queries

    Authors: Vidya Setlur, Sarah Battersby, Tracy Wong

    Abstract: How many crimes occurred in the city center? And exactly which part of town is the 'city center'? While location is at the heart of many data questions, geographic location can be difficult to specify in natural language (NL) queries. This is especially true when working with fuzzy cognitive regions or regions that may be defined based on data distributions instead of absolute administrative locat… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

    Comments: 5 pages (4 + 1 page references), two figures

    Journal ref: IEEE VIS conference 2021 (TVCG journal)

  43. arXiv:2110.04491  [pdf, other

    eess.IV cs.CV

    Invertible Tone Map** with Selectable Styles

    Authors: Zhuming Zhang, Menghan Xia, Xueting Liu, Chengze Li, Tien-Tsin Wong

    Abstract: Although digital cameras can acquire high-dynamic range (HDR) images, the captured HDR information are mostly quantized to low-dynamic range (LDR) images for display compatibility and compact storage. In this paper, we propose an invertible tone map** method that converts the multi-exposure HDR to a true LDR (8-bit per color channel) and reserves the capability to accurately restore the original… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

  44. arXiv:2109.13460  [pdf, other

    cs.CG

    Self-Improving Voronoi Construction for a Hidden Mixture of Product Distributions

    Authors: Siu-Wing Cheng, Man Ting Wong

    Abstract: We propose a self-improving algorithm for computing Voronoi diagrams under a given convex distance function with constant description complexity. The $n$ input points are drawn from a hidden mixture of product distributions; we are only given an upper bound $m = o(\sqrt{n})$ on the number of distributions in the mixture, and the property that for each distribution, an input instance is drawn from… ▽ More

    Submitted 24 October, 2021; v1 submitted 27 September, 2021; originally announced September 2021.

  45. arXiv:2109.12065  [pdf, other

    cs.CV cs.AI

    DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning

    Authors: Tongan Cai, Haomiao Ni, Mingli Yu, Xiaolei Huang, Kelvin Wong, John Volpi, James Z. Wang, Stephen T. C. Wong

    Abstract: In an emergency room (ER) setting, stroke triage or screening is a common challenge. A quick CT is usually done instead of MRI due to MRI's slow throughput and high cost. Clinical tests are commonly referred to during the process, but the misdiagnosis rate remains high. We propose a novel multimodal deep learning framework, DeepStroke, to achieve computer-aided stroke presence assessment by recogn… ▽ More

    Submitted 27 June, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

  46. arXiv:2109.08311  [pdf, other

    eess.IV cs.CV cs.LG

    Adaptive Hierarchical Dual Consistency for Semi-Supervised Left Atrium Segmentation on Cross-Domain Data

    Authors: Jun Chen, Heye Zhang, Raad Mohiaddin, Tom Wong, David Firmin, Jennifer Keegan, Guang Yang

    Abstract: Semi-supervised learning provides great significance in left atrium (LA) segmentation model learning with insufficient labelled data. Generalising semi-supervised learning to cross-domain data is of high importance to further improve model robustness. However, the widely existing distribution difference and sample mismatch between different data domains hinder the generalisation of semi-supervised… ▽ More

    Submitted 20 September, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

  47. Multimodal Breast Lesion Classification Using Cross-Attention Deep Networks

    Authors: Hung Q. Vo, Pengyu Yuan, Tiancheng He, Stephen T. C. Wong, Hien V. Nguyen

    Abstract: Accurate breast lesion risk estimation can significantly reduce unnecessary biopsies and help doctors decide optimal treatment plans. Most existing computer-aided systems rely solely on mammogram features to classify breast lesions. While this approach is convenient, it does not fully exploit useful information in clinical reports to achieve the optimal performance. Would clinical features signifi… ▽ More

    Submitted 21 August, 2021; originally announced August 2021.

  48. arXiv:2107.11925  [pdf, other

    math.PR cs.IT math.ST

    Tsallis and Rényi deformations linked via a new $λ$-duality

    Authors: Ting-Kam Leonard Wong, Jun Zhang

    Abstract: Tsallis and Rényi entropies, which are monotone transformations of each other, are deformations of the celebrated Shannon entropy. Maximization of these deformed entropies, under suitable constraints, leads to the $q$-exponential family which has applications in non-extensive statistical physics, information theory and statistics. In previous information-geometric studies, the $q$-exponential fami… ▽ More

    Submitted 12 January, 2022; v1 submitted 25 July, 2021; originally announced July 2021.

    Comments: 41 pages, 7 figures. Revised

  49. arXiv:2107.07797  [pdf, other

    cs.CV

    Conditional Directed Graph Convolution for 3D Human Pose Estimation

    Authors: Wenbo Hu, Changgong Zhang, Fangneng Zhan, Lei Zhang, Tien-Tsin Wong

    Abstract: Graph convolutional networks have significantly improved 3D human pose estimation by representing the human skeleton as an undirected graph. However, this representation fails to reflect the articulated characteristic of human skeletons as the hierarchical orders among the joints are not explicitly presented. In this paper, we propose to represent the human skeleton as a directed graph with the jo… ▽ More

    Submitted 4 August, 2021; v1 submitted 16 July, 2021; originally announced July 2021.

    Journal ref: ACM International Conference on Multimedia (ACM MM), 2021

  50. arXiv:2106.12041  [pdf, other

    physics.ao-ph cs.LG physics.geo-ph

    Analysis of the Evolution of Parametric Drivers of High-End Sea-Level Hazards

    Authors: Alana Hough, Tony E. Wong

    Abstract: Climate models are critical tools for develo** strategies to manage the risks posed by sea-level rise to coastal communities. While these models are necessary for understanding climate risks, there is a level of uncertainty inherent in each parameter in the models. This model parametric uncertainty leads to uncertainty in future climate risks. Consequently, there is a need to understand how thos… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.