Skip to main content

Showing 1–50 of 250 results for author: Tan, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10447  [pdf, other

    cs.CV

    The BabyView dataset: High-resolution egocentric videos of infants' and young children's everyday experiences

    Authors: Bria Long, Violet Xiang, Stefan Stojanov, Robert Z. Sparks, Zi Yin, Grace E. Keene, Alvin W. M. Tan, Steven Y. Feng, Chengxu Zhuang, Virginia A. Marchman, Daniel L. K. Yamins, Michael C. Frank

    Abstract: Human children far exceed modern machine learning algorithms in their sample efficiency, achieving high performance in key domains with much less data than current models. This ''data gap'' is a key challenge both for building intelligent artificial systems and for understanding human development. Egocentric video capturing children's experience -- their ''training data'' -- is a key ingredient fo… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages, 2 figures, 4 tables and SI. Submitted to NeurIPS Datasets and Benchmarks

  2. arXiv:2406.10215  [pdf, other

    cs.CL cs.LG

    DevBench: A multimodal developmental benchmark for language learning

    Authors: Alvin Wei Ming Tan, Sunny Yu, Bria Long, Wan**g Anya Ma, Tonya Murray, Rebecca D. Silverman, Jason D. Yeatman, Michael C. Frank

    Abstract: How (dis)similar are the learning trajectories of vision-language models and children? Recent modeling work has attempted to understand the gap between models' and humans' data efficiency by constructing models trained on less data, especially multimodal naturalistic data. However, such models are often evaluated on adult-level benchmarks, with limited breadth in language abilities tested, and wit… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2406.07971  [pdf, other

    cs.CL cs.AI cs.LG

    It Takes Two: On the Seamlessness between Reward and Policy Model in RLHF

    Authors: Taiming Lu, Lingfeng Shen, Xinyu Yang, Weiting Tan, Beidi Chen, Huaxiu Yao

    Abstract: Reinforcement Learning from Human Feedback (RLHF) involves training policy models (PMs) and reward models (RMs) to align language models with human preferences. Instead of focusing solely on PMs and RMs independently, we propose to examine their interactions during fine-tuning, introducing the concept of seamlessness. Our study starts with observing the saturation phenomenon, where continual impro… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  4. DHR+S: Distributed Hybrid Rendering with Realistic Real-time Shadows for Interactive Thin Client Metaverse and Game Applications

    Authors: Yu Wei Tan, Siang Ern Low, Jonas Chow, Javon Teo, Anand Bhojan

    Abstract: Distributed hybrid rendering (DHR) is a real-time rendering approach that incorporates cloud-based ray tracing with locally rasterized graphics for interactive thin client metaverse and game applications. With cloud assistance, DHR can generate high-fidelity ray-traced graphics contents remotely and deliver them to thin clients with low graphics capability, including standalone extended reality de… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    MSC Class: 68U05 ACM Class: I.3

  5. arXiv:2405.13274  [pdf, other

    cs.CL

    DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation

    Authors: Weiting Tan, **gyu Zhang, Lingfeng Shen, Daniel Khashabi, Philipp Koehn

    Abstract: Non-autoregressive Transformers (NATs) are recently applied in direct speech-to-speech translation systems, which convert speech across different languages without intermediate text data. Although NATs generate high-quality outputs and offer faster inference than autoregressive models, they tend to produce incoherent and repetitive results due to complex data distribution (e.g., acoustic and lingu… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  6. arXiv:2405.04940  [pdf, other

    cs.CV

    Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID

    Authors: Wentao Tan, Changxing Ding, Jiayu Jiang, Fei Wang, Yibing Zhan, Dapeng Tao

    Abstract: Text-to-image person re-identification (ReID) retrieves pedestrian images according to textual descriptions. Manually annotating textual descriptions is time-consuming, restricting the scale of existing datasets and therefore the generalization ability of ReID models. As a result, we study the transferable text-to-image ReID problem, where we train a model on our proposed large-scale database and… ▽ More

    Submitted 30 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  7. arXiv:2405.01881  [pdf

    q-fin.RM cs.LG

    Explainable Risk Classification in Financial Reports

    Authors: Xue Wen Tan, Stanley Kok

    Abstract: Every publicly traded company in the US is required to file an annual 10-K financial report, which contains a wealth of information about the company. In this paper, we propose an explainable deep-learning model, called FinBERT-XRC, that takes a 10-K report as input, and automatically assesses the post-event return volatility risk of its associated company. In contrast to previous systems, our pro… ▽ More

    Submitted 6 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: ICIS 2023 Proceedings. 3. https://aisel.aisnet.org/icis2023/blockchain/blockchain/3

  8. arXiv:2405.00291  [pdf, other

    cs.CL cs.AI cs.HC

    How Can I Improve? Using GPT to Highlight the Desired and Undesired Parts of Open-ended Responses

    Authors: Jionghao Lin, Eason Chen, Zeifei Han, Ashish Gurung, Danielle R. Thomas, Wei Tan, Ngoc Dang Nguyen, Kenneth R. Koedinger

    Abstract: Automated explanatory feedback systems play a crucial role in facilitating learning for a large cohort of learners by offering feedback that incorporates explanations, significantly enhancing the learning process. However, delivering such explanatory feedback in real-time poses challenges, particularly when high classification accuracy for domain-specific, nuanced responses is essential. Our study… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages, full research paper, EDM 2024

  9. arXiv:2404.09814  [pdf, other

    cs.IT

    A Novel HARQ-CC Assisted SCMA Scheme

    Authors: Man Wang, Zheng Shi, Yunfei Li, Xianda Wu, Weiqiang Tan, Xinrong Ye

    Abstract: This letter proposes a novel hybrid automatic repeat request with chase combining assisted sparse code multiple access (HARQ-CC-SCMA) scheme. Depending on whether the same superimposed packet are retransmitted, synchronous and asynchronous modes are considered for retransmissions. Moreover, factor graph aggregation (FGA) and Log-likelihood ratio combination (LLRC) are proposed for multi-user detec… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  10. arXiv:2404.04244  [pdf, other

    cs.CV

    Fast Diffeomorphic Image Registration using Patch based Fully Convolutional Networks

    Authors: Jiong Wu, Shuang Zhou, Li Lin, Xin Wang, Wenxue Tan

    Abstract: Diffeomorphic image registration is a fundamental step in medical image analysis, owing to its capability to ensure the invertibility of transformations and preservation of topology. Currently, unsupervised learning-based registration techniques primarily extract features at the image level, potentially limiting their efficacy. This paper proposes a novel unsupervised learning-based fully convolut… ▽ More

    Submitted 3 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

  11. arXiv:2403.14890  [pdf

    cs.SI

    Unraveling Contagion Origins: Optimal Estimation through Maximum-Likelihood and Starlike Tree Approximation in Markovian Spreading Models

    Authors: Pei-Duo Yu, Chee Wei Tan, Liang Zheng, Chao Zhao

    Abstract: Identifying the source of epidemic-like spread in networks is crucial for tasks like removing internet viruses or finding the rumor source in online social networks. The challenge lies in tracing the source from a snapshot observation of infected nodes. How do we accurately pinpoint the source? Utilizing snapshot data, we apply a probabilistic approach, focusing on the graph boundary and the obser… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  12. arXiv:2403.07312  [pdf, other

    cs.RO

    Multi-task Manipulation Policy Modeling with Visuomotor Latent Diffusion

    Authors: Wenhui Tan, Bei Liu, Junbo Zhang, Ruihua Song, Jianlong Fu

    Abstract: Modeling a generalized visuomotor policy has been a longstanding challenge for both computer vision and robotics communities. Existing approaches often fail to efficiently leverage cross-dataset resources or rely on heavy Vision-Language models, which require substantial computational resources, thereby limiting their multi-task performance and application potential. In this paper, we introduce a… ▽ More

    Submitted 1 April, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  13. arXiv:2403.06400  [pdf, other

    cs.CV

    DivCon: Divide and Conquer for Progressive Text-to-Image Generation

    Authors: Yuhao Jia, Wenhan Tan

    Abstract: Diffusion-driven text-to-image (T2I) generation has achieved remarkable advancements. To further improve T2I models' capability in numerical and spatial reasoning, the layout is employed as an intermedium to bridge large language models and layout-based diffusion models. However, these methods still struggle with generating images from textural prompts with multiple objects and complicated spatial… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  14. arXiv:2403.03186  [pdf, other

    cs.AI

    Cradle: Empowering Foundation Agents Towards General Computer Control

    Authors: Weihao Tan, Wentao Zhang, Xinrun Xu, Haochong Xia, Ziluo Ding, Boyu Li, Bohan Zhou, Junpeng Yue, Jiechuan Jiang, Yewen Li, Ruyi An, Molei Qin, Chuqiao Zong, Longtao Zheng, Yujie Wu, Xiaoqiang Chai, Yifei Bi, Tianbao Xie, Pengjie Gu, Xiyun Li, Ceyao Zhang, Long Tian, Chaojie Wang, Xinrun Wang, Börje F. Karlsson , et al. (3 additional authors not shown)

    Abstract: Despite the success in specific scenarios, existing foundation agents still struggle to generalize across various virtual scenarios, mainly due to the dramatically different encapsulations of environments with manually designed observation and action spaces. To handle this issue, we propose the General Computer Control (GCC) setting to restrict foundation agents to interact with software through t… ▽ More

    Submitted 2 July, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  15. arXiv:2402.13575  [pdf, other

    cs.CV cs.AI

    Flexible Physical Camouflage Generation Based on a Differential Approach

    Authors: Yang Li, Wenyi Tan, Chenxing Zhao, Shuangju Zhou, Xinkai Liang, Quan Pan

    Abstract: This study introduces a novel approach to neural rendering, specifically tailored for adversarial camouflage, within an extensive 3D rendering framework. Our method, named FPA, goes beyond traditional techniques by faithfully simulating lighting conditions and material variations, ensuring a nuanced and realistic representation of textures on a 3D target. To achieve this, we employ a generative ap… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  16. arXiv:2402.01172  [pdf, other

    cs.CL cs.SD eess.AS

    Streaming Sequence Transduction through Dynamic Compression

    Authors: Weiting Tan, Yunmo Chen, Tongfei Chen, Guanghui Qin, Haoran Xu, Heidi C. Zhang, Benjamin Van Durme, Philipp Koehn

    Abstract: We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams. STAR dynamically segments input streams to create compressed anchor representations, achieving nearly lossless compression (12x) in Automatic Speech Recognition (ASR) and outperforming existing methods. Moreover, STAR demonstrat… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  17. arXiv:2401.17542  [pdf, other

    cs.LG cs.AI cs.CV

    A Medical Data-Effective Learning Benchmark for Highly Efficient Pre-training of Foundation Models

    Authors: Wenxuan Yang, Weimin Tan, Yuqi Sun, Bo Yan

    Abstract: Foundation models, pre-trained on massive datasets, have achieved unprecedented generalizability. However, is it truly necessary to involve such vast amounts of data in pre-training, consuming extensive computational resources? This paper introduces data-effective learning, aiming to use data in the most impactful way to pre-train foundation models. This involves strategies that focus on data qual… ▽ More

    Submitted 15 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  18. arXiv:2401.15814  [pdf, other

    cs.LG

    OntoMedRec: Logically-Pretrained Model-Agnostic Ontology Encoders for Medication Recommendation

    Authors: Weicong Tan, Weiqing Wang, Xin Zhou, Wray Buntine, Gordon Bingham, Hongzhi Yin

    Abstract: Most existing medication recommendation models learn representations for medical concepts based on electronic health records (EHRs) and make recommendations with learnt representations. However, most medications appear in the dataset for limited times, resulting in insufficient learning of their representations. Medical ontologies are the hierarchical classification systems for medical terms where… ▽ More

    Submitted 14 February, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  19. arXiv:2401.14151  [pdf, other

    cs.LG cs.AI cs.CL

    True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning

    Authors: Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, Bo An

    Abstract: Despite the impressive performance across numerous tasks, large language models (LLMs) often fail in solving simple decision-making tasks due to the misalignment of the knowledge in LLMs with environments. On the contrary, reinforcement learning (RL) agents learn policies from scratch, which makes them always align with environments but difficult to incorporate prior knowledge for efficient explor… ▽ More

    Submitted 10 March, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR2024

  20. arXiv:2401.13136  [pdf, other

    cs.CL cs.AI

    The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts

    Authors: Lingfeng Shen, Weiting Tan, Sihao Chen, Yunmo Chen, **gyu Zhang, Haoran Xu, Boyuan Zheng, Philipp Koehn, Daniel Khashabi

    Abstract: As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LLMs across different languages and discusses approaches to alleviating such concerns. By comparing how state-of-the-art LLMs respond to the same set of malicious… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  21. arXiv:2401.08703  [pdf, other

    cs.LG

    Decoupled Prototype Learning for Reliable Test-Time Adaptation

    Authors: Guowei Wang, Changxing Ding, Wentao Tan, Mingkui Tan

    Abstract: Test-time adaptation (TTA) is a task that continually adapts a pre-trained source model to the target domain during inference. One popular approach involves fine-tuning model with cross-entropy loss according to estimated pseudo-labels. However, its performance is significantly affected by noisy pseudo-labels. This study reveals that minimizing the classification error of each sample causes the cr… ▽ More

    Submitted 25 January, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

    Comments: 12 pages, 5 figures

  22. arXiv:2401.08417  [pdf, other

    cs.CL

    Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

    Authors: Haoran Xu, Amr Sharaf, Yunmo Chen, Weiting Tan, Lingfeng Shen, Benjamin Van Durme, Kenton Murray, Young ** Kim

    Abstract: Moderate-sized large language models (LLMs) -- those with 7B or 13B parameters -- exhibit promising machine translation (MT) performance. However, even the top-performing 13B LLM-based translation models, like ALMA, does not match the performance of state-of-the-art conventional encoder-decoder translation models or larger-scale LLMs such as GPT-4. In this study, we bridge this performance gap. We… ▽ More

    Submitted 2 June, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted at ICML 2024

  23. arXiv:2401.08216  [pdf, other

    cs.CR cs.LG

    Towards Efficient and Certified Recovery from Poisoning Attacks in Federated Learning

    Authors: Yu Jiang, Jiyuan Shen, Ziyao Liu, Chee Wei Tan, Kwok-Yan Lam

    Abstract: Federated learning (FL) is vulnerable to poisoning attacks, where malicious clients manipulate their updates to affect the global model. Although various methods exist for detecting those clients in FL, identifying malicious clients requires sufficient model updates, and hence by the time malicious clients are detected, FL models have been already poisoned. Thus, a method is needed to recover an a… ▽ More

    Submitted 19 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  24. arXiv:2401.07395  [pdf, other

    cs.LG cs.AI

    Harnessing the Power of Beta Scoring in Deep Active Learning for Multi-Label Text Classification

    Authors: Wei Tan, Ngoc Dang Nguyen, Lan Du, Wray Buntine

    Abstract: Within the scope of natural language processing, the domain of multi-label text classification is uniquely challenging due to its expansive and uneven label distribution. The complexity deepens due to the demand for an extensive set of annotated data for training an advanced deep learning model, especially in specialized fields where the labeling task can be labor-intensive and often requires doma… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: 7 pages AAAI 2024

  25. arXiv:2312.16907  [pdf, other

    cs.CV

    DOEPatch: Dynamically Optimized Ensemble Model for Adversarial Patches Generation

    Authors: Wenyi Tan, Yang Li, Chenxing Zhao, Zhunga Liu, Quan Pan

    Abstract: Object detection is a fundamental task in various applications ranging from autonomous driving to intelligent security systems. However, recognition of a person can be hindered when their clothing is decorated with carefully designed graffiti patterns, leading to the failure of object detection. To achieve greater attack potential against unknown black-box models, adversarial patches capable of af… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  26. arXiv:2312.13614  [pdf, other

    cs.LG cs.CL

    Structure-Aware Path Inference for Neural Finite State Transducers

    Authors: Weiting Tan, Chu-cheng Lin, Jason Eisner

    Abstract: Neural finite-state transducers (NFSTs) form an expressive family of neurosymbolic sequence transduction models. An NFST models each string pair as having been generated by a latent path in a finite-state transducer. As they are deep generative models, both training and inference of NFSTs require inference networks that approximate posterior distributions over such latent variables. In this paper,… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: In Proceedings of ICBINB Workshop at NeurIPS 2023

  27. arXiv:2312.10890  [pdf, other

    cs.CV cs.GR

    Low-latency Space-time Supersampling for Real-time Rendering

    Authors: Ruian He, Shili Zhou, Yuqi Sun, Ri Cheng, Weimin Tan, Bo Yan

    Abstract: With the rise of real-time rendering and the evolution of display devices, there is a growing demand for post-processing methods that offer high-resolution content in a high frame rate. Existing techniques often suffer from quality and latency issues due to the disjointed treatment of frame supersampling and extrapolation. In this paper, we recognize the shared context and mechanisms between frame… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  28. Bayesian Estimate of Mean Proper Scores for Diversity-Enhanced Active Learning

    Authors: Wei Tan, Lan Du, Wray Buntine

    Abstract: The effectiveness of active learning largely depends on the sampling efficiency of the acquisition function. Expected Loss Reduction (ELR) focuses on a Bayesian estimate of the reduction in classification error, and more general costs fit in the same framework. We propose Bayesian Estimate of Mean Proper Scores (BEMPS) to estimate the increase in strictly proper scores such as log probability or n… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 16 pages, TPAMI. arXiv admin note: text overlap with arXiv:2110.14171

    Journal ref: TPAMI, 2023

  29. arXiv:2312.07180  [pdf, other

    cs.CV

    Context-Aware Iteration Policy Network for Efficient Optical Flow Estimation

    Authors: Ri Cheng, Ruian He, Xuhao Jiang, Shili Zhou, Weimin Tan, Bo Yan

    Abstract: Existing recurrent optical flow estimation networks are computationally expensive since they use a fixed large number of iterations to update the flow field for each sample. An efficient network should skip iterations when the flow improvement is limited. In this paper, we develop a Context-Aware Iteration Policy Network for efficient optical flow estimation, which determines the optimal number of… ▽ More

    Submitted 5 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: 2024, Association for the Advancement of Artificial Intelligence

  30. arXiv:2312.03998  [pdf, other

    cs.LG

    Series2Vec: Similarity-based Self-supervised Representation Learning for Time Series Classification

    Authors: Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Hamid Rezatofighi, Mahsa Salehi

    Abstract: We argue that time series analysis is fundamentally different in nature to either vision or natural language processing with respect to the forms of meaningful self-supervised learning tasks that can be defined. Motivated by this insight, we introduce a novel approach called \textit{Series2Vec} for self-supervised representation learning. Unlike other self-supervised methods in time series, which… ▽ More

    Submitted 12 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

  31. arXiv:2311.14708  [pdf, other

    cs.CY cs.AI cs.CL cs.HC

    Large Language Model-Driven Classroom Flip**: Empowering Student-Centric Peer Questioning with Flipped Interaction

    Authors: Chee Wei Tan

    Abstract: Reciprocal questioning is essential for effective teaching and learning, fostering active engagement and deeper understanding through collaborative interactions, especially in large classrooms. Can large language model (LLM), such as OpenAI's GPT (Generative Pre-trained Transformer) series, assist in this? This paper investigates a pedagogical approach of classroom flip** based on flipped intera… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Submitted

  32. arXiv:2311.12315  [pdf, other

    cs.CL

    AcademicGPT: Empowering Academic Research

    Authors: Shufa Wei, Xiaolong Xu, Xianbiao Qi, Xi Yin, Jun Xia, **gyi Ren, Peijun Tang, Yuxiang Zhong, Yihao Chen, Xiaoqin Ren, Yuxin Liang, Liankai Huang, Kai Xie, Weikang Gui, Wei Tan, Shuanglong Sun, Yongquan Hu, Qinxian Liu, Nan** Li, Chihao Dai, Lihua Wang, Xiaohui Liu, Lei Zhang, Yutao Xie

    Abstract: Large Language Models (LLMs) have demonstrated exceptional capabilities across various natural language processing tasks. Yet, many of these advanced LLMs are tailored for broad, general-purpose applications. In this technical report, we introduce AcademicGPT, designed specifically to empower academic research. AcademicGPT is a continual training model derived from LLaMA2-70B. Our training corpus… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Technical Report. arXiv admin note: text overlap with arXiv:2310.12081, arXiv:2310.10053 by other authors

  33. arXiv:2311.05707  [pdf, other

    cs.CV

    FMViT: A multiple-frequency mixing Vision Transformer

    Authors: Wei Tan, Yifeng Geng, Xuansong Xie

    Abstract: The transformer model has gained widespread adoption in computer vision tasks in recent times. However, due to the quadratic time and memory complexity of self-attention, which is proportional to the number of input tokens, most existing Vision Transformers (ViTs) encounter challenges in achieving efficient performance in practical industrial deployment scenarios, such as TensorRT and CoreML, wher… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  34. arXiv:2311.04918  [pdf, other

    cs.CL cs.LG

    Low-Resource Named Entity Recognition: Can One-vs-All AUC Maximization Help?

    Authors: Ngoc Dang Nguyen, Wei Tan, Lan Du, Wray Buntine, Richard Beare, Changyou Chen

    Abstract: Named entity recognition (NER), a task that identifies and categorizes named entities such as persons or organizations from text, is traditionally framed as a multi-class classification problem. However, this approach often overlooks the issues of imbalanced label distributions, particularly in low-resource settings, which is common in certain NER contexts, like biomedical NER (bioNER). To address… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 6 pages, 3 figures, ICDM 2023

  35. arXiv:2311.02310  [pdf, other

    cs.CL

    Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles

    Authors: Weiting Tan, Haoran Xu, Lingfeng Shen, Shuyue Stella Li, Kenton Murray, Philipp Koehn, Benjamin Van Durme, Yunmo Chen

    Abstract: Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning. However, even though zero-shot translations are relatively good, there remains a discernible gap comparing their performance with the few-shot setting. In this paper, we investigate the factors contributing… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  36. arXiv:2311.00906  [pdf, other

    cs.CL cs.LG

    Re-weighting Tokens: A Simple and Effective Active Learning Strategy for Named Entity Recognition

    Authors: Haocheng Luo, Wei Tan, Ngoc Dang Nguyen, Lan Du

    Abstract: Active learning, a widely adopted technique for enhancing machine learning models in text and image classification tasks with limited annotation resources, has received relatively little attention in the domain of Named Entity Recognition (NER). The challenge of data imbalance in NER has hindered the effectiveness of active learning, as sequence labellers lack sufficient learning signals. To addre… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  37. KyberMat: Efficient Accelerator for Matrix-Vector Polynomial Multiplication in CRYSTALS-Kyber Scheme via NTT and Polyphase Decomposition

    Authors: Weihang Tan, Yingjie Lao, Keshab K. Parhi

    Abstract: CRYSTAL-Kyber (Kyber) is one of the post-quantum cryptography (PQC) key-encapsulation mechanism (KEM) schemes selected during the standardization process. This paper addresses optimization for Kyber architecture with respect to latency and throughput constraints. Specifically, matrix-vector multiplication and number theoretic transform (NTT)-based polynomial multiplication are critical operations… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: Proc. 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Francisco, CA, Oct. 29 - Nov. 2, 2023

    Journal ref: 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)

  38. arXiv:2309.12218  [pdf, other

    cs.IR cs.LG

    SR-PredictAO: Session-based Recommendation with High-Capability Predictor Add-On

    Authors: Ruida Wang, Raymond Chi-Wing Wong, Weile Tan

    Abstract: Session-based recommendation, aiming at making the prediction of the user's next item click based on the information in a single session only even in the presence of some random user's behavior, is a complex problem. This complex problem requires a high-capability model of predicting the user's next action. Most (if not all) existing models follow the encoder-predictor paradigm where all studies f… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  39. arXiv:2309.11039  [pdf, other

    cs.LG cs.AI cs.DC

    Federated Learning in Intelligent Transportation Systems: Recent Applications and Open Problems

    Authors: Shiying Zhang, Jun Li, Long Shi, Ming Ding, Dinh C. Nguyen, Wuzheng Tan, Jian Weng, Zhu Han

    Abstract: Intelligent transportation systems (ITSs) have been fueled by the rapid development of communication technologies, sensor technologies, and the Internet of Things (IoT). Nonetheless, due to the dynamic characteristics of the vehicle networks, it is rather challenging to make timely and accurate decisions of vehicle behaviors. Moreover, in the presence of mobile wireless communications, the privacy… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  40. arXiv:2309.08273  [pdf, other

    cs.CV

    A Generative Framework for Self-Supervised Facial Representation Learning

    Authors: Ruian He, Zhen Xing, Weimin Tan, Bo Yan

    Abstract: Self-supervised representation learning has gained increasing attention for strong generalization ability without relying on paired datasets. However, it has not been explored sufficiently for facial representation. Self-supervised facial representation learning remains unsolved due to the coupling of facial identities, expressions, and external factors like pose and light. Prior methods primarily… ▽ More

    Submitted 22 May, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

  41. arXiv:2308.07748  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Exploiting Sparsity in Automotive Radar Object Detection Networks

    Authors: Marius Lippke, Maurice Quach, Sascha Braun, Daniel Köhler, Michael Ulrich, Bastian Bischoff, Wei Yap Tan

    Abstract: Having precise perception of the environment is crucial for ensuring the secure and reliable functioning of autonomous driving systems. Radar object detection networks are one fundamental part of such systems. CNN-based object detectors showed good performance in this context, but they require large compute resources. This paper investigates sparse convolutional object detection networks, which co… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  42. arXiv:2308.01568  [pdf, other

    cs.CV

    MVFlow: Deep Optical Flow Estimation of Compressed Videos with Motion Vector Prior

    Authors: Shili Zhou, Xuhao Jiang, Weimin Tan, Ruian He, Bo Yan

    Abstract: In recent years, many deep learning-based methods have been proposed to tackle the problem of optical flow estimation and achieved promising results. However, they hardly consider that most videos are compressed and thus ignore the pre-computed information in compressed video streams. Motion vectors, one of the compression information, record the motion of the video frames. They can be directly ex… ▽ More

    Submitted 4 August, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  43. arXiv:2307.16586  [pdf, other

    cs.CV

    SAMFlow: Eliminating Any Fragmentation in Optical Flow with Segment Anything Model

    Authors: Shili Zhou, Ruian He, Weimin Tan, Bo Yan

    Abstract: Optical Flow Estimation aims to find the 2D dense motion field between two frames. Due to the limitation of model structures and training datasets, existing methods often rely too much on local clues and ignore the integrity of objects, resulting in fragmented motion estimation. Through theoretical analysis, we find the pre-trained large vision models are helpful in optical flow estimation, and we… ▽ More

    Submitted 21 December, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted by AAAI 2024

  44. Uncertainty-Guided Spatial Pruning Architecture for Efficient Frame Interpolation

    Authors: Ri Cheng, Xuhao Jiang, Ruian He, Shili Zhou, Weimin Tan, Bo Yan

    Abstract: The video frame interpolation (VFI) model applies the convolution operation to all locations, leading to redundant computations in regions with easy motion. We can use dynamic spatial pruning method to skip redundant computation, but this method cannot properly identify easy regions in VFI tasks without supervision. In this paper, we develop an Uncertainty-Guided Spatial Pruning (UGSP) architectur… ▽ More

    Submitted 27 October, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: ACM Multimedia 2023

  45. arXiv:2307.14349  [pdf, other

    cs.SE cs.AI

    Copilot for Xcode: Exploring AI-Assisted Programming by Prompting Cloud-based Large Language Models

    Authors: Chee Wei Tan, Shangxin Guo, Man Fai Wong, Ching Nam Hang

    Abstract: This paper presents an AI-assisted programming tool called Copilot for Xcode for program composition and design to support human software developers. By seamlessly integrating cloud-based Large Language Models (LLM) with Apple's local development environment, Xcode, this tool enhances productivity and unleashes creativity for software development in Apple software ecosystem (e.g., iOS apps, macOS)… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

  46. arXiv:2307.13716  [pdf, other

    cs.LG cs.AI

    FedDRL: A Trustworthy Federated Learning Model Fusion Method Based on Staged Reinforcement Learning

    Authors: Leiming Chen, Weishan Zhang, Cihao Dong, Sibo Qiao, Ziling Huang, Yuming Nie, Zhaoxiang Hou, Chee Wei Tan

    Abstract: Traditional federated learning uses the number of samples to calculate the weights of each client model and uses this fixed weight value to fusion the global model. However, in practical scenarios, each client's device and data heterogeneity leads to differences in the quality of each client's model. Thus the contribution to the global model is not wholly determined by the sample size. In addition… ▽ More

    Submitted 19 March, 2024; v1 submitted 25 July, 2023; originally announced July 2023.

  47. arXiv:2307.04291  [pdf, other

    cs.SE

    Wait, wasn't that code here before? Detecting Outdated Software Documentation

    Authors: Wen Siang Tan, Markus Wagner, Christoph Treude

    Abstract: Encountering outdated documentation is not a rare occurrence for developers and users in the software engineering community. To ensure that software documentation is up-to-date, developers often have to manually check whether the documentation needs to be updated whenever changes are made to the source code. In our previous work, we proposed an approach to automatically detect outdated code elemen… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

  48. Contagion Source Detection in Epidemic and Infodemic Outbreaks: Mathematical Analysis and Network Algorithms

    Authors: Chee Wei Tan, Pei-Duo Yu

    Abstract: This monograph provides an overview of the mathematical theories and computational algorithm design for contagion source detection in large networks. By leveraging network centrality as a tool for statistical inference, we can accurately identify the source of contagions, trace their spread, and predict future trajectories. This approach provides fundamental insights into surveillance capability a… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: Suggested Citation: Chee Wei Tan and Pei-Duo Yu (2023), "Contagion Source Detection in Epidemic and Infodemic Outbreaks: Mathematical Analysis and Network Algorithms", Foundations and Trends in Networking: Vol. 13: No. 2-3, pp 107-251. http://dx.doi.org/10.1561/1300000068

  49. arXiv:2307.02503  [pdf, other

    cs.SE cs.AI cs.CL

    Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review

    Authors: Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan

    Abstract: This paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augmented with software naturalness, have played a crucial role in facilitating AI-assisted programming app… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Journal ref: Entropy(2023), 25(6), 888

  50. arXiv:2307.00773  [pdf, other

    cs.CV cs.AI

    DifFSS: Diffusion Model for Few-Shot Semantic Segmentation

    Authors: Weimin Tan, Siyuan Chen, Bo Yan

    Abstract: Diffusion models have demonstrated excellent performance in image generation. Although various few-shot semantic segmentation (FSS) models with different network structures have been proposed, performance improvement has reached a bottleneck. This paper presents the first work to leverage the diffusion model for FSS task, called DifFSS. DifFSS, a novel FSS paradigm, can further improve the perform… ▽ More

    Submitted 11 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: code is available at https://github.com/TrinitialChan/DifFSS