Skip to main content

Showing 1–50 of 269 results for author: Tan, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19958  [pdf, other

    stat.ML cs.LG math.ST

    The Computational Curse of Big Data for Bayesian Additive Regression Trees: A Hitting Time Analysis

    Authors: Yan Shuo Tan, Omer Ronen, Theo Saarinen, Bin Yu

    Abstract: Bayesian Additive Regression Trees (BART) is a popular Bayesian non-parametric regression model that is commonly used in causal inference and beyond. Its strong predictive performance is supported by theoretical guarantees that its posterior distribution concentrates around the true regression function at optimal rates under various data generative settings and for appropriate prior choices. In th… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    MSC Class: 62G08; 65C40

  2. arXiv:2406.13124  [pdf, other

    cs.CL

    Learning to Generate Answers with Citations via Factual Consistency Models

    Authors: Rami Aly, Zhiqiang Tang, Samson Tan, George Karypis

    Abstract: Large Language Models (LLMs) frequently hallucinate, impeding their reliability in mission-critical situations. One approach to address this issue is to provide citations to relevant sources alongside generated content, enhancing the verifiability of generations. However, citing passages accurately in answers remains a substantial challenge. This paper proposes a weakly-supervised fine-tuning meth… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024. Code release will follow

  3. arXiv:2406.12800  [pdf, other

    cs.CR

    Supporting Human Raters with the Detection of Harmful Content using Large Language Models

    Authors: Kurt Thomas, Patrick Gage Kelley, David Tao, Sarah Meiklejohn, Owen Vallis, Shunwen Tan, Blaž Bratanič, Felipe Tiengo Ferreira, Vijay Kumar Eranti, Elie Bursztein

    Abstract: In this paper, we explore the feasibility of leveraging large language models (LLMs) to automate or otherwise assist human raters with identifying harmful content including hate speech, harassment, violent extremism, and election misinformation. Using a dataset of 50,000 comments, we demonstrate that LLMs can achieve 90% accuracy when compared to human verdicts. We explore how to best leverage the… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.12649  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models

    Authors: Hengyi Wang, Shiwei Tan, Hao Wang

    Abstract: Vision transformers (ViTs) have emerged as a significant area of focus, particularly for their capacity to be jointly trained with large language models and to serve as robust vision foundation models. Yet, the development of trustworthy explanation methods for ViTs has lagged, particularly in the context of post-hoc interpretations of ViT predictions. Existing sub-image selection approaches, such… ▽ More

    Submitted 18 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  5. arXiv:2406.12313  [pdf

    cs.DB

    A framework for develo** a knowledge management platform

    Authors: Marie Lisandra Zepeda Mendoza, Sonali Agarwal, James A. Blackshaw, Vanesa Bol, Audrey Fazzi, Filippo Fiorini, Amy Louise Foreman, Nancy George, Brett R. Johnson, Brian Martin, Dave McComb, Euphemia Mutasa-Gottgens, Helen Parkinson, Martin Romacker, Rolf Russell, Valérien Ségard, Shawn Zheng Kai Tan, Wei Kheng Teh, F. P. Winstanley, Benedict Wong, Adrian M. Smith

    Abstract: Knowledge management (KM) involves collecting, organizing, storing, and disseminating information to improve decision-making, innovation, and performance. Implementing KM at scale has become essential for organizations to effectively leverage vast accessible data. This paper is a compilation of concepts that emerged from KM workshops hosted by EMBL-EBI, attended by SMEs and industry. We provide gu… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 18 pages, 1 figure

  6. arXiv:2406.11230  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models

    Authors: Hengyi Wang, Haizhou Shi, Shiwei Tan, Weiyi Qin, Wenyuan Wang, Tunyu Zhang, Akshay Nambi, Tanuja Ganu, Hao Wang

    Abstract: Multimodal Large Language Models (MLLMs) have shown significant promise in various applications, leading to broad interest from researchers and practitioners alike. However, a comprehensive evaluation of their long-context capabilities remains underexplored. To address these gaps, we introduce the MultiModal Needle-in-a-haystack (MMNeedle) benchmark, specifically designed to assess the long-contex… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  7. arXiv:2406.10290  [pdf, other

    cs.CL cs.AI cs.LG

    MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases

    Authors: Rithesh Murthy, Liangwei Yang, Juntao Tan, Tulika Manoj Awalgaonkar, Yilun Zhou, Shelby Heinecke, Sachin Desai, Jason Wu, Ran Xu, Sarah Tan, Jianguo Zhang, Zhiwei Liu, Shirley Kokane, Zuxin Liu, Ming Zhu, Huan Wang, Caiming Xiong, Silvio Savarese

    Abstract: The deployment of Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices has gained significant attention due to the benefits of enhanced privacy, stability, and personalization. However, the hardware constraints of mobile devices necessitate the use of models with fewer parameters and model compression techniques like quantization. Currently, there is limited understand… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  8. arXiv:2406.07866  [pdf, other

    cs.LG math.OC

    Asymptotically Optimal Regret for Black-Box Predict-then-Optimize

    Authors: Samuel Tan, Peter I. Frazier

    Abstract: We consider the predict-then-optimize paradigm for decision-making in which a practitioner (1) trains a supervised learning model on historical data of decisions, contexts, and rewards, and then (2) uses the resulting model to make future binary decisions for new contexts by finding the decision that maximizes the model's predicted reward. This approach is common in industry. Past analysis assumes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 15 pages, 2 figures, 3 tables

  9. arXiv:2405.16003  [pdf, other

    cs.AI cs.CY cs.LG

    Disentangling Heterogeneous Knowledge Concept Embedding for Cognitive Diagnosis on Untested Knowledge

    Authors: Kui Xiao, Runtian Xing, Miao Zhang, Shunfeng Tan, Ziming Wang, Xiaolian Zhu

    Abstract: Cognitive diagnosis is a fundamental and critical task in learning assessment, which aims to infer students' proficiency on knowledge concepts from their response logs. Current works assume each knowledge concept will certainly be tested and covered by multiple exercises. However, whether online or offline courses, it's hardly feasible to completely cover all knowledge concepts in several exercise… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  10. arXiv:2405.14782  [pdf, other

    cs.CL

    Lessons from the Trenches on Reproducible Evaluation of Language Models

    Authors: Stella Biderman, Hailey Schoelkopf, Lintang Sutawika, Leo Gao, Jonathan Tow, Baber Abbasi, Alham Fikri Aji, Pawan Sasanka Ammanamanchi, Sidney Black, Jordan Clive, Anthony DiPofi, Julen Etxaniz, Benjamin Fattori, Jessica Zosa Forde, Charles Foster, Jeffrey Hsu, Mimansa Jaiswal, Wilson Y. Lee, Haonan Li, Charles Lovering, Niklas Muennighoff, Ellie Pavlick, Jason Phang, Aviya Skowron, Samson Tan , et al. (5 additional authors not shown)

    Abstract: Effective evaluation of language models remains an open challenge in NLP. Researchers and engineers face methodological issues such as the sensitivity of models to evaluation setup, difficulty of proper comparisons across methods, and the lack of reproducibility and transparency. In this paper we draw on three years of experience in evaluating large language models to provide guidance and lessons… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  11. arXiv:2405.12462   

    cs.LG cs.AI

    Boosting X-formers with Structured Matrix for Long Sequence Time Series Forecasting

    Authors: Zhicheng Zhang, Yong Wang, Shaoqi Tan, Bowei Xia, Yujie Luo

    Abstract: Transformer-based models for long sequence time series forecasting (LSTF) problems have gained significant attention due to their exceptional forecasting precision. As the cornerstone of these models, the self-attention mechanism poses a challenge to efficient training and inference due to its quadratic time complexity. In this article, we propose a novel architectural design for Transformer-based… ▽ More

    Submitted 22 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: We believe this work is premature and requires further study

  12. arXiv:2405.05413  [pdf

    cs.DB

    Digital Evolution: Novo Nordisk's Shift to Ontology-Based Data Management

    Authors: Shawn Zheng Kai Tan, Shounak Baksi, Thomas Gade Bjerregaard, Preethi Elangovan, Thrishna Kuttikattu Gopalakrishnan, Darko Hric, Joffrey Joumaa, Beidi Li, Kashif Rabbani, Santhosh Kannan Venkatesan, Joshua Daniel Valdez, Saritha Vettikunnel Kuriakose

    Abstract: Biomedical data is growing exponentially, and managing it is increasingly challenging. While Findable, Accessible, Interoperable and Reusable (FAIR) data principles provide guidance, their adoption has proven difficult, especially in larger enterprises like pharmaceutical companies. In this manuscript, we describe how we leverage an Ontology-Based Data Management (OBDM) strategy for digital transf… ▽ More

    Submitted 10 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: 14 pages, 2 figures

  13. arXiv:2405.02213  [pdf, other

    cs.SE cs.AI cs.LG

    Automatic Programming: Large Language Models and Beyond

    Authors: Michael R. Lyu, Baishakhi Ray, Abhik Roychoudhury, Shin Hwei Tan, Patanamon Thongtanunam

    Abstract: Automatic programming has seen increasing popularity due to the emergence of tools like GitHub Copilot which rely on Large Language Models (LLMs). At the same time, automatically generated code faces challenges during deployment due to concerns around quality and trust. In this article, we study automated coding in a general sense and study the concerns around code quality, security and related is… ▽ More

    Submitted 15 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  14. arXiv:2405.01350  [pdf, other

    cs.LG cs.SI

    Community-Invariant Graph Contrastive Learning

    Authors: Shiyin Tan, Dongyuan Li, Renhe Jiang, Ying Zhang, Manabu Okumura

    Abstract: Graph augmentation has received great attention in recent years for graph contrastive learning (GCL) to learn well-generalized node/graph representations. However, mainstream GCL methods often favor randomly disrupting graphs for augmentation, which shows limited generalization and inevitably leads to the corruption of high-level graph information, i.e., the graph community. Moreover, current know… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: This paper is accepted by ICML-2024

  15. arXiv:2404.17126  [pdf, other

    cs.LG cs.AI eess.IV physics.med-ph

    Deep Evidential Learning for Dose Prediction

    Authors: Hai Siong Tan, Kuancheng Wang, Rafe Mcbeth

    Abstract: In this work, we present a novel application of an uncertainty-quantification framework called Deep Evidential Learning in the domain of radiotherapy dose prediction. Using medical images of the Open Knowledge-Based Planning Challenge dataset, we found that this model can be effectively harnessed to yield uncertainty estimates that inherited correlations with prediction errors upon completion of n… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 24 pages, 8 figures

  16. arXiv:2404.15163  [pdf, other

    cs.CV eess.IV

    Adaptive Mixed-Scale Feature Fusion Network for Blind AI-Generated Image Quality Assessment

    Authors: Tianwei Zhou, Songbai Tan, Wei Zhou, Yu Luo, Yuan-Gen Wang, Guanghui Yue

    Abstract: With the increasing maturity of the text-to-image and image-to-image generative models, AI-generated images (AGIs) have shown great application potential in advertisement, entertainment, education, social media, etc. Although remarkable advancements have been achieved in generative models, very few efforts have been paid to design relevant quality assessment models. In this paper, we propose a nov… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: IEEE Transactions on Broadcasting (TBC)

  17. arXiv:2404.11201  [pdf, other

    cs.CL

    Neuron Specialization: Leveraging intrinsic task modularity for multilingual machine translation

    Authors: Shaomu Tan, Di Wu, Christof Monz

    Abstract: Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference. Language-specific modeling methods show promise in reducing interference. However, they often rely on heuristics to distribute capacity and struggle to foster cross-lingual transfer via isolated modules. In this paper, we explore intrinsic task modularity within multilingual networks… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  18. arXiv:2404.08877  [pdf, other

    cs.SE cs.CL cs.LG

    Aligning LLMs for FL-free Program Repair

    Authors: Junjielong Xu, Ying Fu, Shin Hwei Tan, Pinjia He

    Abstract: Large language models (LLMs) have achieved decent results on automated program repair (APR). However, the next token prediction training objective of decoder-only LLMs (e.g., GPT-4) is misaligned with the masked span prediction objective of current infilling-style methods, which impedes LLMs from fully leveraging pre-trained knowledge for program repair. In addition, while some LLMs are capable of… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  19. arXiv:2404.07979  [pdf, other

    cs.CL cs.AI cs.LG

    LLoCO: Learning Long Contexts Offline

    Authors: Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa

    Abstract: Processing long contexts remains a challenge for large language models (LLMs) due to the quadratic computational and memory overhead of the self-attention mechanism and the substantial KV cache sizes during generation. We propose a novel approach to address this problem by learning contexts offline through context compression and in-domain parameter-efficient finetuning. Our method enables an LLM… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: The first two authors contributed equally to this work

  20. arXiv:2404.01647  [pdf, other

    cs.CV

    EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis

    Authors: Shuai Tan, Bin Ji, Mengxiao Bi, Ye Pan

    Abstract: Achieving disentangled control over multiple facial motions and accommodating diverse input modalities greatly enhances the application and entertainment of the talking head generation. This necessitates a deep exploration of the decoupling space for facial features, ensuring that they a) operate independently without mutual interference and b) can be preserved to share with different modal input,… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 22 pages, 15 figures

  21. arXiv:2403.15132  [pdf, other

    cs.CV eess.IV

    Transfer CLIP for Generalizable Image Denoising

    Authors: Jun Cheng, Dong Liang, Shan Tan

    Abstract: Image denoising is a fundamental task in computer vision. While prevailing deep learning-based supervised and self-supervised methods have excelled in eliminating in-distribution noise, their susceptibility to out-of-distribution (OOD) noise remains a significant challenge. The recent emergence of contrastive language-image pre-training (CLIP) model has showcased exceptional capabilities in open-w… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  22. arXiv:2403.08245  [pdf, other

    cs.LG cs.DC

    Scattered Mixture-of-Experts Implementation

    Authors: Shawn Tan, Yikang Shen, Rameswar Panda, Aaron Courville

    Abstract: We present ScatterMoE, an implementation of Sparse Mixture-of-Experts (SMoE) on GPUs. ScatterMoE builds upon existing implementations, and overcoming some of the limitations to improve inference and training speed, and memory footprint. This implementation achieves this by avoiding padding and making excessive copies of the input. We introduce ParallelLinear, the main component we use to build our… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  23. arXiv:2403.06375  [pdf, other

    cs.CV

    FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization

    Authors: Shuai Tan, Bin Ji, Ye Pan

    Abstract: Generating emotional talking faces is a practical yet challenging endeavor. To create a lifelike avatar, we draw upon two critical insights from a human perspective: 1) The connection between audio and the non-deterministic facial dynamics, encompassing expressions, blinks, poses, should exhibit synchronous and one-to-many map**. 2) Vibrant expressions are often accompanied by emotion-aware high… ▽ More

    Submitted 22 April, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: 11 pages, 11 figures, conference

  24. arXiv:2403.06365  [pdf, other

    cs.CV

    Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style

    Authors: Shuai Tan, Bin Ji, Ye Pan

    Abstract: Although automatically animating audio-driven talking heads has recently received growing interest, previous efforts have mainly concentrated on achieving lip synchronization with the audio, neglecting two crucial elements for generating expressive videos: emotion style and art style. In this paper, we present an innovative audio-driven talking face generation method called Style2Talker. It involv… ▽ More

    Submitted 11 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: 9 pages, 5 figures, conference

  25. arXiv:2403.06363  [pdf, other

    cs.CV

    Say Anything with Any Style

    Authors: Shuai Tan, Bin Ji, Yu Ding, Ye Pan

    Abstract: Generating stylized talking head with diverse head motions is crucial for achieving natural-looking videos but still remains challenging. Previous works either adopt a regressive method to capture the speaking style, resulting in a coarse style that is averaged across all training data, or employ a universal network to synthesize videos with different styles which causes suboptimal performance. To… ▽ More

    Submitted 12 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: 9 pages, 5 figures, conference

  26. arXiv:2403.04133  [pdf, other

    cs.CV cs.RO

    Towards learning-based planning:The nuPlan benchmark for real-world autonomous driving

    Authors: Napat Karnchanachari, Dimitris Geromichalos, Kok Seang Tan, Nanxiang Li, Christopher Eriksen, Shakiba Yaghoubi, Noushin Mehdipour, Gianmarco Bernasconi, Whye Kit Fong, Yiluan Guo, Holger Caesar

    Abstract: Machine Learning (ML) has replaced traditional handcrafted methods for perception and prediction in autonomous vehicles. Yet for the equally important planning task, the adoption of ML-based techniques is slow. We present nuPlan, the world's first real-world autonomous driving dataset, and benchmark. The benchmark is designed to test the ability of ML-based planners to handle diverse driving situa… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: ICRA 2024 camera ready incl. supplementary material

  27. arXiv:2403.01229  [pdf, other

    cs.CV cs.AI cs.LG eess.SP

    REWIND Dataset: Privacy-preserving Speaking Status Segmentation from Multimodal Body Movement Signals in the Wild

    Authors: Jose Vargas Quiros, Chirag Raman, Stephanie Tan, Ekin Gedik, Laura Cabrera-Quiros, Hayley Hung

    Abstract: Recognizing speaking in humans is a central task towards understanding social interactions. Ideally, speaking would be detected from individual voice recordings, as done previously for meeting scenarios. However, individual voice recordings are hard to obtain in the wild, especially in crowded mingling scenarios due to cost, logistics, and privacy concerns. As an alternative, machine learning mode… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  28. arXiv:2402.18600  [pdf

    eess.IV cs.AI q-bio.TO

    Artificial Intelligence and Diabetes Mellitus: An Inside Look Through the Retina

    Authors: Yasin Sadeghi Bazargani, Majid Mirzaei, Navid Sobhi, Mirsaeed Abdollahi, Ali Jafarizadeh, Siamak Pedrammehr, Roohallah Alizadehsani, Ru San Tan, Sheikh Mohammed Shariful Islam, U. Rajendra Acharya

    Abstract: Diabetes mellitus (DM) predisposes patients to vascular complications. Retinal images and vasculature reflect the body's micro- and macrovascular health. They can be used to diagnose DM complications, including diabetic retinopathy (DR), neuropathy, nephropathy, and atherosclerotic cardiovascular disease, as well as forecast the risk of cardiovascular events. Artificial intelligence (AI)-enabled s… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 44 Pages, 6 figures, 1 table, 166 references

    ACM Class: J.3.2; J.3.3

  29. arXiv:2402.18592  [pdf, other

    cs.AR cs.PF

    A$^3$PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader

    Authors: Qingcai Jiang, Shaojie Tan, Junshi Chen, Hong An

    Abstract: The performance gap between memory and processor has grown rapidly. Consequently, the energy and wall-clock time costs associated with moving data between the CPU and main memory predominate the overall computational cost. The Processing-in-Memory (PIM) paradigm emerges as a promising architecture that mitigates the need for extensive data movements by strategically positioning computing units pro… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 6 pages, 4 figures, accepted for presentation at Design, Automation and Test in Europe Conference | The European Event for Electronic System Design & Test (DATE 2024), conference to be held in March 2024

  30. arXiv:2402.17509  [pdf, other

    cs.CL

    Extreme Miscalibration and the Illusion of Adversarial Robustness

    Authors: Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis

    Abstract: Deep learning-based Natural Language Processing (NLP) models are vulnerable to adversarial attacks, where small perturbations can cause a model to misclassify. Adversarial Training (AT) is often used to increase model robustness. However, we have discovered an intriguing phenomenon: deliberately or accidentally miscalibrating models masks gradients in a way that interferes with adversarial attack… ▽ More

    Submitted 30 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  31. arXiv:2402.14366  [pdf, other

    cs.SE

    Understanding and Detecting Annotation-Induced Faults of Static Analyzers

    Authors: Huaien Zhang, Yu Pei, Shuyun Liang, Shin Hwei Tan

    Abstract: Static analyzers can reason about the properties and behaviors of programs and detect various issues without executing them. Hence, they should extract the necessary information to understand the analyzed program well. Annotation has been a widely used feature for different purposes in Java since the introduction of Java 5. Annotations can change program structures and convey semantics information… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 23 pages, 16 figures

  32. arXiv:2402.10551  [pdf, other

    cs.LG q-bio.QM

    Personalised Drug Identifier for Cancer Treatment with Transformers using Auxiliary Information

    Authors: Aishwarya Jayagopal, Hansheng Xue, Ziyang He, Robert J. Walsh, Krishna Kumar Hariprasannan, David Shao Peng Tan, Tuan Zea Tan, Jason J. Pitt, Anand D. Jeyasekharan, Vaibhav Rajan

    Abstract: Cancer remains a global challenge due to its growing clinical and economic burden. Its uniquely personal manifestation, which makes treatment difficult, has fuelled the quest for personalized treatment strategies. Thus, genomic profiling is increasingly becoming part of clinical diagnostic panels. Effective use of such panels requires accurate drug response prediction (DRP) models, which are chall… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  33. arXiv:2402.02478  [pdf, other

    cs.LG cs.AI

    Why are hyperbolic neural networks effective? A study on hierarchical representation capability

    Authors: Shicheng Tan, Huan**g Zhao, Shu Zhao, Yan** Zhang

    Abstract: Hyperbolic Neural Networks (HNNs), operating in hyperbolic space, have been widely applied in recent years, motivated by the existence of an optimal embedding in hyperbolic space that can preserve data hierarchical relationships (termed Hierarchical Representation Capability, HRC) more accurately than Euclidean space. However, there is no evidence to suggest that HNNs can achieve this theoretical… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  34. arXiv:2401.15234  [pdf, other

    cs.SE

    Moving beyond Deletions: Program Simplification via Diverse Program Transformations

    Authors: Haibo Wang, Zezhong Xing, Zheng Wang, Chengnian Sun, Shin Hwei Tan

    Abstract: To reduce the complexity of software, Developers manually simplify program (known as developer-induced program simplification in this paper) to reduce its code size yet preserving its functionality but manual simplification is time-consuming and error-prone. To reduce manual effort, rule-based approaches (e.g., refactoring) and deletion-based approaches (e.g., delta debugging) can be potentially a… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  35. arXiv:2401.13587  [pdf, other

    cs.IT eess.SP

    Deep Learning Based Adaptive Joint mmWave Beam Alignment

    Authors: Daniel Tandler, Marc Gauger, Ahmet Serdar Tan, Sebastian Dörner, Stephan ten Brink

    Abstract: The challenging propagation environment, combined with the hardware limitations of mmWave systems, gives rise to the need for accurate initial access beam alignment strategies with low latency and high achievable beamforming gain. Much of the recent work in this area either focuses on one-sided beam alignment, or, joint beam alignment methods where both sides of the link perform a sequence of fixe… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  36. arXiv:2401.12413  [pdf, other

    cs.CL cs.LG

    How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation via Tiny Multi-Parallel Data

    Authors: Di Wu, Shaomu Tan, Yan Meng, David Stap, Christof Monz

    Abstract: Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem. A common, albeit resource-consuming, solution is to add as many related translation directions as possible to the training corpus. In this paper, we show that for an English-centric model, surprisingly large zero-shot improveme… ▽ More

    Submitted 26 February, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 15 pages, 5 figures

  37. arXiv:2312.16828  [pdf, other

    cs.IR

    GUITAR: Gradient Pruning toward Fast Neural Ranking

    Authors: Weijie Zhao, Shulong Tan, ** Li

    Abstract: With the continuous popularity of deep learning and representation learning, fast vector search becomes a vital task in various ranking/retrieval based applications, say recommendation, ads ranking and question answering. Neural network based ranking is widely adopted due to its powerful capacity in modeling complex relationships, such as between users and items, questions and answers. However, it… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  38. arXiv:2312.13382  [pdf, ps, other

    cs.CL cs.AI cs.PL

    DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines

    Authors: Arnav Singhvi, Manish Shetty, Shangyin Tan, Christopher Potts, Koushik Sen, Matei Zaharia, Omar Khattab

    Abstract: Chaining language model (LM) calls as composable modules is fueling a new way of programming, but ensuring LMs adhere to important constraints requires heuristic "prompt engineering". We introduce LM Assertions, a programming construct for expressing computational constraints that LMs should satisfy. We integrate our constructs into the recent DSPy programming model for LMs, and present new strate… ▽ More

    Submitted 2 February, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Arnav*, Manish*, Shangyin* contributed equally to this work

  39. LPR: Large Language Models-Aided Program Reduction

    Authors: Mengxiao Zhang, Yongqiang Tian, Zhenyang Xu, Yiwen Dong, Shin Hwei Tan, Chengnian Sun

    Abstract: Program reduction is a prevalent technique to facilitate compilers' debugging by automatically minimizing bug-triggering programs. Existing program reduction techniques are either generic across languages (e.g., Perses and Vulcan) or specifically customized for one certain language by employing language-specific features, like C-Reduce. However, striking the balance between generality across multi… ▽ More

    Submitted 11 May, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Accepted by ISSTA'24. This is the preprint version

  40. arXiv:2312.11391  [pdf, other

    cs.AI cs.GT cs.LG

    FedCompetitors: Harmonious Collaboration in Federated Learning with Competing Participants

    Authors: Shanli Tan, Hao Cheng, Xiaohu Wu, Han Yu, Tiantian He, Yew-Soon Ong, Chongjun Wang, Xiaofeng Tao

    Abstract: Federated learning (FL) provides a privacy-preserving approach for collaborative training of machine learning models. Given the potential data heterogeneity, it is crucial to select appropriate collaborators for each FL participant (FL-PT) based on data complementarity. Recent studies have addressed this challenge. Similarly, it is imperative to consider the inter-individual relationships among FL… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI-2024

  41. Exploring UMAP in hybrid models of entropy-based and representativeness sampling for active learning in biomedical segmentation

    Authors: H. S. Tan, Kuancheng Wang, Rafe Mcbeth

    Abstract: In this work, we study various hybrid models of entropy-based and representativeness sampling techniques in the context of active learning in medical segmentation, in particular examining the role of UMAP (Uniform Manifold Approximation and Projection) as a technique for capturing representativeness. Although UMAP has been shown viable as a general purpose dimension reduction method in diverse are… ▽ More

    Submitted 27 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: 25 pages, 6 figures

    Journal ref: Computers in Biology and Medicine, vol. 176, June 2024, 108605

  42. arXiv:2312.08764  [pdf, other

    cs.CV

    CattleEyeView: A Multi-task Top-down View Cattle Dataset for Smarter Precision Livestock Farming

    Authors: Kian Eng Ong, Sivaji Retta, Ramarajulu Srinivasan, Shawn Tan, Jun Liu

    Abstract: Cattle farming is one of the important and profitable agricultural industries. Employing intelligent automated precision livestock farming systems that can count animals, track the animals and their poses will raise productivity and significantly reduce the heavy burden on its already limited labor pool. To achieve such intelligent systems, a large cattle video dataset is essential in develo** a… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Published at VCIP 2023. Dataset and code available at https://github.com/AnimalEyeQ/CattleEyeView

  43. arXiv:2312.08317  [pdf, other

    cs.CR cs.AI

    Prompt Engineering-assisted Malware Dynamic Analysis Using GPT-4

    Authors: Pei Yan, Shunquan Tan, Miaohui Wang, Jiwu Huang

    Abstract: Dynamic analysis methods effectively identify shelled, wrapped, or obfuscated malware, thereby preventing them from invading computers. As a significant representation of dynamic malware behavior, the API (Application Programming Interface) sequence, comprised of consecutive API calls, has progressively become the dominant feature of dynamic analysis methods. Though there have been numerous deep l… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  44. arXiv:2312.05778  [pdf, other

    cs.SE

    Guiding ChatGPT to Fix Web UI Tests via Explanation-Consistency Checking

    Authors: Zhuolin Xu, Qiushi Li, Shin Hwei Tan

    Abstract: The rapid evolution of Web UI incurs time and effort in maintaining UI tests. Existing techniques in Web UI test repair focus on finding the target elements on the new web page that match the old ones so that the corresponding broken statements can be repaired. We present the first study that investigates the feasibility of using prior Web UI repair techniques for initial local matching and then u… ▽ More

    Submitted 26 January, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  45. arXiv:2312.04712  [pdf, other

    cs.LG

    Error Discovery by Clustering Influence Embeddings

    Authors: Fulton Wang, Julius Adebayo, Sarah Tan, Diego Garcia-Olano, Narine Kokhlikyan

    Abstract: We present a method for identifying groups of test examples -- slices -- on which a model under-performs, a task now known as slice discovery. We formalize coherence -- a requirement that erroneous predictions, within a slice, should be wrong for the same reason -- as a key property that any slice discovery method should satisfy. We then use influence functions to derive a new slice discovery meth… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: NeuRIPs 2023 conference paper

  46. arXiv:2311.06572  [pdf, other

    eess.IV cs.CV

    Swin UNETR++: Advancing Transformer-Based Dense Dose Prediction Towards Fully Automated Radiation Oncology Treatments

    Authors: Kuancheng Wang, Hai Siong Tan, Rafe Mcbeth

    Abstract: The field of Radiation Oncology is uniquely positioned to benefit from the use of artificial intelligence to fully automate the creation of radiation treatment plans for cancer therapy. This time-consuming and specialized task combines patient imaging with organ and tumor segmentation to generate a 3D radiation dose distribution to meet clinical treatment goals, similar to voxel-level dense predic… ▽ More

    Submitted 17 March, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 16 pages

  47. arXiv:2311.06552  [pdf, other

    eess.IV cs.CV cs.LG

    Stain Consistency Learning: Handling Stain Variation for Automatic Digital Pathology Segmentation

    Authors: Michael Yeung, Todd Watts, Sean YW Tan, Pedro F. Ferreira, Andrew D. Scott, Sonia Nielles-Vallespin, Guang Yang

    Abstract: Stain variation is a unique challenge associated with automated analysis of digital pathology. Numerous methods have been developed to improve the robustness of machine learning methods to stain variation, but comparative studies have demonstrated limited benefits to performance. Moreover, methods to handle stain variation were largely developed for H&E stained data, with evaluation generally limi… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

  48. arXiv:2310.10385  [pdf, other

    cs.CL cs.LG

    Towards a Better Understanding of Variations in Zero-Shot Neural Machine Translation Performance

    Authors: Shaomu Tan, Christof Monz

    Abstract: Multilingual Neural Machine Translation (MNMT) facilitates knowledge sharing but often suffers from poor zero-shot (ZS) translation qualities. While prior work has explored the causes of overall low ZS performance, our work introduces a fresh perspective: the presence of high variations in ZS performance. This suggests that MNMT does not uniformly exhibit poor ZS capability; instead, certain trans… ▽ More

    Submitted 31 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: This paper is accepted by the EMNLP 2023 Main Conference

  49. arXiv:2310.10036  [pdf, other

    cs.CV cs.MM

    Evading Detection Actively: Toward Anti-Forensics against Forgery Localization

    Authors: Long Zhuo, Shenghai Luo, Shunquan Tan, Han Chen, Bin Li, Jiwu Huang

    Abstract: Anti-forensics seeks to eliminate or conceal traces of tampering artifacts. Typically, anti-forensic methods are designed to deceive binary detectors and persuade them to misjudge the authenticity of an image. However, to the best of our knowledge, no attempts have been made to deceive forgery detectors at the pixel level and mis-locate forged regions. Traditional adversarial attack methods cannot… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  50. arXiv:2310.09946  [pdf, other

    cs.CL cs.LG

    UvA-MT's Participation in the WMT23 General Translation Shared Task

    Authors: Di Wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz

    Abstract: This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation. We participate in the constrained track in two directions: English <-> Hebrew. In this competition, we show that by using one model to handle bidirectional tasks, as a minimal setting of Multilingual Machine Translation (MMT), it is possible to achieve comparable results with that of traditiona… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: This paper has been accepted by the WMT2023 Conference