Skip to main content

Showing 1–50 of 523 results for author: Tan, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19958  [pdf, other

    stat.ML cs.LG math.ST

    The Computational Curse of Big Data for Bayesian Additive Regression Trees: A Hitting Time Analysis

    Authors: Yan Shuo Tan, Omer Ronen, Theo Saarinen, Bin Yu

    Abstract: Bayesian Additive Regression Trees (BART) is a popular Bayesian non-parametric regression model that is commonly used in causal inference and beyond. Its strong predictive performance is supported by theoretical guarantees that its posterior distribution concentrates around the true regression function at optimal rates under various data generative settings and for appropriate prior choices. In th… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    MSC Class: 62G08; 65C40

  2. arXiv:2406.19755  [pdf, other

    q-bio.QM cs.AI

    Protein Representation Learning with Sequence Information Embedding: Does it Always Lead to a Better Performance?

    Authors: Yang Tan, Lirong Zheng, Bozitao Zhong, Liang Hong, Bingxin Zhou

    Abstract: Deep learning has become a crucial tool in studying proteins. While the significance of modeling protein structure has been discussed extensively in the literature, amino acid types are typically included in the input as a default operation for many inference tasks. This study demonstrates with structure alignment task that embedding amino acid types in some cases may not help a deep learning mode… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 8 pages, 4 figures

  3. arXiv:2406.16671  [pdf, other

    cs.RO

    STAR: Swarm Technology for Aerial Robotics Research

    Authors: Jimmy Chiun, Yan Rui Tan, Yuhong Cao, John Tan, Guillaume Sartoretti

    Abstract: In recent years, the field of aerial robotics has witnessed significant progress, finding applications in diverse domains, including post-disaster search and rescue operations. Despite these strides, the prohibitive acquisition costs associated with deploying physical multi-UAV systems have posed challenges, impeding their widespread utilization in research endeavors. To overcome these challenges,… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.13434  [pdf, other

    cs.RO

    Tactile Aware Dynamic Obstacle Avoidance in Crowded Environment with Deep Reinforcement Learning

    Authors: Yung Chuen Ng, Qi Wen, Lim, Chun Ye Tan, Zhen Hao Gan, Meng Yee, Chuah

    Abstract: Mobile robots operating in crowded environments require the ability to navigate among humans and surrounding obstacles efficiently while adhering to safety standards and socially compliant mannerisms. This scale of the robot navigation problem may be classified as both a local path planning and trajectory optimization problem. This work presents an array of force sensors that act as a tactile laye… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.12835  [pdf, other

    cs.LG cs.AI cs.IR cs.SI

    Influence Maximization via Graph Neural Bandits

    Authors: Yuting Feng, Vincent Y. F. Tan, Bogdan Cautis

    Abstract: We consider a ubiquitous scenario in the study of Influence Maximization (IM), in which there is limited knowledge about the topology of the diffusion network. We set the IM problem in a multi-round diffusion campaign, aiming to maximize the number of distinct users that are influenced. Leveraging the capability of bandit algorithms to effectively balance the objectives of exploration and exploita… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: To appear at the 2024 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)

  6. arXiv:2406.12241  [pdf, other

    cs.LG cs.AI

    More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling

    Authors: Haque Ishfaq, Yixin Tan, Yu Yang, Qingfeng Lan, Jianfeng Lu, A. Rupam Mahmood, Doina Precup, Pan Xu

    Abstract: Thompson sampling (TS) is one of the most popular exploration techniques in reinforcement learning (RL). However, most TS algorithms with theoretical guarantees are difficult to implement and not generalizable to Deep RL. While the emerging approximate sampling-based exploration schemes are promising, most existing algorithms are specific to linear Markov Decision Processes (MDP) with suboptimal r… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: First two authors contributed equally. Accepted to the Reinforcement Learning Conference (RLC) 2024

  7. arXiv:2406.12205  [pdf, other

    cs.LG cs.AI cs.IT math.ST stat.ML

    Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback

    Authors: Zhirui Chen, Vincent Y. F. Tan

    Abstract: We consider offline reinforcement learning (RL) with preference feedback in which the implicit reward is a linear function of an unknown parameter. Given an offline dataset, our objective consists in ascertaining the optimal action for each state, with the ultimate goal of minimizing the {\em simple regret}. We propose an algorithm, \underline{RL} with \underline{L}ocally \underline{O}ptimal \unde… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted to Models of Human Feedback for AI Alignment Workshop, ICML 2024

  8. arXiv:2406.11414  [pdf, other

    cs.LO cs.AI

    Formally Certified Approximate Model Counting

    Authors: Yong Kiam Tan, Jiong Yang, Mate Soos, Magnus O. Myreen, Kuldeep S. Meel

    Abstract: Approximate model counting is the task of approximating the number of solutions to an input Boolean formula. The state-of-the-art approximate model counter for formulas in conjunctive normal form (CNF), ApproxMC, provides a scalable means of obtaining model counts with probably approximately correct (PAC)-style guarantees. Nevertheless, the validity of ApproxMC's approximation relies on a careful… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: The extended version, including the appendix, of the paper to be published in CAV24. The associated artifact is available at https://doi.org/10.5281/zenodo.10948839

  9. arXiv:2406.10100  [pdf, other

    cs.CV cs.AI

    SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding

    Authors: Junwei Luo, Zhen Pang, Yongjun Zhang, Tingzhu Wang, Linlin Wang, Bo Dang, Jiangwei Lao, Jian Wang, **gdong Chen, Yihua Tan, Yansheng Li

    Abstract: Remote Sensing Large Multi-Modal Models (RSLMMs) are develo** rapidly and showcase significant capabilities in remote sensing imagery (RSI) comprehension. However, due to the limitations of existing datasets, RSLMMs have shortcomings in understanding the rich semantic relations among objects in complex remote sensing scenes. To unlock RSLMMs' complex comprehension ability, we propose a large-sca… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 30 pages, 5 figures, 19 tables, dataset and code see https://github.com/Luo-Z13/SkySenseGPT

  10. arXiv:2406.09121  [pdf, other

    cs.CV

    MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era

    Authors: Jiahao Nie, Gongjie Zhang, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Despite the recent advancements in Multi-modal Large Language Models (MLLMs), understanding inter-object relations, i.e., interactions or associations between distinct objects, remains a major challenge for such models. This issue significantly hinders their advanced reasoning capabilities and is primarily due to the lack of large-scale, high-quality, and diverse multi-modal data essential for tra… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  11. arXiv:2406.08443  [pdf, other

    cs.CV cs.LG

    Transformation-Dependent Adversarial Attacks

    Authors: Yaoteng Tan, Zikui Cai, M. Salman Asif

    Abstract: We introduce transformation-dependent adversarial attacks, a new class of threats where a single additive perturbation can trigger diverse, controllable mis-predictions by systematically transforming the input (e.g., scaling, blurring, compression). Unlike traditional attacks with static effects, our perturbations embed metamorphic properties to enable different adversarial attacks as a function o… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  12. DHR+S: Distributed Hybrid Rendering with Realistic Real-time Shadows for Interactive Thin Client Metaverse and Game Applications

    Authors: Yu Wei Tan, Siang Ern Low, Jonas Chow, Javon Teo, Anand Bhojan

    Abstract: Distributed hybrid rendering (DHR) is a real-time rendering approach that incorporates cloud-based ray tracing with locally rasterized graphics for interactive thin client metaverse and game applications. With cloud assistance, DHR can generate high-fidelity ray-traced graphics contents remotely and deliver them to thin clients with low graphics capability, including standalone extended reality de… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    MSC Class: 68U05 ACM Class: I.3

  13. arXiv:2406.06925  [pdf, other

    cs.LG cs.IR

    Non-autoregressive Personalized Bundle Generation

    Authors: Wenchuan Yang, Cheng Yang, Jichao Li, Yue** Tan, Xin Lu, Chuan Shi

    Abstract: The personalized bundle generation problem, which aims to create a preferred bundle for user from numerous candidate items, receives increasing attention in recommendation. However, existing works ignore the order-invariant nature of the bundle and adopt sequential modeling methods as the solution, which might introduce inductive bias and cause a large latency in prediction. To address this proble… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to Information Processing & Management

  14. arXiv:2406.04881  [pdf, other

    cs.IT eess.SP

    MIMO Capacity Analysis and Channel Estimation for Electromagnetic Information Theory

    Authors: Jieao Zhu, Vincent Y. F. Tan, Linglong Dai

    Abstract: Electromagnetic information theory (EIT) is an interdisciplinary subject that serves to integrate deterministic electromagnetic theory with stochastic Shannon's information theory. Existing EIT analysis operates in the continuous space domain, which is not aligned with the practical algorithms working in the discrete space domain. This mismatch leads to a significant difficulty in application of E… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Submitted to the IEEE TWC. In this paper, we established the discrete-continuous correspondence for electromagnetic information theory (EIT), thus enabling analytical tools in the continuous space domain to be applied to discrete space MIMO architectures. Simulation codes will be provided at http://oa.ee.tsinghua.edu.cn/dailinglong/publications/publications.html

  15. arXiv:2406.00812  [pdf, other

    stat.ML cs.LG

    Covariance-Adaptive Sequential Black-box Optimization for Diffusion Targeted Generation

    Authors: Yueming Lyu, Kim Yong Tan, Yew Soon Ong, Ivor W. Tsang

    Abstract: Diffusion models have demonstrated great potential in generating high-quality content for images, natural language, protein domains, etc. However, how to perform user-preferred targeted generation via diffusion models with only black-box target scores of users remains challenging. To address this issue, we first formulate the fine-tuning of the targeted reserve-time stochastic differential equatio… ▽ More

    Submitted 8 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  16. arXiv:2405.20892  [pdf, other

    cs.CV cs.AI

    MALT: Multi-scale Action Learning Transformer for Online Action Detection

    Authors: Zhipeng Yang, Ruoyu Wang, Yang Tan, Abstract: Online action detection (OAD) aims to identify ongoing actions from streaming video in real-time, without access to future frames. Since these actions manifest at varying scales of granularity, ranging from coarse to fine, projecting an entire set of action frames to a single latent encoding may result in a lack of local information, necessitating the acquisition of action features across multiple… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 8 pages, 3 figures

  17. arXiv:2405.20032  [pdf, other

    cs.NI cs.AI cs.MM

    Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion

    Authors: Jiangkai Wu, Liming Liu, Yunpeng Tan, Junlin Hao, Xinggong Zhang

    Abstract: With the exponential growth of video traffic, traditional video streaming systems are approaching their limits in compression efficiency and communication capacity. To further reduce bitrate while maintaining quality, we propose Promptus, a disruptive novel system that streaming prompts instead of video content with Stable Diffusion, which converts video frames into a series of "prompts" for deliv… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  18. arXiv:2405.17773  [pdf, other

    cs.CV

    Towards a Generalist and Blind RGB-X Tracker

    Authors: Yuedong Tan, Zongwei Wu, Yuqian Fu, Zhuyun Zhou, Guolei Sun, Chao Ma, Danda Pani Paudel, Luc Van Gool, Radu Timofte

    Abstract: With the emergence of a single large model capable of successfully solving a multitude of tasks in NLP, there has been growing research interest in achieving similar goals in computer vision. On the one hand, most of these generic models, referred to as generalist vision models, aim at producing unified outputs serving different tasks. On the other hand, some existing models aim to combine differe… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  19. arXiv:2405.16114  [pdf, other

    cs.AI cs.CV cs.LG

    Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of Bearing

    Authors: Huanbai Liu, Fanlong Zhang, Yin Tan, Lian Huang, Yan Li, Guoheng Huang, Shenghong Luo, An Zeng

    Abstract: In recent years, deep learning has led to significant advances in bearing fault diagnosis (FD). Most techniques aim to achieve greater accuracy. However, they are sensitive to noise and lack robustness, resulting in insufficient domain adaptation and anti-noise ability. The comparison of studies reveals that giving equal attention to all features does not differentiate their significance. In this… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  20. arXiv:2405.15256  [pdf, other

    cs.LG

    FTMixer: Frequency and Time Domain Representations Fusion for Time Series Modeling

    Authors: Zhengnan Li, Yunxiao Qin, Xilong Cheng, Yuting Tan

    Abstract: Time series data can be represented in both the time and frequency domains, with the time domain emphasizing local dependencies and the frequency domain highlighting global dependencies. To harness the strengths of both domains in capturing local and global dependencies, we propose the Frequency and Time Domain Mixer (FTMixer). To exploit the global characteristics of the frequency domain, we intr… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  21. arXiv:2405.15200  [pdf, other

    cs.LG cs.IT

    Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

    Authors: Jie Bian, Vincent Y. F. Tan

    Abstract: The Indexed Minimum Empirical Divergence (IMED) algorithm is a highly effective approach that offers a stronger theoretical guarantee of the asymptotic optimality compared to the Kullback--Leibler Upper Confidence Bound (KL-UCB) algorithm for the multi-armed bandit problem. Additionally, it has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. Despite its effectiv… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted to the Transactions on Machine Learning Research (TMLR)

  22. arXiv:2405.15032  [pdf, other

    cs.CL

    Aya 23: Open Weight Releases to Further Multilingual Progress

    Authors: Viraat Aryabumi, John Dang, Dwarak Talupuru, Saurabh Dash, David Cairuz, Hangyu Lin, Bharat Venkitesh, Madeline Smith, Jon Ander Campos, Yi Chern Tan, Kelly Marchisio, Max Bartolo, Sebastian Ruder, Acyr Locatelli, Julia Kreutzer, Nick Frosst, Aidan Gomez, Phil Blunsom, Marzieh Fadaee, Ahmet Üstün, Sara Hooker

    Abstract: This technical report introduces Aya 23, a family of multilingual language models. Aya 23 builds on the recent release of the Aya model (Üstün et al., 2024), focusing on pairing a highly performant pre-trained model with the recently released Aya collection (Singh et al., 2024). The result is a powerful multilingual large language model serving 23 languages, expanding state-of-art language modelin… ▽ More

    Submitted 31 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  23. arXiv:2405.13722  [pdf, other

    cs.CV

    InstaDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos

    Authors: Yujun Shi, Jun Hao Liew, Hanshu Yan, Vincent Y. F. Tan, Jiashi Feng

    Abstract: Accuracy and speed are critical in image editing tasks. Pan et al. introduced a drag-based image editing framework that achieves pixel-level control using Generative Adversarial Networks (GANs). A flurry of subsequent studies enhanced this framework's generality by leveraging large-scale diffusion models. However, these methods often suffer from inordinately long processing times (exceeding 1 minu… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: Project page: https://instadrag.github.io/

  24. arXiv:2405.13581  [pdf, other

    cs.CV cs.AI

    Safety Alignment for Vision Language Models

    Authors: Zhendong Liu, Yuanbi Nie, Yingshui Tan, Xiangyu Yue, Qiushi Cui, Chongjun Wang, Xiaoyong Zhu, Bo Zheng

    Abstract: Benefiting from the powerful capabilities of Large Language Models (LLMs), pre-trained visual encoder models connected to an LLMs can realize Vision Language Models (VLMs). However, existing research shows that the visual modality of VLMs is vulnerable, with attackers easily bypassing LLMs' safety alignment through visual modality features to launch attacks. To address this issue, we enhance the e… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 23 pages, 15 figures

  25. arXiv:2405.12213  [pdf, other

    cs.RO cs.LG

    Octo: An Open-Source Generalist Robot Policy

    Authors: Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, You Liang Tan, Lawrence Yunliang Chen, Pannag Sanketi, Quan Vuong, Ted Xiao, Dorsa Sadigh, Chelsea Finn, Sergey Levine

    Abstract: Large policies pretrained on diverse robot datasets have the potential to transform robotic learning: instead of training new policies from scratch, such generalist robot policies may be finetuned with only a little in-domain data, yet generalize broadly. However, to be widely applicable across a range of robotic learning scenarios, environments, and tasks, such policies need to handle diverse sen… ▽ More

    Submitted 26 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: Project website: https://octo-models.github.io

  26. arXiv:2405.05616  [pdf, other

    cs.CL cs.AI

    G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning

    Authors: Ruiting Dai, Yuqiao Tan, Lisi Mo, Shuang Liang, Guohao Huo, Jiayi Luo, Yao Cheng

    Abstract: Commonsense question answering has demonstrated considerable potential across various applications like assistants and social robots. Although fully fine-tuned pre-trained Language Models(LM) have achieved remarkable performance in commonsense reasoning, their tendency to excessively prioritize textual information hampers the precise transfer of structural knowledge and undermines interpretability… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  27. arXiv:2405.04795  [pdf, other

    cs.LG

    Variational Schrödinger Diffusion Models

    Authors: Wei Deng, Weijian Luo, Yixin Tan, Marin Biloš, Yu Chen, Yuriy Nevmyvaka, Ricky T. Q. Chen

    Abstract: Schrödinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize… ▽ More

    Submitted 19 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  28. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  29. arXiv:2405.01460  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders

    Authors: Yi Yu, Yufei Wang, Song Xia, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex C. Kot

    Abstract: Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-time defense, such as adversarial training, which can mitigate poisoning effects but is computationall… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  30. arXiv:2404.17316  [pdf, other

    cs.AI

    Certified MaxSAT Preprocessing

    Authors: Hannes Ihalainen, Andy Oertel, Yong Kiam Tan, Jeremias Berg, Matti Järvisalo, Jakob Nordström

    Abstract: Building on the progress in Boolean satisfiability (SAT) solving over the last decades, maximum satisfiability (MaxSAT) has become a viable approach for solving NP-hard optimization problems, but ensuring correctness of MaxSAT solvers has remained an important concern. For SAT, this is largely a solved problem thanks to the use of proof logging, meaning that solvers emit machine-verifiable proofs… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  31. arXiv:2404.15294  [pdf

    eess.SP cs.LG

    Multimodal Physical Fitness Monitoring (PFM) Framework Based on TimeMAE-PFM in Wearable Scenarios

    Authors: Junjie Zhang, Zheming Zhang, Huachen Xiang, Yangquan Tan, Linnan Huo, Fengyi Wang

    Abstract: Physical function monitoring (PFM) plays a crucial role in healthcare especially for the elderly. Traditional assessment methods such as the Short Physical Performance Battery (SPPB) have failed to capture the full dynamic characteristics of physical function. Wearable sensors such as smart wristbands offer a promising solution to this issue. However, challenges exist, such as the computational co… ▽ More

    Submitted 25 March, 2024; originally announced April 2024.

    Comments: 5 pages, 6 figures

  32. arXiv:2404.14850  [pdf, other

    cs.CL cs.LG q-bio.BM

    Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models

    Authors: Yang Tan, Mingchen Li, Bingxin Zhou, Bozitao Zhong, Lirong Zheng, Pan Tan, Ziyi Zhou, Huiqun Yu, Guisheng Fan, Liang Hong

    Abstract: Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs. However, the direct transfe… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 30 pages, 4 figures, 8 tables

  33. arXiv:2404.14824  [pdf, other

    cs.SE

    Automated Commit Message Generation with Large Language Models: An Empirical Study and Beyond

    Authors: Pengyu Xue, Linhao Wu, Zhongxing Yu, Zhi **, Zhen Yang, Xinyi Li, Zhenyu Yang, Yue Tan

    Abstract: Commit Message Generation (CMG) approaches aim to automatically generate commit messages based on given code diffs, which facilitate collaboration among developers and play a critical role in Open-Source Software (OSS). Very recently, Large Language Models (LLMs) have demonstrated extensive applicability in diverse code-related task. But few studies systematically explored their effectiveness usin… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  34. arXiv:2404.05445  [pdf, other

    stat.ME cs.LG stat.CO

    Unsupervised Training of Convex Regularizers using Maximum Likelihood Estimation

    Authors: Hong Ye Tan, Ziruo Cai, Marcelo Pereyra, Subhadip Mukherjee, Junqi Tang, Carola-Bibiane Schönlieb

    Abstract: Unsupervised learning is a training approach in the situation where ground truth data is unavailable, such as inverse imaging problems. We present an unsupervised Bayesian training approach to learning convex neural network regularizers using a fixed noisy dataset, based on a dual Markov chain estimation method. Compared to classical supervised adversarial regularization methods, where there is ac… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    MSC Class: 62C12; 62F15; 65C40; 65J22

  35. arXiv:2404.03181  [pdf, other

    cs.CV

    MonoCD: Monocular 3D Object Detection with Complementary Depths

    Authors: Longfei Yan, Pei Yan, Shengzhou Xiong, Xuanyu Xiang, Yihua Tan

    Abstract: Monocular 3D object detection has attracted widespread attention due to its potential to accurately obtain object 3D localization from a single image at a low cost. Depth estimation is an essential but challenging subtask of monocular 3D object detection due to the ill-posedness of 2D to 3D map**. Many methods explore multiple local depth clues such as object heights and keypoints and then formu… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  36. Adversarial Combinatorial Bandits with Switching Costs

    Authors: Yanyan Dong, Vincent Y. F. Tan

    Abstract: We study the problem of adversarial combinatorial bandit with a switching cost $λ$ for a switch of each selected arm in each round, considering both the bandit feedback and semi-bandit feedback settings. In the oblivious adversarial case with $K$ base arms and time horizon $T$, we derive lower bounds for the minimax regret and design algorithms to approach them. To prove these lower bounds, we des… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: The work has been accepted in IEEE Transactions on Information Theory. https://ieeexplore.ieee.org/document/10487974

  37. arXiv:2403.19979  [pdf, other

    cs.CV cs.AI cs.LG

    Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer

    Authors: Yuwen Tan, Qinhao Zhou, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li

    Abstract: Class-incremental learning (CIL) aims to enable models to continuously learn new classes while overcoming catastrophic forgetting. The introduction of pre-trained models has brought new tuning paradigms to CIL. In this paper, we revisit different parameter-efficient tuning (PET) methods within the context of continual learning. We observe that adapter tuning demonstrates superiority over prompt-ba… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: To appear at CVPR 2024

  38. arXiv:2403.19591  [pdf, other

    cs.LG cs.AR cs.NE

    Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers

    Authors: **cheng Dong, Yonghao Tan, Dong Zhang, Tianwei Ni, Xuejiao Liu, Yu Liu, Peng Luo, Luhong Liang, Shih-Yang Liu, Xijie Huang, Huaiyu Zhu, Yun Pan, Fengwei An, Kwang-Ting Cheng

    Abstract: Non-linear functions are prevalent in Transformers and their lightweight variants, incurring substantial and frequently underestimated hardware costs. Previous state-of-the-art works optimize these operations by piece-wise linear approximation and store the parameters in look-up tables (LUT), but most of them require unfriendly high-precision arithmetics such as FP/INT 32 and lack consideration of… ▽ More

    Submitted 29 March, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: 61st ACM/IEEE Design Automation Conference (DAC) 2024

  39. arXiv:2403.16398  [pdf, other

    cs.LG cs.AI

    Rethinking the Representation in Federated Unsupervised Learning with Non-IID Data

    Authors: Xinting Liao, Weiming Liu, Chaochao Chen, Pengyang Zhou, Fengyuan Yu, Huabin Zhu, Binhui Yao, Tao Wang, Xiaolin Zheng, Yanchao Tan

    Abstract: Federated learning achieves effective performance in modeling decentralized data. In practice, client data are not well-labeled, which makes it potential for federated unsupervised learning (FUSL) with non-IID data. However, the performance of existing FUSL methods suffers from insufficient representations, i.e., (1) representation collapse entanglement among local and global models, and (2) incon… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  40. arXiv:2403.13714  [pdf, other

    cs.RO cs.CV

    DBA-Fusion: Tightly Integrating Deep Dense Visual Bundle Adjustment with Multiple Sensors for Large-Scale Localization and Map**

    Authors: Yuxuan Zhou, Xingxing Li, Shengyu Li, Xuanbin Wang, Shaoquan Feng, Yuxuan Tan

    Abstract: Visual simultaneous localization and map** (VSLAM) has broad applications, with state-of-the-art methods leveraging deep neural networks for better robustness and applicability. However, there is a lack of research in fusing these learning-based methods with multi-sensor information, which could be indispensable to push related applications to large-scale and complex scenarios. In this paper, we… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  41. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  42. arXiv:2403.04780  [pdf, other

    cs.CL cs.AI

    MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining

    Authors: Yanchao Tan, Hang Lv, Xinyi Huang, Jiawei Zhang, Shi** Wang, Carl Yang

    Abstract: Graphs with abundant attributes are essential in modeling interconnected entities and improving predictions in various real-world applications. Traditional Graph Neural Networks (GNNs), which are commonly used for modeling attributed graphs, need to be re-trained every time when applied to different graph tasks and datasets. Although the emergence of Large Language Models (LLMs) has introduced a n… ▽ More

    Submitted 13 March, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

  43. arXiv:2403.01754  [pdf, other

    cs.CL

    Derivative-Free Optimization for Low-Rank Adaptation in Large Language Models

    Authors: Feihu **, Yin Liu, Ying Tan

    Abstract: Parameter-efficient tuning methods such as LoRA could achieve comparable performance to model tuning by tuning a small portion of the parameters. However, substantial computational resources are still required, as this process involves calculating gradients and performing back-propagation throughout the model. Much effort has recently been devoted to utilizing the derivative-free optimization meth… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 14 pages, 4 figures, 5 tables

  44. arXiv:2402.19421  [pdf, other

    cs.IR cs.AI econ.GN

    Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines

    Authors: Lijia Ma, Xingchen Xu, Yong Tan

    Abstract: In the domain of digital information dissemination, search engines act as pivotal conduits linking information seekers with providers. The advent of chat-based search engines utilizing Large Language Models (LLMs) and Retrieval Augmented Generation (RAG), exemplified by Bing Chat, marks an evolutionary leap in the search ecosystem. They demonstrate metacognitive abilities in interpreting web infor… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 38 pages, 2 figures, 7 tables

    ACM Class: J.4

  45. arXiv:2402.19054  [pdf, other

    cs.CR cs.AI

    RobWE: Robust Watermark Embedding for Personalized Federated Learning Model Ownership Protection

    Authors: Yang Xu, Yunlin Tan, Cheng Zhang, Kai Chi, Peng Sun, Wenyuan Yang, Ju Ren, Hongbo Jiang, Yaoxue Zhang

    Abstract: Embedding watermarks into models has been widely used to protect model ownership in federated learning (FL). However, existing methods are inadequate for protecting the ownership of personalized models acquired by clients in personalized FL (PFL). This is due to the aggregation of the global model in PFL, resulting in conflicts over clients' private watermarks. Moreover, malicious clients may tamp… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  46. arXiv:2402.18292  [pdf, other

    cs.CV cs.AI cs.LG

    FSL-Rectifier: Rectify Outliers in Few-Shot Learning via Test-Time Augmentation

    Authors: Yunwei Bai, Ying Kiat Tan, Tsuhan Chen

    Abstract: Few-shot-learning (FSL) commonly requires a model to identify images (queries) that belong to classes unseen during training, based on a few labelled samples of the new classes (support set) as reference. As the test classes are novel, FSL is challenging with high generalization error with respect to the novel classes, where outliers query or support image during inference exacerbate the error fur… ▽ More

    Submitted 16 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  47. arXiv:2402.15127  [pdf, other

    cs.LG cs.IT stat.ML

    Multi-Armed Bandits with Abstention

    Authors: Junwen Yang, Tianyuan **, Vincent Y. F. Tan

    Abstract: We introduce a novel extension of the canonical multi-armed bandit problem that incorporates an additional strategic element: abstention. In this enhanced framework, the agent is not only tasked with selecting an arm at each time step, but also has the option to abstain from accepting the stochastic instantaneous reward before observing it. When opting for abstention, the agent either suffers a fi… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Preprint

  48. arXiv:2402.09421  [pdf, other

    eess.SP cs.LG

    EEG Based Generative Depression Discriminator

    Authors: Ziming Mao, Hao wu, Yongxi Tan, Yuhe **

    Abstract: Depression is a very common but serious mood disorder.In this paper, We built a generative detection network(GDN) in accordance with three physiological laws. Our aim is that we expect the neural network to learn the relevant brain activity based on the EEG signal and, at the same time, to regenerate the target electrode signal based on the brain activity. We trained two generators, the first one… ▽ More

    Submitted 19 January, 2024; originally announced February 2024.

  49. arXiv:2402.06506  [pdf, other

    cs.CV cs.AI cs.LG

    Classifying point clouds at the facade-level using geometric features and deep learning networks

    Authors: Yue Tan, Olaf Wysocki, Ludwig Hoegner, Uwe Stilla

    Abstract: 3D building models with facade details are playing an important role in many applications now. Classifying point clouds at facade-level is key to create such digital replicas of the real world. However, few studies have focused on such detailed classification with deep neural networks. We propose a method fusing geometric features with deep learning networks for point cloud classification at facad… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted to the Recent Advances in 3D Geoinformation Science, Proceedings of the 18th 3D GeoInfo Conference 2023

  50. arXiv:2402.05376  [pdf, other

    cs.CL

    Zero-Shot Chain-of-Thought Reasoning Guided by Evolutionary Algorithms in Large Language Models

    Authors: Feihu **, Yifan Liu, Ying Tan

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks and exhibited impressive reasoning abilities by applying zero-shot Chain-of-Thought (CoT) prompting. However, due to the evolving nature of sentence prefixes during the pre-training phase, existing zero-shot CoT prompting methods that employ identical CoT prompting across all task instances may not be optima… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 17 pages, 5 figures, 16 tables