Skip to main content

Showing 1–50 of 638 results for author: Chen, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02028  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Why does in-context learning fail sometimes? Evaluating in-context learning on open and closed questions

    Authors: Xiang Li, Haoran Tang, Siyu Chen, Ziwei Wang, Ryan Chen, Marcin Abram

    Abstract: We measure the performance of in-context learning as a function of task novelty and difficulty for open and closed questions. For that purpose, we created a novel benchmark consisting of hard scientific questions, each paired with a context of various relevancy. We show that counter-intuitively, a context that is more aligned with the topic does not always help more than a less relevant context. T… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 8 pages plus references, 4 main figures, 6 pages of supplementary material

  2. arXiv:2407.00848  [pdf, other

    cs.RO

    Ego-to-Exo: Interfacing Third Person Visuals from Egocentric Views in Real-time for Improved ROV Teleoperation

    Authors: Adnan Abdullah, Ruo Chen, Ioannis Rekleitis, Md Jahidul Islam

    Abstract: Underwater ROVs (Remotely Operated Vehicles) are unmanned submersible vehicles designed for exploring and operating in the depths of the ocean. Despite using high-end cameras, typical teleoperation engines based on first-person (egocentric) views limit a surface operator's ability to maneuver and navigate the ROV in complex deep-water missions. In this paper, we present an interactive teleoperatio… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: V1, 8 pages

  3. arXiv:2407.00224  [pdf, other

    cs.CV stat.AP

    Multimodal Prototy** for cancer survival prediction

    Authors: Andrew H. Song, Richard J. Chen, Guillaume Jaume, Anurag J. Vaidya, Alexander S. Baras, Faisal Mahmood

    Abstract: Multimodal survival methods combining gigapixel histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification. Current approaches involve tokenizing the WSIs into smaller patches (>10,000 patches) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes. However, this proc… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: ICML 2024

  4. arXiv:2406.18944  [pdf, other

    cs.CV cs.AI cs.CR

    Investigating and Defending Shortcut Learning in Personalized Diffusion Models

    Authors: Yixin Liu, Ruoxi Chen, Lichao Sun

    Abstract: Personalized diffusion models have gained popularity for adapting pre-trained text-to-image models to generate images of specific topics with only a few images. However, recent studies find that these models are vulnerable to minor adversarial perturbation, and the fine-tuning performance is largely degraded on corrupted datasets. Such characteristics are further exploited to craft protective pert… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Preprint

  5. arXiv:2406.18536  [pdf, other

    eess.SY cs.AI cs.AR

    Reliable Interval Prediction of Minimum Operating Voltage Based on On-chip Monitors via Conformalized Quantile Regression

    Authors: Yuxuan Yin, Xiaoxiao Wang, Rebecca Chen, Chen He, Peng Li

    Abstract: Predicting the minimum operating voltage ($V_{min}$) of chips is one of the important techniques for improving the manufacturing testing flow, as well as ensuring the long-term reliability and safety of in-field systems. Current $V_{min}$ prediction methods often provide only point estimates, necessitating additional techniques for constructing prediction confidence intervals to cover uncertaintie… ▽ More

    Submitted 3 May, 2024; originally announced June 2024.

    Comments: Accepted by DATE 2024. Camera-ready version

  6. arXiv:2406.16192  [pdf, other

    cs.CV

    HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image Analysis

    Authors: Guillaume Jaume, Paul Doucet, Andrew H. Song, Ming Y. Lu, Cristina Almagro-PĂ©rez, Sophia J. Wagner, Anurag J. Vaidya, Richard J. Chen, Drew F. K. Williamson, Ahrong Kim, Faisal Mahmood

    Abstract: Spatial transcriptomics (ST) enables interrogating the molecular composition of tissue with ever-increasing resolution, depth, and sensitivity. However, costs, rapidly evolving technology, and lack of standards have constrained computational methods in ST to narrow tasks and small cohorts. In addition, the underlying tissue morphology as reflected by H&E-stained whole slide images (WSIs) encodes r… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Under review

  7. arXiv:2406.15741  [pdf, other

    cs.CL cs.AI cs.LG

    Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level

    Authors: Zhaopeng Feng, Ruizhe Chen, Yan Zhang, Zijie Meng, Zuozhu Liu

    Abstract: General-purpose Large Language Models (LLMs) like GPT-4 have achieved remarkable advancements in machine translation (MT) by leveraging extensive web content. On the other hand, translation-specific LLMs are built by pre-training on domain-specific monolingual corpora and fine-tuning with human-annotated translation data. Despite the superior performance, these methods either demand an unprecedent… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Our code is available at https://github.com/fzp0424/Ladder

  8. arXiv:2406.15222  [pdf

    eess.IV cs.AI cs.CV

    Rapid and Accurate Diagnosis of Acute Aortic Syndrome using Non-contrast CT: A Large-scale, Retrospective, Multi-center and AI-based Study

    Authors: Yujian Hu, Yilang Xiang, Yan-Jie Zhou, Yangyan He, Shifeng Yang, Xiaolong Du, Chunlan Den, Youyao Xu, Gaofeng Wang, Zhengyao Ding, **gyong Huang, Wenjun Zhao, Xuejun Wu, Donglin Li, Qianqian Zhu, Zhenjiang Li, Chenyang Qiu, Ziheng Wu, Yunjun He, Chen Tian, Yihui Qiu, Zuodong Lin, Xiaolong Zhang, Yuan He, Zhenpeng Yuan , et al. (15 additional authors not shown)

    Abstract: Chest pain symptoms are highly prevalent in emergency departments (EDs), where acute aortic syndrome (AAS) is a catastrophic cardiovascular emergency with a high fatality rate, especially when timely and accurate treatment is not administered. However, current triage practices in the ED can cause up to approximately half of patients with AAS to have an initially missed diagnosis or be misdiagnosed… ▽ More

    Submitted 24 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: under peer review

  9. arXiv:2406.11418  [pdf, other

    cs.CL

    BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM

    Authors: Zhewen Shen, Aditya Joshi, Ruey-Cheng Chen

    Abstract: Children from bilingual backgrounds benefit from interactions with parents and teachers to re-acquire their heritage language. In this paper, we investigate how this insight from behavioral study can be incorporated into the learning of small-scale language models. We introduce BAMBINO-LM, a continual pretraining strategy for BabyLM that uses a novel combination of alternation and PPO-based perple… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Short paper; Under review

  10. arXiv:2406.11175  [pdf, other

    cs.SD eess.AS

    SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression

    Authors: Zhihang Sun, Andong Li, Rilin Chen, Hao Zhang, Meng Yu, Yi Zhou, Dong Yu

    Abstract: The proliferation of deep neural networks has spawned the rapid development of acoustic echo cancellation and noise suppression, and plenty of prior arts have been proposed, which yield promising performance. Nevertheless, they rarely consider the deployment generality in different processing scenarios, such as edge devices, and cloud processing. To this end, this paper proposes a general model, t… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  11. arXiv:2406.10813  [pdf, other

    cs.CL

    Self-Evolution Fine-Tuning for Policy Optimization

    Authors: Ruijun Chen, Jiehao Liang, Shi** Gao, Fanqi Wan, Xiaojun Quan

    Abstract: The alignment of large language models (LLMs) is crucial not only for unlocking their potential in specific tasks but also for ensuring that responses meet human expectations and adhere to safety and ethical principles. Current alignment methodologies face considerable challenges. For instance, supervised fine-tuning (SFT) requires extensive, high-quality annotated samples, while reinforcement lea… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  12. arXiv:2406.10594  [pdf, other

    cs.CL

    BlockPruner: Fine-grained Pruning for Large Language Models

    Authors: Longguang Zhong, Fanqi Wan, Ruijun Chen, Xiaojun Quan, Liangzhi Li

    Abstract: With the rapid growth in the size and complexity of large language models (LLMs), the costs associated with their training and inference have escalated significantly. Research indicates that certain layers in LLMs harbor substantial redundancy, and pruning these layers has minimal impact on the overall performance. While various layer pruning methods have been developed based on this insight, they… ▽ More

    Submitted 20 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

  13. arXiv:2406.07432  [pdf, other

    cs.IR

    Matryoshka Representation Learning for Recommendation

    Authors: Riwei Lai, Li Chen, Weixin Chen, Rui Chen

    Abstract: Representation learning is essential for deep-neural-network-based recommender systems to capture user preferences and item features within fixed-dimensional user and item vectors. Unlike existing representation learning methods that either treat each user preference and item feature uniformly or categorize them into discrete clusters, we argue that in the real world, user preferences and item fea… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  14. arXiv:2406.07176  [pdf, other

    cs.CV

    RAD: A Comprehensive Dataset for Benchmarking the Robustness of Image Anomaly Detection

    Authors: Yuqi Cheng, Yunkang Cao, Rui Chen, Weiming Shen

    Abstract: Robustness against noisy imaging is crucial for practical image anomaly detection systems. This study introduces a Robust Anomaly Detection (RAD) dataset with free views, uneven illuminations, and blurry collections to systematically evaluate the robustness of current anomaly detection methods. Specifically, RAD aims to identify foreign objects on working platforms as anomalies. The collection pro… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 6 pages, 5 figures

  15. arXiv:2406.06542  [pdf, other

    cs.AR cs.LG

    vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs

    Authors: Size Zheng, Renze Chen, Meng Li, Zihao Ye, Luis Ceze, Yun Liang

    Abstract: IoT devices based on microcontroller units (MCU) provide ultra-low power consumption and ubiquitous computation for near-sensor deep learning models (DNN). However, the memory of MCU is usually 2-3 orders of magnitude smaller than mobile devices, which makes it challenging to map DNNs onto MCUs. Previous work separates memory management and kernel implementation for MCU and relies on coarse-graine… ▽ More

    Submitted 1 May, 2024; originally announced June 2024.

  16. arXiv:2406.05325  [pdf, other

    eess.AS cs.SD

    LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance

    Authors: Shihao Chen, Yu Gu, Jie Zhang, Na Li, Rilin Chen, Li** Chen, Lirong Dai

    Abstract: Any-to-any singing voice conversion (SVC) is an interesting audio editing technique, aiming to convert the singing voice of one singer into that of another, given only a few seconds of singing data. However, during the conversion process, the issue of timbre leakage is inevitable: the converted singing voice still sounds like the original singer's voice. To tackle this, we propose a latent diffusi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  17. arXiv:2406.04713  [pdf, other

    cs.LG cond-mat.mtrl-sci cs.AI physics.comp-ph stat.ML

    FlowMM: Generating Materials with Riemannian Flow Matching

    Authors: Benjamin Kurt Miller, Ricky T. Q. Chen, Anuroop Sriram, Brandon M Wood

    Abstract: Crystalline materials are a fundamental component in next-generation technologies, yet modeling their distribution presents unique computational challenges. Of the plausible arrangements of atoms in a periodic lattice only a vanishingly small percentage are thermodynamically stable, which is a key indicator of the materials that can be experimentally realized. Two fundamental tasks in this area ar… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: https://github.com/facebookresearch/flowmm

    Journal ref: ICML 2024

  18. arXiv:2406.04531  [pdf, other

    cs.SE

    TESTEVAL: Benchmarking Large Language Models for Test Case Generation

    Authors: Wenhan Wang, Chenyuan Yang, Zhijie Wang, Yuheng Huang, Zhaoyang Chu, Da Song, Lingming Zhang, An Ran Chen, Lei Ma

    Abstract: Testing plays a crucial role in the software development cycle, enabling the detection of bugs, vulnerabilities, and other undesirable behaviors. To perform software testing, testers need to write code snippets that execute the program under test. Recently, researchers have recognized the potential of large language models (LLMs) in software testing. However, there remains a lack of fair compariso… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  19. arXiv:2406.03882  [pdf, other

    cs.CL cs.SD eess.AS

    Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models

    Authors: Ziyun Cui, Chang Lei, Wen Wu, Yinan Duan, Diyang Qu, Ji Wu, Runsen Chen, Chao Zhang

    Abstract: The early detection of suicide risk is important since it enables the intervention to prevent potential suicide attempts. This paper studies the automatic detection of suicide risk based on spontaneous speech from adolescents, and collects a Mandarin dataset with 15 hours of suicide speech from more than a thousand adolescents aged from ten to eighteen for our experiments. To leverage the diverse… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  20. arXiv:2406.02603  [pdf, other

    cs.CR cs.LG

    Distortion-free Watermarks are not Truly Distortion-free under Watermark Key Collisions

    Authors: Yihan Wu, Ruibo Chen, Zhengmian Hu, Yanshuo Chen, Junfeng Guo, Hongyang Zhang, Heng Huang

    Abstract: Language model (LM) watermarking techniques inject a statistical signal into LM-generated content by substituting the random sampling process with pseudo-random sampling, using watermark keys as the random seed. Among these statistical watermarking approaches, distortion-free watermarks are particularly crucial because they embed watermarks into LM-generated content without compromising generation… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  21. arXiv:2406.01993  [pdf

    eess.IV cs.CV

    Choroidal Vessel Segmentation on Indocyanine Green Angiography Images via Human-in-the-Loop Labeling

    Authors: Ruoyu Chen, Ziwei Zhao, Mayinuer Yusufu, Xianwen Shang, Danli Shi, Mingguang He

    Abstract: Human-in-the-loop (HITL) strategy has been recently introduced into the field of medical image processing. Indocyanine green angiography (ICGA) stands as a well-established examination for visualizing choroidal vasculature and detecting chorioretinal diseases. However, the intricate nature of choroidal vascular networks makes large-scale manual segmentation of ICGA images challenging. Thus, the st… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 25 pages,4 figures

  22. arXiv:2406.00584  [pdf, other

    cs.DB cs.AI

    A Blueprint Architecture of Compound AI Systems for Enterprise

    Authors: Eser Kandogan, Sajjadur Rahman, Nikita Bhutani, Dan Zhang, Rafael Li Chen, Kushan Mitra, Sairam Gurajada, Pouya Pezeshkpour, Hayate Iso, Yanlin Feng, Hannah Kim, Chen Shen, ** Wang, Estevam Hruschka

    Abstract: Large Language Models (LLMs) have showcased remarkable capabilities surpassing conventional NLP challenges, creating opportunities for use in production use cases. Towards this goal, there is a notable shift to building compound AI systems, wherein LLMs are integrated into an expansive software infrastructure with many components like models, retrievers, databases and tools. In this paper, we intr… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Compound AI Systems Workshop at the Data+AI Summit 2024

  23. arXiv:2406.00288  [pdf, other

    cs.LG stat.ML

    Neural Optimal Transport with Lagrangian Costs

    Authors: Aram-Alexandre Pooladian, Carles Domingo-Enrich, Ricky T. Q. Chen, Brandon Amos

    Abstract: We investigate the optimal transport problem between probability measures when the underlying cost function is understood to satisfy a least action principle, also known as a Lagrangian cost. These generalizations are useful when connecting observations from a physical system where the transport dynamics are influenced by the geometry of the system, such as obstacles (e.g., incorporating barrier f… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: UAI 2024

  24. arXiv:2405.17814  [pdf, other

    cs.CV cs.AI

    FAIntbench: A Holistic and Precise Benchmark for Bias Evaluation in Text-to-Image Models

    Authors: Hanjun Luo, Ziye Deng, Ruizhe Chen, Zuozhu Liu

    Abstract: The rapid development and reduced barriers to entry for Text-to-Image (T2I) models have raised concerns about the biases in their outputs, but existing research lacks a holistic definition and evaluation framework of biases, limiting the enhancement of debiasing techniques. To address this issue, we introduce FAIntbench, a holistic and precise benchmark for biases in T2I models. In contrast to exi… ▽ More

    Submitted 8 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  25. arXiv:2405.17755  [pdf, other

    cs.CL cs.AI

    XL3M: A Training-free Framework for LLM Length Extension Based on Segment-wise Inference

    Authors: Shengnan Wang, Youhui Bai, Lin Zhang, **yi Zhou, Shixiong Zhao, Gong Zhang, Sen Wang, Renhai Chen, Hua Xu, Hongwei Sun

    Abstract: Length generalization failure problem, namely the large language model (LLM) fails to generalize to texts longer than its maximum training length, greatly restricts the application of LLM in the scenarios with streaming long inputs. To address this problem, the existing methods either require substantial costs or introduce precision loss. In this paper, we empirically find that the accuracy of the… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  26. arXiv:2405.14979  [pdf, other

    cs.GR cs.CV

    CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

    Authors: Weiyu Li, Jiarui Liu, Rui Chen, Yixun Liang, Xuelin Chen, ** Tan, Xiaoxiao Long

    Abstract: We present a novel generative 3D modeling system, coined CraftsMan, which can generate high-fidelity 3D geometries with highly varied shapes, regular mesh topologies, and detailed surfaces, and, notably, allows for refining the geometry in an interactive manner. Despite the significant advancements in 3D generation, existing methods still struggle with lengthy optimization processes, irregular mes… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: HomePage: https://craftsman3d.github.io/, Code: https://github.com/wyysf-98/CraftsMan

  27. arXiv:2405.14870  [pdf, other

    cs.CV cs.RO

    An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models

    Authors: Jiahao Sun, Chunmei Qing, Xiang Xu, Lingdong Kong, Youquan Liu, Li Li, Chenming Zhu, **gwei Zhang, Zeqi Xiao, Runnan Chen, Tai Wang, Wenwei Zhang, Kai Chen

    Abstract: In the rapidly evolving field of autonomous driving, precise segmentation of LiDAR data is crucial for understanding complex 3D environments. Traditional approaches often rely on disparate, standalone codebases, hindering unified advancements and fair benchmarking across models. To address these challenges, we introduce MMDetection3D-lidarseg, a comprehensive toolbox designed for the efficient tra… ▽ More

    Submitted 30 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Preprint; 17 pages, 4 figures, 7 tables; Code at https://github.com/open-mmlab/mmdetection3d

  28. arXiv:2405.14198  [pdf, other

    cs.MA

    Enabling Sustainable Freight Forwarding Network via Collaborative Games

    Authors: Pang-** Tan, Shih-Fen Cheng, Richard Chen

    Abstract: Freight forwarding plays a crucial role in facilitating global trade and logistics. However, as the freight forwarding market is extremely fragmented, freight forwarders often face the issue of not being able to fill the available ship** capacity. This recurrent issue motivates the creation of various freight forwarding networks that aim at exchanging capacities and demands so that the resource… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted to the 33rd International Joint Conference on Artificial Intelligence (IJCAI-24)

  29. arXiv:2405.14090  [pdf, other

    cs.LG math.OC

    Actively Learning Combinatorial Optimization Using a Membership Oracle

    Authors: Rosario Messana, Rui Chen, Andrea Lodi

    Abstract: We consider solving a combinatorial optimization problem with an unknown linear constraint using a membership oracle that, given a solution, determines whether it is feasible or infeasible with absolute certainty. The goal of the decision maker is to find the best possible solution subject to a budget on the number of oracle calls. Inspired by active learning based on Support Vector Machines (SVMs… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  30. arXiv:2405.12079  [pdf, other

    cs.DC cs.OS

    PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation

    Authors: Zhuobin Huang, Xingda Wei, Yingyi Hao, Rong Chen, Mingcong Han, **yu Gu, Haibo Chen

    Abstract: Checkpointing (C) and restoring (R) are key components for GPU tasks. POS is an OS-level GPU C/R system: It can transparently checkpoint or restore processes that use the GPU, without requiring any cooperation from the application, a key feature required by modern systems like the cloud. Moreover, POS is the first OS-level C/R system that can concurrently execute C/R with the application execution… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  31. arXiv:2405.11643  [pdf, other

    cs.CV cs.LG stat.AP

    Morphological Prototy** for Unsupervised Slide Representation Learning in Computational Pathology

    Authors: Andrew H. Song, Richard J. Chen, Tong Ding, Drew F. K. Williamson, Guillaume Jaume, Faisal Mahmood

    Abstract: Representation learning of pathology whole-slide images (WSIs) has been has primarily relied on weak supervision with Multiple Instance Learning (MIL). However, the slide representations resulting from this approach are highly tailored to specific clinical tasks, which limits their expressivity and generalization, particularly in scenarios with limited data. Instead, we hypothesize that morphologi… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  32. arXiv:2405.11618  [pdf, other

    cs.CV cs.AI

    Transcriptomics-guided Slide Representation Learning in Computational Pathology

    Authors: Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F. K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood

    Abstract: Self-supervised learning (SSL) has been successful in building patch embeddings of small histology images (e.g., 224x224 pixels), but scaling these models to learn slide embeddings from the entirety of giga-pixel whole-slide images (WSIs) remains challenging. Here, we leverage complementary information from gene expression profiles to guide slide representation learning using multimodal pre-traini… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: CVPR'24, Oral

  33. arXiv:2405.11380  [pdf, other

    cs.RO cs.AI eess.SY

    Meta-Control: Automatic Model-based Control Synthesis for Heterogeneous Robot Skills

    Authors: Tianhao Wei, Liqian Ma, Rui Chen, Weiye Zhao, Changliu Liu

    Abstract: The requirements for real-world manipulation tasks are diverse and often conflicting; some tasks require precise motion while others require force compliance; some tasks require avoidance of certain regions, while others require convergence to certain states. Satisfying these varied requirements with a fixed state-action representation and control strategy is challenging, impeding the development… ▽ More

    Submitted 7 June, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

  34. arXiv:2405.11226  [pdf, ps, other

    cs.LG

    The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback

    Authors: Ruitao Chen, Liwei Wang

    Abstract: Reinforcement learning from human feedback (RLHF) has contributed to performance improvements in large language models. To tackle its reliance on substantial amounts of human-labeled data, a successful approach is multi-task representation learning, which involves learning a high-quality, low-dimensional representation from a wide range of source tasks. In this paper, we formulate RLHF as the cont… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  35. arXiv:2405.10989  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    Learnable Privacy Neurons Localization in Language Models

    Authors: Ruizhe Chen, Tianxiang Hu, Yang Feng, Zuozhu Liu

    Abstract: Concerns regarding Large Language Models (LLMs) to memorize and disclose private information, particularly Personally Identifiable Information (PII), become prominent within the community. Many efforts have been made to mitigate the privacy risks. However, the mechanism through which LLMs memorize PII remains poorly understood. To bridge this gap, we introduce a pioneering method for pinpointing P… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: ACL 2024 main conference

  36. arXiv:2405.09585  [pdf, other

    cs.LG cs.AI

    An Embarrassingly Simple Approach to Enhance Transformer Performance in Genomic Selection for Crop Breeding

    Authors: Renqi Chen, Wenwei Han, Haohao Zhang, Haoyang Su, Zhefan Wang, Xiaolei Liu, Hao Jiang, Wanli Ouyang, Nanqing Dong

    Abstract: Genomic selection (GS), as a critical crop breeding strategy, plays a key role in enhancing food production and addressing the global hunger crisis. The predominant approaches in GS currently revolve around employing statistical methods for prediction. However, statistical methods often come with two main limitations: strong statistical priors and linear assumptions. A recent trend is to capture t… ▽ More

    Submitted 24 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI2024. Code is available at https://github.com/RenqiChen/Genomic-Selection

  37. arXiv:2405.09341  [pdf, other

    cs.CL cs.AI

    Large Language Model Bias Mitigation from the Perspective of Knowledge Editing

    Authors: Ruizhe Chen, Yichen Li, Zikai Xiao, Zuozhu Liu

    Abstract: Existing debiasing methods inevitably make unreasonable or undesired predictions as they are designated and evaluated to achieve parity across different social groups but leave aside individual facts, resulting in modified existing knowledge. In this paper, we first establish a new bias mitigation benchmark BiasKE leveraging existing and additional constructed datasets, which systematically assess… ▽ More

    Submitted 29 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  38. arXiv:2405.04795  [pdf, other

    cs.LG

    Variational Schrödinger Diffusion Models

    Authors: Wei Deng, Weijian Luo, Yixin Tan, Marin Biloš, Yu Chen, Yuriy Nevmyvaka, Ricky T. Q. Chen

    Abstract: Schrödinger bridge (SB) has emerged as the go-to method for optimizing transportation plans in diffusion models. However, SB requires estimating the intractable forward score functions, inevitably resulting in the costly implicit training loss based on simulated trajectories. To improve the scalability while preserving efficient transportation plans, we leverage variational inference to linearize… ▽ More

    Submitted 19 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  39. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  40. arXiv:2405.03267  [pdf, other

    cs.DC cs.DB cs.IR

    Characterizing the Dilemma of Performance and Index Size in Billion-Scale Vector Search and Breaking It with Second-Tier Memory

    Authors: Rongxin Cheng, Yifan Peng, Xingda Wei, Hongrui Xie, Rong Chen, Sijie Shen, Haibo Chen

    Abstract: Vector searches on large-scale datasets are critical to modern online services like web search and RAG, which necessity storing the datasets and their index on the secondary storage like SSD. In this paper, we are the first to characterize the trade-off of performance and index size in existing SSD-based graph and cluster indexes: to improve throughput by 5.7$\times$ and 1.7$\times$, these indexes… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  41. arXiv:2405.01538  [pdf, other

    cs.CV cs.LG cs.RO

    Multi-Space Alignments Towards Universal LiDAR Segmentation

    Authors: Youquan Liu, Lingdong Kong, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma

    Abstract: A unified and versatile LiDAR segmentation model with strong robustness and generalizability is desirable for safe autonomous driving perception. This work presents M3Net, a one-of-a-kind framework for fulfilling multi-task, multi-dataset, multi-modality LiDAR segmentation in a universal manner using just a single set of parameters. To better exploit data volume and diversity, we first combine lar… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: CVPR 2024; 33 pages, 14 figures, 14 tables; Code at https://github.com/youquanl/M3Net

  42. arXiv:2405.00565  [pdf, other

    cs.SE

    Leveraging Stack Traces for Spectrum-based Fault Localization in the Absence of Failing Tests

    Authors: Lorena Barreto Simedo Pacheco, An Ran Chen, **qiu Yang, Tse-Hsun, Chen

    Abstract: Bug fixing is a crucial task in software maintenance to hold user trust. Although various automated fault localization techniques exist, they often require specific conditions to be effective. For example, Spectrum-Based Fault Localization (SBFL) techniques need at least one failing test to identify bugs, which may not always be available. Bug reports, particularly those with stack traces, provide… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  43. LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing

    Authors: Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-Hsun Chen, Shaowei Wang

    Abstract: Logs are important in modern software development with runtime information. Log parsing is the first step in many log-based analyses, that involve extracting structured information from unstructured log data. Traditional log parsers face challenges in accurately parsing logs due to the diversity of log formats, which directly impacts the performance of downstream log-analysis tasks. In this paper,… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  44. arXiv:2404.17302  [pdf, other

    cs.RO cs.AI cs.CV

    Part-Guided 3D RL for Sim2Real Articulated Object Manipulation

    Authors: Pengwei Xie, Rui Chen, Siang Chen, Yuzhe Qin, Fanbo Xiang, Tianyu Sun, **g Xu, Gui** Wang, Hao Su

    Abstract: Manipulating unseen articulated objects through visual feedback is a critical but challenging task for real robots. Existing learning-based solutions mainly focus on visual affordance learning or other pre-trained visual models to guide manipulation policies, which face challenges for novel instances in real-world scenarios. In this paper, we propose a novel part-guided 3D RL framework, which can… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 9 pages

  45. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  46. arXiv:2404.16067  [pdf, other

    cs.HC cs.AI

    Layout2Rendering: AI-aided Greenspace design

    Authors: Ran Chen, Zeke Lian, Yueheng He, Xiao Ling, Fuyu Yang, Xueqi Yao, Xingjian Yi, **g Zhao

    Abstract: In traditional human living environment landscape design, the establishment of three-dimensional models is an essential step for designers to intuitively present the spatial relationships of design elements, as well as a foundation for conducting landscape analysis on the site. Rapidly and effectively generating beautiful and realistic landscape spaces is a significant challenge faced by designers… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 14 pages,8 figures

  47. arXiv:2404.16006  [pdf, other

    cs.CV

    MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

    Authors: Kaining Ying, Fanqing Meng, ** Wang, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Runjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, ** Luo, Kaipeng Zhang, Wenqi Shao

    Abstract: Large Vision-Language Models (LVLMs) show significant strides in general-purpose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 77 pages, 41 figures

  48. arXiv:2404.15121  [pdf, other

    cs.GR cs.AI cs.CV

    Taming Diffusion Probabilistic Models for Character Control

    Authors: Rui Chen, Mingyi Shi, Shaoli Huang, ** Tan, Taku Komura, Xuelin Chen

    Abstract: We present a novel character control framework that effectively utilizes motion diffusion probabilistic models to generate high-quality and diverse character animations, responding in real-time to a variety of dynamic user-supplied control signals. At the heart of our method lies a transformer-based Conditional Autoregressive Motion Diffusion Model (CAMDM), which takes as input the character's his… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGGRAPH 2024 (Conference Track). Project page and source codes: https://aiganimation.github.io/CAMDM/

  49. arXiv:2404.14710  [pdf, other

    cs.SE

    Challenges of Using Pre-trained Models: the Practitioners' Perspective

    Authors: Xin Tan, Taichuan Li, Ruohe Chen, Fang Liu, Li Zhang

    Abstract: The challenges associated with using pre-trained models (PTMs) have not been specifically investigated, which hampers their effective utilization. To address this knowledge gap, we collected and analyzed a dataset of 5,896 PTM-related questions on Stack Overflow. We first analyze the popularity and difficulty trends of PTM-related questions. We find that PTM-related questions are becoming more and… ▽ More

    Submitted 1 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: SANER 2024

  50. arXiv:2404.12728  [pdf, other

    cs.CL

    Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?

    Authors: Chengwei Qin, Wenhan Xia, Tan Wang, Fangkai Jiao, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty

    Abstract: Analogical reasoning is a unique ability of humans to address unfamiliar challenges by transferring strategies from relevant past experiences. One key finding in psychology is that compared with irrelevant past experiences, recalling relevant ones can help humans better handle new tasks. Coincidentally, the NLP community has also recently found that self-generating relevant examples in the context… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.