Skip to main content

Showing 1–50 of 200 results for author: Jia, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17864  [pdf, other

    cs.CY cs.AI

    AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

    Authors: Yi Zeng, Kevin Klyman, Andy Zhou, Yu Yang, Minzhou Pan, Ruoxi Jia, Dawn Song, Percy Liang, Bo Li

    Abstract: We present a comprehensive AI risk taxonomy derived from eight government policies from the European Union, United States, and China and 16 company policies worldwide, making a significant step towards establishing a unified language for generative AI safety evaluation. We identify 314 unique risk categories organized into a four-tiered taxonomy. At the highest level, this taxonomy encompasses Sys… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.17274  [pdf, other

    cs.CL cs.LG

    Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?

    Authors: Jianfeng He, Runing Yang, Linlin Yu, Changbin Li, Ruoxi Jia, Feng Chen, Ming **, Chang-Tien Lu

    Abstract: Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncerta… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 63 pages, 41 figures, 11 tables

  3. arXiv:2406.17092  [pdf, other

    cs.CR cs.AI

    BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models

    Authors: Yi Zeng, Weiyu Sun, Tran Ngoc Huynh, Dawn Song, Bo Li, Ruoxi Jia

    Abstract: Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions. The high dimensionality of potential triggers in the token space and the diverse range of malicious behaviors make this a critical challenge. We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relati… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.16943  [pdf, other

    eess.SP cs.AI cs.HC cs.LG

    EarDA: Towards Accurate and Data-Efficient Earable Activity Sensing

    Authors: Shengzhe Lyu, Yongliang Chen, Di Duan, Renqi Jia, Weitao Xu

    Abstract: In the realm of smart sensing with the Internet of Things, earable devices are empowered with the capability of multi-modality sensing and intelligence of context-aware computing, leading to its wide usage in Human Activity Recognition (HAR). Nonetheless, unlike the movements captured by Inertial Measurement Unit (IMU) sensors placed on the upper or lower body, those motion signals obtained from e… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: accepted by 2024 IEEE Coupling of Sensing & Computing in AIoT Systems (CSCAIoT)

  5. arXiv:2406.14598  [pdf, other

    cs.AI

    SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

    Authors: Tinghao Xie, Xiangyu Qi, Yi Zeng, Yangsibo Huang, Udari Madhushani Sehwag, Kaixuan Huang, Luxi He, Boyi Wei, Dacheng Li, Ying Sheng, Ruoxi Jia, Bo Li, Kai Li, Danqi Chen, Peter Henderson, Prateek Mittal

    Abstract: Evaluating aligned large language models' (LLMs) ability to recognize and reject unsafe user requests is crucial for safe, policy-compliant deployments. Existing evaluation efforts, however, face three limitations that we address with SORRY-Bench, our proposed benchmark. First, existing methods often use coarse-grained taxonomies of unsafe topics, and are over-representing some fine-grained topics… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  6. arXiv:2406.13131  [pdf, other

    cs.CL

    When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

    Authors: Ting-Yun Chang, Jesse Thomason, Robin Jia

    Abstract: This paper studies in-context learning (ICL) by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that al… ▽ More

    Submitted 24 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: fix typos and citations; appendix

  7. arXiv:2406.11011  [pdf, other

    cs.LG cs.CL stat.ML

    Data Shapley in One Training Run

    Authors: Jiachen T. Wang, Prateek Mittal, Dawn Song, Ruoxi Jia

    Abstract: Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. However, existing approaches require re-training models on different data subsets, which is computationally intensive, foreclosing their application to large-scale models. Furthermore, they produce the same attribution score for any models produced by running the learning algorithm, m… ▽ More

    Submitted 29 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

  8. arXiv:2406.07029  [pdf, other

    cs.LG

    Fairness-Aware Meta-Learning via Nash Bargaining

    Authors: Yi Zeng, Xuelin Yang, Li Chen, Cristian Canton Ferrer, Ming **, Michael I. Jordan, Ruoxi Jia

    Abstract: To address issues of group-level fairness in machine learning, it is natural to adjust model parameters based on specific fairness objectives over a sensitive-attributed validation set. Such an adjustment procedure can be cast within a meta-learning framework. However, naive integration of fairness goals via meta-learning can cause hypergradient conflicts for subgroups, resulting in unstable conve… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  9. arXiv:2406.03720  [pdf, other

    cs.CV cs.MM

    JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits

    Authors: Minzhou Pan, Yi Zeng, Xue Lin, Ning Yu, Cho-Jui Hsieh, Peter Henderson, Ruoxi Jia

    Abstract: In this study, we investigate the vulnerability of image watermarks to diffusion-model-based image editing, a challenge exacerbated by the computational cost of accessing gradient information and the closed-source nature of many diffusion models. To address this issue, we introduce JIGMARK. This first-of-its-kind watermarking technique enhances robustness through contrastive learning with pairs of… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  10. arXiv:2406.03445  [pdf, other

    cs.LG cs.CL

    Pre-trained Large Language Models Use Fourier Features to Compute Addition

    Authors: Tianyi Zhou, Deqing Fu, Vatsal Sharan, Robin Jia

    Abstract: Pre-trained large language models (LLMs) exhibit impressive mathematical reasoning capabilities, yet how they compute basic arithmetic, such as addition, remains unclear. This paper shows that pre-trained LLMs add numbers using Fourier features -- dimensions in the hidden state that represent numbers via a set of features sparse in the frequency domain. Within the model, MLP and attention layers u… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  11. arXiv:2406.02791  [pdf, other

    cs.AI cs.CL cs.RO

    Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

    Authors: Wang Zhu, Ishika Singh, Robin Jia, Jesse Thomason

    Abstract: Classical planning approaches guarantee finding a set of actions that can achieve a given goal state when possible, but require an expert to specify logical action semantics that govern the dynamics of the environment. Researchers have shown that Large Language Models (LLMs) can be used to directly infer planning steps based on commonsense knowledge and minimal domain information alone, but such p… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  12. arXiv:2405.20774  [pdf, other

    cs.CR cs.AI

    Exploring Backdoor Attacks against Large Language Model-based Decision Making

    Authors: Ruochen Jiao, Shaoyuan Xie, Justin Yue, Takami Sato, Lixu Wang, Yixuan Wang, Qi Alfred Chen, Qi Zhu

    Abstract: Large Language Models (LLMs) have shown significant promise in decision-making tasks when fine-tuned on specific applications, leveraging their inherent common sense and reasoning abilities learned from vast amounts of data. However, these systems are exposed to substantial safety and security risks during the fine-tuning phase. In this work, we propose the first comprehensive framework for Backdo… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 27 pages, including main paper, references, and appendix

  13. arXiv:2405.19524  [pdf, other

    cs.CR cs.AI

    AI Risk Management Should Incorporate Both Safety and Security

    Authors: Xiangyu Qi, Yangsibo Huang, Yi Zeng, Edoardo Debenedetti, Jonas Gei**, Luxi He, Kaixuan Huang, Udari Madhushani, Vikash Sehwag, Weijia Shi, Boyi Wei, Tinghao Xie, Danqi Chen, Pin-Yu Chen, Jeffrey Ding, Ruoxi Jia, Jiaqi Ma, Arvind Narayanan, Weijie J Su, Mengdi Wang, Chaowei Xiao, Bo Li, Dawn Song, Peter Henderson, Prateek Mittal

    Abstract: The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come together under the overarching goal of AI risk management, they have historically evolved separately, giving rise to differing perspectives. Therefore, in this pape… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  14. arXiv:2405.15374  [pdf, other

    cs.IR cs.AI cs.CL

    Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

    Authors: Runsong Jia, Bowen Zhang, Sergio J. Rodríguez Méndez, Pouya G. Omran

    Abstract: The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related ar… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: for the associated repository, see http://w3id.org/kgcp/KGQP

    ACM Class: H.3.3; I.2.4; I.7.5; I.2.7

  15. arXiv:2405.12933  [pdf, other

    cs.CL cs.AI cs.LG

    Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs

    Authors: Bilgehan Sel, Priya Shanmugasundaram, Mohammad Kachuee, Kun Zhou, Ruoxi Jia, Ming **

    Abstract: Large Language Models (LLMs) have shown remarkable capabilities in tasks such as summarization, arithmetic reasoning, and question answering. However, they encounter significant challenges in the domain of moral reasoning and ethical decision-making, especially in complex scenarios with multiple stakeholders. This paper introduces the Skin-in-the-Game (SKIG) framework, aimed at enhancing moral rea… ▽ More

    Submitted 2 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: ACL 2024, long paper

  16. arXiv:2405.03875  [pdf, other

    cs.LG stat.ML

    Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

    Authors: Jiachen T. Wang, Tianji Yang, James Zou, Yongchan Kwon, Ruoxi Jia

    Abstract: Data Shapley provides a principled approach to data valuation and plays a crucial role in data-centric machine learning (ML) research. Data selection is considered a standard application of Data Shapley. However, its data selection performance has shown to be inconsistent across settings in the literature. This study aims to deepen our understanding of this phenomenon. We introduce a hypothesis te… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  17. arXiv:2405.02989  [pdf, other

    cs.CR eess.SY

    Defense against Joint Poison and Evasion Attacks: A Case Study of DERMS

    Authors: Zain ul Abdeen, Padmaksha Roy, Ahmad Al-Tawaha, Rouxi Jia, Laura Freeman, Peter Beling, Chen-Ching Liu, Alberto Sangiovanni-Vincentelli, Ming **

    Abstract: There is an upward trend of deploying distributed energy resource management systems (DERMS) to control modern power grids. However, DERMS controller communication lines are vulnerable to cyberattacks that could potentially impact operational reliability. While a data-driven intrusion detection system (IDS) can potentially thwart attacks during deployment, also known as the evasion attack, the tra… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  18. arXiv:2405.02774  [pdf, other

    cs.LG cs.AI cs.CL

    Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs

    Authors: Feiyang Kang, Hoang Anh Just, Yifan Sun, Himanshu Jahagirdar, Yuanzhi Zhang, Rongxing Du, Anit Kumar Sahu, Ruoxi Jia

    Abstract: This work focuses on leveraging and selecting from vast, unlabeled, open data to pre-fine-tune a pre-trained language model. The goal is to minimize the need for costly domain-specific data for subsequent fine-tuning while achieving desired performance levels. While many data selection algorithms have been designed for small-scale applications, rendering them unsuitable for our context, some emerg… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ICLR 2024

  19. arXiv:2404.15157  [pdf, other

    cs.CL cs.AI

    FASTTRACK: Fast and Accurate Fact Tracing for LLMs

    Authors: Si Chen, Feiyang Kang, Ning Yu, Ruoxi Jia

    Abstract: Fact tracing seeks to identify specific training examples that serve as the knowledge source for a given query. Existing approaches to fact tracing rely on assessing the similarity between each training sample and the query along a certain dimension, such as lexical similarity, gradient, or embedding space. However, these methods fall short of effectively distinguishing between samples that are me… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  20. arXiv:2404.02235  [pdf, other

    cs.LG cs.AI

    Is Exploration All You Need? Effective Exploration Characteristics for Transfer in Reinforcement Learning

    Authors: Jonathan C. Balloch, Rishav Bhagat, Geigh Zollicoffer, Ruoran Jia, Julia Kim, Mark O. Riedl

    Abstract: In deep reinforcement learning (RL) research, there has been a concerted effort to design more efficient and productive exploration methods while solving sparse-reward problems. These exploration methods often share common principles (e.g., improving diversity) and implementation details (e.g., intrinsic reward). Prior work found that non-stationary Markov decision processes (MDPs) require explora… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  21. arXiv:2404.01266  [pdf, other

    cs.AI cs.CL

    IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations

    Authors: Deqing Fu, Ghazal Khalighinejad, Ollie Liu, Bhuwan Dhingra, Dani Yogatama, Robin Jia, Willie Neiswanger

    Abstract: Current foundation models exhibit impressive capabilities when prompted either with text only or with both image and text inputs. But do their capabilities change depending on the input modality? In this work, we propose $\textbf{IsoBench}$, a benchmark dataset containing problems from four major areas: math, science, algorithms, and games. Each example is presented with multiple… ▽ More

    Submitted 2 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  22. arXiv:2403.16560  [pdf, other

    cs.RO

    Active Admittance Control with Iterative Learning for General-Purpose Contact-Rich Manipulation

    Authors: Bo Zhou, Yuyao Sun, Wenbo Liu, Ruixuan Jiao, Fang Fang, Shihua Li

    Abstract: Force interaction is inevitable when robots face multiple operation scenarios. How to make the robot competent in force control for generalized operations such as multi-tasks still remains a challenging problem. Aiming at the reproducibility of interaction tasks and the lack of a generalized force control framework for multi-task scenarios, this paper proposes a novel hybrid control framework base… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  23. arXiv:2403.13031  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

    Authors: Zhuowen Yuan, Zidi Xiong, Yi Zeng, Ning Yu, Ruoxi Jia, Dawn Song, Bo Li

    Abstract: Recent advancements in Large Language Models (LLMs) have showcased remarkable capabilities across various tasks in different domains. However, the emergence of biases and the potential for generating harmful content in LLMs, particularly under malicious inputs, pose significant challenges. Current mitigation strategies, while effective, are not resilient under adversarial attacks. This paper intro… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  24. arXiv:2403.10499  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study

    Authors: Chenguang Wang, Ruoxi Jia, Xin Liu, Dawn Song

    Abstract: Pre-training image representations from the raw text about images enables zero-shot vision transfer to downstream tasks. Through pre-training on millions of samples collected from the internet, multimodal foundation models, such as CLIP, produce state-of-the-art zero-shot results that often reach competitiveness with fully supervised methods without the need for task-specific training. Besides the… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  25. arXiv:2403.04893  [pdf, other

    cs.AI

    A Safe Harbor for AI Evaluation and Red Teaming

    Authors: Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

    Abstract: Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensio… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  26. arXiv:2403.02794  [pdf

    cs.IR cs.AI cs.LG

    A Distance Metric Learning Model Based On Variational Information Bottleneck

    Authors: YaoDan Zhang, Zidong Wang, Ru Jia, Ru Li

    Abstract: In recent years, personalized recommendation technology has flourished and become one of the hot research directions. The matrix factorization model and the metric learning model which proposed successively have been widely studied and applied. The latter uses the Euclidean distance instead of the dot product used by the former to measure the latent space vector. While avoiding the shortcomings of… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  27. arXiv:2403.00485  [pdf, other

    cs.LG

    A Survey of Geometric Graph Neural Networks: Data Structures, Models and Applications

    Authors: Jiaqi Han, Jiacheng Cen, Liming Wu, Zongzhao Li, Xiangzhe Kong, Rui Jiao, Ziyang Yu, Tingyang Xu, Fandi Wu, Zihe Wang, Hongteng Xu, Zhewei Wei, Yang Liu, Yu Rong, Wenbing Huang

    Abstract: Geometric graph is a special kind of graph with geometric features, which is vital to model many scientific problems. Unlike generic graphs, geometric graphs often exhibit physical symmetries of translations, rotations, and reflections, making them ineffectively processed by current Graph Neural Networks (GNNs). To tackle this issue, researchers proposed a variety of Geometric Graph Neural Network… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  28. arXiv:2402.12714  [pdf, other

    cs.LG physics.chem-ph

    Equivariant Pretrained Transformer for Unified Geometric Learning on Multi-Domain 3D Molecules

    Authors: Rui Jiao, Xiangzhe Kong, Ziyang Yu, Wenbing Huang, Yang Liu

    Abstract: Pretraining on a large number of unlabeled 3D molecules has showcased superiority in various scientific applications. However, prior efforts typically focus on pretraining models on a specific domain, either proteins or small molecules, missing the opportunity to leverage the cross-domain knowledge. To mitigate this gap, we introduce Equivariant Pretrained Transformer (EPT), a novel pretraining fr… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  29. arXiv:2402.10892  [pdf, other

    cs.CR cs.CL cs.LG

    Proving membership in LLM pretraining data via data watermarks

    Authors: Johnny Tian-Zheng Wei, Ryan Yixiang Wang, Robin Jia

    Abstract: Detecting whether copyright holders' works were used in LLM pretraining is poised to be an important problem. This work proposes using data watermarks to enable principled detection with only black-box model access, provided that the rightholder contributed multiple training documents and watermarked them before public release. By applying a randomly sampled data watermark, detection can be framed… ▽ More

    Submitted 10 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  30. arXiv:2402.08922  [pdf, other

    cs.LG stat.ML

    The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

    Authors: Myeongseob Ko, Feiyang Kang, Weiyan Shi, Ming **, Zhou Yu, Ruoxi Jia

    Abstract: Large-scale black-box models have become ubiquitous across numerous applications. Understanding the influence of individual training data sources on predictions made by these models is crucial for improving their trustworthiness. Current influence estimation techniques involve computing gradients for every training point or repeated training on different subsets. These approaches face obvious comp… ▽ More

    Submitted 19 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

  31. arXiv:2402.03992  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Space Group Constrained Crystal Generation

    Authors: Rui Jiao, Wenbing Huang, Yu Liu, Deli Zhao, Yang Liu

    Abstract: Crystals are the foundation of numerous scientific and industrial applications. While various learning-based approaches have been proposed for crystal generation, existing methods seldom consider the space group constraint which is crucial in describing the geometry of crystals and closely relevant to many desirable properties. However, considering space group constraint is challenging owing to it… ▽ More

    Submitted 8 April, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: ICLR 2024 poster

  32. arXiv:2401.13192  [pdf

    cs.AI cond-mat.mtrl-sci cs.LG physics.comp-ph

    Generative Design of Crystal Structures by Point Cloud Representations and Diffusion Model

    Authors: Zhelin Li, Rami Mrad, Runxian Jiao, Guan Huang, Jun Shan, Shibing Chu, Yuan** Chen

    Abstract: Efficiently generating energetically stable crystal structures has long been a challenge in material design, primarily due to the immense arrangement of atoms in a crystal lattice. To facilitate the discovery of stable material, we present a framework for the generation of synthesizable materials, leveraging a point cloud representation to encode intricate structural information. At the heart of t… ▽ More

    Submitted 30 January, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: I have submitted to a journal

  33. arXiv:2401.11103  [pdf, other

    cs.DS cs.LG stat.ML

    Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

    Authors: Jiachen T. Wang, Prateek Mittal, Ruoxi Jia

    Abstract: This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted $K$ nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: AISTATS 2024 Oral

  34. arXiv:2401.06373  [pdf, other

    cs.CL cs.AI

    How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs

    Authors: Yi Zeng, Hongpeng Lin, **gwen Zhang, Diyi Yang, Ruoxi Jia, Weiyan Shi

    Abstract: Most traditional AI safety research has approached AI models as machines and centered on algorithm-focused attacks developed by security experts. As large language models (LLMs) become increasingly common and competent, non-expert users can also impose risks during daily interactions. This paper introduces a new perspective to jailbreak LLMs as human-like communicators, to explore this overlooked… ▽ More

    Submitted 23 January, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: 14 pages of the main text, qualitative examples of jailbreaks may be harmful in nature

  35. arXiv:2401.03495  [pdf, other

    eess.IV cs.CV

    Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions

    Authors: Yichi Zhang, Zhenrong Shen, Rushi Jiao

    Abstract: Due to the inherent flexibility of prompting, foundation models have emerged as the predominant force in the fields of natural language processing and computer vision. The recent introduction of the Segment Anything Model (SAM) signifies a noteworthy expansion of the prompt-driven paradigm into the domain of image segmentation, thereby introducing a plethora of previously unexplored capabilities.… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  36. arXiv:2312.13671  [pdf, other

    cs.CL cs.LG

    Text2Analysis: A Benchmark of Table Question Answering with Advanced Data Analysis and Unclear Queries

    Authors: Xinyi He, Mengyu Zhou, Xinrun Xu, Xiaojun Ma, Rui Ding, Lun Du, Yan Gao, Ran Jia, Xu Chen, Shi Han, Zejian Yuan, Dongmei Zhang

    Abstract: Tabular data analysis is crucial in various fields, and large language models show promise in this area. However, current research mostly focuses on rudimentary tasks like Text2SQL and TableQA, neglecting advanced analysis like forecasting and chart generation. To address this gap, we developed the Text2Analysis benchmark, incorporating advanced analysis tasks that go beyond the SQL-compatible ope… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI'2024

  37. arXiv:2312.07514  [pdf, other

    cs.RO

    Integrated and Lightweight Design of Electro-hydraulic Ankle Prosthesis

    Authors: Yi Wei, Xingjian Wang, Xinyu Tian, Shao** Wang, Rujun Jia

    Abstract: For lower limb amputees, an active ankle joint prosthesis can provide basic mobility functions. This study focuses on an ankle joint prosthesis system based on the principle of electric-hydraulic actuation. By analyzing the characteristics of human gait cycles and the mechanics of ankle joint movement, a lightweight and integrated ankle joint prosthesis is designed, considering the requirements fo… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 8 pages, 21 figures, conference

  38. arXiv:2312.03205  [pdf, other

    cs.CR

    Who Leaked the Model? Tracking IP Infringers in Accountable Federated Learning

    Authors: Shuyang Yu, Junyuan Hong, Yi Zeng, Fei Wang, Ruoxi Jia, Jiayu Zhou

    Abstract: Federated learning (FL) emerges as an effective collaborative learning framework to coordinate data and computation resources from massive and distributed clients in training. Such collaboration results in non-trivial intellectual property (IP) represented by the model parameters that should be protected and shared by the whole party rather than an individual user. Meanwhile, the distributed natur… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  39. arXiv:2312.00812  [pdf, other

    cs.AI cs.LG eess.SY

    Empowering Autonomous Driving with Large Language Models: A Safety Perspective

    Authors: Yixuan Wang, Ruochen Jiao, Sinong Simon Zhan, Chengtian Lang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu

    Abstract: Autonomous Driving (AD) encounters significant safety hurdles in long-tail unforeseen driving scenarios, largely stemming from the non-interpretability and poor generalization of the deep neural networks within the AD system, particularly in out-of-distribution and uncertain data. To this end, this paper explores the integration of Large Language Models (LLMs) into AD systems, leveraging their rob… ▽ More

    Submitted 22 March, 2024; v1 submitted 27 November, 2023; originally announced December 2023.

    Comments: Accepted to LLMAgent workshop @ICLR2024

  40. arXiv:2311.18164  [pdf, other

    q-fin.GN cs.CE

    The Paradox Of Just-in-Time Liquidity in Decentralized Exchanges: More Providers Can Sometimes Mean Less Liquidity

    Authors: Agostino Capponi, Ruizhe Jia, Brian Zhu

    Abstract: We study Just-in-time (JIT) liquidity provision in blockchain-based decentralized exchanges. A JIT liquidity provider (LP) monitors pending swap orders in public mempools of blockchains to sandwich orders of their choice with liquidity, depositing right before and withdrawing right after the order. Our game-theoretic model with asymmetrically informed agents reveals that a JIT LP's presence does n… ▽ More

    Submitted 15 February, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

  41. arXiv:2311.17280  [pdf, other

    cs.CL cs.CV

    Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions?

    Authors: Wang Zhu, Ishika Singh, Yuan Huang, Robin Jia, Jesse Thomason

    Abstract: Data augmentation via back-translation is common when pretraining Vision-and-Language Navigation (VLN) models, even though the generated instructions are noisy. But: does that noise matter? We find that nonsensical or irrelevant language instructions during pretraining can have little effect on downstream performance for both HAMT and VLN-BERT on R2R, and is still better than only using clean, hum… ▽ More

    Submitted 23 December, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Accepted by O-DRUM @ CVPR 2023

  42. arXiv:2311.13712  [pdf, other

    cs.AI

    Data Acquisition: A New Frontier in Data-centric AI

    Authors: Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou

    Abstract: As Machine Learning (ML) systems continue to grow, the demand for relevant and comprehensive datasets becomes imperative. There is limited study on the challenges of data acquisition due to ad-hoc processes and lack of consistent methodologies. We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets, transparent prici… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  43. arXiv:2311.09612  [pdf, other

    cs.CV cs.CL

    Efficient End-to-End Visual Document Understanding with Rationale Distillation

    Authors: Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

    Abstract: Understanding visually situated language requires interpreting complex layouts of textual and visual elements. Pre-processing tools, such as optical character recognition (OCR), can map document image inputs to textual tokens, then large language models (LLMs) can reason over text. However, such methods have high computational and engineering complexity. Can small pretrained image-to-text models a… ▽ More

    Submitted 1 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted by NAACL 2024

  44. arXiv:2311.09060  [pdf, other

    cs.CL

    Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks

    Authors: Ting-Yun Chang, Jesse Thomason, Robin Jia

    Abstract: The concept of localization in LLMs is often mentioned in prior work; however, methods for localization have never been systematically and directly evaluated. We propose two complementary benchmarks that evaluate the ability of localization methods to pinpoint LLM components responsible for memorized data. In our INJ benchmark, we actively inject a piece of new information into a small subset of L… ▽ More

    Submitted 2 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: accepted by NAACL 2024

  45. arXiv:2311.02227  [pdf, other

    cs.LG cs.AI eess.SY

    State-Wise Safe Reinforcement Learning With Pixel Observations

    Authors: Simon Sinong Zhan, Yixuan Wang, Qingyuan Wu, Ruochen Jiao, Chao Huang, Qi Zhu

    Abstract: In the context of safe exploration, Reinforcement Learning (RL) has long grappled with the challenges of balancing the tradeoff between maximizing rewards and minimizing safety violations, particularly in complex environments with contact-rich or non-smooth dynamics, and when dealing with high-dimensional pixel observations. Furthermore, incorporating state-wise safety constraints in the explorati… ▽ More

    Submitted 11 December, 2023; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: 10 pages, 5 figures

  46. arXiv:2310.17168  [pdf, other

    cs.LG stat.ML

    Learning an Inventory Control Policy with General Inventory Arrival Dynamics

    Authors: Sohrab Andaz, Carson Eisenach, Dhruv Madeka, Kari Torkkola, Randy Jia, Dean Foster, Sham Kakade

    Abstract: In this paper we address the problem of learning and backtesting inventory control policies in the presence of general arrival dynamics -- which we term as a quantity-over-time arrivals model (QOT). We also allow for order quantities to be modified as a post-processing step to meet vendor constraints such as order minimum and batch size constraints -- a common practice in real supply chains. To th… ▽ More

    Submitted 21 January, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  47. arXiv:2310.17086  [pdf, other

    cs.LG cs.AI cs.CL

    Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models

    Authors: Deqing Fu, Tian-Qi Chen, Robin Jia, Vatsal Sharan

    Abstract: Transformers excel at in-context learning (ICL) -- learning from demonstrations without parameter updates -- but how they do so remains a mystery. Recent work suggests that Transformers may internally run Gradient Descent (GD), a first-order optimization method, to perform ICL. In this paper, we instead demonstrate that Transformers learn to approximate higher-order optimization methods for ICL. F… ▽ More

    Submitted 31 May, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

  48. arXiv:2310.17044  [pdf, other

    cs.LG

    Learning to Rank for Active Learning via Multi-Task Bilevel Optimization

    Authors: Zixin Ding, Si Chen, Ruoxi Jia, Yuxin Chen

    Abstract: Active learning is a promising paradigm to reduce the labeling cost by strategically requesting labels to improve model performance. However, existing active learning methods often rely on expensive acquisition function to compute, extensive modeling retraining and multiple rounds of interaction with annotators. To address these limitations, we propose a novel approach for active learning, which a… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  49. arXiv:2310.16096  [pdf, ps, other

    stat.ML cs.LG

    Contextual Bandits for Evaluating and Improving Inventory Control Policies

    Authors: Dean Foster, Randy Jia, Dhruv Madeka

    Abstract: Solutions to address the periodic review inventory control problem with nonstationary random demand, lost sales, and stochastic vendor lead times typically involve making strong assumptions on the dynamics for either approximation or simulation, and applying methods such as optimization, dynamic programming, or reinforcement learning. Therefore, it is important to analyze and evaluate any inventor… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  50. arXiv:2310.07220  [pdf, other

    cs.LG

    COPlanner: Plan to Roll Out Conservatively but to Explore Optimistically for Model-Based RL

    Authors: Xiyao Wang, Ruijie Zheng, Yanchao Sun, Ruonan Jia, Wichayaporn Wongkamjan, Huazhe Xu, Furong Huang

    Abstract: Dyna-style model-based reinforcement learning contains two phases: model rollouts to generate sample for policy learning and real environment exploration using current policy for dynamics model learning. However, due to the complex real-world environment, it is inevitable to learn an imperfect dynamics model with model prediction error, which can further mislead policy learning and result in sub-o… ▽ More

    Submitted 29 December, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 22 pages, 17 figures