Skip to main content

Showing 1–4 of 4 results for author: Ran, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09321  [pdf, other

    cs.CR cs.AI cs.CL

    JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models

    Authors: Delong Ran, **yuan Liu, Yichen Gong, **gyi Zheng, Xinlei He, Tianshuo Cong, Anyu Wang

    Abstract: Jailbreak attacks aim to induce Large Language Models (LLMs) to generate harmful responses for forbidden instructions, presenting severe misuse threats to LLMs. Up to now, research into jailbreak attacks and defenses is emerging, however, there is (surprisingly) no consensus on how to evaluate whether a jailbreak attempt is successful. In other words, the methods to assess the harmfulness of an LL… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Our code is available at https://github.com/ThuCCSLab/JailbreakEval

  2. arXiv:2404.05188  [pdf, other

    cs.CR cs.AI cs.CL

    Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging

    Authors: Tianshuo Cong, Delong Ran, Zesen Liu, Xinlei He, **yuan Liu, Yichen Gong, Qi Li, Anyu Wang, Xiaoyun Wang

    Abstract: Model merging is a promising lightweight model empowerment technique that does not rely on expensive computing devices (e.g., GPUs) or require the collection of specific training data. Instead, it involves editing different upstream model parameters to absorb their downstream task capabilities. However, uncertified model merging can infringe upon the Intellectual Property (IP) rights of the origin… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Technical Report

  3. arXiv:2311.05608  [pdf, other

    cs.CR cs.AI cs.CL

    FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts

    Authors: Yichen Gong, Delong Ran, **yuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang, Sisi Duan, Xiaoyun Wang

    Abstract: Ensuring the safety of artificial intelligence-generated content (AIGC) is a longstanding topic in the artificial intelligence (AI) community, and the safety concerns associated with Large Language Models (LLMs) have been widely investigated. Recently, large vision-language models (VLMs) represent an unprecedented revolution, as they are built upon LLMs but can incorporate additional modalities (e… ▽ More

    Submitted 13 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Technical Report

  4. CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pre-trained Models

    Authors: Hao Yu, Bo Shen, Dezhi Ran, Jiaxin Zhang, Qi Zhang, Yuchi Ma, Guangtai Liang, Ying Li, Qianxiang Wang, Tao Xie

    Abstract: Code generation models based on the pre-training and fine-tuning paradigm have been increasingly attempted by both academia and industry, resulting in well-known industrial models such as Codex, CodeGen, and PanGu-Coder. To evaluate the effectiveness of these models, multiple existing benchmarks are proposed, including only cases of generating a standalone function, i.e., a function that may invok… ▽ More

    Submitted 23 February, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

    Journal ref: ICSE (2024)