Skip to main content

Showing 1–50 of 122 results for author: Pang, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12921  [pdf, other

    cs.LG

    WindowMixer: Intra-Window and Inter-Window Modeling for Time Series Forecasting

    Authors: Quangao Liu, Ruiqi Li, Maowei Jiang, Wei Yang, Chen Liang, LongLong Pang, Zhuozhang Zou

    Abstract: Time series forecasting (TSF) is crucial in fields like economic forecasting, weather prediction, traffic flow analysis, and public health surveillance. Real-world time series data often include noise, outliers, and missing values, making accurate forecasting challenging. Traditional methods model point-to-point relationships, which limits their ability to capture complex temporal patterns and inc… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2406.06891  [pdf, other

    cs.LG cs.AI

    Tokenize features, enhancing tables: the FT-TABPFN model for tabular classification

    Authors: Quangao Liu, Wei Yang, Chen Liang, Longlong Pang, Zhuozhang Zou

    Abstract: Traditional methods for tabular classification usually rely on supervised learning from scratch, which requires extensive training data to determine model parameters. However, a novel approach called Prior-Data Fitted Networks (TabPFN) has changed this paradigm. TabPFN uses a 12-layer transformer trained on large synthetic datasets to learn universal tabular representations. This method enables fa… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  3. arXiv:2406.06374  [pdf, other

    cs.RO cs.CV

    Multicam-SLAM: Non-overlap** Multi-camera SLAM for Indirect Visual Localization and Navigation

    Authors: Shenghao Li, Luchao Pang, Xianglong Hu

    Abstract: This paper presents a novel approach to visual simultaneous localization and map** (SLAM) using multiple RGB-D cameras. The proposed method, Multicam-SLAM, significantly enhances the robustness and accuracy of SLAM systems by capturing more comprehensive spatial information from various perspectives. This method enables the accurate determination of pose relationships among multiple cameras with… ▽ More

    Submitted 23 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  4. arXiv:2406.05000  [pdf, other

    cs.CV

    AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation

    Authors: Lianyu Pang, Jian Yin, Baoquan Zhao, Feize Wu, Fu Lee Wang, Qing Li, Xudong Mao

    Abstract: Recent advances in text-to-image models have enabled high-quality personalized image synthesis of user-provided concepts with flexible textual control. In this work, we analyze the limitations of two primary techniques in text-to-image personalization: Textual Inversion and DreamBooth. When integrating the learned concept into new prompts, Textual Inversion tends to overfit the concept, while Drea… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  5. arXiv:2406.00944  [pdf, other

    cs.CL cs.AI cs.IR

    Unveil the Duality of Retrieval-Augmented Generation: Theoretical Analysis and Practical Solution

    Authors: Shicheng Xu, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Retrieval-augmented generation (RAG) utilizes retrieved texts to enhance large language models (LLMs). However, studies show that RAG is not consistently effective and can even mislead LLMs due to noisy or incorrect retrieved texts. This suggests that RAG possesses a duality including both benefit and detriment. Although many existing methods attempt to address this issue, they lack a theoretical… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 23 pages

  6. arXiv:2405.17998  [pdf, other

    cs.IR cs.AI cs.CL

    Source Echo Chamber: Exploring the Escalation of Source Bias in User, Data, and Recommender System Feedback Loop

    Authors: Yuqi Zhou, Sunhao Dai, Liang Pang, Gang Wang, Zhenhua Dong, Jun Xu, Ji-Rong Wen

    Abstract: Recently, researchers have uncovered that neural retrieval models prefer AI-generated content (AIGC), called source bias. Compared to active search behavior, recommendation represents another important means of information acquisition, where users are more prone to source bias. Furthermore, delving into the recommendation scenario, as AIGC becomes integrated within the feedback loop involving user… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  7. arXiv:2405.16546  [pdf, other

    cs.IR cs.CL

    Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration

    Authors: Sunhao Dai, Weihao Liu, Yuqi Zhou, Liang Pang, Rongju Ruan, Gang Wang, Zhenhua Dong, Jun Xu, Ji-Rong Wen

    Abstract: The proliferation of Large Language Models (LLMs) has led to an influx of AI-generated content (AIGC) on the internet, transforming the corpus of Information Retrieval (IR) systems from solely human-written to a coexistence with LLM-generated content. The impact of this surge in AIGC on IR systems remains an open question, with the primary challenge being the lack of a dedicated benchmark for rese… ▽ More

    Submitted 2 July, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by Findings of ACL 2024; Datasets Link: https://huggingface.co/IR-Cocktail

  8. arXiv:2405.15349  [pdf, other

    cs.CL

    UnKE: Unstructured Knowledge Editing in Large Language Models

    Authors: **gcheng Deng, Zihao Wei, Liang Pang, Hanxing Ding, Huawei Shen, Xueqi Cheng

    Abstract: Recent knowledge editing methods have primarily focused on modifying structured knowledge in large language models, heavily relying on the assumption that structured knowledge is stored as key-value pairs locally in MLP layers or specific neurons. However, this task setting overlooks the fact that a significant portion of real-world knowledge is stored in an unstructured format, characterized by l… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  9. arXiv:2405.01353  [pdf, other

    cs.CV

    Sparse multi-view hand-object reconstruction for unseen environments

    Authors: Yik Lung Pang, Changjae Oh, Andrea Cavallaro

    Abstract: Recent works in hand-object reconstruction mainly focus on the single-view and dense multi-view settings. On the one hand, single-view methods can leverage learned shape priors to generalise to unseen objects but are prone to inaccuracies due to occlusions. On the other hand, dense multi-view methods are very accurate but cannot easily adapt to unseen objects without further data collection. In co… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Camera-ready version. Paper accepted to CVPRW 2024. 8 pages, 7 figures, 1 table

  10. arXiv:2405.00987  [pdf, other

    cs.LG

    S$^2$AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic

    Authors: Safa Messaoud, Billel Mokeddem, Zhenghai Xue, Linsey Pang, Bo An, Haipeng Chen, Sanjay Chawla

    Abstract: Learning expressive stochastic policies instead of deterministic ones has been proposed to achieve better stability, sample complexity, and robustness. Notably, in Maximum Entropy Reinforcement Learning (MaxEnt RL), the policy is modeled as an expressive Energy-Based Model (EBM) over the Q-values. However, this formulation requires the estimation of the entropy of such EBMs, which is an open probl… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted for publication at ICLR 2024

  11. arXiv:2404.17826  [pdf, other

    cs.IR

    A Taxation Perspective for Fair Re-ranking

    Authors: Chen Xu, Xiaopeng Ye, Wenjie Wang, Liang Pang, Jun Xu, Tat-Seng Chua

    Abstract: Fair re-ranking aims to redistribute ranking slots among items more equitably to ensure responsibility and ethics. The exploration of redistribution problems has a long history in economics, offering valuable insights for conceptualizing fair re-ranking as a taxation process. Such a formulation provides us with a fresh perspective to re-examine fair re-ranking and inspire the development of new me… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Accepted in SIGIR 2024

  12. arXiv:2404.16924  [pdf, other

    cs.IR cs.CL

    A Survey of Generative Search and Recommendation in the Era of Large Language Models

    Authors: Yongqi Li, Xinyu Lin, Wenjie Wang, Fuli Feng, Liang Pang, Wenjie Li, Liqiang Nie, Xiangnan He, Tat-Seng Chua

    Abstract: With the information explosion on the Web, search and recommendation are foundational infrastructures to satisfying users' information needs. As the two sides of the same coin, both revolve around the same core research problem, matching queries with documents or users with items. In the recent few decades, search and recommendation have experienced synchronous technological paradigm shifts, inclu… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  13. arXiv:2404.11457  [pdf, other

    cs.IR cs.AI cs.CL

    Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models

    Authors: Sunhao Dai, Chen Xu, Shicheng Xu, Liang Pang, Zhenhua Dong, Jun Xu

    Abstract: With the rapid advancement of large language models (LLMs), information retrieval (IR) systems, such as search engines and recommender systems, have undergone a significant paradigm shift. This evolution, while heralding new opportunities, introduces emerging challenges, particularly in terms of biases and unfairness, which may threaten the information ecosystem. In this paper, we present a compre… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  14. arXiv:2404.11129  [pdf, other

    cs.CV

    Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales

    Authors: Minghe Gao, Shuang Chen, Liang Pang, Yuan Yao, Jisheng Dang, Wenqiao Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang, Tat-Seng Chua

    Abstract: The remarkable performance of Multimodal Large Language Models (MLLMs) has unequivocally demonstrated their proficient understanding capabilities in handling a wide array of visual tasks. Nevertheless, the opaque nature of their black-box reasoning processes persists as an enigma, rendering them uninterpretable and struggling with hallucination. Their ability to execute intricate compositional rea… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  15. arXiv:2404.09043  [pdf, other

    cs.CL

    Do LLMs Play Dice? Exploring Probability Distribution Sampling in Large Language Models for Behavioral Simulation

    Authors: Jia Gu, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: With the rapid advancement of large language models (LLMs) for handling complex language tasks, an increasing number of studies are employing LLMs as agents to emulate the sequential decision-making processes of humans often represented as Markov decision-making processes (MDPs). The actions in MDPs adhere to specific probability distributions and require iterative sampling. This arouses curiosity… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

  16. arXiv:2404.04990  [pdf, other

    cs.CL

    MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models

    Authors: Zihao Wei, **gcheng Deng, Liang Pang, Hanxing Ding, Huawei Shen, Xueqi Cheng

    Abstract: The extensive utilization of large language models (LLMs) underscores the crucial necessity for precise and contemporary knowledge embedded within their intrinsic parameters. Existing research on knowledge editing primarily concentrates on monolingual scenarios, neglecting the complexities presented by multilingual contexts and multi-hop reasoning. To address these challenges, our study introduces… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  17. arXiv:2403.19275  [pdf, other

    cs.CL cs.AI

    Knowledge Boundary and Persona Dynamic Shape A Better Social Media Agent

    Authors: Junkai Zhou, Liang Pang, Ya **g, Jia Gu, Huawei Shen, Xueqi Cheng

    Abstract: Constructing personalized and anthropomorphic agents holds significant importance in the simulation of social networks. However, there are still two key problems in existing works: the agent possesses world knowledge that does not belong to its personas, and it cannot eliminate the interference of diverse persona information on current actions, which reduces the personalization and anthropomorphis… ▽ More

    Submitted 2 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  18. arXiv:2403.17155  [pdf, other

    cs.CL cs.CR

    Task-Agnostic Detector for Insertion-Based Backdoor Attacks

    Authors: Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen

    Abstract: Textual backdoor attacks pose significant security threats. Current detection approaches, typically relying on intermediate feature representation or reconstructing potential triggers, are task-specific and less effective beyond sentence classification, struggling with tasks like question answering and named entity recognition. We introduce TABDet (Task-Agnostic Backdoor Detector), a pioneering ta… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Findings of NAACL 2024

  19. arXiv:2403.10340  [pdf, other

    cs.CV cs.RO

    Thermal-NeRF: Neural Radiance Fields from an Infrared Camera

    Authors: Tianxiang Ye, Qi Wu, Junyuan Deng, Guoqing Liu, Liu Liu, Songpengcheng Xia, Liang Pang, Wenxian Yu, Ling Pei

    Abstract: In recent years, Neural Radiance Fields (NeRFs) have demonstrated significant potential in encoding highly-detailed 3D geometry and environmental appearance, positioning themselves as a promising alternative to traditional explicit representation for 3D scene reconstruction. However, the predominant reliance on RGB imaging presupposes ideal lighting conditions: a premise frequently unmet in roboti… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  20. arXiv:2403.07805  [pdf, other

    cs.CL cs.AI

    Beyond Memorization: The Challenge of Random Memory Access in Language Models

    Authors: Tongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin

    Abstract: Recent developments in Language Models (LMs) have shown their effectiveness in NLP tasks, particularly in knowledge-intensive tasks. However, the mechanisms underlying knowledge storage and memory access within their parameters remain elusive. In this paper, we investigate whether a generative LM (e.g., GPT-2) is able to access its memory sequentially or randomly. Through carefully-designed synthe… ▽ More

    Submitted 13 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: 8 pages, 4 figures; fixed typos

  21. arXiv:2403.06013  [pdf, other

    cs.LG cs.CV

    Are Classification Robustness and Explanation Robustness Really Strongly Correlated? An Analysis Through Input Loss Landscape

    Authors: Tie** Chen, Wenwang Huang, Linsey Pang, Dongsheng Luo, Hua Wei

    Abstract: This paper delves into the critical area of deep learning robustness, challenging the conventional belief that classification robustness and explanation robustness in image classification systems are inherently correlated. Through a novel evaluation approach leveraging clustering for efficient assessment of explanation robustness, we demonstrate that enhancing explanation robustness does not neces… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  22. arXiv:2403.04260  [pdf, other

    cs.IR cs.CL cs.LG

    Can Small Language Models be Good Reasoners for Sequential Recommendation?

    Authors: Yuling Wang, Changxin Tian, Binbin Hu, Yanhua Yu, Ziqi Liu, Zhiqiang Zhang, Jun Zhou, Liang Pang, Xiao Wang

    Abstract: Large language models (LLMs) open up new horizons for sequential recommendations, owing to their remarkable language comprehension and generation capabilities. However, there are still numerous challenges that should be addressed to successfully implement sequential recommendations empowered by LLMs. Firstly, user behavior patterns are often complex, and relying solely on one-step reasoning from L… ▽ More

    Submitted 28 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted by TheWebConf (WWW) 2024

  23. arXiv:2402.18150  [pdf, other

    cs.CL cs.AI cs.IR

    Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation

    Authors: Shicheng Xu, Liang Pang, Mo Yu, Fandong Meng, Huawei Shen, Xueqi Cheng, Jie Zhou

    Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating additional information from retrieval. However, studies have shown that LLMs still face challenges in effectively using the retrieved information, even ignoring it or being misled by it. The key reason is that the training of LLMs does not clearly make LLMs learn how to utilize input retrieved texts with va… ▽ More

    Submitted 11 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Main

  24. arXiv:2402.15865  [pdf, other

    cs.CV eess.IV

    HIR-Diff: Unsupervised Hyperspectral Image Restoration Via Improved Diffusion Models

    Authors: Li Pang, Xiangyu Rui, Long Cui, Hongzhong Wang, Deyu Meng, Xiangyong Cao

    Abstract: Hyperspectral image (HSI) restoration aims at recovering clean images from degraded observations and plays a vital role in downstream tasks. Existing model-based methods have limitations in accurately modeling the complex image characteristics with handcraft priors, and deep learning-based methods suffer from poor generalization ability. To alleviate these issues, this paper proposes an unsupervis… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  25. arXiv:2402.15183  [pdf, other

    cs.LG cs.AI

    GraphEdit: Large Language Models for Graph Structure Learning

    Authors: Zirui Guo, Lianghao Xia, Yanhua Yu, Yuling Wang, Zixuan Yang, Wei Wei, Liang Pang, Tat-Seng Chua, Chao Huang

    Abstract: Graph Structure Learning (GSL) focuses on capturing intrinsic dependencies and interactions among nodes in graph-structured data by generating novel graph structures. Graph Neural Networks (GNNs) have emerged as promising GSL solutions, utilizing recursive message passing to encode node-wise inter-dependencies. However, many existing GSL methods heavily depend on explicit graph structural informat… ▽ More

    Submitted 5 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  26. arXiv:2402.14272  [pdf, other

    cs.CL

    Qsnail: A Questionnaire Dataset for Sequential Question Generation

    Authors: Yan Lei, Liang Pang, Yuanzhuo Wang, Huawei Shen, Xueqi Cheng

    Abstract: The questionnaire is a professional research methodology used for both qualitative and quantitative analysis of human opinions, preferences, attitudes, and behaviors. However, designing and evaluating questionnaires demands significant effort due to their intricate and complex structure. Questionnaires entail a series of questions that must conform to intricate constraints involving the questions,… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to the LREC-COLING 2024

  27. arXiv:2402.13576  [pdf, other

    cs.CV cs.IR

    Improving Video Corpus Moment Retrieval with Partial Relevance Enhancement

    Authors: Danyang Hou, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Video Corpus Moment Retrieval (VCMR) is a new video retrieval task aimed at retrieving a relevant moment from a large corpus of untrimmed videos using a text query. The relevance between the video and query is partial, mainly evident in two aspects:~(1)~Scope: The untrimmed video contains many frames, but not all are relevant to the query. Strong relevance is typically observed only within the rel… ▽ More

    Submitted 23 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: camera-ready version of ACM ICMR 2024

  28. arXiv:2402.13566  [pdf, other

    cs.CV cs.IR

    Event-aware Video Corpus Moment Retrieval

    Authors: Danyang Hou, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Video Corpus Moment Retrieval (VCMR) is a practical video retrieval task focused on identifying a specific moment within a vast corpus of untrimmed videos using the natural language query. Existing methods for VCMR typically rely on frame-aware video retrieval, calculating similarities between the query and video frames to rank videos based on maximum frame similarity.However, this approach overlo… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 11 pages, 5 figures, 9 tables

  29. arXiv:2402.13048  [pdf, other

    cs.CL

    Stable Knowledge Editing in Large Language Models

    Authors: Zihao Wei, Liang Pang, Hanxing Ding, **gcheng Deng, Huawei Shen, Xueqi Cheng

    Abstract: Efficient knowledge editing of large language models is crucial for replacing obsolete information or incorporating specialized knowledge on a large scale. However, previous methods implicitly assume that knowledge is localized and isolated within the model, an assumption that oversimplifies the interconnected nature of model knowledge. The premise of localization results in an incomplete knowledg… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  30. arXiv:2402.10612  [pdf, other

    cs.CL

    Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models

    Authors: Hanxing Ding, Liang Pang, Zihao Wei, Huawei Shen, Xueqi Cheng

    Abstract: Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs). The utilization of parametric knowledge in generating factual content is constrained by the limited knowledge of LLMs, potentially resulting in internal hallucinations. While incorporating external information can help fill knowledge gaps, it also introduces the risk of irrelevant informat… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  31. arXiv:2402.02764  [pdf, other

    cs.IR cs.AI cs.CL

    List-aware Reranking-Truncation Joint Model for Search and Retrieval-augmented Generation

    Authors: Shicheng Xu, Liang Pang, Jun Xu, Huawei Shen, Xueqi Cheng

    Abstract: The results of information retrieval (IR) are usually presented in the form of a ranked list of candidate documents, such as web search for humans and retrieval-augmented generation for large language models (LLMs). List-aware retrieval aims to capture the list-level contextual features to return a better list, mainly including reranking and truncation. Reranking finely re-scores the documents in… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024

  32. arXiv:2312.15905  [pdf, other

    cs.CV

    Cross Initialization for Personalized Text-to-Image Generation

    Authors: Lianyu Pang, Jian Yin, Haoran Xie, Qi** Wang, Qing Li, Xudong Mao

    Abstract: Recently, there has been a surge in face personalization techniques, benefiting from the advanced capabilities of pretrained text-to-image diffusion models. Among these, a notable method is Textual Inversion, which generates personalized images by inverting given images into textual embeddings. However, methods based on Textual Inversion still struggle with balancing the trade-off between reconstr… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  33. arXiv:2312.01052  [pdf, other

    cs.IR cs.CL

    SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting

    Authors: Yunshan Ma, Chenchen Ye, Zijian Wu, Xiang Wang, Yixin Cao, Liang Pang, Tat-Seng Chua

    Abstract: Temporal complex event forecasting aims to predict the future events given the observed events from history. Most formulations of temporal complex event are unstructured or without extensive temporal information, resulting in inferior representations and limited forecasting capabilities. To bridge these gaps, we innovatively introduce the formulation of Structured, Complex, and Time-complete tempo… ▽ More

    Submitted 3 April, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

    Comments: pre-print, 6 figures, 7 tables

    ACM Class: H.3.0

  34. arXiv:2311.14084  [pdf, other

    cs.IR cs.AI cs.CV

    Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images

    Authors: Shicheng Xu, Danyang Hou, Liang Pang, **gcheng Deng, Jun Xu, Huawei Shen, Xueqi Cheng

    Abstract: With the advancement of generation models, AI-generated content (AIGC) is becoming more realistic, flooding the Internet. A recent study suggests that this phenomenon causes source bias in text retrieval for web search. Specifically, neural retrieval models tend to rank generated texts higher than human-written texts. In this paper, we extend the study of this bias to cross-modal retrieval. Firstl… ▽ More

    Submitted 26 May, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

    Comments: Accepted by SIGIR 2024

  35. arXiv:2311.13614  [pdf, other

    cs.CV cs.AI

    HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

    Authors: Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang

    Abstract: Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks. However, the hallucinations inherent in machine-generated data, which could lead to hallucinatory outputs in MLLMs, remain under-explored. This work aims to investigate various hallucinations (i.e., objec… ▽ More

    Submitted 24 March, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Accepted by CVPR 2024

  36. arXiv:2311.12890  [pdf, other

    cs.CV

    De-fine: Decomposing and Refining Visual Programs with Auto-Feedback

    Authors: Minghe Gao, Juncheng Li, Hao Fei, Liang Pang, Wei Ji, Guoming Wang, Wenqiao Zhang, Siliang Tang, Yueting Zhuang

    Abstract: Visual programming, a modular and generalizable paradigm, integrates different modules and Python operators to solve various vision-language tasks. Unlike end-to-end models that need task-specific data, it advances in performing visual processing and reasoning in an unsupervised manner. Current visual programming methods generate programs in a single pass for each task where the ability to evaluat… ▽ More

    Submitted 25 November, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

  37. arXiv:2311.07445  [pdf, other

    cs.CL cs.AI

    Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue

    Authors: Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: The emergence of large language models (LLMs) further improves the capabilities of open-domain dialogue systems and can generate fluent, coherent, and diverse responses. However, LLMs still lack a crucial ability: communication skills. This limitation renders them more like information seeking tools rather than anthropomorphic chatbots. Communication skills, such as topic transition, proactively a… ▽ More

    Submitted 15 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted by NAACL 2024 Findings

  38. arXiv:2311.07054  [pdf, other

    cs.IR

    Do LLMs Implicitly Exhibit User Discrimination in Recommendation? An Empirical Study

    Authors: Chen Xu, Wenjie Wang, Yuxin Li, Liang Pang, Jun Xu, Tat-Seng Chua

    Abstract: Recently, Large Language Models (LLMs) have enhanced user interaction, enabling seamless information retrieval and recommendations. However, concerns emerge as these LLMs have shown tendencies to display discrimination related to users' sensitive characteristics (such as gender), leading to explicit user unfairness. Furthermore, our analysis uncovers a more discreet variant of bias in LLMs, define… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: No

  39. arXiv:2311.04477  [pdf, other

    cs.RO

    PLV-IEKF: Consistent Visual-Inertial Odometry using Points, Lines, and Vanishing Points

    Authors: Tong Hua, Tao Li, Liang Pang, Guoqing Liu, Wencheng Xuanyuan, Chang Shu, Ling Pei

    Abstract: In this paper, we propose an Invariant Extended Kalman Filter (IEKF) based Visual-Inertial Odometry (VIO) using multiple features in man-made environments. Conventional EKF-based VIO usually suffers from system inconsistency and angular drift that naturally occurs in feature-based methods. However, in man-made environments, notable structural regularities, such as lines and vanishing points, offer… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: ROBIO 2023

  40. arXiv:2311.01666  [pdf, other

    cs.IR cs.CL

    Plot Retrieval as an Assessment of Abstract Semantic Association

    Authors: Shicheng Xu, Liang Pang, Jiangnan Li, Mo Yu, Fandong Meng, Huawei Shen, Xueqi Cheng, Jie Zhou

    Abstract: Retrieving relevant plots from the book for a query is a critical task, which can improve the reading experience and efficiency of readers. Readers usually only give an abstract and vague description as the query based on their own understanding, summaries, or speculations of the plot, which requires the retrieval model to have a strong ability to estimate the abstract semantic associations betwee… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  41. arXiv:2310.20501  [pdf, other

    cs.IR cs.AI cs.CL

    LLMs may Dominate Information Access: Neural Retrievers are Biased Towards LLM-Generated Texts

    Authors: Sunhao Dai, Yuqi Zhou, Liang Pang, Weihao Liu, Xiaolin Hu, Yong Liu, Xiao Zhang, Gang Wang, Jun Xu

    Abstract: Recently, the emergence of large language models (LLMs) has revolutionized the paradigm of information retrieval (IR) applications, especially in web search. With their remarkable capabilities in generating human-like texts, LLMs have created enormous texts on the Internet. As a result, IR systems in the LLMs era are facing a new challenge: the indexed documents now are not only written by human b… ▽ More

    Submitted 14 January, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

  42. arXiv:2310.14480  [pdf, other

    cs.LG

    Attention-Enhancing Backdoor Attacks Against BERT-based Models

    Authors: Weimin Lyu, Songzhu Zheng, Lu Pang, Haibin Ling, Chao Chen

    Abstract: Recent studies have revealed that \textit{Backdoor Attacks} can threaten the safety of natural language processing (NLP) models. Investigating the strategies of backdoor attacks will help to understand the model's vulnerability. Most existing textual backdoor attacks focus on generating stealthy triggers or modifying model weights. In this paper, we directly target the interior structure of neural… ▽ More

    Submitted 24 October, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  43. arXiv:2310.10567  [pdf, other

    cs.CL

    RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language Modeling

    Authors: **gcheng Deng, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Retrieval-augmented language models show promise in addressing issues like outdated information and hallucinations in language models (LMs). However, current research faces two main problems: 1) determining what information to retrieve, and 2) effectively combining retrieved information during generation. We argue that valuable retrieved information should not only be related to the current source… ▽ More

    Submitted 23 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to the Findings of EMNLP 2023

  44. arXiv:2310.08943  [pdf, other

    cs.CL

    Multi-level Adaptive Contrastive Learning for Knowledge Internalization in Dialogue Generation

    Authors: Chenxu Yang, Zheng Lin, Lanrui Wang, Chong Tian, Liang Pang, Jiangnan Li, Qirong Ho, Yanan Cao, Wei** Wang

    Abstract: Knowledge-grounded dialogue generation aims to mitigate the issue of text degeneration by incorporating external knowledge to supplement the context. However, the model often fails to internalize this information into responses in a human-like manner. Instead, it simply inserts segments of the provided knowledge into generic responses. As a result, the generated responses tend to be tedious, incoh… ▽ More

    Submitted 17 October, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP 2023

  45. arXiv:2306.17797  [pdf, other

    cs.CV eess.IV

    HIDFlowNet: A Flow-Based Deep Network for Hyperspectral Image Denoising

    Authors: Li Pang, Weizhen Gu, Xiangyong Cao, Xiangyu Rui, Jiangjun Peng, Shuang Xu, Gang Yang, Deyu Meng

    Abstract: Hyperspectral image (HSI) denoising is essentially ill-posed since a noisy HSI can be degraded from multiple clean HSIs. However, current deep learning-based approaches ignore this fact and restore the clean image with deterministic map** (i.e., the network receives a noisy HSI and outputs a clean HSI). To alleviate this issue, this paper proposes a flow-based HSI denoising network (HIDFlowNet)… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: 10 pages, 8 figures

  46. arXiv:2305.15004  [pdf, other

    cs.CL

    LLMDet: A Third Party Large Language Models Generated Text Detection Tool

    Authors: Kangxi Wu, Liang Pang, Huawei Shen, Xueqi Cheng, Tat-Seng Chua

    Abstract: Generated texts from large language models (LLMs) are remarkably close to high-quality human-authored text, raising concerns about their potential misuse in spreading false information and academic misconduct. Consequently, there is an urgent need for a highly practical detection tool capable of accurately identifying the source of a given text. However, existing detection tools typically rely on… ▽ More

    Submitted 3 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to the Findings of EMNLP 2023

  47. arXiv:2305.12785  [pdf, other

    cs.CL

    MacLaSa: Multi-Aspect Controllable Text Generation via Efficient Sampling from Compact Latent Space

    Authors: Hanxing Ding, Liang Pang, Zihao Wei, Huawei Shen, Xueqi Cheng, Tat-Seng Chua

    Abstract: Multi-aspect controllable text generation aims to generate fluent sentences that possess multiple desired attributes simultaneously. Traditional methods either combine many operators in the decoding stage, often with costly iteration or search in the discrete text space, or train separate controllers for each aspect, resulting in a degeneration of text quality due to the discrepancy between differ… ▽ More

    Submitted 17 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted to the Findings of EMNLP 2023

  48. arXiv:2305.11130  [pdf, other

    cs.AI

    SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation

    Authors: Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Language models trained on large-scale corpora can generate remarkably fluent results in open-domain dialogue. However, for the persona-based dialogue generation task, consistency and coherence are also key factors, which are great challenges for language models. Existing works mainly focus on valuable data filtering, model structure modifying, or objective function designing, while their improvem… ▽ More

    Submitted 20 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023 Main

  49. arXiv:2305.11052  [pdf, other

    cs.IR cs.CL

    BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval

    Authors: Shicheng Xu, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Dense retrieval has shown promise in the first-stage retrieval process when trained on in-domain labeled datasets. However, previous studies have found that dense retrieval is hard to generalize to unseen domains due to its weak modeling of domain-invariant and interpretable feature (i.e., matching signal between two texts, which is the essence of information retrieval). In this paper, we propose… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023 Main

  50. arXiv:2305.10925  [pdf, other

    cs.CV eess.IV

    Unsupervised Hyperspectral Pansharpening via Low-rank Diffusion Model

    Authors: Xiangyu Rui, Xiangyong Cao, Li Pang, Zeyu Zhu, Zongsheng Yue, Deyu Meng

    Abstract: Hyperspectral pansharpening is a process of merging a high-resolution panchromatic (PAN) image and a low-resolution hyperspectral (LRHS) image to create a single high-resolution hyperspectral (HRHS) image. Existing Bayesian-based HS pansharpening methods require designing handcraft image prior to characterize the image features, and deep learning-based HS pansharpening methods usually require a la… ▽ More

    Submitted 19 November, 2023; v1 submitted 18 May, 2023; originally announced May 2023.