Skip to main content

Showing 1–50 of 216 results for author: Tang, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01245  [pdf, other

    cs.AI cs.CY

    SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model

    Authors: Lingyue Fu, Hao Guan, Kounianhua Du, Jianghao Lin, Wei Xia, Weinan Zhang, Ruiming Tang, Yasheng Wang, Yong Yu

    Abstract: Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question, which is a crucial task in intelligent tutoring systems (ITS). In educational KT scenarios, transductive ID-based methods often face severe data sparsity and cold start problems, where interactions between individual students and questions are sparse, and new questions and concepts consistently a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.18825  [pdf, other

    cs.IR

    ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation

    Authors: Jizheng Chen, Kounianhua Du, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang

    Abstract: Large language models have been flourishing in the natural language processing (NLP) domain, and their potential for recommendation has been paid much attention to. Despite the intelligence shown by the recommendation-oriented finetuned models, LLMs struggle to fully understand the user behavior patterns due to their innate weakness in interpreting numerical features and the overhead for long cont… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.13048  [pdf, other

    cs.CV

    Head Pose Estimation and 3D Neural Surface Reconstruction via Monocular Camera in situ for Navigation and Safe Insertion into Natural Openings

    Authors: Ruijie Tang, Beilei Cui, Hongliang Ren

    Abstract: As the significance of simulation in medical care and intervention continues to grow, it is anticipated that a simplified and low-cost platform can be set up to execute personalized diagnoses and treatments. 3D Slicer can not only perform medical image analysis and visualization but can also provide surgical navigation and surgical planning functions. In this paper, we have chosen 3D Slicer as our… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by ICBIR 2024

  4. arXiv:2406.12529  [pdf, other

    cs.IR cs.AI

    LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation

    Authors: Yuhao Wang, Yichao Wang, Zichuan Fu, Xiangyang Li, Xiangyu Zhao, Huifeng Guo, Ruiming Tang

    Abstract: As the demand for more personalized recommendation grows and a dramatic boom in commercial scenarios arises, the study on multi-scenario recommendation (MSR) has attracted much attention, which uses the data from all scenarios to simultaneously improve their recommendation performance. However, existing methods tend to integrate insufficient scenario knowledge and neglect learning personalized cro… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  5. arXiv:2406.12433  [pdf, other

    cs.IR

    LLM-enhanced Reranking in Recommender Systems

    Authors: **gtong Gao, Bo Chen, Xiangyu Zhao, Weiwen Liu, Xiangyang Li, Yichao Wang, Zijian Zhang, Wanyu Wang, Yuyang Ye, Shanru Lin, Huifeng Guo, Ruiming Tang

    Abstract: Reranking is a critical component in recommender systems, playing an essential role in refining the output of recommendation algorithms. Traditional reranking models have focused predominantly on accuracy, but modern applications demand consideration of additional criteria such as diversity and fairness. Existing reranking approaches often fail to harmonize these diverse criteria effectively at th… ▽ More

    Submitted 20 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  6. arXiv:2406.11030  [pdf, other

    cs.CL

    FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

    Authors: Wenyan Li, Xinyu Zhang, Jiaang Li, Qiwei Peng, Raphael Tang, Li Zhou, Weijia Zhang, Guimin Hu, Yifei Yuan, Anders Søgaard, Daniel Hershcovich, Desmond Elliott

    Abstract: Food is a rich and varied dimension of cultural heritage, crucial to both individuals and social groups. To bridge the gap in the literature on the often-overlooked regional diversity in this domain, we introduce FoodieQA, a manually curated, fine-grained image-text dataset capturing the intricate features of food cultures across various regions in China. We evaluate vision-language Models (VLMs)… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  7. arXiv:2406.08482  [pdf, other

    cs.CV cs.CL

    Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation

    Authors: Raphael Tang, Xinyu Zhang, Lixinyu Xu, Yao Lu, Wenyan Li, Pontus Stenetorp, Jimmy Lin, Ferhan Ture

    Abstract: Diffusion models are the state of the art in text-to-image generation, but their perceptual variability remains understudied. In this paper, we examine how prompts affect image variability in black-box diffusion-based models. We propose W1KP, a human-calibrated measure of variability in a set of images, bootstrapped from existing image-pair perceptual distances. Current datasets do not cover recen… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures

  8. arXiv:2406.06602  [pdf

    cs.LG eess.SY stat.AP

    Modeling of New Energy Vehicles' Impact on Urban Ecology Focusing on Behavior

    Authors: Run-Xuan Tang

    Abstract: The surging demand for new energy vehicles is driven by the imperative to conserve energy, reduce emissions, and enhance the ecological ambiance. By conducting behavioral analysis and mining usage patterns of new energy vehicles, particular patterns can be identified. For instance, overloading the battery, operating with low battery power, and driving at excessive speeds can all detrimentally affe… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 13 pages

  9. arXiv:2406.02368  [pdf, other

    cs.IR cs.CL

    Large Language Models Make Sample-Efficient Recommender Systems

    Authors: Jianghao Lin, Xinyi Dai, Rong Shan, Bo Chen, Ruiming Tang, Yong Yu, Weinan Zhang

    Abstract: Large language models (LLMs) have achieved remarkable progress in the field of natural language processing (NLP), demonstrating remarkable abilities in producing text that resembles human language for various tasks. This opens up new opportunities for employing them in recommender systems (RSs). In this paper, we specifically examine the sample efficiency of LLM-enhanced recommender systems, which… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by Frontier of Computer Science

  10. arXiv:2406.02265  [pdf, other

    cs.CV cs.CL

    Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning

    Authors: Wenyan Li, Jiaang Li, Rita Ramos, Raphael Tang, Desmond Elliott

    Abstract: Recent advances in retrieval-augmented models for image captioning highlight the benefit of retrieving related captions for efficient, lightweight models with strong domain-transfer capabilities. While these models demonstrate the success of retrieval augmentation, retrieval models are still far from perfect in practice: the retrieved information can sometimes mislead the model, resulting in incor… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 9 pages, long paper at ACL 2024

  11. arXiv:2406.00012  [pdf, other

    cs.IR cs.AI

    Extracting Essential and Disentangled Knowledge for Recommendation Enhancement

    Authors: Kounianhua Du, Jizheng Chen, Jianghao Lin, Menghui Zhu, Bo Chen, Shuai Li, Ruiming Tang

    Abstract: Recommender models play a vital role in various industrial scenarios, while often faced with the catastrophic forgetting problem caused by the fast shifting data distribution, e.g., the evolving user interests, click signals fluctuation during sales promotions, etc. To alleviate this problem, a common approach is to reuse knowledge from the historical data. However, preserving the vast and fast-ac… ▽ More

    Submitted 20 May, 2024; originally announced June 2024.

  12. arXiv:2406.00011  [pdf, other

    cs.IR cs.AI

    DisCo: Towards Harmonious Disentanglement and Collaboration between Tabular and Semantic Space for Recommendation

    Authors: Kounianhua Du, Jizheng Chen, Jianghao Lin, Yunjia Xi, Hangyu Wang, Xinyi Dai, Bo Chen, Ruiming Tang, Weinan Zhang

    Abstract: Recommender systems play important roles in various applications such as e-commerce, social media, etc. Conventional recommendation methods usually model the collaborative signals within the tabular representation space. Despite the personalization modeling and the efficiency, the latent semantic dependencies are omitted. Methods that introduce semantics into recommendation then emerge, injecting… ▽ More

    Submitted 4 June, 2024; v1 submitted 20 May, 2024; originally announced June 2024.

  13. arXiv:2405.19010  [pdf, other

    cs.CL cs.AI cs.IR

    Evaluating the External and Parametric Knowledge Fusion of Large Language Models

    Authors: Hao Zhang, Yuyang Zhang, Xiaoguang Li, Wenxuan Shi, Haonan Xu, Huanshuo Liu, Yasheng Wang, Lifeng Shang, Qun Liu, Yong Liu, Ruiming Tang

    Abstract: Integrating external knowledge into large language models (LLMs) presents a promising solution to overcome the limitations imposed by their antiquated and static parametric memory. Prior studies, however, have tended to over-reliance on external knowledge, underestimating the valuable contributions of an LLMs' intrinsic parametric knowledge. The efficacy of LLMs in blending external and parametric… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 15 pages, 3 figures, 3 tables

  14. arXiv:2405.12892  [pdf, other

    cs.IR cs.LG

    Retrievable Domain-Sensitive Feature Memory for Multi-Domain Recommendation

    Authors: Yuang Zhao, Zhaocheng Du, Qinglin Jia, Linxuan Zhang, Zhenhua Dong, Ruiming Tang

    Abstract: With the increase in the business scale and number of domains in online advertising, multi-domain ad recommendation has become a mainstream solution in the industry. The core of multi-domain recommendation is effectively modeling the commonalities and distinctions among domains. Existing works are dedicated to designing model architectures for implicit multi-domain modeling while overlooking an in… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  15. arXiv:2405.12442  [pdf, other

    cs.IR cs.AI

    Learning Structure and Knowledge Aware Representation with Large Language Models for Concept Recommendation

    Authors: Qingyao Li, Wei Xia, Kounianhua Du, Qiji Zhang, Weinan Zhang, Ruiming Tang, Yong Yu

    Abstract: Concept recommendation aims to suggest the next concept for learners to study based on their knowledge states and the human knowledge system. While knowledge states can be predicted using knowledge tracing models, previous approaches have not effectively integrated the human knowledge system into the process of designing these educational models. In the era of rapidly evolving Large Language Model… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 11 pages, 8 figures

  16. arXiv:2405.10596  [pdf, other

    cs.IR

    CELA: Cost-Efficient Language Model Alignment for CTR Prediction

    Authors: Xingmei Wang, Weiwen Liu, Xiaolong Chen, Qi Liu, Xu Huang, Defu Lian, Xiangyang Li, Yasheng Wang, Zhenhua Dong, Ruiming Tang

    Abstract: Click-Through Rate (CTR) prediction holds a paramount position in recommender systems. The prevailing ID-based paradigm underperforms in cold-start scenarios due to the skewed distribution of feature frequency. Additionally, the utilization of a single modality fails to exploit the knowledge contained within textual features. Recent efforts have sought to mitigate these challenges by integrating P… ▽ More

    Submitted 17 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 10 pages, 5 figures

    MSC Class: 68T07

  17. arXiv:2405.02355  [pdf, other

    cs.SE cs.AI

    CodeGRAG: Extracting Composed Syntax Graphs for Retrieval Augmented Cross-Lingual Code Generation

    Authors: Kounianhua Du, Renting Rui, Huacan Chai, Lingyue Fu, Wei Xia, Yasheng Wang, Ruiming Tang, Yong Yu, Weinan Zhang

    Abstract: Utilizing large language models to generate codes has shown promising meaning in software development revolution. Despite the intelligence shown by the general large language models, their specificity in code generation can still be improved due to the syntactic gap and mismatched vocabulary existing among natural language and different programming languages. In addition, programming languages are… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  18. "Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real Time

    Authors: Scott Rome, Tianwen Chen, Raphael Tang, Luwei Zhou, Ferhan Ture

    Abstract: Customer service is how companies interface with their customers. It can contribute heavily towards the overall customer satisfaction. However, high-quality service can become expensive, creating an incentive to make it as cost efficient as possible and prompting most companies to utilize AI-powered assistants, or "chat bots". On the other hand, human-to-human interaction is still desired by custo… ▽ More

    Submitted 6 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  19. arXiv:2404.18304  [pdf, other

    cs.IR cs.AI

    Retrieval-Oriented Knowledge for Click-Through Rate Prediction

    Authors: Huanshuo Liu, Bo Chen, Menghui Zhu, Jianghao Lin, Jiarui Qin, Yang Yang, Hao Zhang, Ruiming Tang

    Abstract: Click-through rate (CTR) prediction plays an important role in personalized recommendations. Recently, sample-level retrieval-based models (e.g., RIM) have achieved remarkable performance by retrieving and aggregating relevant samples. However, their inefficiency at the inference stage makes them impractical for industrial applications. To overcome this issue, this paper proposes a universal plug-… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  20. arXiv:2404.09578  [pdf, other

    cs.IR

    Recall-Augmented Ranking: Enhancing Click-Through Rate Prediction Accuracy with Cross-Stage Data

    Authors: Junjie Huang, Guohao Cai, Jieming Zhu, Zhenhua Dong, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: Click-through rate (CTR) prediction plays an indispensable role in online platforms. Numerous models have been proposed to capture users' shifting preferences by leveraging user behavior sequences. However, these historical sequences often suffer from severe homogeneity and scarcity compared to the extensive item pool. Relying solely on such sequences for user representations is inherently restric… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 4 pages, accepted by WWW 2024 Short Track

  21. arXiv:2404.07581  [pdf, other

    cs.IR

    M-scan: A Multi-Scenario Causal-driven Adaptive Network for Recommendation

    Authors: Jiachen Zhu, Yichao Wang, Jianghao Lin, Jiarui Qin, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: We primarily focus on the field of multi-scenario recommendation, which poses a significant challenge in effectively leveraging data from different scenarios to enhance predictions in scenarios with limited data. Current mainstream efforts mainly center around innovative model network architectures, with the aim of enabling the network to implicitly acquire knowledge from diverse scenarios. Howeve… ▽ More

    Submitted 14 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted by WWW'24

  22. arXiv:2404.07456  [pdf, other

    cs.AI cs.MA

    WESE: Weak Exploration to Strong Exploitation for LLM Agents

    Authors: Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Defu Lian, Yasheng Wang, Ruiming Tang, Enhong Chen

    Abstract: Recently, large language models (LLMs) have demonstrated remarkable potential as an intelligent agent. However, existing researches mainly focus on enhancing the agent's reasoning or decision-making abilities through well-designed prompt engineering or task-specific fine-tuning, ignoring the procedure of exploration and exploitation. When addressing complex tasks within open-world interactive envi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  23. arXiv:2404.03881  [pdf, other

    cs.CL

    A Bi-consolidating Model for Joint Relational Triple Extraction

    Authors: Xiaocheng Luo, Yan** Chen, Ruixue Tang, Ruizhang Huang, Yongbin Qin

    Abstract: Current methods to extract relational triples directly make a prediction based on a possible entity pair in a raw sentence without depending on entity recognition. The task suffers from a serious semantic overlap** problem, in which several relation triples may share one or two entities in a sentence. It is weak to learn discriminative semantic features relevant to a relation triple. In this pap… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  24. arXiv:2404.00702  [pdf, other

    cs.IR

    Tired of Plugins? Large Language Models Can Be End-To-End Recommenders

    Authors: Wenlin Zhang, Chuhan Wu, Xiangyang Li, Yuhao Wang, Kuicai Dong, Yichao Wang, Xinyi Dai, Xiangyu Zhao, Huifeng Guo, Ruiming Tang

    Abstract: Recommender systems aim to predict user interest based on historical behavioral data. They are mainly designed in sequential pipelines, requiring lots of data to train different sub-systems, and are hard to scale to new domains. Recently, Large Language Models (LLMs) have demonstrated remarkable generalized capabilities, enabling a singular model to tackle diverse recommendation tasks across vario… ▽ More

    Submitted 7 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

  25. arXiv:2403.16378  [pdf, other

    cs.IR

    Play to Your Strengths: Collaborative Intelligence of Conventional Recommender Models and Large Language Models

    Authors: Yunjia Xi, Weiwen Liu, Jianghao Lin, Chuhan Wu, Bo Chen, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: The rise of large language models (LLMs) has opened new opportunities in Recommender Systems (RSs) by enhancing user behavior modeling and content understanding. However, current approaches that integrate LLMs into RSs solely utilize either LLM or conventional recommender model (CRM) to generate final recommendations, without considering which data segments LLM or CRM excel in. To fill in this gap… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  26. arXiv:2403.12660  [pdf, other

    cs.IR cs.AI

    ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems

    Authors: Pengyue Jia, Ye**g Wang, Zhaocheng Du, Xiangyu Zhao, Yichao Wang, Bo Chen, Wanyu Wang, Huifeng Guo, Ruiming Tang

    Abstract: Deep Recommender Systems (DRS) are increasingly dependent on a large number of feature fields for more precise recommendations. Effective feature selection methods are consequently becoming critical for further enhancing the accuracy and optimizing storage efficiencies to align with the deployment demands. This research area, particularly in the context of DRS, is nascent and faces three core chal… ▽ More

    Submitted 19 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted to KDD 2024

  27. arXiv:2403.05146  [pdf, other

    cs.CV

    Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy

    Authors: Yuelin Zhang, Wanquan Yan, Kim Yan, Chun ** Lam, Yufu Qiu, Pengyu Zheng, Raymond Shing-Yan Tang, Shing Shin Cheng

    Abstract: Gastric simulators with objective educational feedback have been proven useful for endoscopy training. Existing electronic simulators with feedback are however not commonly adopted due to their high cost. In this work, a motion-guided dual-camera tracker is proposed to provide reliable endoscope tip position feedback at a low cost inside a mechanical simulator for endoscopy skill evaluation, tackl… ▽ More

    Submitted 20 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  28. arXiv:2403.04299  [pdf, other

    cs.RO cs.AI

    LitSim: A Conflict-aware Policy for Long-term Interactive Traffic Simulation

    Authors: Haojie Xin, Xiaodong Zhang, Renzhi Tang, Songyang Yan, Qianrui Zhao, Chunze Yang, Wen Cui, Zijiang Yang

    Abstract: Simulation is pivotal in evaluating the performance of autonomous driving systems due to the advantages of high efficiency and low cost compared to on-road testing. Bridging the gap between simulation and the real world requires realistic agent behaviors. However, the existing works have the following shortcomings in achieving this goal: (1) log replay offers realistic scenarios but often leads to… ▽ More

    Submitted 1 May, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: 9 pages, 6 figures, under review

  29. arXiv:2403.03536  [pdf, other

    cs.IR cs.AI

    Towards Efficient and Effective Unlearning of Large Language Models for Recommendation

    Authors: Hangyu Wang, Jianghao Lin, Bo Chen, Yang Yang, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: The significant advancements in large language models (LLMs) give rise to a promising research direction, i.e., leveraging LLMs as recommenders (LLMRec). The efficacy of LLMRec arises from the open-world knowledge and reasoning capabilities inherent in LLMs. LLMRec acquires the recommendation capabilities through instruction tuning based on user interaction data. However, in order to protect user… ▽ More

    Submitted 30 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: Accepted by Frontier of Computer Science

  30. arXiv:2403.00108  [pdf, other

    cs.CR cs.AI cs.CL

    LoRA-as-an-Attack! Piercing LLM Safety Under The Share-and-Play Scenario

    Authors: Hongyi Liu, Zirui Liu, Ruixiang Tang, Jiayi Yuan, Shaochen Zhong, Yu-Neng Chuang, Li Li, Rui Chen, Xia Hu

    Abstract: Fine-tuning LLMs is crucial to enhancing their task-specific performance and ensuring model behaviors are aligned with human preferences. Among various fine-tuning methods, LoRA is popular for its efficiency and ease to use, allowing end-users to easily post and adopt lightweight LoRA modules on open-source platforms to tailor their model for different customization. However, such a handy share-an… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  31. arXiv:2402.11905  [pdf, other

    cs.CL

    Learning to Edit: Aligning LLMs with Knowledge Editing

    Authors: Yuxin Jiang, Yufei Wang, Chuhan Wu, Wanjun Zhong, Xingshan Zeng, Jiahui Gao, Liangyou Li, Xin Jiang, Lifeng Shang, Ruiming Tang, Qun Liu, Wei Wang

    Abstract: Knowledge editing techniques, aiming to efficiently modify a minor proportion of knowledge in large language models (LLMs) without negatively impacting performance across other inputs, have garnered widespread attention. However, existing methods predominantly rely on memorizing the updated knowledge, impeding LLMs from effectively combining the new knowledge with their inherent knowledge when ans… ▽ More

    Submitted 5 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 17 pages, 8 figures, 9 tables. ACL 2024 main camera-ready version

  32. arXiv:2402.09764  [pdf, other

    cs.AI

    Aligning Crowd Feedback via Distributional Preference Reward Modeling

    Authors: Dexun Li, Cong Zhang, Kuicai Dong, Derrick Goh Xin Deik, Ruiming Tang, Yong Liu

    Abstract: Deep Reinforcement Learning is widely used for aligning Large Language Models (LLM) with human preference. However, the conventional reward modelling is predominantly dependent on human annotations provided by a select cohort of individuals. Such dependence may unintentionally result in skewed models that reflect the inclinations of these annotators, thereby failing to adequately represent the wid… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  33. arXiv:2402.04678  [pdf, other

    cs.CL cs.AI cs.LG

    FaithLM: Towards Faithful Explanations for Large Language Models

    Authors: Yu-Neng Chuang, Guanchu Wang, Chia-Yuan Chang, Ruixiang Tang, Shaochen Zhong, Fan Yang, Mengnan Du, Xuanting Cai, Xia Hu

    Abstract: Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their extensive internal knowledge and reasoning capabilities. However, the black-box nature of these models complicates the task of explaining their decision-making processes. While recent advancements demonstrate the potential of leveraging LLMs to self-explain their predictions through natural language… ▽ More

    Submitted 26 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  34. arXiv:2402.02716  [pdf, other

    cs.AI cs.CL cs.LG

    Understanding the planning of LLM agents: A survey

    Authors: Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Hao Wang, Defu Lian, Yasheng Wang, Ruiming Tang, Enhong Chen

    Abstract: As Large Language Models (LLMs) have shown significant intelligence, the progress to leverage LLMs as planning modules of autonomous agents has attracted more attention. This survey provides the first systematic view of LLM-based agents planning, covering recent works aiming to improve planning ability. We provide a taxonomy of existing works on LLM-Agent planning, which can be categorized into Ta… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 9 pages, 2 tables, 2 figures

  35. arXiv:2401.16072  [pdf, other

    cs.ET physics.optics

    Symmetric silicon microring resonator optical crossbar array for accelerated inference and training in deep learning

    Authors: Rui Tang, Shuhei Ohno, Ken Tanizawa, Kazuhiro Ikeda, Makoto Okano, Kasidit Toprasertpong, Shinichi Takagi, Mitsuru Takenaka

    Abstract: Photonic integrated circuits are emerging as a promising platform for accelerating matrix multiplications in deep learning, leveraging the inherent parallel nature of light. Although various schemes have been proposed and demonstrated to realize such photonic matrix accelerators, the in-situ training of artificial neural networks using photonic accelerators remains challenging due to the difficult… ▽ More

    Submitted 1 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Journal ref: Photonics Research, 2024

  36. arXiv:2401.11478  [pdf, other

    cs.IR

    D2K: Turning Historical Data into Retrievable Knowledge for Recommender Systems

    Authors: Jiarui Qin, Weiwen Liu, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: A vast amount of user behavior data is constantly accumulating on today's large recommendation platforms, recording users' various interests and tastes. Preserving knowledge from the old data while new data continually arrives is a vital problem for recommender systems. Existing approaches generally seek to save the knowledge implicitly in the model parameters. However, such a parameter-centric ap… ▽ More

    Submitted 22 January, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: 12 pages, 7 figures

  37. arXiv:2401.08664  [pdf, other

    cs.AI cs.CL

    Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges

    Authors: Qingyao Li, Lingyue Fu, Weiming Zhang, Xianyu Chen, **gwei Yu, Wei Xia, Weinan Zhang, Ruiming Tang, Yong Yu

    Abstract: Online education platforms, leveraging the internet to distribute education resources, seek to provide convenient education but often fall short in real-time communication with students. They often struggle to address the diverse obstacles students encounter throughout their learning journey. Solving the problems encountered by students poses a significant challenge for traditional deep learning m… ▽ More

    Submitted 26 April, 2024; v1 submitted 27 December, 2023; originally announced January 2024.

    Comments: 31 pages, 5 figures, 1 table

  38. arXiv:2312.16621  [pdf, ps, other

    cs.IT eess.SP

    Dual-Functional Artificial Noise (DFAN) Aided Robust Covert Communications in Integrated Sensing and Communications

    Authors: Runzhe Tang, Long Yang, Lv Lu, Zheng Zhang, Yuanwei Liu, Jian Chen

    Abstract: This paper investigates covert communications in an integrated sensing and communications system, where a dual-functional base station (called Alice) covertly transmits signals to a covert user (called Bob) while sensing multiple targets, with one of them acting as a potential watcher (called Willie) and maliciously eavesdrop** on legitimate communications. To shelter the covert communications,… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: 13 pages, 11 figures

  39. arXiv:2312.10743  [pdf, other

    cs.IR

    A Unified Framework for Multi-Domain CTR Prediction via Large Language Models

    Authors: Zichuan Fu, Xiangyang Li, Chuhan Wu, Yichao Wang, Kuicai Dong, Xiangyu Zhao, Mengchen Zhao, Huifeng Guo, Ruiming Tang

    Abstract: Click-Through Rate (CTR) prediction is a crucial task in online recommendation platforms as it involves estimating the probability of user engagement with advertisements or items by clicking on them. Given the availability of various services like online shop**, ride-sharing, food delivery, and professional services on commercial platforms, recommendation systems in these platforms are required… ▽ More

    Submitted 23 February, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: submited to TOIS

  40. arXiv:2312.02969  [pdf, other

    cs.CL cs.IR

    Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models

    Authors: Xinyu Zhang, Sebastian Hofstätter, Patrick Lewis, Raphael Tang, Jimmy Lin

    Abstract: Listwise rerankers based on large language models (LLM) are the zero-shot state-of-the-art. However, current works in this direction all depend on the GPT models, making it a single point of failure in scientific reproducibility. Moreover, it raises the concern that the current research findings only hold for GPT models but not LLM in general. In this work, we lift this pre-condition and build for… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  41. arXiv:2312.02790  [pdf, other

    cs.SI

    A Low-cost, High-impact Node Injection Approach for Attacking Social Network Alignment

    Authors: Shuyu Jiang, Yunxiang Qiu, Xian Mo, Rui Tang, Wei Wang

    Abstract: Social network alignment (SNA) holds significant importance for various downstream applications, prompting numerous professionals to develop and share SNA tools. Unfortunately, these tools can be exploited by malicious actors to integrate sensitive user information, posing cybersecurity risks. While many researchers have explored attacking SNA (ASNA) through a network modification attack way, prac… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  42. arXiv:2311.18812  [pdf, other

    cs.CL

    What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

    Authors: Raphael Tang, Xinyu Zhang, Jimmy Lin, Ferhan Ture

    Abstract: Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond? To bypass their refusal to "speak," we study this research question by probing contextualized embeddings and exploring whether this bias is encoded in its latent representations. We propose a logistic Bradley-Terry probe which predicts word pair preferences of LLMs from the words' hidden vectors. We… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 10 pages, 5 figures

  43. arXiv:2311.18213  [pdf, other

    cs.IR cs.AI

    Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation

    Authors: Liangcai Su, Fan Yan, Jieming Zhu, Xi Xiao, Haoyi Duan, Zhou Zhao, Zhenhua Dong, Ruiming Tang

    Abstract: Two-tower models are a prevalent matching framework for recommendation, which have been widely deployed in industrial applications. The success of two-tower matching attributes to its efficiency in retrieval among a large number of items, since the item tower can be precomputed and used for fast Approximate Nearest Neighbor (ANN) search. However, it suffers two main challenges, including limited f… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted by SIGIR 2023. Code will be available at https://reczoo.github.io/SparCode

  44. arXiv:2311.09569  [pdf, other

    cs.CL cs.AI

    Strings from the Library of Babel: Random Sampling as a Strong Baseline for Prompt Optimisation

    Authors: Yao Lu, Jiayi Wang, Raphael Tang, Sebastian Riedel, Pontus Stenetorp

    Abstract: Recent prompt optimisation approaches use the generative nature of language models to produce prompts -- even rivaling the performance of human-curated prompts. In this paper, we demonstrate that randomly sampling tokens from the model vocabulary as ``separators'' can be as effective as language models for prompt-style text classification. Our experiments show that random separators are competitiv… ▽ More

    Submitted 17 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024. The code is publicly available at https://github.com/yaolu/random-prompt

  45. arXiv:2311.03526  [pdf, other

    cs.IR

    Towards Automated Negative Sampling in Implicit Recommendation

    Authors: Fuyuan Lyu, Yaochen Hu, Xing Tang, Yingxue Zhang, Ruiming Tang, Xue Liu

    Abstract: Negative sampling methods are vital in implicit recommendation models as they allow us to obtain negative instances from massive unlabeled data. Most existing approaches focus on sampling hard negative samples in various ways. These studies are orthogonal to the recommendation model and implicit datasets. However, such an idea contradicts the common belief in AutoML that the model and dataset shou… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  46. APGL4SR: A Generic Framework with Adaptive and Personalized Global Collaborative Information in Sequential Recommendation

    Authors: Mingjia Yin, Hao Wang, Xiang Xu, Likang Wu, Sirui Zhao, Wei Guo, Yong Liu, Ruiming Tang, Defu Lian, Enhong Chen

    Abstract: The sequential recommendation system has been widely studied for its promising effectiveness in capturing dynamic preferences buried in users' sequential behaviors. Despite the considerable achievements, existing methods usually focus on intra-sequence modeling while overlooking exploiting global collaborative information by inter-sequence modeling, resulting in inferior recommendation performance… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  47. arXiv:2310.19453  [pdf, other

    cs.IR cs.AI

    FLIP: Towards Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction

    Authors: Hangyu Wang, Jianghao Lin, Xiangyang Li, Bo Chen, Chenxu Zhu, Ruiming Tang, Weinan Zhang, Yong Yu

    Abstract: Click-through rate (CTR) prediction plays as a core function module in various personalized online services. The traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality, which capture the collaborative signals via feature interaction modeling. But the one-hot encoding discards the semantic information conceived in the original feature texts… ▽ More

    Submitted 7 May, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Under Review

  48. arXiv:2310.18633  [pdf, other

    cs.LG cs.AI cs.CL

    Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots

    Authors: Ruixiang Tang, Jiayi Yuan, Yiming Li, Zirui Liu, Rui Chen, Xia Hu

    Abstract: In the field of natural language processing, the prevalent approach involves fine-tuning pretrained language models (PLMs) using local samples. Recent research has exposed the susceptibility of PLMs to backdoor attacks, wherein the adversaries can embed malicious prediction behaviors by manipulating a few training samples. In this study, our objective is to develop a backdoor-resistant tuning proc… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  49. arXiv:2310.18286  [pdf, other

    cs.LG stat.AP stat.ML

    Optimal Transport for Treatment Effect Estimation

    Authors: Hao Wang, Zhichao Chen, Jiajun Fan, Haoxuan Li, Tianqiao Liu, Weiming Liu, Quanyu Dai, Yichao Wang, Zhenhua Dong, Ruiming Tang

    Abstract: Estimating conditional average treatment effect from observational data is highly challenging due to the existence of treatment selection bias. Prevalent methods mitigate this issue by aligning distributions of different treatment groups in the latent space. However, there are two critical problems that these methods fail to address: (1) mini-batch sampling effects (MSE), which causes misalignment… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted as NeurIPS 2023 Poster

  50. arXiv:2310.13291  [pdf, other

    cs.CL cs.AI cs.LG

    Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks

    Authors: Ruixiang Tang, Gord Lueck, Rodolfo Quispe, Huseyin A Inan, Janardhan Kulkarni, Xia Hu

    Abstract: Large language models have revolutionized the field of NLP by achieving state-of-the-art performance on various tasks. However, there is a concern that these models may disclose information in the training data. In this study, we focus on the summarization task and investigate the membership inference (MI) attack: given a sample and black-box access to a model's API, it is possible to determine if… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.