Skip to main content

Showing 1–50 of 90 results for author: Hua, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01016  [pdf, other

    cs.CV

    SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection

    Authors: Dingkang Liang, Wei Hua, Chunsheng Shi, Zhikang Zou, Xiaoqing Ye, Xiang Bai

    Abstract: Semi-supervised object detection (SSOD), leveraging unlabeled data to boost object detectors, has become a hot topic recently. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects common in aerial images unexplored. At the same time, the annotation cost of multi-oriented objects is significantly higher than that of their horizontal counterparts. Ther… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.14711  [pdf, other

    cs.CL cs.AI cs.MA

    MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate

    Authors: Alfonso Amayuelas, Xianjun Yang, Antonis Antoniades, Wenyue Hua, Liangming Pan, William Wang

    Abstract: Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually. The advancement in their capabilities, along with a reduction in parameter size and inference times, has facilitated the use of these models as agents, enabling interactions among multiple models to execute complex tasks. Such collaborations offer several advantages, including the use of sp… ▽ More

    Submitted 26 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.04428  [pdf, other

    cs.CL cs.AI

    MoralBench: Moral Evaluation of LLMs

    Authors: Jianchao Ji, Yutong Chen, Mingyu **, Wujiang Xu, Wenyue Hua, Yongfeng Zhang

    Abstract: In the rapidly evolving field of artificial intelligence, large language models (LLMs) have emerged as powerful tools for a myriad of applications, from natural language processing to decision-making support systems. However, as these models become increasingly integrated into societal frameworks, the imperative to ensure they operate within ethical and moral boundaries has never been more critica… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2406.02787  [pdf, other

    cs.CL cs.AI cs.LG

    Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities

    Authors: Wenyue Hua, Kaijie Zhu, Lingyao Li, Lizhou Fan, Shuhang Lin, Mingyu **, Haochen Xue, Zelong Li, **Dong Wang, Yongfeng Zhang

    Abstract: This study intends to systematically disentangle pure logic reasoning and text understanding by investigating the contrast across abstract and contextualized logical problems from a comprehensive set of domains. We explore whether LLMs demonstrate genuine reasoning capabilities across various domains when the underlying logical structure remains constant. We focus on two main questions (1) Can abs… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 22 pages, 9 figures

  5. arXiv:2405.16806  [pdf, other

    cs.CL cs.AI

    Entity Alignment with Noisy Annotations from Large Language Models

    Authors: Shengyuan Chen, Qinggang Zhang, Junnan Dong, Wen Hua, Qing Li, Xiao Huang

    Abstract: Entity alignment (EA) aims to merge two knowledge graphs (KGs) by identifying equivalent entity pairs. While existing methods heavily rely on human-generated labels, it is prohibitively expensive to incorporate cross-domain experts for annotation in real-world scenarios. The advent of Large Language Models (LLMs) presents new avenues for automating EA with annotations, inspired by their comprehens… ▽ More

    Submitted 28 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  6. arXiv:2405.03066  [pdf

    cs.ET

    A sco** review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs)

    Authors: Lingyao Li, Jiayan Zhou, Zhenxiang Gao, Wenyue Hua, Lizhou Fan, Huizi Yu, Loni Hagen, Yongfeng Zhang, Themistocles L. Assimes, Libby Hemphill, Siyuan Ma

    Abstract: Electronic Health Records (EHRs) play an important role in the healthcare system. However, their complexity and vast volume pose significant challenges to data interpretation and analysis. Recent advancements in Artificial Intelligence (AI), particularly the development of Large Language Models (LLMs), open up new opportunities for researchers in this domain. Although prior studies have demonstrat… ▽ More

    Submitted 22 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  7. arXiv:2404.15532  [pdf, other

    cs.HC cs.AI cs.CL cs.CV cs.MA

    BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis

    Authors: Shuhang Lin, Wenyue Hua, Lingyao Li, Che-Jui Chang, Lizhou Fan, Jianchao Ji, Hang Hua, Mingyu **, Jiebo Luo, Yongfeng Zhang

    Abstract: This paper presents BattleAgent, an emulation system that combines the Large Vision-Language Model and Multi-agent System. This novel system aims to simulate complex dynamic interactions among multiple agents, as well as between agents and their environments, over a period of time. It emulates both the decision-making processes of leaders and the viewpoints of ordinary participants, such as soldie… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 26 pages, 14 figures The data and code for this project are accessible at https://github.com/agiresearch/battleagent

  8. arXiv:2404.07066  [pdf, other

    cs.CL cs.AI cs.LG

    Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?

    Authors: Mingyu **, Qinkai Yu, **gyuan Huang, Qingcheng Zeng, Zhenting Wang, Wenyue Hua, Haiyan Zhao, Kai Mei, Yanda Meng, Kaize Ding, Fan Yang, Mengnan Du, Yongfeng Zhang

    Abstract: Large language models (LLMs) have shown remarkable performances across a wide range of tasks. However, the mechanisms by which these models encode tasks of varying complexities remain poorly understood. In this paper, we explore the hypothesis that LLMs process concepts of varying complexities in different layers, introducing the idea of "Concept Depth" to suggest that more complex concepts are ty… ▽ More

    Submitted 30 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 12 pages

  9. arXiv:2404.01064  [pdf, other

    cs.CV

    Roadside Monocular 3D Detection via 2D Detection Prompting

    Authors: Yechi Ma, Shuoquan Wei, Churun Zhang, Wei Hua, Yanan Li, Shu Kong

    Abstract: The problem of roadside monocular 3D detection requires detecting objects of interested classes in a 2D RGB frame and predicting their 3D information such as locations in bird's-eye-view (BEV). It has broad applications in traffic control, vehicle-vehicle communication, and vehicle-infrastructure cooperative perception. To approach this problem, we present a novel and simple method by prompting th… ▽ More

    Submitted 4 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  10. arXiv:2403.19021  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    IDGenRec: LLM-RecSys Alignment with Textual ID Learning

    Authors: Juntao Tan, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Zelong Li, Yongfeng Zhang

    Abstract: Generative recommendation based on Large Language Models (LLMs) have transformed the traditional ranking-based recommendation style into a text-to-text generation paradigm. However, in contrast to standard NLP tasks that inherently operate on human vocabulary, current research in generative recommendations struggles to effectively encode recommendation items within the text-to-text framework using… ▽ More

    Submitted 17 May, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted in SIGIR 2024

  11. arXiv:2403.16303  [pdf

    cs.DL cs.AI cs.CL cs.SI

    Large Language Models in Biomedical and Health Informatics: A Bibliometric Review

    Authors: Huizi Yu, Lizhou Fan, Lingyao Li, Jiayan Zhou, Zihui Ma, Lu Xian, Wenyue Hua, Sijia He, Mingyu **, Yongfeng Zhang, Ashvin Gandhi, Xin Ma

    Abstract: Large Language Models (LLMs) have rapidly become important tools in Biomedical and Health Informatics (BHI), enabling new ways to analyze data, treat patients, and conduct research. This bibliometric review aims to provide a panoramic view of how LLMs have been used in BHI by examining research articles and collaboration networks from 2022 to 2023. It further explores how LLMs can improve Natural… ▽ More

    Submitted 23 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: 51 pages, 7 figures, 4 tables

  12. arXiv:2403.09439  [pdf, other

    cs.CV cs.AI

    3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation

    Authors: Frank Zhang, Yibo Zhang, Quan Zheng, Rui Ma, Wei Hua, Hujun Bao, Weiwei Xu, Changqing Zou

    Abstract: Text-driven 3D scene generation techniques have made rapid progress in recent years. Their success is mainly attributed to using existing generative models to iteratively perform image war** and inpainting to generate 3D scenes. However, these methods heavily rely on the outputs of existing models, leading to error accumulation in geometry and appearance that prevent the models from being used i… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 11 pages, 7 figures

  13. arXiv:2403.01777  [pdf, other

    cs.CL cs.CV

    NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models

    Authors: Lizhou Fan, Wenyue Hua, Xiang Li, Kaijie Zhu, Mingyu **, Lingyao Li, Haoyang Ling, **kui Chi, **dong Wang, Xin Ma, Yongfeng Zhang

    Abstract: Understanding the reasoning capabilities of Multimodal Large Language Models (MLLMs) is an important area of research. In this study, we introduce a dynamic benchmark, NPHardEval4V, aimed at addressing the existing gaps in evaluating the pure reasoning abilities of MLLMs. Our benchmark aims to provide a venue to disentangle the effect of various factors such as image recognition and instruction fo… ▽ More

    Submitted 5 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 16 pages, 10 figures, 2 tables

  14. arXiv:2402.13184  [pdf, other

    cs.CL

    What if LLMs Have Different World Views: Simulating Alien Civilizations with LLM-based Agents

    Authors: Mingyu **, Beichen Wang, Zhaoqian Xue, Suiyuan Zhu, Wenyue Hua, Hua Tang, Kai Mei, Mengnan Du, Yongfeng Zhang

    Abstract: In this study, we introduce "CosmoAgent," an innovative artificial intelligence framework utilizing Large Language Models (LLMs) to simulate complex interactions between human and extraterrestrial civilizations, with a special emphasis on Stephen Hawking's cautionary advice about not sending radio signals haphazardly into the universe. The goal is to assess the feasibility of peaceful coexistence… ▽ More

    Submitted 20 February, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  15. arXiv:2402.05868  [pdf, other

    cs.CL cs.AI cs.CR cs.IR cs.LG

    EmojiCrypt: Prompt Encryption for Secure Communication with Large Language Models

    Authors: Guo Lin, Wenyue Hua, Yongfeng Zhang

    Abstract: Cloud-based large language models (LLMs) such as ChatGPT have increasingly become integral to daily operations, serving as vital tools across various applications. While these models offer substantial benefits in terms of accessibility and functionality, they also introduce significant privacy concerns: the transmission and storage of user data in cloud infrastructures pose substantial risks of da… ▽ More

    Submitted 12 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 12 pages, 4 figures, 2 tables, comments and suggestions are welcome

  16. arXiv:2402.01586  [pdf, other

    cs.CL cs.AI cs.LG cs.MA

    TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution

    Authors: Wenyue Hua, Xianjun Yang, Zelong Li, Wei Cheng, Yongfeng Zhang

    Abstract: The emergence of LLM-based agents has garnered considerable attention, yet their trustworthiness remains an under-explored area. As agents can directly interact with the physical environment, their reliability and safety is critical. This paper presents an Agent-Constitution-based agent framework, TrustAgent, an initial investigation into improving the safety dimension of trustworthiness in LLM-ba… ▽ More

    Submitted 17 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 16 pages, 3 figures, 5 tables, comments and suggestions are welcome

  17. arXiv:2402.00798  [pdf, other

    cs.LG cs.AI cs.CL cs.FL

    Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents

    Authors: Zelong Li, Wenyue Hua, Hao Wang, He Zhu, Yongfeng Zhang

    Abstract: Recent advancements on Large Language Models (LLMs) enable AI Agents to automatically generate and execute multi-step plans to solve complex tasks. However, since LLM's content generation process is hardly controllable, current LLM-based agents frequently generate invalid or non-executable plans, which jeopardizes the performance of the generated plans and corrupts users' trust in LLM-based agents… ▽ More

    Submitted 18 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 21 pages, 6 figures; comments and suggestions are welcome

  18. arXiv:2402.00746  [pdf, other

    cs.CL

    Health-LLM: Personalized Retrieval-Augmented Disease Prediction System

    Authors: Mingyu **, Qinkai Yu, Dong Shu, Chong Zhang, Lizhou Fan, Wenyue Hua, Suiyuan Zhu, Yanda Meng, Zhenting Wang, Mengnan Du, Yongfeng Zhang

    Abstract: Recent advancements in artificial intelligence (AI), especially large language models (LLMs), have significantly advanced healthcare applications and demonstrated potentials in intelligent medical treatment. However, there are conspicuous challenges such as vast data volumes and inconsistent symptom characterization standards, preventing full integration of healthcare AI systems with individual pa… ▽ More

    Submitted 19 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  19. arXiv:2402.00284  [pdf, other

    cs.IR cs.AI cs.LG

    PAP-REC: Personalized Automatic Prompt for Recommendation Language Model

    Authors: Zelong Li, Jianchao Ji, Yingqiang Ge, Wenyue Hua, Yongfeng Zhang

    Abstract: Recently emerged prompt-based Recommendation Language Models (RLM) can solve multiple recommendation tasks uniformly. The RLMs make full use of the inherited knowledge learned from the abundant pre-training data to solve the downstream recommendation tasks by prompts, without introducing additional parameters or network training. However, handcrafted prompts require significant expertise and human… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  20. arXiv:2401.17585  [pdf, other

    cs.CL cs.AI cs.LG stat.ME

    Propagation and Pitfalls: Reasoning-based Assessment of Knowledge Editing through Counterfactual Tasks

    Authors: Wenyue Hua, Jiang Guo, Mingwen Dong, Henghui Zhu, Patrick Ng, Zhiguo Wang

    Abstract: Current approaches of knowledge editing struggle to effectively propagate updates to interconnected facts. In this work, we delve into the barriers that hinder the appropriate propagation of updated knowledge within these models for accurate reasoning. To support our analysis, we introduce a novel reasoning-based benchmark -- ReCoE (Reasoning-based Counterfactual Editing dataset) -- which covers s… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 22 pages, 14 figures, 5 tables

  21. arXiv:2401.04925  [pdf, other

    cs.CL cs.AI

    The Impact of Reasoning Step Length on Large Language Models

    Authors: Mingyu **, Qinkai Yu, Dong Shu, Haiyan Zhao, Wenyue Hua, Yanda Meng, Yongfeng Zhang, Mengnan Du

    Abstract: Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correlation between the effectiveness of CoT and the length of reasoning steps in prompts remains largely unknown. To shed light on this, we have conducted several empirical experiments to explore the relations. Specifically, we design experiments that expand and compress the ra… ▽ More

    Submitted 22 June, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Findings of ACL 2024

  22. arXiv:2401.04361  [pdf, other

    cs.CL cs.AI

    Improving the Robustness of Knowledge-Grounded Dialogue via Contrastive Learning

    Authors: Jiaan Wang, Jianfeng Qu, Kexin Wang, Zhixu Li, Wen Hua, Ximing Li, An Liu

    Abstract: Knowledge-grounded dialogue (KGD) learns to generate an informative response based on a given dialogue context and external knowledge (\emph{e.g.}, knowledge graphs; KGs). Recently, the emergence of large language models (LLMs) and pre-training techniques has brought great success to knowledge-grounded dialogue. However, when building KGD systems in real applications, there are various real-world… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  23. arXiv:2312.14890  [pdf, other

    cs.AI cs.CC cs.CL cs.LG

    NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes

    Authors: Lizhou Fan, Wenyue Hua, Lingyao Li, Haoyang Ling, Yongfeng Zhang

    Abstract: Complex reasoning ability is one of the most important features of current LLMs, which has also been leveraged to play an integral role in complex decision-making tasks. Therefore, the investigation into the reasoning capabilities of Large Language Models (LLMs) is critical: numerous benchmarks have been established to assess the reasoning abilities of LLMs. However, current benchmarks are inadequ… ▽ More

    Submitted 12 February, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 23 pages, 7 figures, 2 tables

  24. arXiv:2312.10986  [pdf, other

    cs.CV cs.RO

    Long-Tailed 3D Detection via 2D Late Fusion

    Authors: Yechi Ma, Neehar Peri, Shuoquan Wei, Wei Hua, Deva Ramanan, Yanan Li, Shu Kong

    Abstract: Long-Tailed 3D Object Detection (LT3D) addresses the problem of accurately detecting objects from both common and rare classes. Contemporary multi-modal detectors achieve low AP on rare-classes (e.g., CMT only achieves 9.4 AP on stroller), presumably because training detectors end-to-end with significant class imbalance is challenging. To address this limitation, we delve into a simple late-fusion… ▽ More

    Submitted 14 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  25. arXiv:2312.03815  [pdf, other

    cs.OS cs.AI cs.CL cs.LG

    LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem

    Authors: Yingqiang Ge, Yujie Ren, Wenyue Hua, Shuyuan Xu, Juntao Tan, Yongfeng Zhang

    Abstract: This paper envisions a revolutionary AIOS-Agent ecosystem, where Large Language Model (LLM) serves as the (Artificial) Intelligent Operating System (IOS, or AIOS)--an operating system "with soul". Upon this foundation, a diverse range of LLM-based AI Agent Applications (Agents, or AAPs) are developed, enriching the AIOS-Agent ecosystem and signaling a paradigm shift from the traditional OS-APP eco… ▽ More

    Submitted 9 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: 35 pages, 4 figures

  26. arXiv:2312.03022  [pdf, other

    cs.AI cs.CL cs.LG

    Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction

    Authors: Hongbin Ye, Honghao Gui, Aijia Zhang, Tong Liu, Wei Hua, Weiqiang Jia

    Abstract: Knowledge graph construction (KGC) is a multifaceted undertaking involving the extraction of entities, relations, and events. Traditionally, large language models (LLMs) have been viewed as solitary task-solving agents in this complex landscape. However, this paper challenges this paradigm by introducing a novel framework, CooperKGC. Departing from the conventional approach, CooperKGC establishes… ▽ More

    Submitted 29 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: work in progress; 12 pages

  27. arXiv:2311.17227  [pdf, other

    cs.AI cs.CL cs.CY

    War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars

    Authors: Wenyue Hua, Lizhou Fan, Lingyao Li, Kai Mei, Jianchao Ji, Yingqiang Ge, Libby Hemphill, Yongfeng Zhang

    Abstract: Can we avoid wars at the crossroads of history? This question has been pursued by individuals, scholars, policymakers, and organizations throughout human history. In this research, we attempt to answer the question based on the recent advances of Artificial Intelligence (AI) and Large Language Models (LLMs). We propose \textbf{WarAgent}, an LLM-powered multi-agent AI system, to simulate the partic… ▽ More

    Submitted 30 January, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: 47 pages, 9 figures, 5 tables

  28. arXiv:2311.11825  [pdf, other

    cs.CV cs.GR

    Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning

    Authors: Zixuan Xie, Rengan Xie, Rong Li, Kai Huang, Pengju Qiao, **gsen Zhu, Xu Yin, Qi Ye, Wei Hua, Yuchi Huo, Hujun Bao

    Abstract: In this work, we use multi-view aerial images to reconstruct the geometry, lighting, and material of facades using neural signed distance fields (SDFs). Without the requirement of complex equipment, our method only takes simple RGB images captured by a drone as inputs to enable physically based and photorealistic novel-view rendering, relighting, and editing. However, a real-world facade usually h… ▽ More

    Submitted 8 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

  29. arXiv:2309.13235  [pdf, other

    cs.CV

    M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and Siamese Decoders

    Authors: Qibo Qiu, Honghui Yang, Wenxiao Wang, Shun Zhang, Haiming Gao, Haochao Ying, Wei Hua, Xiaofei He

    Abstract: Masked point modeling has become a promising scheme of self-supervised pre-training for point clouds. Existing methods reconstruct either the original points or related features as the objective of pre-training. However, considering the diversity of downstream tasks, it is necessary for the model to have both low- and high-level representation modeling capabilities to capture geometric details and… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  30. arXiv:2309.06794  [pdf, other

    cs.CL cs.AI cs.LG

    Cognitive Mirage: A Review of Hallucinations in Large Language Models

    Authors: Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia

    Abstract: As large language models continue to develop in the field of AI, text generation systems are susceptible to a worrisome phenomenon known as hallucination. In this study, we summarize recent compelling insights into hallucinations in LLMs. We present a novel taxonomy of hallucinations from various text generation tasks, thus provide theoretical insights, detection methods and improvement approaches… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: work in progress; 21 pages

  31. arXiv:2307.00457  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    GenRec: Large Language Model for Generative Recommendation

    Authors: Jianchao Ji, Zelong Li, Shuyuan Xu, Wenyue Hua, Yingqiang Ge, Juntao Tan, Yongfeng Zhang

    Abstract: In recent years, large language models (LLM) have emerged as powerful tools for diverse natural language processing tasks. However, their potential for recommender systems under the generative recommendation paradigm remains relatively unexplored. This paper presents an innovative approach to recommendation systems using large language models (LLMs) based on text data. In this paper, we present a… ▽ More

    Submitted 4 July, 2023; v1 submitted 1 July, 2023; originally announced July 2023.

  32. arXiv:2306.12317  [pdf, other

    cs.CL

    Iterated Piecewise Affine (IPA) Approximation for Language Modeling

    Authors: Davood Shamsi, Wen-yu Hua, Brian Williams

    Abstract: In this work, we demonstrate the application of a first-order Taylor expansion to approximate a generic function $F: R^{n \times m} \to R^{n \times m}$ and utilize it in language modeling. To enhance the basic Taylor expansion, we introduce iteration and piecewise modeling, leading us to name the algorithm the Iterative Piecewise Affine (IPA) approximation. The final algorithm exhibits interesting… ▽ More

    Submitted 1 November, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  33. arXiv:2306.11134  [pdf, other

    cs.IR

    OpenP5: An Open-Source Platform for Develo**, Training, and Evaluating LLM-based Recommender Systems

    Authors: Shuyuan Xu, Wenyue Hua, Yongfeng Zhang

    Abstract: In recent years, the integration of Large Language Models (LLMs) into recommender systems has garnered interest among both practitioners and researchers. Despite this interest, the field is still emerging, and the lack of open-source R&D platforms may impede the exploration of LLM-based recommendations. This paper introduces OpenP5, an open-source platform designed as a resource to facilitate the… ▽ More

    Submitted 10 April, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: In SIGIR 2024 Resource & Reproducibility Track

  34. arXiv:2306.03287  [pdf, other

    cs.CV

    ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

    Authors: Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun , et al. (2 additional authors not shown)

    Abstract: Structured text extraction is one of the most valuable and challenging application directions in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we organized the ICDAR 2023 competition on Structured text extracti… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: ICDAR 2023 Competition on SVRD report (To be appear in ICDAR 2023)

  35. arXiv:2306.03235  [pdf, other

    cs.LG cs.CR

    Information Flow Control in Machine Learning through Modular Model Architecture

    Authors: Trishita Tiwari, Suchin Gururangan, Chuan Guo, Weizhe Hua, Sanjay Kariyappa, Udit Gupta, Wenjie Xiong, Kiwan Maeng, Hsien-Hsin S. Lee, G. Edward Suh

    Abstract: In today's machine learning (ML) models, any part of the training data can affect its output. This lack of control for information flow from training data to model output is a major obstacle in training models on sensitive data when access control only allows individual users to access a subset of data. To enable secure machine learning for access controlled data, we propose the notion of informat… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  36. arXiv:2306.01205  [pdf, other

    cs.CV

    SelFLoc: Selective Feature Fusion for Large-scale Point Cloud-based Place Recognition

    Authors: Qibo Qiu, Haiming Gao, Wenxiao Wang, Zhiyi Su, Tian Xie, Wei Hua, Xiaofei He

    Abstract: Point cloud-based place recognition is crucial for mobile robots and autonomous vehicles, especially when the global positioning sensor is not accessible. LiDAR points are scattered on the surface of objects and buildings, which have strong shape priors along different axes. To enhance message passing along particular axes, Stacked Asymmetric Convolution Block (SACB) is designed, which is one of t… ▽ More

    Submitted 5 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

  37. arXiv:2305.12090  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    UP5: Unbiased Foundation Model for Fairness-aware Recommendation

    Authors: Wenyue Hua, Yingqiang Ge, Shuyuan Xu, Jianchao Ji, Yongfeng Zhang

    Abstract: Recent advances in Foundation Models such as Large Language Models (LLMs) have propelled them to the forefront of Recommender Systems (RS). Despite their utility, there is a growing concern that LLMs might inadvertently perpetuate societal stereotypes, resulting in unfair recommendations. Since fairness is critical for RS as many users take it for decision-making and demand fulfillment, this paper… ▽ More

    Submitted 29 May, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: In EACL 2024

  38. A Hybrid 3D Eddy Detection Technique Based on Sea Surface Height and Velocity Field

    Authors: Wei** Hua, Karen Bemis, Dujuan Kang, Sedat Ozer, Deborah Silver

    Abstract: Eddy detection is a critical task for ocean scientists to understand and analyze ocean circulation. In this paper, we introduce a hybrid eddy detection approach that combines sea surface height (SSH) and velocity fields with geometric criteria defining eddy behavior. Our approach searches for SSH minima and maxima, which oceanographers expect to find at the center of eddies. Geometric criteria are… ▽ More

    Submitted 31 October, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

    Comments: 8 pages, 14 figures. Accepted by EnvirVis 2023. Project Link: https://github.com/VizlabRutgers/Hybrid-Eddy-detection

  39. arXiv:2305.07498  [pdf, other

    cs.CV

    Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

    Authors: Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Xiang Bai

    Abstract: Visual information extraction (VIE), which aims to simultaneously perform OCR and information extraction in a unified framework, has drawn increasing attention due to its essential role in various applications like understanding receipts, goods, and traffic signs. However, as existing benchmark datasets for VIE mainly consist of document images without the adequate diversity of layout structures,… ▽ More

    Submitted 14 June, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 15 pages, 6 figures, ICDAR2023

  40. arXiv:2305.06569  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    How to Index Item IDs for Recommendation Foundation Models

    Authors: Wenyue Hua, Shuyuan Xu, Yingqiang Ge, Yongfeng Zhang

    Abstract: Recommendation foundation model utilizes large language models (LLM) for recommendation by converting recommendation tasks into natural language tasks. It enables generative recommendation which directly generates the item(s) to recommend rather than calculating a ranking score for each and every candidate item as in traditional recommendation models, simplifying the recommendation pipeline from m… ▽ More

    Submitted 25 September, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted as a full paper by ACM SIGIR-AP 2023

  41. arXiv:2305.06404  [pdf, other

    cs.CL cs.AI

    LACoS-BLOOM: Low-rank Adaptation with Contrastive objective on 8 bits Siamese-BLOOM

    Authors: Wen-Yu Hua, Brian Williams, Davood Shamsi

    Abstract: Text embeddings are useful features for several NLP applications, such as sentence similarity, text clustering, and semantic search. In this paper, we present a Low-rank Adaptation with a Contrastive objective on top of 8-bit Siamese-BLOOM, a multilingual large language model optimized to produce semantically meaningful word embeddings. The innovation is threefold. First, we cast BLOOM weights to… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  42. arXiv:2304.04515  [pdf, other

    cs.CV

    SOOD: Towards Semi-Supervised Oriented Object Detection

    Authors: Wei Hua, Dingkang Liang, **gyu Li, Xiaolong Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai

    Abstract: Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for boosting object detectors, has become an active task in recent years. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects that are common in aerial images unexplored. This paper proposes a novel Semi-supervised Oriented Object Detection model, termed SOOD, built upon the m… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023. Code will be available at https://github.com/HamPerdredes/SOOD

  43. arXiv:2304.04370  [pdf, other

    cs.AI cs.CL cs.LG

    OpenAGI: When LLM Meets Domain Experts

    Authors: Yingqiang Ge, Wenyue Hua, Kai Mei, Jianchao Ji, Juntao Tan, Shuyuan Xu, Zelong Li, Yongfeng Zhang

    Abstract: Human Intelligence (HI) excels at combining basic skills to solve complex tasks. This capability is vital for Artificial Intelligence (AI) and should be embedded in comprehensive AI Agents, enabling them to harness expert models for complex task-solving towards Artificial General Intelligence (AGI). Large Language Models (LLMs) show promising learning and reasoning abilities, and can effectively u… ▽ More

    Submitted 3 November, 2023; v1 submitted 9 April, 2023; originally announced April 2023.

    Comments: In NeurIPS 2023

  44. arXiv:2303.07634  [pdf, other

    cs.CV cs.AI cs.GR

    I$^2$-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs

    Authors: **gsen Zhu, Yuchi Huo, Qi Ye, Fujun Luan, Jifan Li, Dianbing Xi, Lisha Wang, Rui Tang, Wei Hua, Hujun Bao, Rui Wang

    Abstract: In this work, we present I$^2$-SDF, a new method for intrinsic indoor scene reconstruction and editing using differentiable Monte Carlo raytracing on neural signed distance fields (SDFs). Our holistic neural SDF-based framework jointly recovers the underlying shapes, incident radiance and materials from multi-view images. We introduce a novel bubble loss for fine-grained small objects and error-gu… ▽ More

    Submitted 29 March, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR 2023, project page: https://**gsenzhu.github.io/i2-sdf

  45. arXiv:2302.14338  [pdf, other

    cs.CV

    Turning a CLIP Model into a Scene Text Detector

    Authors: Wenwen Yu, Yuliang Liu, Wei Hua, Deqiang Jiang, Bo Ren, Xiang Bai

    Abstract: The recent large-scale Contrastive Language-Image Pretraining (CLIP) model has shown great potential in various downstream tasks via leveraging the pretrained vision and language knowledge. Scene text, which contains rich textual and visual information, has an inherent connection with a model like CLIP. Recently, pretraining approaches based on vision language models have made effective progresses… ▽ More

    Submitted 26 March, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

    Comments: CVPR2023

  46. arXiv:2212.12418  [pdf

    cs.RO

    Dynamic Speed Guidance for CAV Ramp Merging in Non-Cooperative Environment: An On-Site Experiment

    Authors: Wei Ji, Yechi Ma, Guangzhang Cui, Xiaotian Qin, Wei Hua

    Abstract: Ramp merging is a typical application of cooperative intelligent transportation system (C-ITS). Vehicle trajectories perceived by roadside sensors are importation complement to the limited visual field of on-board perception. Vehicle tracking and trajectory denoising algorithm is proposed in this paper to take full advantage of roadside cameras for vehicle trajectory and speed profile estimation.… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: This work has been submitted to IFAC for possible publication

  47. arXiv:2212.08204  [pdf, other

    cs.CL cs.CY

    LegalRelectra: Mixed-domain Language Modeling for Long-range Legal Text Comprehension

    Authors: Wenyue Hua, Yuchen Zhang, Zhe Chen, Josie Li, Melanie Weber

    Abstract: The application of Natural Language Processing (NLP) to specialized domains, such as the law, has recently received a surge of interest. As many legal services rely on processing and analyzing large collections of documents, automating such tasks with NLP tools emerges as a key challenge. Many popular language models, such as BERT or RoBERTa, are general-purpose models, which have limitations on p… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

  48. arXiv:2212.02019  [pdf, other

    cs.CV

    SASFormer: Transformers for Sparsely Annotated Semantic Segmentation

    Authors: Hui Su, Yue Ye, Wei Hua, Lechao Cheng, Mingli Song

    Abstract: Semantic segmentation based on sparse annotation has advanced in recent years. It labels only part of each object in the image, leaving the remainder unlabeled. Most of the existing approaches are time-consuming and often necessitate a multi-stage training strategy. In this work, we propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer… ▽ More

    Submitted 25 February, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

    Comments: 8 pages, 6 figures, 6 tables; version4.0

  49. arXiv:2211.16101  [pdf, other

    cs.CL cs.AI

    Dependency-aware Self-training for Entity Alignment

    Authors: Bing Liu, Tiancheng Lan, Wen Hua, Guido Zuccon

    Abstract: Entity Alignment (EA), which aims to detect entity map**s (i.e. equivalent entity pairs) in different Knowledge Graphs (KGs), is critical for KG fusion. Neural EA methods dominate current EA research but still suffer from their reliance on labelled map**s. To solve this problem, a few works have explored boosting the training of EA models with self-training, which adds confidently predicted ma… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: WSDM 2023

  50. arXiv:2211.15833  [pdf, other

    cs.CL cs.AI

    Guiding Neural Entity Alignment with Compatibility

    Authors: Bing Liu, Harrisen Scells, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang

    Abstract: Entity Alignment (EA) aims to find equivalent entities between two Knowledge Graphs (KGs). While numerous neural EA models have been devised, they are mainly learned using labelled data only. In this work, we argue that different entities within one KG should have compatible counterparts in the other KG due to the potential dependencies among the entities. Making compatible predictions thus should… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022