Skip to main content

Showing 1–50 of 704 results for author: Wu, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17706  [pdf, other

    cs.LG cs.CL cs.DC

    FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model

    Authors: Feijie Wu, Zitao Li, Yaliang Li, Bolin Ding, **g Gao

    Abstract: Large language models (LLMs) show amazing performance on many domain-specific tasks after fine-tuning with some appropriate data. However, many domain-specific data are privately distributed across multiple owners. Thus, this dilemma raises the interest in how to perform LLM fine-tuning in federated learning (FL). However, confronted with limited computation and communication capacities, FL client… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: KDD 2024

  2. arXiv:2406.16989  [pdf, other

    cs.LG cs.AI

    Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning

    Authors: Ziyu Zhao, Leilei Gan, Guoyin Wang, Yuwei Hu, Tao Shen, Hongxia Yang, Kun Kuang, Fei Wu

    Abstract: Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs). Its modular and plug-and-play nature allows the integration of various domain-specific LoRAs, enhancing LLM capabilities. Open-source platforms like Huggingface and Modelscope have introduced a new computational paradigm, Uploadable Machine Learning (UML). In UML, contributors use decentralized data to tr… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.09997

  3. arXiv:2406.12975  [pdf, other

    cs.CL cs.AI cs.CY

    SHIELD: Evaluation and Defense Strategies for Copyright Compliance in LLM Text Generation

    Authors: Xiaoze Liu, Ting Sun, Tianyang Xu, Feijie Wu, Cunxiang Wang, Xiaoqian Wang, **g Gao

    Abstract: Large Language Models (LLMs) have transformed machine learning but raised significant legal concerns due to their potential to produce text that infringes on copyrights, resulting in several high-profile lawsuits. The legal landscape is struggling to keep pace with these rapid advancements, with ongoing debates about whether generated text might plagiarize copyrighted materials. Current LLMs may i… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.12928  [pdf, other

    cs.LG cs.AI cs.CL

    Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox

    Authors: Yijun Liu, Yuan Meng, Fang Wu, Shenhao Peng, Hang Yao, Chaoyu Guan, Chen Tang, Xinzhu Ma, Zhi Wang, Wenwu Zhu

    Abstract: Large language models (LLMs) have exhibited exciting progress in multiple scenarios, while the huge computational demands hinder their deployments in lots of real-world applications. As an effective means to reduce memory footprint and inference cost, quantization also faces challenges in performance degradation at low bit-widths. Understanding the impact of quantization on LLM capabilities, espec… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  5. arXiv:2406.11633  [pdf, other

    cs.CV

    DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

    Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

    Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

  6. arXiv:2406.08709  [pdf, other

    cs.LG stat.ME

    Introducing Diminutive Causal Structure into Graph Representation Learning

    Authors: Hang Gao, Peng Qiao, Yifan **, Fengge Wu, Jiangmeng Li, Changwen Zheng

    Abstract: When engaging in end-to-end graph representation learning with Graph Neural Networks (GNNs), the intricate causal relationships and rules inherent in graph data pose a formidable challenge for the model in accurately capturing authentic data relationships. A proposed mitigating strategy involves the direct integration of rules or relationships corresponding to the graph data into the model. Howeve… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  7. arXiv:2406.05504  [pdf, other

    cs.LG

    G-Transformer: Counterfactual Outcome Prediction under Dynamic and Time-varying Treatment Regimes

    Authors: Hong Xiong, Feng Wu, Leon Deng, Megan Su, Li-wei H Lehman

    Abstract: In the context of medical decision making, counterfactual prediction enables clinicians to predict treatment outcomes of interest under alternative courses of therapeutic actions given observed patient history. Prior machine learning approaches for counterfactual predictions under time-varying treatments focus on static time-varying treatment regimes where treatments do not depend on previous cova… ▽ More

    Submitted 27 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  8. arXiv:2406.05000  [pdf, other

    cs.CV

    AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation

    Authors: Lianyu Pang, Jian Yin, Baoquan Zhao, Feize Wu, Fu Lee Wang, Qing Li, Xudong Mao

    Abstract: Recent advances in text-to-image models have enabled high-quality personalized image synthesis of user-provided concepts with flexible textual control. In this work, we analyze the limitations of two primary techniques in text-to-image personalization: Textual Inversion and DreamBooth. When integrating the learned concept into new prompts, Textual Inversion tends to overfit the concept, while Drea… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  9. arXiv:2406.03703  [pdf, other

    cs.CL cs.LG

    Synthesizing Conversations from Unlabeled Documents using Automatic Response Segmentation

    Authors: Fanyou Wu, Weijie Xu, Chandan K. Reddy, Srinivasan H. Sengamedu

    Abstract: In this study, we tackle the challenge of inadequate and costly training data that has hindered the development of conversational question answering (ConvQA) systems. Enterprises have a large corpus of diverse internal documents. Instead of relying on a searching engine, a more compelling approach for people to comprehend these documents is to create a dialogue system. In this paper, we propose a… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: findings of ACL 2024

  10. arXiv:2406.02027  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Inference Attacks: A Taxonomy, Survey, and Promising Directions

    Authors: Feng Wu, Lei Cui, Shaowen Yao, Shui Yu

    Abstract: The prosperity of machine learning has also brought people's concerns about data privacy. Among them, inference attacks can implement privacy breaches in various MLaaS scenarios and model training/prediction phases. Specifically, inference attacks can perform privacy inference on undisclosed target training sets based on outputs of the target model, including but not limited to statistics, members… ▽ More

    Submitted 27 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  11. arXiv:2405.20626  [pdf, other

    cs.IR cs.IT

    Causal Distillation for Alleviating Performance Heterogeneity in Recommender Systems

    Authors: Shengyu Zhang, Ziqi Jiang, Jiangchao Yao, Fuli Feng, Kun Kuang, Zhou Zhao, Shuo Li, Hongxia Yang, Tat-Seng Chua, Fei Wu

    Abstract: Recommendation performance usually exhibits a long-tail distribution over users -- a small portion of head users enjoy much more accurate recommendation services than the others. We reveal two sources of this performance heterogeneity problem: the uneven distribution of historical interactions (a natural source); and the biased training of recommender models (a model source). As addressing this pr… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: TKDE 2023

  12. arXiv:2405.20389  [pdf, other

    astro-ph.IM cs.AI cs.HC cs.IR

    Designing an Evaluation Framework for Large Language Models in Astronomy Research

    Authors: John F. Wu, Alina Hyk, Kiera McCormick, Christine Ye, Simone Astarita, Elina Baral, Jo Ciuca, Jesse Cranney, Anjalie Field, Kartheik Iyer, Philipp Koehn, Jenn Kotler, Sandor Kruk, Michelle Ntampaka, Charles O'Neill, Joshua E. G. Peek, Sanjib Sharma, Mikaeel Yunus

    Abstract: Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy rese… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 7 pages, 3 figures. Code available at https://github.com/jsalt2024-evaluating-llms-for-astronomy/astro-arxiv-bot

  13. arXiv:2405.18315  [pdf, other

    cs.AI cs.PL

    DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data

    Authors: Bin Wang, Linke Ouyang, Fan Wu, Wenchang Ning, Xiao Han, Zhiyuan Zhao, Jiahui Peng, Yiying Jiang, Dahua Lin, Conghui He

    Abstract: In the era of artificial intelligence, the diversity of data modalities and annotation formats often renders data unusable directly, requiring understanding and format conversion before it can be used by researchers or developers with different needs. To tackle this problem, this article introduces a framework called Dataset Description Language (DSDL) that aims to simplify dataset processing by p… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  14. arXiv:2405.17830  [pdf, other

    cs.CL

    More Than Catastrophic Forgetting: Integrating General Capabilities For Domain-Specific LLMs

    Authors: Chengyuan Liu, Shihang Wang, Yangyang Kang, Lizhi Qing, Fubang Zhao, Changlong Sun, Kun Kuang, Fei Wu

    Abstract: The performance on general tasks decreases after Large Language Models (LLMs) are fine-tuned on domain-specific tasks, the phenomenon is known as Catastrophic Forgetting (CF). However, this paper presents a further challenge for real application of domain-specific LLMs beyond CF, called General Capabilities Integration (GCI), which necessitates the integration of both the general capabilities and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  15. arXiv:2405.16847  [pdf, other

    cs.CV cs.AI

    TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction

    Authors: Yinda Chen, Haoyuan Shi, Xiaoyu Liu, Te Shi, Ruobing Zhang, Dong Liu, Zhiwei Xiong, Feng Wu

    Abstract: Autoregressive next-token prediction is a standard pretraining method for large-scale language models, but its application to vision tasks is hindered by the non-sequential nature of image data, leading to cumulative errors. Most vision models employ masked autoencoder (MAE) based pretraining, which faces scalability issues. To address these challenges, we introduce \textbf{TokenUnify}, a novel pr… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  16. arXiv:2405.14314  [pdf, other

    cs.AI cs.CL cs.LG cs.MA cs.RO

    Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration

    Authors: Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li

    Abstract: Grounding the reasoning ability of large language models (LLMs) for embodied tasks is challenging due to the complexity of the physical world. Especially, LLM planning for multi-agent collaboration requires communication of agents or credit assignment as the feedback to re-adjust the proposed plans and achieve effective coordination. However, existing methods that overly rely on physical verificat… ▽ More

    Submitted 25 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: The first two authors contributed equally

  17. arXiv:2405.13097  [pdf, other

    cs.CV

    NieR: Normal-Based Lighting Scene Rendering

    Authors: Hongsheng Wang, Yang Wang, Yalan Liu, Fayuan Hu, Shengyu Zhang, Fei Wu, Feng Lin

    Abstract: In real-world road scenes, diverse material properties lead to complex light reflection phenomena, making accurate color reproduction crucial for enhancing the realism and safety of simulated driving environments. However, existing methods often struggle to capture the full spectrum of lighting effects, particularly in dynamic scenarios where viewpoint changes induce significant material color var… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  18. arXiv:2405.12806  [pdf, other

    cs.CV

    MOSS: Motion-based 3D Clothed Human Synthesis from Monocular Video

    Authors: Hongsheng Wang, Xiang Cai, Xi Sun, **hong Yue, Zhanyun Tang, Shengyu Zhang, Feng Lin, Fei Wu

    Abstract: Single-view clothed human reconstruction holds a central position in virtual reality applications, especially in contexts involving intricate human motions. It presents notable challenges in achieving realistic clothing deformation. Current methodologies often overlook the influence of motion on surface deformation, resulting in surfaces lacking the constraints imposed by global motion. To overcom… ▽ More

    Submitted 21 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:1710.03746 by other authors

  19. arXiv:2405.12724  [pdf, other

    cs.CV

    RemoCap: Disentangled Representation Learning for Motion Capture

    Authors: Hongsheng Wang, Lizao Zhang, Zhangnan Zhong, Shuolin Xu, Xinrui Zhou, Shengyu Zhang, Huahao Xu, Fei Wu, Feng Lin

    Abstract: Reconstructing 3D human bodies from realistic motion sequences remains a challenge due to pervasive and complex occlusions. Current methods struggle to capture the dynamics of occluded body parts, leading to model penetration and distorted motion. RemoCap leverages Spatial Disentanglement (SD) and Motion Disentanglement (MD) to overcome these limitations. SD addresses occlusion interference betwee… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  20. arXiv:2405.12505  [pdf, other

    cs.CV

    NOVA-3D: Non-overlapped Views for 3D Anime Character Reconstruction

    Authors: Hongsheng Wang, Nanjie Yao, Xinrui Zhou, Shengyu Zhang, Huahao Xu, Fei Wu, Feng Lin

    Abstract: In the animation industry, 3D modelers typically rely on front and back non-overlapped concept designs to guide the 3D modeling of anime characters. However, there is currently a lack of automated approaches for generating anime characters directly from these 2D designs. In light of this, we explore a novel task of reconstructing anime characters from non-overlapped views. This presents two main c… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  21. arXiv:2405.12477  [pdf, other

    cs.CV

    Gaussian Control with Hierarchical Semantic Graphs in 3D Human Recovery

    Authors: Hongsheng Wang, Weiyue Zhang, Sihao Liu, Xinrui Zhou, **g Li, Zhanyun Tang, Shengyu Zhang, Fei Wu, Feng Lin

    Abstract: Although 3D Gaussian Splatting (3DGS) has recently made progress in 3D human reconstruction, it primarily relies on 2D pixel-level supervision, overlooking the geometric complexity and topological relationships of different body parts. To address this gap, we introduce the Hierarchical Graph Human Gaussian Control (HUGS) framework for achieving high-fidelity 3D human reconstruction. Our approach i… ▽ More

    Submitted 21 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  22. arXiv:2405.08443  [pdf, other

    cs.LG

    Safety Constrained Multi-Agent Reinforcement Learning for Active Voltage Control

    Authors: Yang Qu, **ming Ma, Feng Wu

    Abstract: Active voltage control presents a promising avenue for relieving power congestion and enhancing voltage quality, taking advantage of the distributed controllable generators in the power network, such as roof-top photovoltaics. While Multi-Agent Reinforcement Learning (MARL) has emerged as a compelling approach to address this challenge, existing MARL approaches tend to overlook the constrained opt… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI2024

  23. arXiv:2405.06914  [pdf, other

    cs.CV

    Non-confusing Generation of Customized Concepts in Diffusion Models

    Authors: Wang Lin, **gyuan Chen, Jiaxin Shi, Yichen Zhu, Chen Liang, Junzhong Miao, Tao **, Zhou Zhao, Fei Wu, Shuicheng Yan, Hanwang Zhang

    Abstract: We tackle the common challenge of inter-concept visual confusion in compositional concept generation using text-guided diffusion models (TGDMs). It becomes even more pronounced in the generation of customized concepts, due to the scarcity of user-provided concept visual examples. By revisiting the two major stages leading to the success of TGDMs -- 1) contrastive image-language pre-training (CLIP)… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  24. arXiv:2405.05722  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph

    A Framework of SO(3)-equivariant Non-linear Representation Learning and its Application to Electronic-Structure Hamiltonian Prediction

    Authors: Shi Yin, Xinyang Pan, Fengyan Wang, Feng Wu, Lixin He

    Abstract: We present both a theoretical and a methodological framework that addresses a critical challenge in applying deep learning to physical systems: the reconciliation of non-linear expressiveness with SO(3)-equivariance in predictions of SO(3)-equivariant quantities. Inspired by covariant theory in physics, we address this problem by exploring the mathematical relationships between SO(3)-invariant and… ▽ More

    Submitted 18 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  25. arXiv:2405.04307  [pdf, other

    cs.RO cs.AI cs.LG

    Improving Offline Reinforcement Learning with Inaccurate Simulators

    Authors: Yiwen Hou, Haoyuan Sun, **ming Ma, Feng Wu

    Abstract: Offline reinforcement learning (RL) provides a promising approach to avoid costly online interaction with the real environment. However, the performance of offline RL highly depends on the quality of the datasets, which may cause extrapolation error in the learning process. In many robotic applications, an inaccurate simulator is often available. However, the data directly collected from the inacc… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  26. arXiv:2405.04044  [pdf, other

    cs.CV

    DMOFC: Discrimination Metric-Optimized Feature Compression

    Authors: Changsheng Gao, Yiheng Jiang, Li Li, Dong Liu, Feng Wu

    Abstract: Feature compression, as an important branch of video coding for machines (VCM), has attracted significant attention and exploration. However, the existing methods mainly focus on intra-feature similarity, such as the Mean Squared Error (MSE) between the reconstructed and original features, while neglecting the importance of inter-feature relationships. In this paper, we analyze the inter-feature r… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  27. arXiv:2405.04009  [pdf, other

    cs.CV cs.AI

    Structured Click Control in Transformer-based Interactive Segmentation

    Authors: Long Xu, Yongquan Chen, Rui Huang, Feng Wu, Shiwu Lai

    Abstract: Click-point-based interactive segmentation has received widespread attention due to its efficiency. However, it's hard for existing algorithms to obtain precise and robust responses after multiple clicks. In this case, the segmentation results tend to have little change or are even worse than before. To improve the robustness of the response, we propose a structured click intent model based on gra… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 10 pages, 6 figures, submitted to NeurIPS 2024

  28. arXiv:2405.02918  [pdf, other

    cs.CV

    MERIT: Multi-view Evidential learning for Reliable and Interpretable liver fibrosis sTaging

    Authors: Yuanye Liu, Zheyao Gao, Nannan Shi, Fu** Wu, Yuxin Shi, Qingchao Chen, Xiahai Zhuang

    Abstract: Accurate staging of liver fibrosis from magnetic resonance imaging (MRI) is crucial in clinical practice. While conventional methods often focus on a specific sub-region, multi-view learning captures more information by analyzing multiple patches simultaneously. However, previous multi-view approaches could not typically calculate uncertainty by nature, and they generally integrate features from d… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Submitted to Medical Image Analysis

    MSC Class: 68U10 ACM Class: I.4.6

  29. arXiv:2404.19644  [pdf, other

    cs.CV

    MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation

    Authors: Min Zhang, Haoxuan Li, Fei Wu, Kun Kuang

    Abstract: Out-of-distribution (OOD) problems in few-shot classification (FSC) occur when novel classes sampled from testing distributions differ from base classes drawn from training distributions, which considerably degrades the performance of deep learning models deployed in real-world applications. Recent studies suggest that the OOD problems in FSC mainly including: (a) cross-domain few-shot classificat… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: ICLR 24

  30. arXiv:2404.18143  [pdf, other

    cs.CV

    Tracking Transforming Objects: A Benchmark

    Authors: You Wu, Yuelong Wang, Yaxin Liao, Fuliang Wu, Hengzhou Ye, Shuiwang Li

    Abstract: Tracking transforming objects holds significant importance in various fields due to the dynamic nature of many real-world scenarios. By enabling systems accurately represent transforming objects over time, tracking transforming objects facilitates advancements in areas such as autonomous systems, human-computer interaction, and security applications. Moreover, understanding the behavior of transfo… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  31. arXiv:2404.14233  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

    Authors: Wenyi Xiao, Ziwei Huang, Leilei Gan, Wanggui He, Haoyuan Li, Zhelun Yu, Hao Jiang, Fei Wu, Linchao Zhu

    Abstract: The rapidly develo** Large Vision Language Models (LVLMs) have shown notable capabilities on a range of multi-modal tasks, but still face the hallucination phenomena where the generated texts do not align with the given contexts, significantly restricting the usages of LVLMs. Most previous work detects and mitigates hallucination at the coarse-grained level or requires expensive annotation (e.g.… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  32. arXiv:2404.14037  [pdf, other

    cs.CV cs.MM

    GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting

    Authors: Hongyun Yu, Zhan Qu, Qihang Yu, Jianchuan Chen, Zhonghua Jiang, Zhiwen Chen, Shengyu Zhang, Jimin Xu, Fei Wu, Chengfei Lv, Gang Yu

    Abstract: Recent works on audio-driven talking head synthesis using Neural Radiance Fields (NeRF) have achieved impressive results. However, due to inadequate pose and expression control caused by NeRF implicit representation, these methods still have some limitations, such as unsynchronized or unnatural lip movements, and visual jitter and artifacts. In this paper, we propose GaussianTalker, a novel method… ▽ More

    Submitted 28 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: https://yuhongyun777.github.io/GaussianTalker/

  33. arXiv:2404.13322  [pdf, other

    cs.LG cs.AI

    MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities

    Authors: Kunxi Li, Tianyu Zhan, Kairui Fu, Shengyu Zhang, Kun Kuang, Jiwei Li, Zhou Zhao, Fei Wu

    Abstract: In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present Mer… ▽ More

    Submitted 17 June, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  34. arXiv:2404.12638  [pdf, other

    cs.AI

    Learning to Cut via Hierarchical Sequence/Set Model for Efficient Mixed-Integer Programming

    Authors: Jie Wang, Zhihai Wang, Xijun Li, Yufei Kuang, Zhihao Shi, Fangzhou Zhu, Mingxuan Yuan, Jia Zeng, Yongdong Zhang, Feng Wu

    Abstract: Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics. Howev… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2302.00244

  35. arXiv:2404.04102  [pdf, other

    cs.LG cs.AI cs.CL

    ROPO: Robust Preference Optimization for Large Language Models

    Authors: Xize Liang, Chao Chen, Shuang Qiu, Jie Wang, Yue Wu, Zhihang Fu, Zhihao Shi, Feng Wu, Jie** Ye

    Abstract: Preference alignment is pivotal for empowering large language models (LLMs) to generate helpful and harmless responses. However, the performance of preference alignment is highly sensitive to the prevalent noise in the preference data. Recent efforts for this problem either marginally alleviate the impact of noise without the ability to actually reduce its presence, or rely on costly teacher LLMs… ▽ More

    Submitted 28 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

  36. arXiv:2404.01882  [pdf, other

    cs.CV

    Scene Adaptive Sparse Transformer for Event-based Object Detection

    Authors: Yansong Peng, Hebei Li, Yueyi Zhang, Xiaoyan Sun, Feng Wu

    Abstract: While recent Transformer-based approaches have shown impressive performances on event-based object detection tasks, their high computational costs still diminish the low power consumption advantage of event cameras. Image-based works attempt to reduce these costs by introducing sparse Transformers. However, they display inadequate sparsity and adaptability when applied to event-based object detect… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  37. arXiv:2404.00942  [pdf, other

    cs.CL cs.AI cs.LG

    Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs

    Authors: Xiaoze Liu, Feijie Wu, Tianyang Xu, Zhuo Chen, Yichi Zhang, Xiaoqian Wang, **g Gao

    Abstract: The advent of Large Language Models (LLMs) has significantly transformed the AI landscape, enhancing machine learning and AI capabilities. Factuality issue is a critical concern for LLMs, as they may generate factually incorrect responses. In this paper, we propose GraphEval to evaluate an LLM's performance using a substantially large test dataset. Specifically, the test dataset is retrieved from… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  38. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  39. arXiv:2403.16854  [pdf, other

    cs.CL cs.AI

    An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing

    Authors: Ziwei Chai, Guoyin Wang, **g Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, **g**g Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang

    Abstract: We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instru… ▽ More

    Submitted 11 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  40. arXiv:2403.14232  [pdf, other

    cs.LG

    Contrastive Balancing Representation Learning for Heterogeneous Dose-Response Curves Estimation

    Authors: Minqin Zhu, Anpeng Wu, Haoxuan Li, Ruoxuan Xiong, Bo Li, Xiaoqing Yang, Xuan Qin, Peng Zhen, Jiecheng Guo, Fei Wu, Kun Kuang

    Abstract: Estimating the individuals' potential response to varying treatment doses is crucial for decision-making in areas such as precision medicine and management science. Most recent studies predict counterfactual outcomes by learning a covariate representation that is independent of the treatment variable. However, such independence constraints neglect much of the covariate information that is useful f… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  41. arXiv:2403.11694  [pdf, other

    eess.IV cs.CV

    Object Segmentation-Assisted Inter Prediction for Versatile Video Coding

    Authors: Zhuoyuan Li, Zikun Yuan, Li Li, Dong Liu, Xiaohu Tang, Feng Wu

    Abstract: In modern video coding standards, block-based inter prediction is widely adopted, which brings high compression efficiency. However, in natural videos, there are usually multiple moving objects of arbitrary shapes, resulting in complex motion fields that are difficult to compactly represent. This problem has been tackled by more flexible block partitioning methods in the Versatile Video Coding (VV… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 22 pages, 15 figures

  42. arXiv:2403.11449  [pdf, other

    cs.LG

    Graph Partial Label Learning with Potential Cause Discovering

    Authors: Hang Gao, Jiaguo Yuan, Jiangmeng Li, Peng Qiao, Fengge Wu, Changwen Zheng, Hua** Liu

    Abstract: Graph Neural Networks (GNNs) have garnered widespread attention for their potential to address the challenges posed by graph representation learning, which face complex graph-structured data across various domains. However, due to the inherent complexity and interconnectedness of graphs, accurately annotating graph data for training GNNs is extremely challenging. To address this issue, we have int… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  43. arXiv:2403.08157  [pdf

    cs.CV

    Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks

    Authors: Fuzhi Wu, Jiasong Wu, Youyong Kong, Chunfeng Yang, Guanyu Yang, Huazhong Shu, Guy Carrault, Lotfi Senhadji

    Abstract: Deep learning and Convolutional Neural Networks (CNNs) have driven major transformations in diverse research areas. However, their limitations in handling low-frequency information present obstacles in certain tasks like interpreting global structures or managing smooth transition images. Despite the promising performance of transformer structures in numerous tasks, their intricate optimization co… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 9 pages, 10 figures,6 tables. AAAI 2024 conference

  44. arXiv:2403.07030  [pdf, other

    cs.LG cs.CV

    AuG-KD: Anchor-Based Mixup Generation for Out-of-Domain Knowledge Distillation

    Authors: Zihao Tang, Zheqi Lv, Shengyu Zhang, Yifan Zhou, Xinyu Duan, Fei Wu, Kun Kuang

    Abstract: Due to privacy or patent concerns, a growing number of large models are released without granting access to their training data, making transferring their knowledge inefficient and problematic. In response, Data-Free Knowledge Distillation (DFKD) methods have emerged as direct solutions. However, simply adopting models derived from DFKD for real-world applications suffers significant performance d… ▽ More

    Submitted 17 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024

  45. arXiv:2403.06504  [pdf, other

    cs.DC

    Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU

    Authors: Changyue Liao, Mo Sun, Zihan Yang, Kaiqi Chen, Binhang Yuan, Fei Wu, Zeke Wang

    Abstract: Recent advances in large language models have brought immense value to the world, with their superior capabilities stemming from the massive number of parameters they utilize. However, even the GPUs with the highest memory capacities, currently peaking at 80GB, are far from sufficient to accommodate these vast parameters and their associated optimizer states when conducting stochastic gradient des… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  46. arXiv:2403.06414  [pdf, other

    cs.CL

    Evolving Knowledge Distillation with Large Language Models and Active Learning

    Authors: Chengyuan Liu, Yangyang Kang, Fubang Zhao, Kun Kuang, Zhuoren Jiang, Changlong Sun, Fei Wu

    Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks. However, their computational costs are prohibitively high. To address this issue, previous research has attempted to distill the knowledge of LLMs into smaller models by generating annotated data. Nonetheless, these works have mainly focused on the direct use of LLMs for text generation and labeling, w… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted by COLING 2024

  47. arXiv:2403.04369  [pdf, other

    cs.AI cs.CL

    From Graph to Word Bag: Introducing Domain Knowledge to Confusing Charge Prediction

    Authors: Ang Li, Qiangchao Chen, Yiquan Wu, Ming Cai, Xiang Zhou, Fei Wu, Kun Kuang

    Abstract: Confusing charge prediction is a challenging task in legal AI, which involves predicting confusing charges based on fact descriptions. While existing charge prediction methods have shown impressive performance, they face significant challenges when dealing with confusing charges, such as Snatch and Robbery. In the legal domain, constituent elements play a pivotal role in distinguishing confusing c… ▽ More

    Submitted 24 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  48. arXiv:2403.04366  [pdf, other

    cs.AI

    Enhancing Court View Generation with Knowledge Injection and Guidance

    Authors: Ang Li, Yiquan Wu, Yifei Liu, Fei Wu, Ming Cai, Kun Kuang

    Abstract: Court View Generation (CVG) is a challenging task in the field of Legal Artificial Intelligence (LegalAI), which aims to generate court views based on the plaintiff claims and the fact descriptions. While Pretrained Language Models (PLMs) have showcased their prowess in natural language generation, their application to the complex, knowledge-intensive domain of CVG often reveals inherent limitatio… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  49. arXiv:2403.02624  [pdf, other

    cs.LG cs.AI

    Pareto-Optimal Estimation and Policy Learning on Short-term and Long-term Treatment Effects

    Authors: Yingrong Wang, Anpeng Wu, Haoxuan Li, Weiming Liu, Qiaowei Miao, Ruoxuan Xiong, Fei Wu, Kun Kuang

    Abstract: This paper focuses on develo** Pareto-optimal estimation and policy learning to identify the most effective treatment that maximizes the total reward from both short-term and long-term effects, which might conflict with each other. For example, a higher dosage of medication might increase the speed of a patient's recovery (short-term) but could also result in severe long-term side effects. Altho… ▽ More

    Submitted 12 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  50. arXiv:2403.02607  [pdf

    cs.GT cs.AI

    MEBS: Multi-task End-to-end Bid Shading for Multi-slot Display Advertising

    Authors: Zhen Gong, Lvyin Niu, Yang Zhao, Miao Xu, Zhenzhe Zheng, Haoqi Zhang, Zhilin Zhang, Fan Wu, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng

    Abstract: Online bidding and auction are crucial aspects of the online advertising industry. Conventionally, there is only one slot for ad display and most current studies focus on it. Nowadays, multi-slot display advertising is gradually becoming popular where many ads could be displayed in a list and shown as a whole to users. However, multi-slot display advertising leads to different cost-effectiveness.… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.