Skip to main content

Showing 1–50 of 123 results for author: Long, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19631  [pdf, other

    cs.LG cs.DC

    Personalized Interpretation on Federated Learning: A Virtual Concepts approach

    Authors: Peng Yan, Guodong Long, **g Jiang, Michael Blumenstein

    Abstract: Tackling non-IID data is an open challenge in federated learning research. Existing FL methods, including robust FL and personalized FL, are designed to improve model performance without consideration of interpreting non-IID across clients. This paper aims to design a novel FL method to robust and interpret the non-IID data across clients. Specifically, we interpret each client's dataset as a mixt… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.01961  [pdf, other

    cs.RO cs.CV

    Exploring Real World Map Change Generalization of Prior-Informed HD Map Prediction Models

    Authors: Samuel M. Bateman, Ning Xu, H. Charles Zhao, Yael Ben Shalom, Vince Gong, Greg Long, Will Maddern

    Abstract: Building and maintaining High-Definition (HD) maps represents a large barrier to autonomous vehicle deployment. This, along with advances in modern online map detection models, has sparked renewed interest in the online map** problem. However, effectively predicting online maps at a high enough quality to enable safe, driverless deployments remains a significant challenge. Recent work on these m… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to CVPR 2024, Workshop on Autonomous Driving

  3. arXiv:2406.00004  [pdf, other

    cs.IR cs.AI cs.LG

    Navigating the Future of Federated Recommendation Systems with Foundation Models

    Authors: Zhiwei Li, Guodong Long

    Abstract: In recent years, the integration of federated learning (FL) and recommendation systems (RS), known as Federated Recommendation Systems (FRS), has attracted attention for preserving user privacy by kee** private data on client devices. However, FRS faces inherent limitations such as data heterogeneity and scarcity, due to the privacy requirements of FL and the typical data sparsity issues of RSs.… ▽ More

    Submitted 3 June, 2024; v1 submitted 12 May, 2024; originally announced June 2024.

    Comments: 20 pages, position paper

  4. arXiv:2405.20348  [pdf, other

    physics.ao-ph cs.LG

    Personalized Adapter for Large Meteorology Model on Devices: Towards Weather Foundation Models

    Authors: Shengchao Chen, Guodong Long, **g Jiang, Chengqi Zhang

    Abstract: This paper demonstrates that pre-trained language models (PLMs) are strong foundation models for on-device meteorological variables modeling. We present LM-Weather, a generic approach to taming PLMs, that have learned massive sequential knowledge from the universe of natural language databases, to acquire an immediate capability to obtain highly customized models for heterogeneous meteorological d… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 42 pages, under review

  5. arXiv:2405.18123  [pdf, other

    cs.AI

    PyTAG: Tabletop Games for Multi-Agent Reinforcement Learning

    Authors: Martin Balla, George E. M. Long, James Goodman, Raluca D. Gaina, Diego Perez-Liebana

    Abstract: Modern Tabletop Games present various interesting challenges for Multi-agent Reinforcement Learning. In this paper, we introduce PyTAG, a new framework that supports interacting with a large collection of games implemented in the Tabletop Games framework. In this work we highlight the challenges tabletop games provide, from a game-playing agent perspective, along with the opportunities they provid… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  6. arXiv:2405.16472  [pdf, other

    cs.LG

    Multi-Level Additive Modeling for Structured Non-IID Federated Learning

    Authors: Shutong Chen, Tianyi Zhou, Guodong Long, Jie Ma, **g Jiang, Chengqi Zhang

    Abstract: The primary challenge in Federated Learning (FL) is to model non-IID distributions across clients, whose fine-grained structure is important to improve knowledge sharing. For example, some knowledge is globally shared across all clients, some is only transferable within a subgroup of clients, and some are client-specific. To capture and exploit this structure, we train models organized in a multi-… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  7. arXiv:2405.04840  [pdf, other

    cs.IR

    Federated Adaptation for Foundation Model-based Recommendations

    Authors: Chunxu Zhang, Guodong Long, Hongkuan Guo, Xiao Fang, Yang Song, Zhaojie Liu, Guorui Zhou, Zijian Zhang, Yang Liu, Bo Yang

    Abstract: With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted as a regular paper of IJCAI'24

  8. arXiv:2404.10942  [pdf, other

    cs.LG cs.AI cs.CY stat.ME

    What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning

    Authors: Zhihong Deng, **g Jiang, Guodong Long, Chengqi Zhang

    Abstract: In sequential decision-making problems involving sensitive attributes like race and gender, reinforcement learning (RL) agents must carefully consider long-term fairness while maximizing returns. Recent works have proposed many different types of fairness notions, but how unfairness arises in RL problems remains unclear. In this paper, we address this gap in the literature by investigating the sou… ▽ More

    Submitted 28 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures, accepted by IJCAI 2024

  9. arXiv:2403.19499  [pdf, other

    cs.LG

    Client-supervised Federated Learning: Towards One-model-for-all Personalization

    Authors: Peng Yan, Guodong Long

    Abstract: Personalized Federated Learning (PerFL) is a new machine learning paradigm that delivers personalized models for diverse clients under federated learning settings. Most PerFL methods require extra learning processes on a client to adapt a globally shared model to the client-specific personalized model using its own local data. However, the model adaptation process in PerFL is still an open challen… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  10. arXiv:2403.19211  [pdf, other

    cs.LG cs.AI cs.CL

    Dual-Personalizing Adapter for Federated Foundation Models

    Authors: Yiyuan Yang, Guodong Long, Tao Shen, **g Jiang, Michael Blumenstein

    Abstract: Recently, foundation models, particularly large language models (LLMs), have demonstrated an impressive ability to adapt to various tasks by fine-tuning large amounts of instruction data. Notably, federated foundation models emerge as a privacy preservation method to fine-tune models collaboratively under federated learning (FL) settings by leveraging many distributed datasets with non-IID data. T… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  11. arXiv:2402.03944  [pdf, other

    cs.CV

    IMUSE: IMU-based Facial Expression Capture

    Authors: Youjia Wang, Yiwen Wu, Hengan Zhou, Hongyang Lin, Xingyue Peng, Yingwenqi Jiang, Yingsheng Zhu, Guanpeng Long, Yatu Zhang, **gya Wang, Lan Xu, **gyi Yu

    Abstract: For facial motion capture and analysis, the dominated solutions are generally based on visual cues, which cannot protect privacy and are vulnerable to occlusions. Inertial measurement units (IMUs) serve as potential rescues yet are mainly adopted for full-body motion capture. In this paper, we propose IMUSE to fill the gap, a novel path for facial expression capture using purely IMU signals, signi… ▽ More

    Submitted 12 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: Go to IMUSE project page https://sites.google.com/view/projectpage-imuse and watch our video https://youtu.be/Rki9syHsvpc

  12. arXiv:2312.03014  [pdf, other

    cs.LG cs.AI cs.CV physics.ao-ph

    Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey

    Authors: Shengchao Chen, Guodong Long, **g Jiang, Dikai Liu, Chengqi Zhang

    Abstract: As artificial intelligence (AI) continues to rapidly evolve, the realm of Earth and atmospheric sciences is increasingly adopting data-driven models, powered by progressive developments in deep learning (DL). Specifically, DL techniques are extensively utilized to decode the chaotic and nonlinear aspects of Earth systems, and to address climate challenges via understanding weather and climate data… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Ongoing work. Survey Paper. 35 pages, 2 figures, 4 tables. The first work to comprehensively and systematically summarize DL-based weather and climate data understanding, paving the way for the development of weather and climate foundation models

  13. arXiv:2311.08734  [pdf, other

    cs.CL

    Thread of Thought Unraveling Chaotic Contexts

    Authors: Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, Jianbing Shen

    Abstract: Large Language Models (LLMs) have ushered in a transformative era in the field of natural language processing, excelling in tasks related to text comprehension and generation. Nevertheless, they encounter difficulties when confronted with chaotic contexts (e.g., distractors rather than long irrelevant context), leading to the inadvertent omission of certain details within the chaotic context. In r… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: 11 pages, 7 figures, 5 tables

  14. arXiv:2309.12529  [pdf, other

    cs.AI

    Curriculum Reinforcement Learning via Morphology-Environment Co-Evolution

    Authors: Shuang Ao, Tianyi Zhou, Guodong Long, Xuan Song, **g Jiang

    Abstract: Throughout long history, natural species have learned to survive by evolving their physical structures adaptive to the environment changes. In contrast, current reinforcement learning (RL) studies mainly focus on training an agent with a fixed morphology (e.g., skeletal structure and joint attributes) in a fixed environment, which can hardly generalize to changing environments or new tasks. In thi… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  15. arXiv:2309.06275  [pdf, other

    cs.CL

    Re-Reading Improves Reasoning in Large Language Models

    Authors: Xiaohan Xu, Chongyang Tao, Tao Shen, Can Xu, Hongbo Xu, Guodong Long, Jian-guang Lou

    Abstract: To enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs), we introduce a simple, yet general and effective prompting method, Re2, i.e., \textbf{Re}-\textbf{Re}ading the question as input. Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), which aim to elicit the reasoning process in the output, Re2 shifts the focus to the input by processing… ▽ More

    Submitted 29 February, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 25 pages

  16. arXiv:2307.09905  [pdf, other

    cs.AI

    PyTAG: Challenges and Opportunities for Reinforcement Learning in Tabletop Games

    Authors: Martin Balla, George E. M. Long, Dominik Jeurissen, James Goodman, Raluca D. Gaina, Diego Perez-Liebana

    Abstract: In recent years, Game AI research has made important breakthroughs using Reinforcement Learning (RL). Despite this, RL for modern tabletop games has gained little to no attention, even when they offer a range of unique challenges compared to video games. To bridge this gap, we introduce PyTAG, a Python API for interacting with the Tabletop Games framework (TAG). TAG contains a growing set of more… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted for Publication in: IEEE Conference on Games (2023)

  17. arXiv:2307.04365  [pdf, other

    cs.CV cs.LG

    One-Shot Pruning for Fast-adapting Pre-trained Models on Devices

    Authors: Haiyan Zhao, Guodong Long

    Abstract: Large-scale pre-trained models have been remarkably successful in resolving downstream tasks. Nonetheless, deploying these models on low-capability devices still requires an effective approach, such as model pruning. However, pruning the model from scratch can pose a practical challenge given the limited resources of each downstream task or device. To tackle this issue, we present a scalable one-s… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  18. arXiv:2307.01452  [pdf, other

    cs.LG cs.AI

    Causal Reinforcement Learning: A Survey

    Authors: Zhihong Deng, **g Jiang, Guodong Long, Chengqi Zhang

    Abstract: Reinforcement learning is an essential paradigm for solving sequential decision problems under uncertainty. Despite many remarkable achievements in recent decades, applying reinforcement learning methods in the real world remains challenging. One of the main obstacles is that reinforcement learning agents lack a fundamental understanding of the world and must therefore learn from scratch through n… ▽ More

    Submitted 20 November, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 52 pages, 10 figures

  19. arXiv:2306.03570  [pdf, other

    cs.LG

    Personalization Disentanglement for Federated Learning: An explainable perspective

    Authors: Peng Yan, Guodong Long

    Abstract: Personalized federated learning (PFL) jointly trains a variety of local models through balancing between knowledge sharing across clients and model personalization per client. This paper addresses PFL via explicit disentangling latent representations into two parts to capture the shared knowledge and client-specific personalization, which leads to more reliable and effective PFL. The disentangleme… ▽ More

    Submitted 13 July, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  20. arXiv:2306.01090  [pdf, other

    cs.CL

    Improving the Robustness of Summarization Systems with Dual Augmentation

    Authors: Xiuying Chen, Guodong Long, Chongyang Tao, Mingzhe Li, Xin Gao, Chengqi Zhang, Xiangliang Zhang

    Abstract: A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input. In this work, we first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise. To create semantic-consistent substitutes, we propose a SummAttacker, which is an efficient approach to generati… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 10 pages, 6 figures, ACL 2023 main coference

  21. arXiv:2305.18444  [pdf, other

    cs.LG cs.AI

    Continual Task Allocation in Meta-Policy Network via Sparse Prompting

    Authors: Yijun Yang, Tianyi Zhou, **g Jiang, Guodong Long, Yuhui Shi

    Abstract: How to train a generalizable meta-policy by continually learning a sequence of tasks? It is a natural human skill yet challenging to achieve by current reinforcement learning: the agent is expected to quickly adapt to new tasks (plasticity) meanwhile retaining the common knowledge from previous tasks (stability). We address it by "Continual Task Allocation via Sparse Prompting (CoTASP)", which lea… ▽ More

    Submitted 3 June, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted by ICML 2023

  22. arXiv:2305.14244  [pdf, other

    cs.LG

    Federated Prompt Learning for Weather Foundation Models on Devices

    Authors: Shengchao Chen, Guodong Long, Tao Shen, **g Jiang, Chengqi Zhang

    Abstract: On-device intelligence for weather forecasting uses local deep learning models to analyze weather patterns without centralized cloud computing, holds significance for supporting human activates. Federated Learning is a promising solution for such forecasting by enabling collaborative model training without sharing raw data. However, it faces three main challenges that hinder its reliability: (1) d… ▽ More

    Submitted 21 April, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted by Main Track in IJCAI'24 (the 33rd International Joint Conference on Artificial Intelligence)

  23. arXiv:2305.12650  [pdf, other

    cs.IR

    When Federated Recommendation Meets Cold-Start Problem: Separating Item Attributes and User Interactions

    Authors: Chunxu Zhang, Guodong Long, Tianyi Zhou, Zijian Zhang, Peng Yan, Bo Yang

    Abstract: Federated recommendation system usually trains a global model on the server without direct access to users' private data on their own devices. However, this separation of the recommendation model and users' private data poses a challenge in providing quality service, particularly when it comes to new items, namely cold-start recommendations in federated settings. This paper introduces a novel meth… ▽ More

    Submitted 24 February, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted as a regular paper of WWW'24

  24. arXiv:2305.07866  [pdf, other

    cs.IR

    GPFedRec: Graph-guided Personalization for Federated Recommendation

    Authors: Chunxu Zhang, Guodong Long, Tianyi Zhou, Zijjian Zhang, Peng Yan, Bo Yang

    Abstract: The federated recommendation system is an emerging AI service architecture that provides recommendation services in a privacy-preserving manner. Using user-relation graphs to enhance federated recommendations is a promising topic. However, it is still an open challenge to construct the user-relation graph while preserving data locality-based privacy protection in federated settings. Inspired by a… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 May, 2023; originally announced May 2023.

    Comments: Accepted as a regular paper of KDD'24

  25. arXiv:2305.07402  [pdf, other

    cs.CL cs.IR

    Synergistic Interplay between Search and Large Language Models for Information Retrieval

    Authors: Jiazhan Feng, Chongyang Tao, Xiubo Geng, Tao Shen, Can Xu, Guodong Long, Dongyan Zhao, Daxin Jiang

    Abstract: Information retrieval (IR) plays a crucial role in locating relevant resources from vast amounts of data, and its applications have evolved from traditional knowledge bases to modern retrieval models (RMs). The emergence of large language models (LLMs) has further revolutionized the IR field by enabling users to interact with search systems in natural languages. In this paper, we explore the advan… ▽ More

    Submitted 12 December, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: Pre-print. Work in progress

  26. arXiv:2304.14233  [pdf, other

    cs.CL cs.IR

    Large Language Models are Strong Zero-Shot Retriever

    Authors: Tao Shen, Guodong Long, Xiubo Geng, Chongyang Tao, Tianyi Zhou, Daxin Jiang

    Abstract: In this work, we propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios. Our method, the Language language model as Retriever (LameR), is built upon no other neural models but an LLM, while breaking brute-force combinations of retrievers with LLMs and lifting the performance of zero-shot retrieval to be very competitive on benchmark datase… ▽ More

    Submitted 1 August, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: Work in progress

  27. arXiv:2304.04158  [pdf, other

    cs.LG

    Does Continual Learning Equally Forget All Parameters?

    Authors: Haiyan Zhao, Tianyi Zhou, Guodong Long, **g Jiang, Chengqi Zhang

    Abstract: Distribution shift (e.g., task or domain shift) in continual learning (CL) usually results in catastrophic forgetting of neural networks. Although it can be alleviated by repeatedly replaying buffered data, the every-step replay is time-consuming. In this paper, we study which modules in neural networks are more prone to forgetting by investigating their training dynamics during CL. Our proposed m… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

  28. arXiv:2302.02173  [pdf, other

    cs.LG cs.AI

    A Survey on Deep Learning based Time Series Analysis with Frequency Transformation

    Authors: Kun Yi, Qi Zhang, Longbing Cao, Shou** Wang, Guodong Long, Liang Hu, Hui He, Zhendong Niu, Wei Fan, Hui Xiong

    Abstract: Recently, frequency transformation (FT) has been increasingly incorporated into deep learning models to significantly enhance state-of-the-art accuracy and efficiency in time series analysis. The advantages of FT, such as high efficiency and a global view, have been rapidly explored and exploited in various time series tasks and applications, demonstrating the promising potential of FT as a new de… ▽ More

    Submitted 15 October, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

  29. arXiv:2301.11560  [pdf, other

    cs.LG

    Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for Downstream Tasks

    Authors: Haiyan Zhao, Tianyi Zhou, Guodong Long, **g Jiang, Chengqi Zhang

    Abstract: As a few large-scale pre-trained models become the major choices of various applications, new challenges arise for model pruning, e.g., can we avoid pruning the same model from scratch for every downstream task? How to reuse the pruning results of previous tasks to accelerate the pruning for a new task? To address these challenges, we create a small model for a new task from the pruned models of s… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  30. arXiv:2301.11367  [pdf, other

    cs.CV cs.CL

    Style-Aware Contrastive Learning for Multi-Style Image Captioning

    Authors: Yucheng Zhou, Guodong Long

    Abstract: Existing multi-style image captioning methods show promising results in generating a caption with accurate visual content and desired linguistic style. However, existing methods overlook the relationship between linguistic style and visual content. To overcome this drawback, we propose style-aware contrastive learning for multi-style image captioning. First, we present a style-aware visual encoder… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: Findings of EACL 2023

  31. arXiv:2301.11362  [pdf, other

    cs.CV cs.CL

    Improving Cross-modal Alignment for Text-Guided Image Inpainting

    Authors: Yucheng Zhou, Guodong Long

    Abstract: Text-guided image inpainting (TGII) aims to restore missing regions based on a given text in a damaged image. Existing methods are based on a strong vision encoder and a cross-modal fusion model to integrate cross-modal features. However, these methods allocate most of the computation to visual encoding, while light computation on modeling modality interactions. Moreover, they take cross-modal fus… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: EACL 2023

  32. arXiv:2301.11357  [pdf, other

    cs.CV cs.CL

    Multimodal Event Transformer for Image-guided Story Ending Generation

    Authors: Yucheng Zhou, Guodong Long

    Abstract: Image-guided story ending generation (IgSEG) is to generate a story ending based on given story plots and ending image. Existing methods focus on cross-modal feature fusion but overlook reasoning and mining implicit information from story plots and ending image. To tackle this drawback, we propose a multimodal event transformer, an event-based reasoning framework for IgSEG. Specifically, we constr… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: EACL 2023

  33. arXiv:2301.09152  [pdf, other

    cs.LG

    Prompt Federated Learning for Weather Forecasting: Toward Foundation Models on Meteorological Data

    Authors: Shengchao Chen, Guodong Long, Tao Shen, **g Jiang

    Abstract: To tackle the global climate challenge, it urgently needs to develop a collaborative platform for comprehensive weather forecasting on large-scale meteorological data. Despite urgency, heterogeneous meteorological sensors across countries and regions, inevitably causing multivariate heterogeneity and data exposure, become the main barrier. This paper develops a foundation model across regions capa… ▽ More

    Submitted 27 May, 2023; v1 submitted 22 January, 2023; originally announced January 2023.

    Comments: Accepted by IJCAI'23 (32nd International Joint Conference on Artificial Intelligence)

  34. arXiv:2301.09109  [pdf, other

    cs.LG

    Federated Recommendation with Additive Personalization

    Authors: Zhiwei Li, Guodong Long, Tianyi Zhou

    Abstract: Building recommendation systems via federated learning (FL) is a new emerging challenge for advancing next-generation Internet service and privacy protection. Existing approaches train shared item embedding by FL while kee** the user embedding private on client side. However, item embedding identical for all clients cannot capture users' individual differences on perceiving the same item and thu… ▽ More

    Submitted 7 February, 2024; v1 submitted 22 January, 2023; originally announced January 2023.

    Comments: 9 pages, conference

  35. arXiv:2301.08143  [pdf, other

    cs.IR cs.AI cs.LG

    Dual Personalization on Federated Recommendation

    Authors: Chunxu Zhang, Guodong Long, Tianyi Zhou, Peng Yan, Zijian Zhang, Chengqi Zhang, Bo Yang

    Abstract: Federated recommendation is a new Internet service architecture that aims to provide privacy-preserving recommendation services in federated settings. Existing solutions are used to combine distributed recommendation algorithms and privacy-preserving mechanisms. Thus it inherently takes the form of heavyweight models at the server and hinders the deployment of on-device intelligent models to end-u… ▽ More

    Submitted 13 May, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

    Comments: Accepted as a regular paper of IJCAI23

  36. arXiv:2212.10423  [pdf, other

    cs.IR cs.CL

    Fine-Grained Distillation for Long Document Retrieval

    Authors: Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Guodong Long, Can Xu, Daxin Jiang

    Abstract: Long document retrieval aims to fetch query-relevant documents from a large-scale collection, where knowledge distillation has become de facto to improve a retriever by mimicking a heterogeneous yet powerful cross-encoder. However, in contrast to passages or sentences, retrieval on long documents suffers from the scope hypothesis that a long document may cover multiple topics. This maximizes their… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: 13 pages, 5 figures, 5 tables

  37. arXiv:2211.13009  [pdf, other

    cs.LG cs.AI cs.DC

    Federated Learning on Non-IID Graphs via Structural Knowledge Sharing

    Authors: Yue Tan, Yixin Liu, Guodong Long, **g Jiang, Qinghua Lu, Chengqi Zhang

    Abstract: Graph neural networks (GNNs) have shown their superiority in modeling graph data. Owing to the advantages of federated learning, federated graph learning (FGL) enables clients to train strong GNN models in a distributed manner without sharing their private data. A core challenge in federated systems is the non-IID problem, which also widely exists in real-world graph data. For example, local data… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  38. arXiv:2211.08737  [pdf, other

    quant-ph cs.AI cs.LG

    Near-Term Quantum Computing Techniques: Variational Quantum Algorithms, Error Mitigation, Circuit Compilation, Benchmarking and Classical Simulation

    Authors: He-Liang Huang, Xiao-Yue Xu, Chu Guo, Guo**g Tian, Shi-Jie Wei, Xiaoming Sun, Wan-Su Bao, Gui-Lu Long

    Abstract: Quantum computing is a game-changing technology for global academia, research centers and industries including computational science, mathematics, finance, pharmaceutical, materials science, chemistry and cryptography. Although it has seen a major boost in the last decade, we are still a long way from reaching the maturity of a full-fledged quantum computer. That said, we will be in the Noisy-Inte… ▽ More

    Submitted 27 December, 2022; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Please feel free to email He-Liang Huang with any comments, questions, suggestions or concerns

    Journal ref: Sci. China-Phys. Mech. Astron. 66, 250302 (2023)

  39. arXiv:2211.05987  [pdf, other

    cs.CL

    CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification

    Authors: Yang Li, Canran Xu, Guodong Long, Tao Shen, Chongyang Tao, **g Jiang

    Abstract: Recently, prefix-tuning was proposed to efficiently adapt pre-trained language models to a broad spectrum of natural language classification tasks. It leverages soft prefix as task-specific indicators and language verbalizers as categorical-label mentions to narrow the formulation gap from pre-training language models. However, when the label space increases considerably (i.e., many-class classifi… ▽ More

    Submitted 12 February, 2024; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: has been accepted by EACL 2024

  40. arXiv:2210.15248  [pdf, ps, other

    cs.CL

    Unsupervised Knowledge Graph Construction and Event-centric Knowledge Infusion for Scientific NLI

    Authors: Chenglin Wang, Yucheng Zhou, Guodong Long, Xiaodong Wang, Xiaowei Xu

    Abstract: With the advance of natural language inference (NLI), a rising demand for NLI is to handle scientific texts. Existing methods depend on pre-trained models (PTM) which lack domain-specific knowledge. To tackle this drawback, we introduce a scientific knowledge graph to generalize PTM to scientific domain. However, existing knowledge graph construction approaches suffer from some drawbacks, i.e., ex… ▽ More

    Submitted 27 October, 2022; v1 submitted 27 October, 2022; originally announced October 2022.

  41. arXiv:2210.01855  [pdf, other

    cs.SE cs.LG

    Multifaceted Hierarchical Report Identification for Non-Functional Bugs in Deep Learning Frameworks

    Authors: Guoming Long, Tao Chen, Georgina Cosma

    Abstract: Non-functional bugs (e.g., performance- or accuracy-related bugs) in Deep Learning (DL) frameworks can lead to some of the most devastating consequences. Reporting those bugs on a repository such as GitHub is a standard route to fix them. Yet, given the growing number of new GitHub reports for DL frameworks, it is intrinsically difficult for developers to distinguish those that reveal non-function… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted at APSEC 2022

  42. arXiv:2209.10083  [pdf, other

    cs.CR cs.AI cs.LG

    Federated Learning from Pre-Trained Models: A Contrastive Learning Approach

    Authors: Yue Tan, Guodong Long, Jie Ma, Lu Liu, Tianyi Zhou, **g Jiang

    Abstract: Federated Learning (FL) is a machine learning paradigm that allows decentralized clients to learn collaboratively without sharing their private data. However, excessive computation and communication demands pose challenges to current FL frameworks, especially when training large-scale models. To prevent these issues from hindering the deployment of FL systems, we propose a lightweight framework wh… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

  43. arXiv:2206.08063  [pdf, other

    cs.IR cs.CL

    Towards Robust Ranker for Text Retrieval

    Authors: Yucheng Zhou, Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Binxing Jiao, Daxin Jiang

    Abstract: A ranker plays an indispensable role in the de facto 'retrieval & rerank' pipeline, but its training still lags behind -- learning from moderate negatives or/and serving as an auxiliary module for a retriever. In this work, we first identify two major barriers to a robust ranker, i.e., inherent label noises caused by a well-trained retriever and non-ideal negatives sampled for a high-capable ranke… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 11 pages of main content, 4 tables, 3 figures

  44. arXiv:2205.11194  [pdf, other

    cs.IR cs.CL

    UnifieR: A Unified Retriever for Large-Scale Retrieval

    Authors: Tao Shen, Xiubo Geng, Chongyang Tao, Can Xu, Guodong Long, Kai Zhang, Daxin Jiang

    Abstract: Large-scale retrieval is to recall relevant documents from a huge collection given a query. It relies on representation learning to embed documents and queries into a common semantic encoding space. According to the encoding space, recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms. These two paradigms… ▽ More

    Submitted 4 June, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: To appear at KDD ADS 2023

  45. arXiv:2205.10110  [pdf, other

    cs.LG

    FedNoiL: A Simple Two-Level Sampling Method for Federated Learning with Noisy Labels

    Authors: Zhuowei Wang, Tianyi Zhou, Guodong Long, Bo Han, **g Jiang

    Abstract: Federated learning (FL) aims at training a global model on the server side while the training data are collected and located at the local devices. Hence, the labels in practice are usually annotated by clients of varying expertise or criteria and thus contain different amounts of noises. Local training on noisy labels can easily result in overfitting to noisy labels, which is devastating to the gl… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Comments: 12 pages

  46. Efficient Pipeline Planning for Expedited Distributed DNN Training

    Authors: Ziyue Luo, Xiaodong Yi, Guo** Long, Shiqing Fan, Chuan Wu, Jun Yang, Wei Lin

    Abstract: To train modern large DNN models, pipeline parallelism has recently emerged, which distributes the model across GPUs and enables different devices to process different microbatches in pipeline. Earlier pipeline designs allow multiple versions of model parameters to co-exist (similar to asynchronous training), and cannot ensure the same model convergence and accuracy performance as without pipelini… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: INFOCOM 2022

  47. arXiv:2204.07893  [pdf, other

    cs.SE

    On Reporting Performance and Accuracy Bugs for Deep Learning Frameworks: An Exploratory Study from GitHub

    Authors: Guoming Long, Tao Chen

    Abstract: The tremendous success of Deep Learning (DL) has significantly boosted the number of open-sourced DL frameworks hosted on GitHub. Among others, performance and accuracy bugs are critical factors that affect the reputation of these DL frameworks, therefore understanding the practice of discovering and investigating them for DL is important. In this paper, we conduct an exploratory study on the natu… ▽ More

    Submitted 16 April, 2022; originally announced April 2022.

    Comments: Accepted at EASE 2022

  48. arXiv:2204.05126  [pdf, other

    cs.IT quant-ph

    General Hamiltonian Representation of ML Detection Relying on the Quantum Approximate Optimization Algorithm

    Authors: **g**g Cui, Gui Lu Long, Lajos Hanzo

    Abstract: The quantum approximate optimization algorithm (QAOA) conceived for solving combinatorial optimization problems has attracted significant interest since it can be run on the existing noisy intermediate-scale quantum (NISQ) devices. A primary step of using the QAOA is the efficient Hamiltonian construction based on different problem instances. Hence, we solve the maximum likelihood (ML) detection p… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

  49. arXiv:2203.02225  [pdf, other

    cs.CL

    ClarET: Pre-training a Correlation-Aware Context-To-Event Transformer for Event-Centric Generation and Classification

    Authors: Yucheng Zhou, Tao Shen, Xiubo Geng, Guodong Long, Daxin Jiang

    Abstract: Generating new events given context with correlated ones plays a crucial role in many event-centric reasoning tasks. Existing works either limit their scope to specific scenarios or overlook event-level correlations. In this paper, we propose to pre-train a general Correlation-aware context-to-Event Transformer (ClarET) for event-centric reasoning. To achieve this, we propose three novel event-cen… ▽ More

    Submitted 9 March, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

    Comments: ACL 2022 camera-ready version

  50. arXiv:2203.00829  [pdf, other

    cs.LG

    Personalized Federated Learning With Graph

    Authors: Fengwen Chen, Guodong Long, Zonghan Wu, Tianyi Zhou, **g Jiang

    Abstract: Knowledge sharing and model personalization are two key components in the conceptual framework of personalized federated learning (PFL). Existing PFL methods focus on proposing new model personalization mechanisms while simply implementing knowledge sharing by aggregating models from all clients, regardless of their relation graph. This paper aims to enhance the knowledge-sharing process in PFL by… ▽ More

    Submitted 30 April, 2022; v1 submitted 1 March, 2022; originally announced March 2022.