Skip to main content

Showing 1–50 of 95 results for author: Zhan, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03257  [pdf, other

    cs.LG

    Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades Later

    Authors: Han-Jia Ye, Huai-Hong Yin, De-Chuan Zhan

    Abstract: The growing success of deep learning in various domains has prompted investigations into its application to tabular data, where deep models have shown promising results compared to traditional tree-based methods. In this paper, we revisit Neighborhood Component Analysis (NCA), a classic tabular prediction method introduced in 2004, designed to learn a linear projection that captures semantic simil… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2407.00956  [pdf, other

    cs.LG

    A Closer Look at Deep Learning on Tabular Data

    Authors: Han-Jia Ye, Si-Yang Liu, Hao-Run Cai, Qi-Le Zhou, De-Chuan Zhan

    Abstract: Tabular data is prevalent across various domains in machine learning. Although Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones, in-depth evaluation of these methods is challenging due to varying performance ranks across diverse datasets. In this paper, we propose a comprehensive benchmark comprising 300 tabular datasets, covering a wide range… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.09486  [pdf, other

    cs.CV cs.AI

    SeMOPO: Learning High-quality Model and Policy from Low-quality Offline Visual Datasets

    Authors: Shenghua Wan, Ziyuan Chen, Le Gan, Shuai Feng, De-Chuan Zhan

    Abstract: Model-based offline reinforcement Learning (RL) is a promising approach that leverages existing data effectively in many real-world applications, especially those involving high-dimensional inputs like images and videos. To alleviate the distribution shift issue in offline RL, existing model-based methods heavily rely on the uncertainty of learned dynamics. However, the model uncertainty estimatio… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 23 pages, 10 figures

  4. arXiv:2406.08477  [pdf, other

    cs.IR

    Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens

    Authors: Ting-Ji Huang, Jia-Qi Yang, Chunxu Shen, Kai-Qi Liu, De-Chuan Zhan, Han-Jia Ye

    Abstract: Characterizing users and items through vector representations is crucial for various tasks in recommender systems. Recent approaches attempt to apply Large Language Models (LLMs) in recommendation through a question and answer format, where real users and items (e.g., Item No.2024) are represented with in-vocabulary tokens (e.g., "item", "20", "24"). However, since LLMs are typically pretrained on… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  5. arXiv:2406.03496  [pdf, other

    cs.CL cs.AI cs.LG

    Wings: Learning Multimodal LLMs without Text-only Forgetting

    Authors: Yi-Kai Zhang, Shiyin Lu, Yang Li, Yanqing Ma, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye

    Abstract: Multimodal large language models (MLLMs), initiated with a trained LLM, first align images with text and then fine-tune on multimodal mixed inputs. However, the MLLM catastrophically forgets the text-only instructions, which do not include images and can be addressed within the initial LLM. In this paper, we present Wings, a novel MLLM that excels in both text-only dialogues and multimodal compreh… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  6. arXiv:2406.02539  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Parrot: Multilingual Visual Instruction Tuning

    Authors: Hai-Long Sun, Da-Wei Zhou, Yang Li, Shiyin Lu, Chao Yi, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye

    Abstract: The rapid development of Multimodal Large Language Models (MLLMs) like GPT-4V has marked a significant step towards artificial general intelligence. Existing methods mainly focus on aligning vision encoders with LLMs through supervised fine-tuning (SFT) to endow LLMs with multimodal abilities, making MLLMs' inherent ability to react to multiple languages progressively deteriorate as the training p… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  7. arXiv:2405.16395  [pdf, other

    cs.LG

    Daily Physical Activity Monitoring -- Adaptive Learning from Multi-source Motion Sensor Data

    Authors: Haoting Zhang, Donglin Zhan, Yunduan Lin, **ghai He, Qing Zhu, Zuo-Jun Max Shen, Zeyu Zheng

    Abstract: In healthcare applications, there is a growing need to develop machine learning models that use data from a single source, such as that from a wrist wearable device, to monitor physical activities, assess health risks, and provide immediate health recommendations or interventions. However, the limitation of using single-source data often compromises the model's accuracy, as it fails to capture the… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  8. arXiv:2405.13078  [pdf, other

    cs.LG

    Exploring Dark Knowledge under Various Teacher Capacities and Addressing Capacity Mismatch

    Authors: Xin-Chun Li, Wen-Shu Fan, Bowen Tao, Le Gan, De-Chuan Zhan

    Abstract: Knowledge Distillation (KD) could transfer the ``dark knowledge" of a well-performed yet large neural network to a weaker but lightweight one. From the view of output logits and softened probabilities, this paper goes deeper into the dark knowledge provided by teachers with different capacities. Two fundamental observations are: (1) a larger teacher tends to produce probability vectors that are le… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  9. arXiv:2405.12493  [pdf, other

    cs.LG

    Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks

    Authors: Xin-Chun Li, Lan Li, De-Chuan Zhan

    Abstract: The loss landscape of deep neural networks (DNNs) is commonly considered complex and wildly fluctuated. However, an interesting observation is that the loss surfaces plotted along Gaussian noise directions are almost v-basin ones with the perturbed model lying on the basin. This motivates us to rethink whether the 1D or 2D subspace could cover more complex local geometry structures, and how to min… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  10. arXiv:2405.12489  [pdf, other

    cs.LG cs.AI

    Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks

    Authors: Xin-Chun Li, **-Lin Tang, Bo Zhang, Lan Li, De-Chuan Zhan

    Abstract: Exploring the loss landscape offers insights into the inherent principles of deep neural networks (DNNs). Recent work suggests an additional asymmetry of the valley beyond the flat and sharp ones, yet without thoroughly examining its causes or implications. Our study methodically explores the factors affecting the symmetry of DNN valleys, encompassing (1) the dataset, network architecture, initial… ▽ More

    Submitted 28 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  11. arXiv:2405.07083  [pdf, other

    cs.LG math.OC

    Data-Efficient and Robust Task Selection for Meta-Learning

    Authors: Donglin Zhan, James Anderson

    Abstract: Meta-learning methods typically learn tasks under the assumption that all tasks are equally important. However, this assumption is often not valid. In real-world applications, tasks can vary both in their importance during different training stages and in whether they contain noisy labeled data or not, making a uniform approach suboptimal. To address these issues, we propose the Data-Efficient and… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR 2024 Wrokshop

  12. arXiv:2405.06979  [pdf, other

    cs.LG

    Robust Semi-supervised Learning by Wisely Leveraging Open-set Data

    Authors: Yang Yang, Nan Jiang, Yi Xu, De-Chuan Zhan

    Abstract: Open-set Semi-supervised Learning (OSSL) holds a realistic setting that unlabeled data may come from classes unseen in the labeled set, i.e., out-of-distribution (OOD) data, which could cause performance degradation in conventional SSL models. To handle this issue, except for the traditional in-distribution (ID) classifier, some existing OSSL approaches employ an extra OOD detection module to avoi… ▽ More

    Submitted 20 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

  13. arXiv:2405.05768  [pdf, other

    cs.CV

    FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting

    Authors: Yikun Ma, Dandan Zhan, Zhi **

    Abstract: Text-driven 3D indoor scene generation holds broad applications, ranging from gaming and smart homes to AR/VR applications. Fast and high-fidelity scene generation is paramount for ensuring user-friendly experiences. However, existing methods are characterized by lengthy generation processes or necessitate the intricate manual specification of motion parameters, which introduces inconvenience for… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI-2024

  14. arXiv:2404.17753  [pdf, other

    cs.CV cs.AI

    Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification

    Authors: Chao Yi, Lu Ren, De-Chuan Zhan, Han-Jia Ye

    Abstract: CLIP showcases exceptional cross-modal matching capabilities due to its training on image-text contrastive learning tasks. However, without specific optimization for unimodal scenarios, its performance in single-modality feature extraction might be suboptimal. Despite this, some studies have directly used CLIP's image encoder for tasks like few-shot classification, introducing a misalignment betwe… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  15. arXiv:2404.17511  [pdf, other

    cs.LG cs.CY cs.SI

    Bridging the Fairness Divide: Achieving Group and Individual Fairness in Graph Neural Networks

    Authors: Duna Zhan, Dongliang Guo, Pengsheng Ji, Sheng Li

    Abstract: Graph neural networks (GNNs) have emerged as a powerful tool for analyzing and learning from complex data structured as graphs, demonstrating remarkable effectiveness in various applications, such as social network analysis, recommendation systems, and drug discovery. However, despite their impressive performance, the fairness problem has increasingly gained attention as a crucial aspect to consid… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 16 pages, 3 figures

  16. arXiv:2404.14801  [pdf, other

    cs.CV

    DesignProbe: A Graphic Design Benchmark for Multimodal Large Language Models

    Authors: Jieru Lin, Danqing Huang, Tiejun Zhao, Dechen Zhan, Chin-Yew Lin

    Abstract: A well-executed graphic design typically achieves harmony in two levels, from the fine-grained design elements (color, font and layout) to the overall design. This complexity makes the comprehension of graphic design challenging, for it needs the capability to both recognize the design elements and understand the design. With the rapid development of Multimodal Large Language Models (MLLMs), we es… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: work in progress

  17. arXiv:2404.14197  [pdf, other

    cs.LG

    SOFTS: Efficient Multivariate Time Series Forecasting with Series-Core Fusion

    Authors: Lu Han, Xu-Yang Chen, Han-Jia Ye, De-Chuan Zhan

    Abstract: Multivariate time series forecasting plays a crucial role in various fields such as finance, traffic management, energy, and healthcare. Recent studies have highlighted the advantages of channel independence to resist distribution drift but neglect channel correlations, limiting further enhancements. Several methods utilize mechanisms like attention or mixer to address this by capturing channel co… ▽ More

    Submitted 12 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  18. arXiv:2404.12407  [pdf, other

    cs.CV cs.LG

    TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen

    Authors: Da-Wei Zhou, Zhi-Hong Qi, Han-Jia Ye, De-Chuan Zhan

    Abstract: The era of pre-trained models has ushered in a wealth of new insights for the machine learning community. Among the myriad of questions that arise, one of paramount importance is: 'Do pre-trained models possess comprehensive knowledge?' This paper seeks to address this crucial inquiry. In line with our objective, we have made publicly available a novel dataset comprised of images from TV series re… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Project page: https://tv-100.github.io/

  19. arXiv:2404.11917  [pdf, other

    cs.LG cs.AI stat.ML

    Expected Coordinate Improvement for High-Dimensional Bayesian Optimization

    Authors: Dawei Zhan

    Abstract: Bayesian optimization (BO) algorithm is very popular for solving low-dimensional expensive optimization problems. Extending Bayesian optimization to high dimension is a meaningful but challenging task. One of the major challenges is that it is difficult to find good infill solutions as the acquisition functions are also high-dimensional. In this work, we propose the expected coordinate improvement… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  20. arXiv:2404.09232  [pdf, other

    cs.LG cs.DC

    MAP: Model Aggregation and Personalization in Federated Learning with Incomplete Classes

    Authors: Xin-Chun Li, Shaoming Song, Yinchuan Li, Bingshuai Li, Yunfeng Shao, Yang Yang, De-Chuan Zhan

    Abstract: In some real-world applications, data samples are usually distributed on local devices, where federated learning (FL) techniques are proposed to coordinate decentralized clients without directly sharing users' private data. FL commonly follows the parameter server architecture and contains multiple personalization and aggregation procedures. The natural data heterogeneity across clients, i.e., Non… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted by TKDE (11-Apr-2024)

  21. arXiv:2404.03988  [pdf, other

    cs.LG cs.SI

    Model Selection with Model Zoo via Graph Learning

    Authors: Ziyu Li, Hilco van der Wilk, Danning Zhan, Megha Khosla, Alessandro Bozzon, Rihan Hai

    Abstract: Pre-trained deep learning (DL) models are increasingly accessible in public repositories, i.e., model zoos. Given a new prediction task, finding the best model to fine-tune can be computationally intensive and costly, especially when the number of pre-trained models is large. Selecting the right pre-trained models is crucial, yet complicated by the diversity of models from various model families (… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted at 40th IEEE International Conference on Data Engineering (ICDE 2024)

  22. arXiv:2404.03386  [pdf, other

    cs.RO cs.AI cs.LG

    SENSOR: Imitate Third-Person Expert's Behaviors via Active Sensoring

    Authors: Kaichen Huang, Minghao Shao, Shenghua Wan, Hai-Hang Sun, Shuai Feng, Le Gan, De-Chuan Zhan

    Abstract: In many real-world visual Imitation Learning (IL) scenarios, there is a misalignment between the agent's and the expert's perspectives, which might lead to the failure of imitation. Previous methods have generally solved this problem by domain alignment, which incurs extra computation and storage costs, and these methods fail to handle the \textit{hard cases} where the viewpoint gap is too large.… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  23. arXiv:2404.03382  [pdf, other

    cs.LG cs.AI

    DIDA: Denoised Imitation Learning based on Domain Adaptation

    Authors: Kaichen Huang, Hai-Hang Sun, Shenghua Wan, Minghao Shao, Shuai Feng, Le Gan, De-Chuan Zhan

    Abstract: Imitating skills from low-quality datasets, such as sub-optimal demonstrations and observations with distractors, is common in real-world applications. In this work, we focus on the problem of Learning from Noisy Demonstrations (LND), where the imitator is required to learn from data with noise that often occurs during the processes of data collection or transmission. Previous IL methods improve t… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  24. arXiv:2403.13797  [pdf, other

    cs.LG cs.CV

    Bridge the Modality and Capacity Gaps in Vision-Language Model Selection

    Authors: Chao Yi, De-Chuan Zhan, Han-Jia Ye

    Abstract: Vision Language Models (VLMs) excel in zero-shot image classification by pairing images with textual category names. The expanding variety of Pre-Trained VLMs enhances the likelihood of identifying a suitable VLM for specific tasks. Thus, a promising zero-shot image classification strategy is selecting the most appropriate Pre-Trained VLM from the VLM Zoo, relying solely on the text data of the ta… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  25. arXiv:2403.12030  [pdf, other

    cs.CV cs.LG

    Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning

    Authors: Da-Wei Zhou, Hai-Long Sun, Han-Jia Ye, De-Chuan Zhan

    Abstract: Class-Incremental Learning (CIL) requires a learning system to continually learn new classes without forgetting. Despite the strong performance of Pre-Trained Models (PTMs) in CIL, a critical issue persists: learning new classes often results in the overwriting of old ones. Excessive modification of the network causes forgetting, while minimal adjustments lead to an inadequate fit for new classes.… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024. Code is available at: https://github.com/sun-hailong/CVPR24-Ease

  26. arXiv:2403.09976  [pdf, other

    cs.LG cs.CV

    AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors

    Authors: Yucen Wang, Shenghua Wan, Le Gan, Shuai Feng, De-Chuan Zhan

    Abstract: Model-based methods have significantly contributed to distinguishing task-irrelevant distractors for visual control. However, prior research has primarily focused on heterogeneous distractors like noisy background videos, leaving homogeneous distractors that closely resemble controllable agents largely unexplored, which poses significant challenges to existing methods. To tackle this problem, we p… ▽ More

    Submitted 5 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  27. arXiv:2401.16386  [pdf, other

    cs.LG cs.CV

    Continual Learning with Pre-Trained Models: A Survey

    Authors: Da-Wei Zhou, Hai-Long Sun, **gyi Ning, Han-Jia Ye, De-Chuan Zhan

    Abstract: Nowadays, real-world applications often face streaming data, which requires the learning system to absorb new knowledge as data evolves. Continual Learning (CL) aims to achieve this goal and meanwhile overcome the catastrophic forgetting of former knowledge when learning new ones. Typical CL methods build the model from scratch to grow with incoming data. However, the advent of the pre-trained mod… ▽ More

    Submitted 23 April, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted to IJCAI 2024. Code is available at: https://github.com/sun-hailong/LAMDA-PILOT

  28. arXiv:2401.16375  [pdf, other

    cs.CV

    Spot the Error: Non-autoregressive Graphic Layout Generation with Wireframe Locator

    Authors: Jieru Lin, Danqing Huang, Tiejun Zhao, Dechen Zhan, Chin-Yew Lin

    Abstract: Layout generation is a critical step in graphic design to achieve meaningful compositions of elements. Most previous works view it as a sequence generation problem by concatenating element attribute tokens (i.e., category, size, position). So far the autoregressive approach (AR) has achieved promising results, but is still limited in global context modeling and suffers from error propagation since… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: accepted by AAAI24

  29. arXiv:2401.14534  [pdf, other

    math.OC cs.LG

    Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-free LQR

    Authors: Leonardo F. Toso, Donglin Zhan, James Anderson, Han Wang

    Abstract: We investigate the problem of learning linear quadratic regulators (LQR) in a multi-task, heterogeneous, and model-free setting. We characterize the stability and personalization guarantees of a policy gradient-based (PG) model-agnostic meta-learning (MAML) (Finn et al., 2017) approach for the LQR problem under different task-heterogeneity settings. We show that our MAML-LQR algorithm produces a s… ▽ More

    Submitted 31 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

  30. arXiv:2312.16604  [pdf, other

    cs.LG

    Twice Class Bias Correction for Imbalanced Semi-Supervised Learning

    Authors: Lan Li, Bowen Tao, Lu Han, De-chuan Zhan, Han-jia Ye

    Abstract: Differing from traditional semi-supervised learning, class-imbalanced semi-supervised learning presents two distinct challenges: (1) The imbalanced distribution of training samples leads to model bias towards certain classes, and (2) the distribution of unlabeled samples is unknown and potentially distinct from that of labeled samples, which further contributes to class bias in the pseudo-labels d… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI24

  31. arXiv:2312.09598  [pdf, other

    cs.CV

    CLAF: Contrastive Learning with Augmented Features for Imbalanced Semi-Supervised Learning

    Authors: Bowen Tao, Lan Li, Xin-Chun Li, De-Chuan Zhan

    Abstract: Due to the advantages of leveraging unlabeled data and learning meaningful representations, semi-supervised learning and contrastive learning have been progressively combined to achieve better performances in popular applications with few labeled data and abundant unlabeled data. One common manner is assigning pseudo-labels to unlabeled samples and selecting positive and negative samples from pseu… ▽ More

    Submitted 24 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted to ICASSP'2024

  32. arXiv:2312.05229  [pdf, other

    cs.CV cs.LG

    Few-Shot Class-Incremental Learning via Training-Free Prototype Calibration

    Authors: Qi-Wei Wang, Da-Wei Zhou, Yi-Kai Zhang, De-Chuan Zhan, Han-Jia Ye

    Abstract: Real-world scenarios are usually accompanied by continuously appearing classes with scare labeled samples, which require the machine learning model to incrementally learn new classes and maintain the knowledge of base classes. In this Few-Shot Class-Incremental Learning (FSCIL) scenario, existing methods either introduce extra learnable components or rely on a frozen feature extractor to mitigate… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted to NeurIPS 2023. Code is available at: https://github.com/wangkiw/TEEN

  33. arXiv:2311.18341  [pdf, other

    cs.LG physics.ao-ph

    Learning Robust Precipitation Forecaster by Temporal Frame Interpolation

    Authors: Lu Han, Xu-Yang Chen, Han-Jia Ye, De-Chuan Zhan

    Abstract: Recent advances in deep learning have significantly elevated weather prediction models. However, these models often falter in real-world scenarios due to their sensitivity to spatial-temporal shifts. This issue is particularly acute in weather forecasting, where models are prone to overfit to local and temporal variations, especially when tasked with fine-grained predictions. In this paper, we add… ▽ More

    Submitted 1 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Previous version has text overlap with last year's paper arXiv:2212.02968 since the competition's datasets does not change. We restate the dataset description to avoid it. We also polish the overall writing

  34. arXiv:2311.00055  [pdf, other

    cs.LG

    Training-Free Generalization on Heterogeneous Tabular Data via Meta-Representation

    Authors: Han-Jia Ye, Qi-Le Zhou, De-Chuan Zhan

    Abstract: Tabular data is prevalent across various machine learning domains. Yet, the inherent heterogeneities in attribute and class spaces across different tabular datasets hinder the effective sharing of knowledge, limiting a tabular model to benefit from other datasets. In this paper, we propose Tabular data Pre-Training via Meta-representation (TabPTM), which allows one tabular model pre-training on a… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  35. arXiv:2310.15149  [pdf, other

    cs.LG

    Unlocking the Transferability of Tokens in Deep Models for Tabular Data

    Authors: Qi-Le Zhou, Han-Jia Ye, Le-Ye Wang, De-Chuan Zhan

    Abstract: Fine-tuning a pre-trained deep neural network has become a successful paradigm in various machine learning tasks. However, such a paradigm becomes particularly challenging with tabular data when there are discrepancies between the feature sets of pre-trained models and the target tasks. In this paper, we propose TabToken, a method aims at enhancing the quality of feature tokens (i.e., embeddings o… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  36. arXiv:2310.05495  [pdf, other

    cs.LG stat.ML

    On the Convergence of Federated Averaging under Partial Participation for Over-parameterized Neural Networks

    Authors: Xin Liu, Wei li, Dazhi Zhan, Yu Pan, Xin Ma, Yu Ding, Zhisong Pan

    Abstract: Federated learning (FL) is a widely employed distributed paradigm for collaboratively training machine learning models from multiple clients without sharing local data. In practice, FL encounters challenges in dealing with partial client participation due to the limited bandwidth, intermittent connection and strict synchronized delay. Simultaneously, there exist few theoretical convergence guarant… ▽ More

    Submitted 2 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

  37. arXiv:2309.14891  [pdf, other

    cs.IR

    RE-SORT: Removing Spurious Correlation in Multilevel Interaction for CTR Prediction

    Authors: Song-Li Wu, Liang Du, Jia-Qi Yang, Yu-Ai Wang, De-Chuan Zhan, Shuang Zhao, Zi-Xun Sun

    Abstract: Click-through rate (CTR) prediction is a critical task in recommendation systems, serving as the ultimate filtering step to sort items for a user. Most recent cutting-edge methods primarily focus on investigating complex implicit and explicit feature interactions; however, these methods neglect the spurious correlation issue caused by confounding factors, thereby diminishing the model's generaliza… ▽ More

    Submitted 10 May, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: 15 pages, 7 figures

  38. arXiv:2309.07117  [pdf, other

    cs.LG cs.CV

    PILOT: A Pre-Trained Model-Based Continual Learning Toolbox

    Authors: Hai-Long Sun, Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan

    Abstract: While traditional machine learning can effectively tackle a wide range of problems, it primarily operates within a closed-world setting, which presents limitations when dealing with streaming data. As a solution, incremental learning emerges to address real-world scenarios involving new data's arrival. Recently, pre-training has made significant advancements and garnered the attention of numerous… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: Code is available at https://github.com/sun-hailong/LAMDA-PILOT

  39. arXiv:2308.09158  [pdf, other

    cs.LG cs.CL cs.CV

    ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse

    Authors: Yi-Kai Zhang, Lu Ren, Chao Yi, Qi-Wei Wang, De-Chuan Zhan, Han-Jia Ye

    Abstract: The rapid expansion of foundation pre-trained models and their fine-tuned counterparts has significantly contributed to the advancement of machine learning. Leveraging pre-trained models to extract knowledge and expedite learning in real-world tasks, known as "Model Reuse", has become crucial in various applications. Previous research focuses on reusing models within a certain aspect, including re… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  40. arXiv:2307.07509  [pdf, other

    cs.IR

    Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data

    Authors: Qi-Wei Wang, Hongyu Lu, Yu Chen, Da-Wei Zhou, De-Chuan Zhan, Ming Chen, Han-Jia Ye

    Abstract: The Click-Through Rate (CTR) prediction task is critical in industrial recommender systems, where models are usually deployed on dynamic streaming data in practical applications. Such streaming data in real-world recommender systems face many challenges, such as distribution shift, temporal non-stationarity, and systematic biases, which bring difficulties to the training and utilizing of recommend… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  41. arXiv:2306.10695  [pdf, other

    cs.LG cs.AI cs.CV

    SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models

    Authors: Shenghua Wan, Yucen Wang, Minghao Shao, Ruying Chen, De-Chuan Zhan

    Abstract: Model-based imitation learning (MBIL) is a popular reinforcement learning method that improves sample efficiency on high-dimension input sources, such as images and videos. Following the convention of MBIL research, existing algorithms are highly deceptive by task-irrelevant information, especially moving distractors in videos. To tackle this problem, we propose a new algorithm - named Separated M… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: 18 pages, 7 figures

  42. arXiv:2306.05001  [pdf, other

    cs.CV cs.LG

    COURIER: Contrastive User Intention Reconstruction for Large-Scale Visual Recommendation

    Authors: Jia-Qi Yang, Chenglei Dai, Dan OU, Dongshuai Li, Ju Huang, De-Chuan Zhan, Xiaoyi Zeng, Yang Yang

    Abstract: With the advancement of multimedia internet, the impact of visual characteristics on the decision of users to click or not within the online retail industry is increasingly significant. Thus, incorporating visual features is a promising direction for further performance improvements in click-through rate (CTR). However, experiments on our production system revealed that simply injecting the image… ▽ More

    Submitted 6 June, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

  43. arXiv:2306.04985  [pdf, other

    cs.LG

    Beyond Probability Partitions: Calibrating Neural Networks with Semantic Aware Grou**

    Authors: Jia-Qi Yang, De-Chuan Zhan, Le Gan

    Abstract: Research has shown that deep networks tend to be overly optimistic about their predictions, leading to an underestimation of prediction errors. Due to the limited nature of data, existing studies have proposed various methods based on model prediction probabilities to bin the data and evaluate calibration error. We propose a more generalized definition of calibration error called Partitioned Calib… ▽ More

    Submitted 21 October, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: NeurIPS'23. https://github.com/ThyrixYang/group_calibration

  44. arXiv:2306.03900  [pdf, other

    cs.LG

    Model Spider: Learning to Rank Pre-Trained Models Efficiently

    Authors: Yi-Kai Zhang, Ting-Ji Huang, Yao-Xiang Ding, De-Chuan Zhan, Han-Jia Ye

    Abstract: Figuring out which Pre-Trained Model (PTM) from a model zoo fits the target task is essential to take advantage of plentiful model resources. With the availability of numerous heterogeneous PTMs from diverse fields, efficiently selecting the most suitable PTM is challenging due to the time-consuming costs of carrying out forward or backward passes over all PTMs. In this paper, we propose Model Spi… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  45. arXiv:2305.19270  [pdf, other

    cs.CV cs.LG

    Learning without Forgetting for Vision-Language Models

    Authors: Da-Wei Zhou, Yuanhan Zhang, **gyi Ning, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

    Abstract: Class-Incremental Learning (CIL) or continual learning is a desired capability in the real world, which requires a learning system to adapt to new tasks without forgetting former ones. While traditional CIL methods focus on visual information to grasp core features, recent advances in Vision-Language Models (VLM) have shown promising capabilities in learning generalizable representations with the… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  46. arXiv:2305.18978  [pdf, other

    cs.AI cs.LG physics.optics

    IDToolkit: A Toolkit for Benchmarking and Develo** Inverse Design Algorithms in Nanophotonics

    Authors: Jia-Qi Yang, Yucheng Xu, Jia-Lei Shen, Kebin Fan, De-Chuan Zhan, Yang Yang

    Abstract: Aiding humans with scientific designs is one of the most exciting of artificial intelligence (AI) and machine learning (ML), due to their potential for the discovery of new drugs, design of new materials and chemical compounds, etc. However, scientific design typically requires complex domain knowledge that is not familiar to AI researchers. Further, scientific studies involve professional skills… ▽ More

    Submitted 31 May, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: KDD'23

  47. arXiv:2305.04201  [pdf, other

    cs.LG

    MrTF: Model Refinery for Transductive Federated Learning

    Authors: Xin-Chun Li, Yang Yang, De-Chuan Zhan

    Abstract: We consider a real-world scenario in which a newly-established pilot project needs to make inferences for newly-collected data with the help of other parties under privacy protection policies. Current federated learning (FL) paradigms are devoted to solving the data heterogeneity problem without considering the to-be-inferred data. We propose a novel learning paradigm named transductive federated… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

    Comments: Minor Revision to DMKD Journal

  48. arXiv:2304.06971  [pdf, other

    cs.LG cs.CV

    Preserving Locality in Vision Transformers for Class Incremental Learning

    Authors: Bowen Zheng, Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan

    Abstract: Learning new classes without forgetting is crucial for real-world applications for a classification model. Vision Transformers (ViT) recently achieve remarkable performance in Class Incremental Learning (CIL). Previous works mainly focus on block design and model expansion for ViTs. However, in this paper, we find that when the ViT is incrementally trained, the attention layers gradually lose conc… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  49. arXiv:2304.05206  [pdf, other

    cs.LG

    The Capacity and Robustness Trade-off: Revisiting the Channel Independent Strategy for Multivariate Time Series Forecasting

    Authors: Lu Han, Han-Jia Ye, De-Chuan Zhan

    Abstract: Multivariate time series data comprises various channels of variables. The multivariate forecasting models need to capture the relationship between the channels to accurately predict future values. However, recently, there has been an emergence of methods that employ the Channel Independent (CI) strategy. These methods view multivariate time series data as separate univariate time series and disre… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: under review

  50. arXiv:2303.07338  [pdf, other

    cs.LG cs.CV

    Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need

    Authors: Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan, Ziwei Liu

    Abstract: Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be e… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Code is available at: https://github.com/zhoudw-zdw/RevisitingCIL