Skip to main content

Showing 1–50 of 74 results for author: Kwok, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03641  [pdf, other

    cs.LG

    Scalable Learned Model Soup on a Single GPU: An Efficient Subspace Training Strategy

    Authors: Tao Li, Weisen Jiang, Fanghui Liu, Xiaolin Huang, James T. Kwok

    Abstract: Pre-training followed by fine-tuning is widely adopted among practitioners. The performance can be improved by "model soups"~\cite{wortsman2022model} via exploring various hyperparameter configurations.The Learned-Soup, a variant of model soups, significantly improves the performance but suffers from substantial memory and time costs due to the requirements of (i) having to load all fine-tuned mod… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  2. arXiv:2406.13183  [pdf, other

    cs.LG cs.CR cs.DC

    Communication-Efficient and Privacy-Preserving Decentralized Meta-Learning

    Authors: Hansi Yang, James T. Kwok

    Abstract: Distributed learning, which does not require gathering training data in a central location, has become increasingly important in the big-data era. In particular, random-walk-based decentralized algorithms are flexible in that they do not need a central server trusted by all clients and do not require all clients to be active in all iterations. However, existing distributed learning algorithms assu… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  3. arXiv:2406.01417  [pdf, other

    cs.LG cs.CV

    Mixup Augmentation with Multiple Interpolations

    Authors: Lifeng Shen, **cheng Yu, Hansi Yang, James T. Kwok

    Abstract: Mixup and its variants form a popular class of data augmentation techniques.Using a random sample pair, it generates a new sample by linear interpolation of the inputs and labels. However, generating only one single interpolation may limit its augmentation ability. In this paper, we propose a simple yet effective extension called multi-mix, which generates multiple interpolations from a sample pai… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  4. arXiv:2405.21040  [pdf, other

    cs.CL cs.AI

    Direct Alignment of Language Models via Quality-Aware Self-Refinement

    Authors: Runsheng Yu, Yong Wang, Xiaoqi Jiao, Youzhi Zhang, James T. Kwok

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has been commonly used to align the behaviors of Large Language Models (LLMs) with human preferences. Recently, a popular alternative is Direct Policy Optimization (DPO), which replaces an LLM-based reward model with the policy itself, thus obviating the need for extra memory and training time to learn the reward model. However, DPO does not consid… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  5. arXiv:2405.00557  [pdf, other

    cs.CL cs.AI

    Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment

    Authors: Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok

    Abstract: As the capabilities of large language models (LLMs) have expanded dramatically, aligning these models with human values presents a significant challenge. Traditional alignment strategies rely heavily on human intervention, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), or on the self-alignment capacities of LLMs, which usually require a strong LLM's eme… ▽ More

    Submitted 8 July, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  6. arXiv:2403.09572  [pdf, other

    cs.CV

    Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation

    Authors: Yunhao Gou, Kai Chen, Zhili Liu, Lanqing Hong, Hang Xu, Zhenguo Li, Dit-Yan Yeung, James T. Kwok, Yu Zhang

    Abstract: Multimodal large language models (MLLMs) have shown impressive reasoning abilities, which, however, are also more vulnerable to jailbreak attacks than their LLM predecessors. Although still capable of detecting unsafe responses, we observe that safety mechanisms of the pre-aligned LLMs in MLLMs can be easily bypassed due to the introduction of image features. To construct robust MLLMs, we propose… ▽ More

    Submitted 22 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Project Page: https://gyhdog99.github.io/projects/ecso/

  7. arXiv:2402.05382  [pdf, other

    cs.CV cs.LG

    Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts

    Authors: Zhili Liu, Kai Chen, Jianhua Han, Lanqing Hong, Hang Xu, Zhenguo Li, James T. Kwok

    Abstract: Masked Autoencoder~(MAE) is a prevailing self-supervised learning method that achieves promising results in model pre-training. However, when the various downstream tasks have data distributions different from the pre-training data, the semantically irrelevant pre-training information might result in negative transfer, impeding MAE's scalability. To address this issue, we propose a novel MAE-based… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted by ICLR 2023

  8. KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

    Authors: Yanbin Wei, Qiushi Huang, James T. Kwok, Yu Zhang

    Abstract: Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications. Many models have been proposed for KGC. They can be categorized into two main classes: triple-based and text-based approaches. Triple-based methods struggle with long-tail entities due to limited structural information and imbalanced entity distributions. Text-based met… ▽ More

    Submitted 23 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted to EMNLP 2023 Findings

  9. arXiv:2402.02130  [pdf, other

    cs.CL

    GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning

    Authors: Yanbin Wei, Shuai Fu, Weisen Jiang, Zejian Zhang, Zhixiong Zeng, Qi Wu, James T. Kwok, Yu Zhang

    Abstract: Large Language Models (LLMs) are increasingly used for various tasks with graph structures. Though LLMs can process graph information in a textual format, they overlook the rich vision modality, which is an intuitive way for humans to comprehend structural information and conduct general graph reasoning. The potential benefits and capabilities of representing graph structures as visual images (i.e… ▽ More

    Submitted 24 May, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  10. arXiv:2401.07502  [pdf, other

    cs.CV

    Compositional Oil Spill Detection Based on Object Detector and Adapted Segment Anything Model from SAR Images

    Authors: Wenhui Wu, Man Sing Wong, Xinyu Yu, Guoqiang Shi, Coco Yin Tung Kwok, Kang Zou

    Abstract: Semantic segmentation-based methods have attracted extensive attention in oil spill detection from SAR images. However, the existing approaches require a large number of finely annotated segmentation samples in the training stage. To alleviate this issue, we propose a composite oil spill detection framework, SAM-OIL, comprising an object detector (e.g., YOLOv8), an adapted Segment Anything Model (… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: 5 pages, 4 figures

  11. arXiv:2312.12379  [pdf, other

    cs.CV

    Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning

    Authors: Yunhao Gou, Zhili Liu, Kai Chen, Lanqing Hong, Hang Xu, Aoxue Li, Dit-Yan Yeung, James T. Kwok, Yu Zhang

    Abstract: Instruction tuning of Large Vision-language Models (LVLMs) has revolutionized the development of versatile models with zero-shot generalization across a wide range of downstream vision-language tasks. However, the diversity of training tasks of different sources and formats would lead to inevitable task conflicts, where different tasks conflict for the same set of model parameters, resulting in su… ▽ More

    Submitted 3 July, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Project website: https://gyhdog99.github.io/projects/mocle/

  12. arXiv:2311.05936  [pdf, ps, other

    cs.LG

    Aggregation Weighting of Federated Learning via Generalization Bound Estimation

    Authors: Mingwei Xu, Xiaofeng Cao, Ivor W. Tsang, James T. Kwok

    Abstract: Federated Learning (FL) typically aggregates client model parameters using a weighting approach determined by sample proportions. However, this naive weighting method may lead to unfairness and degradation in model performance due to statistical heterogeneity and the inclusion of noisy data among clients. Theoretically, distributional robustness analysis has shown that the generalization performan… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  13. arXiv:2310.15301  [pdf, other

    cs.LG

    ADMarker: A Multi-Modal Federated Learning System for Monitoring Digital Biomarkers of Alzheimer's Disease

    Authors: Xiaomin Ouyang, Xian Shuai, Yang Li, Li Pan, Xifan Zhang, Heming Fu, Sitong Cheng, Xinyan Wang, Shihua Cao, Jiang Xin, Hazel Mok, Zhenyu Yan, Doris Sau Fung Yu, Timothy Kwok, Guoliang Xing

    Abstract: Alzheimer's Disease (AD) and related dementia are a growing global health challenge due to the aging population. In this paper, we present ADMarker, the first end-to-end system that integrates multi-modal sensors and new federated learning algorithms for detecting multidimensional AD digital biomarkers in natural living environments. ADMarker features a novel three-stage multi-modal federated lear… ▽ More

    Submitted 12 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

  14. arXiv:2310.01886  [pdf, other

    cs.LG cs.CL cs.CV

    BYOM: Building Your Own Multi-Task Model For Free

    Authors: Weisen Jiang, Baijiong Lin, Han Shi, Yu Zhang, Zhenguo Li, James T. Kwok

    Abstract: Recently, various merging methods have been proposed to build a multi-task model from task-specific finetuned models without retraining. However, existing methods suffer from a large performance deterioration compared to using multiple task-specific models. In this paper, we propose to inject task-specific knowledge into the merged model and design two parameter-efficient approaches (BYOM-FFT and… ▽ More

    Submitted 3 February, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Technical Report

  15. arXiv:2309.14360  [pdf, other

    cs.LG cs.CV

    Domain-Guided Conditional Diffusion Model for Unsupervised Domain Adaptation

    Authors: Yulong Zhang, Shuhao Chen, Weisen Jiang, Yu Zhang, Jiangang Lu, James T. Kwok

    Abstract: Limited transferability hinders the performance of deep learning models when applied to new application scenarios. Recently, Unsupervised Domain Adaptation (UDA) has achieved significant progress in addressing this issue via learning domain-invariant features. However, the performance of existing UDA methods is constrained by the large domain shift and limited target domain data. To alleviate thes… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: Work in progress

  16. arXiv:2309.12284  [pdf, other

    cs.CL cs.AI

    MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

    Authors: Longhui Yu, Weisen Jiang, Han Shi, **cheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu

    Abstract: Large language models (LLMs) have pushed the limits of natural language understanding and exhibited excellent problem-solving ability. Despite the great success, most existing open-source LLMs (e.g., LLaMA-2) are still far away from satisfactory for solving mathematical problem due to the complex reasoning procedures. To bridge this gap, we propose MetaMath, a fine-tuned language model that specia… ▽ More

    Submitted 3 May, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: To appear at ICLR 2024 (Spotlight). Project Page: https://meta-math.github.io/

  17. arXiv:2308.12029  [pdf, other

    cs.LG cs.AI

    Dual-Balancing for Multi-Task Learning

    Authors: Baijiong Lin, Weisen Jiang, Feiyang Ye, Yu Zhang, Pengguang Chen, Ying-Cong Chen, Shu Liu, James T. Kwok

    Abstract: Multi-task learning (MTL), a learning paradigm to learn multiple related tasks simultaneously, has achieved great success in various fields. However, task balancing problem remains a significant challenge in MTL, with the disparity in loss/gradient scales often leading to performance compromises. In this paper, we propose a Dual-Balancing Multi-Task Learning (DB-MTL) method to alleviate the task b… ▽ More

    Submitted 29 September, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Technical Report

  18. arXiv:2308.07758  [pdf, other

    cs.CL cs.AI cs.LG

    Forward-Backward Reasoning in Large Language Models for Mathematical Verification

    Authors: Weisen Jiang, Han Shi, Longhui Yu, Zhengying Liu, Yu Zhang, Zhenguo Li, James T. Kwok

    Abstract: Self-Consistency samples diverse reasoning chains with answers and chooses the final answer by majority voting. It is based on forward reasoning and cannot further improve performance by sampling more reasoning chains when saturated. To further boost performance, we introduce backward reasoning to verify candidate answers. Specifically, for mathematical tasks, we mask a number in the question and… ▽ More

    Submitted 4 June, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: Accepted by Findings of ACL 2024

  19. arXiv:2306.05675  [pdf, other

    cs.CV

    Illumination Controllable Dehazing Network based on Unsupervised Retinex Embedding

    Authors: Jie Gui, Xiaofeng Cong, Lei He, Yuan Yan Tang, James Tin-Yau Kwok

    Abstract: On the one hand, the dehazing task is an illposedness problem, which means that no unique solution exists. On the other hand, the dehazing task should take into account the subjective factor, which is to give the user selectable dehazed images rather than a single result. Therefore, this paper proposes a multi-output dehazing network by introducing illumination controllable ability, called IC-Deha… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

  20. arXiv:2306.00618  [pdf, other

    cs.CL cs.AI cs.LG

    Effective Structured Prompting by Meta-Learning and Representative Verbalizer

    Authors: Weisen Jiang, Yu Zhang, James T. Kwok

    Abstract: Prompt tuning for pre-trained masked language models (MLM) has shown promising performance in natural language processing tasks with few labeled examples. It tunes a prompt for the downstream task, and a verbalizer is used to bridge the predicted token and label prediction. Due to the limited training data, prompt initialization is crucial for prompt tuning. Recently, MetaPrompting (Hou et al., 20… ▽ More

    Submitted 21 March, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted at ICML 2023

  21. arXiv:2305.10716  [pdf, other

    cs.LG cs.AI

    A Survey on Time-Series Pre-Trained Models

    Authors: Qianli Ma, Zhen Liu, Zhen**g Zheng, Ziyang Huang, Siying Zhu, Zhongzhong Yu, James T. Kwok

    Abstract: Time-Series Mining (TSM) is an important research area since it shows great potential in practical applications. Deep learning models that rely on massive labeled data have been utilized for TSM successfully. However, constructing a large-scale well-labeled dataset is difficult due to data annotation costs. Recently, Pre-Trained Models have gradually attracted attention in the time series domain d… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Under review in the IEEE Transactions on Knowledge and Data Engineering

  22. arXiv:2303.18049  [pdf, other

    cs.CL

    No Place to Hide: Dual Deep Interaction Channel Network for Fake News Detection based on Data Augmentation

    Authors: Biwei Cao, Lulu Hua, Jiuxin Cao, Jie Gui, Bo Liu, James Tin-Yau Kwok

    Abstract: Online Social Network (OSN) has become a hotbed of fake news due to the low cost of information dissemination. Although the existing methods have made many attempts in news content and propagation structure, the detection of fake news is still facing two challenges: one is how to mine the unique key features and evolution patterns, and the other is how to tackle the problem of small samples to bui… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

  23. arXiv:2303.17255  [pdf, other

    cs.CV cs.CR eess.IV

    Fooling the Image Dehazing Models by First Order Gradient

    Authors: Jie Gui, Xiaofeng Cong, Chengwei Peng, Yuan Yan Tang, James Tin-Yau Kwok

    Abstract: The research on the single image dehazing task has been widely explored. However, as far as we know, no comprehensive study has been conducted on the robustness of the well-trained dehazing models. Therefore, there is no evidence that the dehazing networks can resist malicious attacks. In this paper, we focus on designing a group of attack methods based on first order gradient to verify the robust… ▽ More

    Submitted 15 February, 2024; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: This paper is accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

  24. arXiv:2303.02405  [pdf, other

    cs.LG cs.AI

    Decision Support System for Chronic Diseases Based on Drug-Drug Interactions

    Authors: Tian Bian, Yuli Jiang, Jia Li, Tingyang Xu, Yu Rong, Yi Su, Timothy Kwok, Helen Meng, Hong Cheng

    Abstract: Many patients with chronic diseases resort to multiple medications to relieve various symptoms, which raises concerns about the safety of multiple medication use, as severe drug-drug antagonism can lead to serious adverse effects or even death. This paper presents a Decision Support System, called DSSDDI, based on drug-drug interactions to support doctors prescribing decisions. DSSDDI contains thr… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

    Journal ref: ICDE2023

  25. Learning the Relation between Similarity Loss and Clustering Loss in Self-Supervised Learning

    Authors: Jidong Ge, Yuxiang Liu, Jie Gui, Lanting Fang, Ming Lin, James Tin-Yau Kwok, LiGuo Huang, Bin Luo

    Abstract: Self-supervised learning enables networks to learn discriminative features from massive data itself. Most state-of-the-art methods maximize the similarity between two augmentations of one image based on contrastive learning. By utilizing the consistency of two augmentations, the burden of manual annotations can be freed. Contrastive learning exploits instance-level information to learn robust feat… ▽ More

    Submitted 5 June, 2023; v1 submitted 8 January, 2023; originally announced January 2023.

    Comments: This paper is accepted by IEEE Transactions on Image Processing

  26. AlignVE: Visual Entailment Recognition Based on Alignment Relations

    Authors: Biwei Cao, Jiuxin Cao, Jie Gui, Jiayun Shen, Bo Liu, Lei He, Yuan Yan Tang, James Tin-Yau Kwok

    Abstract: Visual entailment (VE) is to recognize whether the semantics of a hypothesis text can be inferred from the given premise image, which is one special task among recent emerged vision and language understanding tasks. Currently, most of the existing VE approaches are derived from the methods of visual question answering. They recognize visual entailment by quantifying the similarity between the hypo… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: This paper is accepted for publication as a REGULAR paper in the IEEE Transactions on Multimedia

  27. Automated Dominative Subspace Mining for Efficient Neural Architecture Search

    Authors: Yaofo Chen, Yong Guo, Daihai Liao, Fanbing Lv, Hengjie Song, James Tin-Yau Kwok, Mingkui Tan

    Abstract: Neural Architecture Search (NAS) aims to automatically find effective architectures within a predefined search space. However, the search space is often extremely large. As a result, directly searching in such a large search space is non-trivial and also very time-consuming. To address the above issues, in each search step, we seek to limit the search space to a small but effective subspace to boo… ▽ More

    Submitted 6 June, 2024; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: Published in IEEE TCSVT

  28. arXiv:2209.13139  [pdf, other

    cs.CV cs.AI

    Searching a High-Performance Feature Extractor for Text Recognition Network

    Authors: Hui Zhang, Quanming Yao, James T. Kwok, Xiang Bai

    Abstract: Feature extractor plays a critical role in text recognition (TR), but customizing its architecture is relatively less explored due to expensive manual tweaking. In this work, inspired by the success of neural architecture search (NAS), we propose to search for suitable feature extractors. We design a domain-specific search space by exploring principles for having good feature extractors. The space… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  29. arXiv:2207.14443  [pdf, other

    cs.LG

    A Survey of Learning on Small Data: Generalization, Optimization, and Challenge

    Authors: Xiaofeng Cao, Weixin Bu, Shengjun Huang, Minling Zhang, Ivor W. Tsang, Yew Soon Ong, James T. Kwok

    Abstract: Learning on big data brings success for artificial intelligence (AI), but the annotation and training costs are expensive. In future, learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI, which requires machines to recognize objectives and scenarios relying on small data as humans. A series of learning topics is going on this way suc… ▽ More

    Submitted 6 June, 2023; v1 submitted 28 July, 2022; originally announced July 2022.

  30. arXiv:2206.15205  [pdf, other

    cs.LG

    Black-box Generalization of Machine Teaching

    Authors: Xiaofeng Cao, Yaming Guo, Ivor W. Tsang, James T. Kwok

    Abstract: Hypothesis-pruning maximizes the hypothesis updates for active learning to find those desired unlabeled data. An inherent assumption is that this learning manner can derive those updates into the optimal hypothesis. However, its convergence may not be guaranteed well if those incremental updates are negative and disordered. In this paper, we introduce a black-box teaching hypothesis… ▽ More

    Submitted 20 September, 2023; v1 submitted 30 June, 2022; originally announced June 2022.

  31. arXiv:2203.06168  [pdf, other

    cs.DS cs.DM math.CO math.PR

    Cheeger Inequalities for Vertex Expansion and Reweighted Eigenvalues

    Authors: Tsz Chiu Kwok, Lap Chi Lau, Kam Chuen Tung

    Abstract: The classical Cheeger's inequality relates the edge conductance $φ$ of a graph and the second smallest eigenvalue $λ_2$ of the Laplacian matrix. Recently, Olesker-Taylor and Zanetti discovered a Cheeger-type inequality $ψ^2 / \log |V| \lesssim λ_2^* \lesssim ψ$ connecting the vertex expansion $ψ$ of a graph $G=(V,E)$ and the maximum reweighted second smallest eigenvalue $λ_2^*$ of the Laplacian ma… ▽ More

    Submitted 19 September, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

    Comments: 65 pages, 1 figure. Minor changes

  32. arXiv:2202.08625  [pdf, other

    cs.LG

    Revisiting Over-smoothing in BERT from the Perspective of Graph

    Authors: Han Shi, Jiahui Gao, Hang Xu, Xiaodan Liang, Zhenguo Li, Lingpeng Kong, Stephen M. S. Lee, James T. Kwok

    Abstract: Recently over-smoothing phenomenon of Transformer-based models is observed in both vision and language fields. However, no existing work has delved deeper to further investigate the main cause of this phenomenon. In this work, we make the attempt to analyze the over-smoothing problem from the perspective of graph, where such problem was first discovered and explored. Intuitively, the self-attentio… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: Accepted by ICLR 2022 (Spotlight)

  33. arXiv:2109.08342  [pdf, other

    cs.LG

    Dropout's Dream Land: Generalization from Learned Simulators to Reality

    Authors: Zac Wellmer, James T. Kwok

    Abstract: A World Model is a generative model used to simulate an environment. World Models have proven capable of learning spatial and temporal representations of Reinforcement Learning environments. In some cases, a World Model offers an agent the opportunity to learn entirely inside of its own dream environment. In this work we explore improving the generalization capabilities from dream environments to… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: Published at ECML PKDD 2021

  34. arXiv:2107.00184  [pdf, other

    cs.AI

    Bilinear Scoring Function Search for Knowledge Graph Learning

    Authors: Yongqi Zhang, Quanming Yao, James Tin-Yau Kwok

    Abstract: Learning embeddings for entities and relations in knowledge graph (KG) have benefited many downstream tasks. In recent years, scoring functions, the crux of KG learning, have been human-designed to measure the plausibility of triples and capture different kinds of relations in KGs. However, as relations exhibit intricate patterns that are hard to infer before training, none of them consistently pe… ▽ More

    Submitted 4 March, 2022; v1 submitted 30 June, 2021; originally announced July 2021.

    Comments: TPAMI accepted

  35. arXiv:2106.06996  [pdf, other

    eess.IV cs.CV

    Pyramidal Dense Attention Networks for Lightweight Image Super-Resolution

    Authors: Huapeng Wu, Jie Gui, Jun Zhang, James T. Kwok, Zhihui Wei

    Abstract: Recently, deep convolutional neural network methods have achieved an excellent performance in image superresolution (SR), but they can not be easily applied to embedded devices due to large memory cost. To solve this problem, we propose a pyramidal dense attention network (PDAN) for lightweight image super-resolution in this paper. In our method, the proposed pyramidal dense learning can gradually… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  36. arXiv:2106.06966  [pdf, other

    eess.IV cs.CV

    Feedback Pyramid Attention Networks for Single Image Super-Resolution

    Authors: Huapeng Wu, Jie Gui, Jun Zhang, James T. Kwok, Zhihui Wei

    Abstract: Recently, convolutional neural network (CNN) based image super-resolution (SR) methods have achieved significant performance improvement. However, most CNN-based methods mainly focus on feed-forward architecture design and neglect to explore the feedback mechanism, which usually exists in the human visual system. In this paper, we propose feedback pyramid attention networks (FPAN) to fully exploit… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  37. arXiv:2106.06326  [pdf, other

    cs.LG

    TOHAN: A One-step Approach towards Few-shot Hypothesis Adaptation

    Authors: Haoang Chi, Feng Liu, Wen**g Yang, Long Lan, Tongliang Liu, Bo Han, William K. Cheung, James T. Kwok

    Abstract: In few-shot domain adaptation (FDA), classifiers for the target domain are trained with accessible labeled data in the source domain (SD) and few labeled data in the target domain (TD). However, data usually contain private information in the current era, e.g., data distributed on personal phones. Thus, the private information will be leaked if we directly access data in SD to train a target-domai… ▽ More

    Submitted 7 September, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

  38. arXiv:2102.12871  [pdf, other

    cs.LG

    SparseBERT: Rethinking the Importance Analysis in Self-attention

    Authors: Han Shi, Jiahui Gao, Xiaozhe Ren, Hang Xu, Xiaodan Liang, Zhenguo Li, James T. Kwok

    Abstract: Transformer-based models are popularly used in natural language processing (NLP). Its core component, self-attention, has aroused widespread interest. To understand the self-attention mechanism, a direct method is to visualize the attention map of a pre-trained model. Based on the patterns observed, a series of efficient Transformers with different sparse attention masks have been proposed. From a… ▽ More

    Submitted 1 July, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: Accepted by ICML 2021

  39. arXiv:2011.04406  [pdf, other

    cs.LG

    A Survey of Label-noise Representation Learning: Past, Present and Future

    Authors: Bo Han, Quanming Yao, Tongliang Liu, Gang Niu, Ivor W. Tsang, James T. Kwok, Masashi Sugiyama

    Abstract: Classical machine learning implicitly assumes that labels of the training data are sampled from a clean distribution, which can be too restrictive for real-world scenarios. However, statistical-learning-based methods may not train deep learning models robustly with these noisy labels. Therefore, it is urgent to design Label-Noise Representation Learning (LNRL) methods for robustly training deep mo… ▽ More

    Submitted 20 February, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: The draft is kept updating; any comments and suggestions are welcome

  40. arXiv:2008.06542  [pdf, other

    cs.LG stat.ML

    A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix Completion

    Authors: Yaqing Wang, Quanming Yao, James T. Kwok

    Abstract: Matrix learning is at the core of many machine learning problems. A number of real-world applications such as collaborative filtering and text mining can be formulated as a low-rank matrix completion problem, which recovers incomplete matrix using low-rank assumptions. To ensure that the matrix solution has a low rank, a recent trend is to use nonconvex regularizers that adaptively penalize sing… ▽ More

    Submitted 22 February, 2021; v1 submitted 14 August, 2020; originally announced August 2020.

    Comments: WebConf 2021

  41. arXiv:2006.09117  [pdf, other

    eess.IV cs.CV cs.RO

    End-to-End Real-time Catheter Segmentation with Optical Flow-Guided War** during Endovascular Intervention

    Authors: Anh Nguyen, Dennis Kundrat, Giulio Dagnino, Wenqiang Chi, Mohamed E. M. K. Abdelaziz, Yao Guo, YingLiang Ma, Trevor M. Y. Kwok, Celia Riga, Guang-Zhong Yang

    Abstract: Accurate real-time catheter segmentation is an important pre-requisite for robot-assisted endovascular intervention. Most of the existing learning-based methods for catheter segmentation and tracking are only trained on small-scale datasets or synthetic data due to the difficulties of ground-truth annotation. Furthermore, the temporal continuity in intraoperative imaging sequences is not fully uti… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: ICRA 2020

  42. arXiv:1911.11322  [pdf, other

    cs.LG stat.ML

    Effective Decoding in Graph Auto-Encoder using Triadic Closure

    Authors: Han Shi, Haozheng Fan, James T. Kwok

    Abstract: The (variational) graph auto-encoder and its variants have been popularly used for representation learning on graph-structured data. While the encoder is often a powerful graph convolutional network, the decoder reconstructs the graph structure by only considering two nodes at a time, thus ignoring possible interactions among edges. On the other hand, structured prediction, which considers the who… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

    Comments: Accepted by AAAI 2020

  43. arXiv:1911.09336  [pdf, other

    cs.LG stat.ML

    Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS

    Authors: Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James T. Kwok, Tong Zhang

    Abstract: Neural Architecture Search (NAS) has shown great potentials in finding better neural network designs. Sample-based NAS is the most reliable approach which aims at exploring the search space and evaluating the most promising architectures. However, it is computationally very costly. As a remedy, the one-shot approach has emerged as a popular technique for accelerating NAS using weight-sharing. Howe… ▽ More

    Submitted 24 November, 2020; v1 submitted 21 November, 2019; originally announced November 2019.

    Comments: Accepted by NeurIPS 2020

  44. arXiv:1905.10936  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback

    Authors: Shuai Zheng, Ziyue Huang, James T. Kwok

    Abstract: Communication overhead is a major bottleneck hampering the scalability of distributed machine learning systems. Recently, there has been a surge of interest in using gradient compression to improve the communication efficiency of distributed neural network training. Using 1-bit quantization, signSGD with majority vote achieves a 32x reduction on communication cost. However, its convergence is base… ▽ More

    Submitted 28 October, 2019; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: NeurIPS 2019

  45. arXiv:1905.09899  [pdf, other

    cs.LG math.OC stat.ML

    Blockwise Adaptivity: Faster Training and Better Generalization in Deep Learning

    Authors: Shuai Zheng, James T. Kwok

    Abstract: Stochastic methods with coordinate-wise adaptive stepsize (such as RMSprop and Adam) have been widely used in training deep neural networks. Despite their fast convergence, they can generalize worse than stochastic gradient descent. In this paper, by revisiting the design of Adagrad, we propose to split the network parameters into blocks, and use a blockwise adaptive stepsize. Intuitively, blockwi… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

  46. arXiv:1904.03213  [pdf, ps, other

    cs.DS math.FA math.OC quant-ph

    Spectral analysis of matrix scaling and operator scaling

    Authors: Tsz Chiu Kwok, Lap Chi Lau, Akshay Ramachandran

    Abstract: We present a spectral analysis for matrix scaling and operator scaling. We prove that if the input matrix or operator has a spectral gap, then a natural gradient flow has linear convergence. This implies that a simple gradient descent algorithm also has linear convergence under the same assumption. The spectral gap condition for operator scaling is closely related to the notion of quantum expander… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

  47. General Convolutional Sparse Coding with Unknown Noise

    Authors: Yaqing Wang, James T. Kwok, Lionel M. Ni

    Abstract: Convolutional sparse coding (CSC) can learn representative shift-invariant patterns from multiple kinds of data. However, existing CSC methods can only model noises from Gaussian distribution, which is restrictive and unrealistic. In this paper, we propose a general CSC model capable of dealing with complicated unknown noise. The noise is now modeled by Gaussian mixture model, which can approximat… ▽ More

    Submitted 7 March, 2019; originally announced March 2019.

  48. arXiv:1811.09491  [pdf, other

    cs.LG cs.AI stat.ML

    Differential Private Stack Generalization with an Application to Diabetes Prediction

    Authors: Quanming Yao, Xiawei Guo, James T. Kwok, WeiWei Tu, Yuqiang Chen, Wenyuan Dai, Qiang Yang

    Abstract: To meet the standard of differential privacy, noise is usually added into the original data, which inevitably deteriorates the predicting performance of subsequent learning algorithms. In this paper, motivated by the success of improving predicting performance by ensemble learning, we propose to enhance privacy-preserving logistic regression by stacking. We show that this can be done either by sam… ▽ More

    Submitted 2 June, 2019; v1 submitted 23 November, 2018; originally announced November 2018.

  49. arXiv:1807.08725  [pdf, other

    cs.LG stat.ML

    FasTer: Fast Tensor Completion with Nonconvex Regularization

    Authors: Quanming Yao, James T Kwok, Bo Han

    Abstract: Low-rank tensor completion problem aims to recover a tensor from limited observations, which has many real-world applications. Due to the easy optimization, the convex overlap** nuclear norm has been popularly used for tensor completion. However, it over-penalizes top singular values and lead to biased estimations. In this paper, we propose to use the nonconvex regularizer, which can less penali… ▽ More

    Submitted 23 January, 2019; v1 submitted 23 July, 2018; originally announced July 2018.

  50. arXiv:1806.02927  [pdf, other

    cs.LG math.OC stat.ML

    Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data

    Authors: Shuai Zheng, James T. Kwok

    Abstract: Variance reduction has been commonly used in stochastic optimization. It relies crucially on the assumption that the data set is finite. However, when the data are imputed with random noise as in data augmentation, the perturbed data set be- comes essentially infinite. Recently, the stochastic MISO (S-MISO) algorithm is introduced to address this expected risk minimization problem. Though it conve… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: To appear in ICML 2018