Skip to main content

Showing 1–30 of 30 results for author: Jui, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.02791  [pdf, other

    cs.CL cs.AI cs.LG

    Rethinking Optimization and Architecture for Tiny Language Models

    Authors: Yehui Tang, Fangcheng Liu, Yunsheng Ni, Yuchuan Tian, Zheyuan Bai, Yi-Qi Hu, Sichao Liu, Shangling Jui, Kai Han, Yunhe Wang

    Abstract: The power of large language models (LLMs) has been demonstrated through numerous data and computing resources. However, the application of language models on mobile devices is facing huge challenge on the computation and memory costs, that is, tiny language models with high performance are urgently required. Limited by the highly complex training process, there are many details for optimizing lang… ▽ More

    Submitted 6 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  2. arXiv:2312.15300  [pdf, other

    cs.CV

    Q-Boost: On Visual Quality Assessment Ability of Low-level Multi-Modality Foundation Models

    Authors: Zicheng Zhang, Haoning Wu, Zhongpeng Ji, Chunyi Li, Erli Zhang, Wei Sun, Xiaohong Liu, Xiongkuo Min, Fengyu Sun, Shangling Jui, Weisi Lin, Guangtao Zhai

    Abstract: Recent advancements in Multi-modality Large Language Models (MLLMs) have demonstrated remarkable capabilities in complex high-level vision tasks. However, the exploration of MLLM potential in visual quality assessment, a vital aspect of low-level vision, remains limited. To address this gap, we introduce Q-Boost, a novel strategy designed to enhance low-level MLLMs in image quality assessment (IQA… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  3. arXiv:2312.15246  [pdf, other

    cs.LG cs.AI math.NA math.PR

    A Theory of Non-Acyclic Generative Flow Networks

    Authors: Leo Maxime Brunswic, Yinchuan Li, Yushun Xu, Shangling Jui, Lizhuang Ma

    Abstract: GFlowNets is a novel flow-based method for learning a stochastic policy to generate objects via a sequence of actions and with probability proportional to a given positive reward. We contribute to relaxing hypotheses limiting the application range of GFlowNets, in particular: acyclicity (or lack thereof). To this end, we extend the theory of GFlowNets on measurable spaces which includes continuous… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: 16 pages, 8 figures, 1 table, AAAI 2024

    MSC Class: 68T07; 68T20; 60J05; 60J20; 60J22; 65F45; 65J20; 68T05; 68T20

  4. arXiv:2312.05476  [pdf, other

    cs.CV

    Exploring the Naturalness of AI-Generated Images

    Authors: Zijian Chen, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhongpeng Ji, Fengyu Sun, Shangling Jui, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

    Abstract: The proliferation of Artificial Intelligence-Generated Images (AGIs) has greatly expanded the Image Naturalness Assessment (INA) problem. Different from early definitions that mainly focus on tone-mapped images with limited distortions (e.g., exposure, contrast, and color reproduction), INA on AI-generated images is especially challenging as it has more diverse contents and could be affected by fa… ▽ More

    Submitted 4 March, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: 33 pages

  5. arXiv:2309.00528  [pdf, other

    cs.CV

    Trust your Good Friends: Source-free Domain Adaptation by Reciprocal Neighborhood Clustering

    Authors: Shiqi Yang, Yaxing Wang, Joost van de Weijer, Luis Herranz, Shangling Jui, Jian Yang

    Abstract: Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g. due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absen… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted by IEEE TPAMI, extended version of conference paper arXiv:2110.04202

  6. arXiv:2308.07641  [pdf, other

    cs.LG cs.AI

    Ternary Singular Value Decomposition as a Better Parameterized Form in Linear Map**

    Authors: Boyu Chen, Hanxuan Chen, Jiao He, Fengyu Sun, Shangling Jui

    Abstract: We present a simple yet novel parameterized form of linear map** to achieves remarkable network compression performance: a pseudo SVD called Ternary SVD (TSVD). Unlike vanilla SVD, TSVD limits the $U$ and $V$ matrices in SVD to ternary matrices form in $\{\pm 1, 0\}$. This means that instead of using the expensive multiplication instructions, TSVD only requires addition instructions when compu… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  7. arXiv:2303.02733  [pdf, other

    cs.LG cs.AI cs.CV

    Reparameterization through Spatial Gradient Scaling

    Authors: Alexander Detkov, Mohammad Salameh, Muhammad Fetrat Qharabagh, Jialin Zhang, Wei Lui, Shangling Jui, Di Niu

    Abstract: Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. However, there exists a gap in understanding how reparameterization may change and benefit the learning process of neural networks. In this paper, we present a novel spatial gradient scaling method to redistribute learning foc… ▽ More

    Submitted 6 March, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

    Comments: Published at ICLR 2023. Code available at https://github.com/Ascend-Research/Reparameterization

  8. A General-Purpose Transferable Predictor for Neural Architecture Search

    Authors: Fred X. Han, Keith G. Mills, Fabian Chudak, Parsa Riahi, Mohammad Salameh, Jialin Zhang, Wei Lu, Shangling Jui, Di Niu

    Abstract: Understanding and modelling the performance of neural architectures is key to Neural Architecture Search (NAS). Performance predictors have seen widespread use in low-cost NAS and achieve high ranking correlations between predicted and ground truth performance in several NAS benchmarks. However, existing predictors are often designed based on network encodings specific to a predefined search space… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted to SDM2023; version includes supplementary material; 12 Pages, 3 Figures, 6 Tables

  9. AIO-P: Expanding Neural Performance Predictors Beyond Image Classification

    Authors: Keith G. Mills, Di Niu, Mohammad Salameh, Weichen Qiu, Fred X. Han, Puyuan Liu, Jialin Zhang, Wei Lu, Shangling Jui

    Abstract: Evaluating neural network performance is critical to deep neural network design but a costly procedure. Neural predictors provide an efficient solution by treating architectures as samples and learning to estimate their performance on a given task. However, existing predictors are task-dependent, predominantly estimating neural network performance on image classification benchmarks. They are also… ▽ More

    Submitted 24 April, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: AAAI 2023 Oral Presentation; version includes supplementary material; 16 Pages, 4 Figures, 22 Tables

  10. GENNAPE: Towards Generalized Neural Architecture Performance Estimators

    Authors: Keith G. Mills, Fred X. Han, Jialin Zhang, Fabian Chudak, Ali Safari Mamaghani, Mohammad Salameh, Wei Lu, Shangling Jui, Di Niu

    Abstract: Predicting neural architecture performance is a challenging task and is crucial to neural architecture design and search. Existing approaches either rely on neural performance predictors which are limited to modeling architectures in a predefined design space involving specific sets of operators and connection rules, and cannot generalize to unseen architectures, or resort to zero-cost proxies whi… ▽ More

    Submitted 24 April, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: AAAI 2023 Oral Presentation; includes supplementary materials with more details on introduced benchmarks; 14 Pages, 6 Figures, 10 Tables

  11. arXiv:2210.01600  [pdf, other

    cs.CV

    Positive Pair Distillation Considered Harmful: Continual Meta Metric Learning for Lifelong Object Re-Identification

    Authors: Kai Wang, Chenshen Wu, Andy Bagdanov, Xialei Liu, Shiqi Yang, Shangling Jui, Joost van de Weijer

    Abstract: Lifelong object re-identification incrementally learns from a stream of re-identification tasks. The objective is to learn a representation that can be applied to all tasks and that generalizes to previously unseen re-identification tasks. The main challenge is that at inference time the representation must generalize to previously unseen identities. To address this problem, we apply continual met… ▽ More

    Submitted 4 October, 2022; originally announced October 2022.

    Comments: BMVC 2022

  12. arXiv:2206.03600  [pdf, other

    cs.CV cs.LG

    OneRing: A Simple Method for Source-free Open-partial Domain Adaptation

    Authors: Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost van de Weijer

    Abstract: In this paper, we investigate Source-free Open-partial Domain Adaptation (SF-OPDA), which addresses the situation where there exist both domain and category shifts between source and target domains. Under the SF-OPDA setting, which aims to address data privacy concerns, the model cannot access source data anymore during target adaptation. We propose a novel training scheme to learn a (n+1)-way cla… ▽ More

    Submitted 11 January, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: Updated. It only focuses on source-free open-partial domain adaptation, to avoid any potential misunderstanding

  13. arXiv:2205.06454  [pdf, other

    cs.AI

    R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning

    Authors: Shengyao Lu, Bang Liu, Keith G. Mills, Shangling Jui, Di Niu

    Abstract: Systematicity, i.e., the ability to recombine known parts and rules to form new sequences while reasoning over relational data, is critical to machine intelligence. A model with strong systematicity is able to train on small-scale tasks and generalize to large-scale tasks. In this paper, we propose R5, a relational reasoning framework based on reinforcement learning that reasons over relational gr… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: ICLR 2022 Spotlight

  14. arXiv:2205.04183  [pdf, other

    cs.CV cs.LG

    Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation

    Authors: Shiqi Yang, Yaxing Wang, Kai Wang, Shangling Jui, Joost van de Weijer

    Abstract: We propose a simple but effective source-free domain adaptation (SFDA) method. Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency. This objective encourages local neighborhood features in feature space to have sim… ▽ More

    Submitted 3 October, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

  15. arXiv:2111.04993  [pdf, other

    cs.CV

    Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition

    Authors: Kai Wang, Xialei Liu, Andy Bagdanov, Luis Herranz, Shangling Jui, Joost van de Weijer

    Abstract: Most meta-learning approaches assume the existence of a very large set of labeled data available for episodic meta-learning of base knowledge. This contrasts with the more realistic continual learning paradigm in which data arrives incrementally in the form of tasks containing disjoint classes. In this paper we consider this problem of Incremental Meta-Learning (IML) in which classes are presented… ▽ More

    Submitted 11 November, 2021; v1 submitted 9 November, 2021; originally announced November 2021.

  16. arXiv:2110.08896  [pdf, other

    cs.LG math.OC

    Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

    Authors: Ke Sun, Yafei Wang, Yi Liu, Yingnan Zhao, Bo Pan, Shangling Jui, Bei Jiang, Linglong Kong

    Abstract: Anderson mixing has been heuristically applied to reinforcement learning (RL) algorithms for accelerating convergence and improving the sampling efficiency of deep RL. Despite its heuristic improvement of convergence, a rigorous mathematical justification for the benefits of Anderson mixing in RL has not yet been put forward. In this paper, we provide deeper insights into a class of acceleration s… ▽ More

    Submitted 20 October, 2021; v1 submitted 17 October, 2021; originally announced October 2021.

  17. arXiv:2110.04202  [pdf, other

    cs.CV cs.LG

    Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation

    Authors: Shiqi Yang, Yaxing Wang, Joost van de Weijer, Luis Herranz, Shangling Jui

    Abstract: Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g. due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absen… ▽ More

    Submitted 29 November, 2021; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021. Fix four number errors in Tab.5 (first two tables, row 3 and 4) and corresponding text

  18. Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search

    Authors: Keith G. Mills, Fred X. Han, Jialin Zhang, Seyed Saeed Changiz Rezaei, Fabian Chudak, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

    Abstract: Neural architecture search automates neural network design and has achieved state-of-the-art results in many deep learning applications. While recent literature has focused on designing networks to maximize accuracy, little work has been conducted to understand the compatibility of architecture design spaces to varying hardware. In this paper, we analyze the neural blocks used to build Once-for-Al… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    Comments: Accepted as an Applied Research Paper at CIKM 2021; 10 pages, 8 Figures, 2 Tables

  19. L$^{2}$NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning

    Authors: Keith G. Mills, Fred X. Han, Mohammad Salameh, Seyed Saeed Changiz Rezaei, Linglong Kong, Wei Lu, Shuo Lian, Shangling Jui, Di Niu

    Abstract: Neural architecture search (NAS) has achieved remarkable results in deep neural network design. Differentiable architecture search converts the search over discrete architectures into a hyperparameter optimization problem which can be solved by gradient descent. However, questions have been raised regarding the effectiveness and generalizability of gradient methods for solving non-convex architect… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    Comments: Accepted as a Full Research Paper at CIKM 2021; 10 pages, 3 Figures, 5 Tables

  20. arXiv:2109.08776  [pdf, other

    cs.LG

    Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State Observations

    Authors: Ke Sun, Yingnan Zhao, Shangling Jui, Linglong Kong

    Abstract: In real scenarios, state observations that an agent observes may contain measurement errors or adversarial noises, misleading the agent to take suboptimal actions or even collapse while training. In this paper, we study the training robustness of distributional Reinforcement Learning (RL), a class of state-of-the-art methods that estimate the whole distribution, as opposed to only the expectation,… ▽ More

    Submitted 21 June, 2023; v1 submitted 17 September, 2021; originally announced September 2021.

    Comments: Accepted in ECML PKDD 2023. This is the authors version of the work. The definitive Version of Record will be published in the Proceedings of ECML PKDD 2023

  21. arXiv:2108.01614  [pdf, other

    cs.CV

    Generalized Source-free Domain Adaptation

    Authors: Shiqi Yang, Yaxing Wang, Joost van de Weijer, Luis Herranz, Shangling Jui

    Abstract: Domain adaptation (DA) aims to transfer the knowledge learned from a source domain to an unlabeled target domain. Some recent works tackle source-free domain adaptation (SFDA) where only a source pre-trained model is available for adaptation to the target domain. However, those methods do not consider kee** source performance which is of high practical value in real world applications. In this p… ▽ More

    Submitted 8 October, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: Accepted by ICCV 2021; Update the acknowledgement section

  22. arXiv:2105.09356  [pdf, other

    cs.LG cs.CV

    Generative Adversarial Neural Architecture Search

    Authors: Seyed Saeed Changiz Rezaei, Fred X. Han, Di Niu, Mohammad Salameh, Keith Mills, Shuo Lian, Wei Lu, Shangling Jui

    Abstract: Despite the empirical success of neural architecture search (NAS) in deep learning applications, the optimality, reproducibility and cost of NAS schemes remain hard to assess. In this paper, we propose Generative Adversarial NAS (GA-NAS) with theoretically provable convergence guarantees, promoting stability and reproducibility in neural architecture search. Inspired by importance sampling, GA-NAS… ▽ More

    Submitted 23 June, 2021; v1 submitted 19 May, 2021; originally announced May 2021.

    Comments: 17 pages, 9 figures, 13 Tables

  23. arXiv:2104.13742  [pdf, other

    cs.CV

    MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains

    Authors: Yaxing Wang, Abel Gonzalez-Garcia, Chenshen Wu, Luis Herranz, Fahad Shahbaz Khan, Shangling Jui, Joost van de Weijer

    Abstract: GANs largely increases the potential impact of generative models. Therefore, we propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs. This is done using a miner network that identifies which part of the generative distribution of each pretrained GAN outputs… ▽ More

    Submitted 4 December, 2023; v1 submitted 28 April, 2021; originally announced April 2021.

    Comments: accepted at IJCV. arXiv admin note: substantial text overlap with arXiv:1912.05270

  24. arXiv:2010.12427  [pdf, other

    cs.CV

    Casting a BAIT for Offline and Online Source-free Domain Adaptation

    Authors: Shiqi Yang, Yaxing Wang, Joost van de Weijer, Luis Herranz, Shangling Jui

    Abstract: We address the source-free domain adaptation (SFDA) problem, where only the source model is available during adaptation to the target domain. We consider two settings: the offline setting where all target data can be visited multiple times (epochs) to arrive at a prediction for each target sample, and the online setting where the target data needs to be directly classified upon arrival. Inspired b… ▽ More

    Submitted 10 June, 2023; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: Accepted by Computer Vision and Image Understanding

  25. Neural Architecture Search For Keyword Spotting

    Authors: Tong Mo, Yakun Yu, Mohammad Salameh, Di Niu, Shangling Jui

    Abstract: Deep neural networks have recently become a popular solution to keyword spotting systems, which enable the control of smart devices via voice. In this paper, we apply neural architecture search to search for convolutional neural network models that can help boost the performance of keyword spotting based on features extracted from acoustic signals while maintaining an acceptable memory footprint.… ▽ More

    Submitted 2 September, 2020; v1 submitted 31 August, 2020; originally announced September 2020.

    Comments: will be presented in INTERSPEECH 2020

    Journal ref: Proc. Interspeech 2020, 1982-1986

  26. arXiv:2004.09199  [pdf, other

    cs.CV cs.LG

    Generative Feature Replay For Class-Incremental Learning

    Authors: Xialei Liu, Chenshen Wu, Mikel Menta, Luis Herranz, Bogdan Raducanu, Andrew D. Bagdanov, Shangling Jui, Joost van de Weijer

    Abstract: Humans are capable of learning new tasks without forgetting previous ones, while neural networks fail due to catastrophic forgetting between new and previously-learned tasks. We consider a class-incremental setting which means that the task-ID is unknown at inference time. The imbalance between old and new classes typically results in a bias of the network towards the newest ones. This imbalance p… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: Accepted at CVPR2020: Workshop on Continual Learning in Computer Vision

  27. arXiv:2004.00440  [pdf, other

    cs.CV cs.LG

    Semantic Drift Compensation for Class-Incremental Learning

    Authors: Lu Yu, Bartłomiej Twardowski, Xialei Liu, Luis Herranz, Kai Wang, Yongmei Cheng, Shangling Jui, Joost van de Weijer

    Abstract: Class-incremental learning of deep networks sequentially increases the number of classes to be classified. During training, the network has only access to data of one task at a time, where each task contains several classes. In this setting, networks suffer from catastrophic forgetting which refers to the drastic drop in performance on previous tasks. The vast majority of methods have studied this… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: Accepted at CVPR2020, Code available at \url{https://github.com/yulu0724/SDC-IL}

  28. arXiv:1910.01523  [pdf, other

    cs.LG cs.NE

    ReNAS:Relativistic Evaluation of Neural Architecture Search

    Authors: Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chun**g Xu, Chang Xu

    Abstract: An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS). To save computational cost, most of existing NAS algorithms often train and evaluate intermediate neural architectures on a small proxy dataset with limited training epochs. But it is difficult to expect an accurate performance estimation of an architecture in suc… ▽ More

    Submitted 22 September, 2021; v1 submitted 29 September, 2019; originally announced October 2019.

    Comments: CVPR 2021, Oral

  29. arXiv:1904.00775  [pdf, other

    cs.CV cs.LG stat.ML

    Deep Demosaicing for Edge Implementation

    Authors: Ramchalam Kinattinkara Ramakrishnan, Shangling Jui, Vahid Patrovi Nia

    Abstract: Most digital cameras use sensors coated with a Color Filter Array (CFA) to capture channel components at every pixel location, resulting in a mosaic image that does not contain pixel values in all channels. Current research on reconstructing these missing channels, also known as demosaicing, introduces many artifacts, such as zipper effect and false color. Many deep learning demosaicing techniques… ▽ More

    Submitted 23 May, 2019; v1 submitted 26 March, 2019; originally announced April 2019.

    Comments: Accepted in the 16th International Conference of Image Analysis and Recognition (ICIAR 2019)

  30. arXiv:1903.08606  [pdf, other

    cs.AI

    Single-step Options for Adversary Driving

    Authors: Nazmus Sakib, Hengshuai Yao, Hong Zhang, Shangling Jui

    Abstract: In this paper, we use reinforcement learning for safety driving in adversary settings. In our work, the knowledge in state-of-art planning methods is reused by single-step options whose action suggestions are compared in parallel with primitive actions. We show two advantages by doing so. First, training this reinforcement learning agent is easier and faster than training the primitive-action agen… ▽ More

    Submitted 28 November, 2019; v1 submitted 20 March, 2019; originally announced March 2019.