Skip to main content

Showing 1–50 of 59 results for author: Zhan, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16087  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Imperative Learning: A Self-supervised Neural-Symbolic Learning Framework for Robot Autonomy

    Authors: Chen Wang, Kaiyi Ji, Junyi Geng, Zhongqiang Ren, Taimeng Fu, Fan Yang, Yifan Guo, Haonan He, Xiangyu Chen, Zitong Zhan, Qiwei Du, Shaoshu Su, Bowen Li, Yuheng Qiu, Yi Du, Qihang Li, Yifan Yang, Xiao Lin, Zhipeng Zhao

    Abstract: Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeS… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.14359  [pdf, other

    cs.NE

    Learning to Transfer for Evolutionary Multitasking

    Authors: Sheng-Hao Wu, Yuxiao Huang, Xingyu Wu, Liang Feng, Zhi-Hui Zhan, Kay Chen Tan

    Abstract: Evolutionary multitasking (EMT) is an emerging approach for solving multitask optimization problems (MTOPs) and has garnered considerable research interest. The implicit EMT is a significant research branch that utilizes evolution operators to enable knowledge transfer (KT) between tasks. However, current approaches in implicit EMT face challenges in adaptability, due to the use of a limited numbe… ▽ More

    Submitted 22 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: Under review

  3. arXiv:2405.10620  [pdf, other

    cs.AI cs.CL cs.CV

    MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains

    Authors: Zhaohuan Zhan, Lisha Yu, Sijie Yu, Guang Tan

    Abstract: In the Vision-and-Language Navigation (VLN) task, the agent is required to navigate to a destination following a natural language instruction. While learning-based approaches have been a major solution to the task, they suffer from high training costs and lack of interpretability. Recently, Large Language Models (LLMs) have emerged as a promising tool for VLN due to their strong generalization cap… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  4. arXiv:2405.08151  [pdf, other

    cs.CL

    Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness

    Authors: Mingchen Li, Zaifu Zhan, Han Yang, Yongkang Xiao, Jiatan Huang, Rui Zhang

    Abstract: Large language models (LLM) have demonstrated remarkable capabilities in various biomedical natural language processing (NLP) tasks, leveraging the demonstration within the input context to adapt to new tasks. However, LLM is sensitive to the selection of demonstrations. To address the hallucination issue inherent in LLM, retrieval-augmented LLM (RAL) offers a solution by retrieving pertinent info… ▽ More

    Submitted 16 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  5. arXiv:2405.07530  [pdf, other

    cs.SE

    Prompt-based Code Completion via Multi-Retrieval Augmented Generation

    Authors: Hanzhuo Tan, Qi Luo, Ling Jiang, Zizheng Zhan, **g Li, Haotian Zhang, Yuqun Zhang

    Abstract: Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) technique… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  6. arXiv:2403.05016  [pdf, other

    cs.CV

    DiffClass: Diffusion-Based Class Incremental Learning

    Authors: Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi WAng

    Abstract: Class Incremental Learning (CIL) is challenging due to catastrophic forgetting. On top of that, Exemplar-free Class Incremental Learning is even more challenging due to forbidden access to previous task data. Recent exemplar-free CIL methods attempt to mitigate catastrophic forgetting by synthesizing previous task data. However, they fail to overcome the catastrophic forgetting due to the inabilit… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Preprint

  7. arXiv:2402.11423  [pdf, other

    cs.CR eess.SP

    VoltSchemer: Use Voltage Noise to Manipulate Your Wireless Charger

    Authors: Zihao Zhan, Yirui Yang, Haoqi Shan, Hanqiu Wang, Yier **, Shuo Wang

    Abstract: Wireless charging is becoming an increasingly popular charging solution in portable electronic products for a more convenient and safer charging experience than conventional wired charging. However, our research identified new vulnerabilities in wireless charging systems, making them susceptible to intentional electromagnetic interference. These vulnerabilities facilitate a set of novel attack vec… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: This paper has been accepted by the 33rd USENIX Security Symposium

  8. Invisible Finger: Practical Electromagnetic Interference Attack on Touchscreen-based Electronic Devices

    Authors: Haoqi Shan, Boyi Zhang, Zihao Zhan, Dean Sullivan, Shuo Wang, Yier **

    Abstract: Touchscreen-based electronic devices such as smart phones and smart tablets are widely used in our daily life. While the security of electronic devices have been heavily investigated recently, the resilience of touchscreens against various attacks has yet to be thoroughly investigated. In this paper, for the first time, we show that touchscreen-based electronic devices are vulnerable to intentiona… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: This paper has been accepted by 2022 IEEE Symposium on Security and Privacy (SP) and won distinguished paper award

  9. arXiv:2401.12193  [pdf, other

    cs.CR eess.SP

    Programmable EM Sensor Array for Golden-Model Free Run-time Trojan Detection and Localization

    Authors: Hanqiu Wang, Max Panoff, Zihao Zhan, Shuo Wang, Christophe Bobda, Domenic Forte

    Abstract: Side-channel analysis has been proven effective at detecting hardware Trojans in integrated circuits (ICs). However, most detection techniques rely on large external probes and antennas for data collection and require a long measurement time to detect Trojans. Such limitations make these techniques impractical for run-time deployment and ineffective in detecting small Trojans with subtle side-chan… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 6 pages, 5 figures, Accepted at DATE2024

  10. arXiv:2401.06127  [pdf, other

    cs.CV cs.AI cs.LG

    E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

    Authors: Yifan Gong, Zheng Zhan, Qing **, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren

    Abstract: One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models to generate paired datasets used for training generative adversarial networks (GANs). This approach notably alleviates the stringent requirements typically imposed by high-end commercial GPUs for performing image editing with… ▽ More

    Submitted 2 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: ICML 2024. Project Page: https://yifanfanfanfan.github.io/e2gan/

  11. arXiv:2312.02141  [pdf, other

    cs.CV

    iMatching: Imperative Correspondence Learning

    Authors: Zitong Zhan, Dasong Gao, Yun-Jou Lin, Youjie Xia, Chen Wang

    Abstract: Learning feature correspondence is a foundational task in computer vision, holding immense importance for downstream applications such as visual odometry and 3D reconstruction. Despite recent progress in data-driven models, feature correspondence learning is still limited by the lack of accurate per-pixel correspondence labels. To overcome this difficulty, we introduce a new self-supervised scheme… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  12. arXiv:2310.03749  [pdf

    eess.SP cs.AI cs.LG

    SCVCNet: Sliding cross-vector convolution network for cross-task and inter-individual-set EEG-based cognitive workload recognition

    Authors: Qi Wang, Li Chen, Zhiyuan Zhan, Jianhua Zhang, Zhong Yin

    Abstract: This paper presents a generic approach for applying the cognitive workload recognizer by exploiting common electroencephalogram (EEG) patterns across different human-machine tasks and individual sets. We propose a neural network called SCVCNet, which eliminates task- and individual-set-related interferences in EEGs by analyzing finer-grained frequency structures in the power spectral densities. Th… ▽ More

    Submitted 21 September, 2023; originally announced October 2023.

    Comments: 12 pages

  13. arXiv:2309.13035  [pdf, other

    cs.RO

    PyPose v0.6: The Imperative Programming Interface for Robotics

    Authors: Zitong Zhan, Xiangfu Li, Qihang Li, Haonan He, Abhinav Pandey, Haitao Xiao, Yangmengfei Xu, Xiangyu Chen, Kuan Xu, Kun Cao, Zhipeng Zhao, Zihan Wang, Huan Xu, Zihang Fang, Yutian Chen, Wentao Wang, Xu Fang, Yi Du, Tianhao Wu, Xiao Lin, Yuheng Qiu, Fan Yang, **gnan Shi, Shaoshu Su, Yiren Lu , et al. (11 additional authors not shown)

    Abstract: PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, inco… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  14. arXiv:2309.11883  [pdf

    cs.CV cs.RO

    On-the-Fly SfM: What you capture is What you get

    Authors: Zongqian Zhan, Rui Xia, Yifei Yu, Yibo Xu, Xin Wang

    Abstract: Over the last decades, ample achievements have been made on Structure from motion (SfM). However, the vast majority of them basically work in an offline manner, i.e., images are firstly captured and then fed together into a SfM pipeline for obtaining poses and sparse point cloud. In this work, on the contrary, we present an on-the-fly SfM: running online SfM while image capturing, the newly taken… ▽ More

    Submitted 13 February, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

  15. arXiv:2308.10619  [pdf, other

    cs.LG

    centroIDA: Cross-Domain Class Discrepancy Minimization Based on Accumulative Class-Centroids for Imbalanced Domain Adaptation

    Authors: Xiaona Sun, Zhenyu Wu, Yichen Liu, Saier Hu, Zhiqiang Zhan, Yang Ji

    Abstract: Unsupervised Domain Adaptation (UDA) approaches address the covariate shift problem by minimizing the distribution discrepancy between the source and target domains, assuming that the label distribution is invariant across domains. However, in the imbalanced domain adaptation (IDA) scenario, covariate and long-tailed label shifts both exist across domains. To tackle the IDA problem, some current r… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

  16. arXiv:2308.08366  [pdf, other

    cs.LG

    Dual-Branch Temperature Scaling Calibration for Long-Tailed Recognition

    Authors: Jialin Guo, Zhenyu Wu, Zhiqiang Zhan, Yang Ji

    Abstract: The calibration for deep neural networks is currently receiving widespread attention and research. Miscalibration usually leads to overconfidence of the model. While, under the condition of long-tailed distribution of data, the problem of miscalibration is more prominent due to the different confidence levels of samples in minority and majority categories, and it will result in more serious overco… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  17. arXiv:2306.04366  [pdf, other

    cs.SI cs.AI cs.HC cs.LG

    Enhancing Worker Recruitment in Collaborative Mobile Crowdsourcing: A Graph Neural Network Trust Evaluation Approach

    Authors: Zhongwei Zhan, Yingjie Wang, Peiyong Duan, Akshita Maradapu Vera Venkata Sai, Zhaowei Liu, Chaocan Xiang, Xiangrong Tong, Weilong Wang, Zhipeng Cai

    Abstract: Collaborative Mobile Crowdsourcing (CMCS) allows platforms to recruit worker teams to collaboratively execute complex sensing tasks. The efficiency of such collaborations could be influenced by trust relationships among workers. To obtain the asymmetric trust values among all workers in the social network, the Trust Reinforcement Evaluation Framework (TREF) based on Graph Convolutional Neural Netw… ▽ More

    Submitted 21 March, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: The article has been accepted by IEEE TMC, and its DOI is 10.1109/TMC.2024.3373469

  18. arXiv:2305.00380  [pdf, other

    cs.LG

    DualHSIC: HSIC-Bottleneck and Alignment for Continual Learning

    Authors: Zifeng Wang, Zheng Zhan, Yifan Gong, Yucai Shao, Stratis Ioannidis, Yanzhi Wang, Jennifer Dy

    Abstract: Rehearsal-based approaches are a mainstay of continual learning (CL). They mitigate the catastrophic forgetting problem by maintaining a small fixed-size buffer with a subset of data from past tasks. While most rehearsal-based approaches study how to effectively exploit the knowledge from the buffered past data, little attention is paid to the inter-task relationships with the critical task-specif… ▽ More

    Submitted 30 April, 2023; originally announced May 2023.

    Comments: Accepted at ICML 2023 as a conference paper

  19. arXiv:2304.12825  [pdf, other

    q-bio.BM cs.AI cs.LG

    GraphVF: Controllable Protein-Specific 3D Molecule Generation with Variational Flow

    Authors: Fang Sun, Zhihao Zhan, Hongyu Guo, Ming Zhang, Jian Tang

    Abstract: Designing molecules that bind to specific target proteins is a fundamental task in drug discovery. Recent models leverage geometric constraints to generate ligand molecules that bind cohesively with specific protein pockets. However, these models cannot effectively generate 3D molecules with 2D skeletal curtailments and property constraints, which are pivotal to drug potency and development. To ta… ▽ More

    Submitted 23 February, 2023; originally announced April 2023.

    Comments: 15 pages, 8 figures

  20. arXiv:2304.12779  [pdf, ps, other

    cs.DS

    An Approximation Algorithm for Covering Vertices by 4^+-Paths

    Authors: Mingyang Gong, Zhi-Zhong Chen, Guohui Lin, Zhaohui Zhan

    Abstract: This paper deals with the problem of finding a collection of vertex-disjoint paths in a given graph G=(V,E) such that each path has at least four vertices and the total number of vertices in these paths is maximized. The problem is NP-hard and admits an approximation algorithm which achieves a ratio of 2 and runs in O(|V|^8) time. The known algorithm is based on time-consuming local search, and it… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  21. arXiv:2303.03800  [pdf, other

    cs.CV

    Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding

    Authors: Jiacheng Li, Longhui Wei, ZongYuan Zhan, Xin He, Siliang Tang, Qi Tian, Yueting Zhuang

    Abstract: Generative transformers have shown their superiority in synthesizing high-fidelity and high-resolution images, such as good diversity and training stability. However, they suffer from the problem of slow generation since they need to generate a long token sequence autoregressively. To better accelerate the generative transformers while kee** good generation quality, we propose Lformer, a semi-au… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  22. arXiv:2302.03839  [pdf, other

    eess.IV cs.CV cs.LG

    Futuristic Variations and Analysis in Fundus Images Corresponding to Biological Traits

    Authors: Muhammad Hassan, Hao Zhang, Ahmed Fateh Ameen, Home Wu Zeng, Shuye Ma, Wen Liang, Dingqi Shang, Jiaming Ding, Ziheng Zhan, Tsz Kwan Lam, Ming Xu, Qiming Huang, Dongmei Wu, Can Yang Zhang, Zhou You, Awiwu Ain, Pei Wu Qin

    Abstract: Fundus image captures rear of an eye, and which has been studied for the diseases identification, classification, segmentation, generation, and biological traits association using handcrafted, conventional, and deep learning methods. In biological traits estimation, most of the studies have been carried out for the age prediction and gender classification with convincing results. However, the curr… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

    Comments: 10 pages, 4 figures, 3 tables

  23. arXiv:2212.05122  [pdf, other

    cs.LG cs.AI cs.CV

    All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

    Authors: Yifan Gong, Zheng Zhan, Pu Zhao, Yushu Wu, Chao Wu, Caiwen Ding, Weiwen Jiang, Minghai Qin, Yanzhi Wang

    Abstract: During the deployment of deep neural networks (DNNs) on edge devices, many research efforts are devoted to the limited hardware resource. However, little attention is paid to the influence of dynamic power management. As edge devices typically only have a budget of energy with batteries (rather than almost unlimited energy support on servers or workstations), their dynamic power management often c… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  24. arXiv:2211.09108  [pdf, other

    cs.CV

    Robust Online Video Instance Segmentation with Track Queries

    Authors: Zitong Zhan, Daniel McKee, Svetlana Lazebnik

    Abstract: Recently, transformer-based methods have achieved impressive results on Video Instance Segmentation (VIS). However, most of these top-performing methods run in an offline manner by processing the entire video clip at once to predict instance mask volumes. This makes them incapable of handling the long videos that appear in challenging new video instance segmentation datasets like UVO and OVIS. We… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

  25. arXiv:2209.09476  [pdf, other

    cs.LG cs.AI cs.CV

    SparCL: Sparse Continual Learning on the Edge

    Authors: Zifeng Wang, Zheng Zhan, Yifan Gong, Geng Yuan, Wei Niu, Tong Jian, Bin Ren, Stratis Ioannidis, Yanzhi Wang, Jennifer Dy

    Abstract: Existing work in continual learning (CL) focuses on mitigating catastrophic forgetting, i.e., model performance deterioration on past tasks when learning a new task. However, the training efficiency of a CL system is under-investigated, which limits the real-world application of CL systems under resource-limited scenarios. In this work, we propose a novel framework called Sparse Continual Learning… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

    Comments: Published at NeurIPS 2022 as a conference paper

  26. arXiv:2207.12577  [pdf, other

    cs.CV cs.AR cs.LG eess.IV

    Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

    Authors: Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, Yanzhi Wang

    Abstract: Deep learning-based super-resolution (SR) has gained tremendous popularity in recent years because of its high image quality performance and wide application scenarios. However, prior methods typically suffer from large amounts of computations and huge power consumption, causing difficulties for real-time inference, especially on resource-limited platforms such as mobile devices. To mitigate this,… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  27. Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles

    Authors: Hulin Li, Jun Li, Hanbing Wei, Zheng Liu, Zhenfei Zhan, Qiliang Ren

    Abstract: Object detection is a significant downstream task in computer vision. For the on-board edge computing platforms, a giant model is difficult to achieve the real-time detection requirement. And, a lightweight model built from a large number of the depth-wise separable convolution layers cannot achieve the sufficient accuracy. We introduce a new lightweight convolution technique, GSConv, to lighten t… ▽ More

    Submitted 17 August, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 18 pages, 12 figures

  28. arXiv:2205.15045  [pdf

    cs.ET physics.optics

    Intelligent optoelectronic processor for orbital angular momentum spectrum measurement

    Authors: Hao Wang, Ziyu Zhan, Futai Hu, Yuan Meng, Zeqi Liu, Xing Fu, Qiang Liu

    Abstract: Orbital angular momentum (OAM) detection underpins almost all aspects of vortex beams' advances such as communication and quantum analogy. Conventional schemes are frustrated by low speed, complicated system, limited detection range. Here, we devise an intelligent processor composed of photonic and electronic neurons for OAM spectrum measurement in a fast, accurate and direct manner. Specifically,… ▽ More

    Submitted 5 September, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

  29. arXiv:2203.12802  [pdf

    cs.AI

    A platform for causal knowledge representation and inference in industrial fault diagnosis based on cubic DUCG

    Authors: Bu XuSong, Nie Hao, Zhang Zhan, Zhang Qin

    Abstract: The working conditions of large-scale industrial systems are very complex. Once a failure occurs, it will affect industrial production, cause property damage, and even endanger the workers' lives. Therefore, it is important to control the operation of the system to accurately grasp the operation status of the system and find out the failure in time. The occurrence of system failure is a gradual pr… ▽ More

    Submitted 27 March, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

  30. arXiv:2203.05553  [pdf, other

    cs.CV

    Transfer of Representations to Video Label Propagation: Implementation Factors Matter

    Authors: Daniel McKee, Zitong Zhan, Bing Shuai, Davide Modolo, Joseph Tighe, Svetlana Lazebnik

    Abstract: This work studies feature representations for dense label propagation in video, with a focus on recently proposed methods that learn video correspondence using self-supervised signals such as colorization or temporal cycle consistency. In the literature, these methods have been evaluated with an array of inconsistent settings, making it difficult to discern trends or compare performance fairly. St… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  31. arXiv:2111.11581  [pdf, other

    cs.LG cs.AI cs.CV cs.DC

    Automatic Map** of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

    Authors: Yifan Gong, Geng Yuan, Zheng Zhan, Wei Niu, Zhengang Li, Pu Zhao, Yuxuan Cai, Sijia Liu, Bin Ren, Xue Lin, Xulong Tang, Yanzhi Wang

    Abstract: Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices. However, prior pruning schemes have limited application scenarios due to accuracy degradation, difficulty in leveraging hardware acceleration, and/or restriction on certain types of DNN layers. In this paper, we propose a general, fine-gr… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

  32. arXiv:2110.14032  [pdf, other

    cs.LG cs.AI cs.CV cs.NE

    MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

    Authors: Geng Yuan, Xiaolong Ma, Wei Niu, Zhengang Li, Zhenglun Kong, Ning Liu, Yifan Gong, Zheng Zhan, Chaoyang He, Qing **, Siyue Wang, Minghai Qin, Bin Ren, Yanzhi Wang, Sijia Liu, Xue Lin

    Abstract: Recently, a new trend of exploring sparsity for accelerating neural network training has emerged, embracing the paradigm of training on the edge. This paper proposes a novel Memory-Economic Sparse Training (MEST) framework targeting for accurate and fast execution on edge devices. The proposed MEST framework consists of enhancements by Elastic Mutation (EM) and Soft Memory Bound (&S) that ensure s… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021 Spotlight Paper

  33. arXiv:2108.08910  [pdf, other

    eess.IV cs.AI cs.CV cs.LG cs.NE

    Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search

    Authors: Zheng Zhan, Yifan Gong, Pu Zhao, Geng Yuan, Wei Niu, Yushu Wu, Tianyun Zhang, Malith Jayaweera, David Kaeli, Bin Ren, Xue Lin, Yanzhi Wang

    Abstract: Though recent years have witnessed remarkable progress in single image super-resolution (SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep learning methods are confronted with the computation and memory consumption issues in practice, especially for resource-limited platforms such as mobile devices. To overcome the challenge and facilitate the real-time deploymen… ▽ More

    Submitted 14 February, 2023; v1 submitted 18 August, 2021; originally announced August 2021.

  34. arXiv:2012.15531  [pdf, other

    eess.IV cs.CV

    Colonoscopy Polyp Detection: Domain Adaptation From Medical Report Images to Real-time Videos

    Authors: Zhi-Qin Zhan, Huazhu Fu, Yan-Yao Yang, **g**g Chen, Jie Liu, Yu-Gang Jiang

    Abstract: Automatic colorectal polyp detection in colonoscopy video is a fundamental task, which has received a lot of attention. Manually annotating polyp region in a large scale video dataset is time-consuming and expensive, which limits the development of deep learning techniques. A compromise is to train the target model by using labeled images and infer on colonoscopy videos. However, there are several… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

  35. arXiv:2012.00596  [pdf, other

    cs.LG cs.AI cs.CV cs.NE

    NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration

    Authors: Zhengang Li, Geng Yuan, Wei Niu, Pu Zhao, Yanyu Li, Yuxuan Cai, Xuan Shen, Zheng Zhan, Zhenglun Kong, Qing **, Zhiyu Chen, Sijia Liu, Kaiyuan Yang, Bin Ren, Yanzhi Wang, Xue Lin

    Abstract: With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed. Prior methods towards this goal, including model compression and network architecture search (NAS), are largely performed independently and do not fully consider compiler-level optimizations which is a must-do for mobile ac… ▽ More

    Submitted 16 June, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: Accepted as an oral paper in the Conference on Computer Vision and Pattern Recognition (CVPR), 2021

  36. arXiv:2008.01928  [pdf, other

    cs.CV

    Component Divide-and-Conquer for Real-World Image Super-Resolution

    Authors: Pengxu Wei, Ziwei Xie, Hannan Lu, Zongyuan Zhan, Qixiang Ye, Wangmeng Zuo, Liang Lin

    Abstract: In this paper, we present a large-scale Diverse Real-world image Super-Resolution dataset, i.e., DRealSR, as well as a divide-and-conquer Super-Resolution (SR) network, exploring the utility of guiding SR model with low-level image components. DRealSR establishes a new SR benchmark with diverse real-world degradation processes, mitigating the limitations of conventional simulated image degradation… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

    Journal ref: European Conference on Computer Vision (ECCV), 2020

  37. arXiv:2007.14592  [pdf

    cs.CV

    A SLAM Map Restoration Algorithm Based on Submaps and an Undirected Connected Graph

    Authors: Zongqian Zhan, Wenjie Jian, Yihui Li, Xin Wang, Yang Yue

    Abstract: Many visual simultaneous localization and map** (SLAM) systems have been shown to be accurate and robust, and have real-time performance capabilities on both indoor and ground datasets. However, these methods can be problematic when dealing with aerial frames captured by a camera mounted on an unmanned aerial vehicle (UAV) because the flight height of the UAV can be difficult to control and is e… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

  38. arXiv:2007.03860  [pdf

    cs.CL

    Research on multi-dimensional end-to-end phrase recognition algorithm based on background knowledge

    Authors: Zheng Li, Gang Tu, Guang Liu, Zhi-Qiang Zhan, Yi-Jian Liu

    Abstract: At present, the deep end-to-end method based on supervised learning is used in entity recognition and dependency analysis. There are two problems in this method: firstly, background knowledge cannot be introduced; secondly, multi granularity and nested features of natural language cannot be recognized. In order to solve these problems, the annotation rules based on phrase window are proposed, and… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: in Chinese language

  39. arXiv:2005.01278  [pdf, other

    cs.CL cs.AI cs.LG

    A New Data Normalization Method to Improve Dialogue Generation by Minimizing Long Tail Effect

    Authors: Zhiqiang Zhan, Zifeng Hou, Yang Zhang

    Abstract: Recent neural models have shown significant progress in dialogue generation. Most generation models are based on language models. However, due to the Long Tail Phenomenon in linguistics, the trained models tend to generate words that appear frequently in training datasets, leading to a monotonous issue. To address this issue, we analyze a large corpus from Wikipedia and propose three frequency-bas… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

  40. arXiv:2004.11250  [pdf, other

    cs.LG cs.CV cs.MM

    Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

    Authors: Wei Niu, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren

    Abstract: High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniq… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

    Comments: accepted by the IJCAI-PRICAI 2020 Demonstrations Track

  41. arXiv:2004.07561  [pdf

    cs.NE

    AMPSO: Artificial Multi-Swarm Particle Swarm Optimization

    Authors: Haohao Zhou, Zhi-Hui Zhan, Zhi-Xin Yang, Xiangzhi Wei

    Abstract: In this paper we propose a novel artificial multi-swarm PSO which consists of an exploration swarm, an artificial exploitation swarm and an artificial convergence swarm. The exploration swarm is a set of equal-sized sub-swarms randomly distributed around the particles space, the exploitation swarm is artificially generated from a perturbation of the best particle of exploration swarm for a fixed p… ▽ More

    Submitted 21 June, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

  42. arXiv:2004.05531  [pdf, other

    cs.LG cs.CV cs.NE

    A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods

    Authors: Tianyun Zhang, Xiaolong Ma, Zheng Zhan, Shanglin Zhou, Minghai Qin, Fei Sun, Yen-Kuang Chen, Caiwen Ding, Makan Fardad, Yanzhi Wang

    Abstract: To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i.e., static regularization-based pruning and dynamic regularization-based pruning. However, the former method currently suffers either complex workloads or accuracy degradation, while the latter one takes a long… ▽ More

    Submitted 11 April, 2020; originally announced April 2020.

  43. arXiv:2003.06745  [pdf, other

    cs.CV cs.CL

    Vision-Dialog Navigation by Exploring Cross-modal Memory

    Authors: Yi Zhu, Fengda Zhu, Zhaohuan Zhan, Bingqian Lin, Jianbin Jiao, Xiaojun Chang, Xiaodan Liang

    Abstract: Vision-dialog navigation posed as a new holy-grail task in vision-language disciplinary targets at learning an agent endowed with the capability of constant conversation for help with natural language and navigating according to human responses. Besides the common challenges faced in visual language navigation, vision-dialog navigation also requires to handle well with the language intentions of a… ▽ More

    Submitted 14 March, 2020; originally announced March 2020.

    Comments: CVPR2020

  44. arXiv:2003.06513  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework

    Authors: Yifan Gong, Zheng Zhan, Zhengang Li, Wei Niu, Xiaolong Ma, Wenhao Wang, Bin Ren, Caiwen Ding, Xue Lin, Xiaolin Xu, Yanzhi Wang

    Abstract: Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices. However, previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data. To mitigate this concern, we propose a privacy-preserving-oriented pruning and mobile acceleration framewor… ▽ More

    Submitted 16 September, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

  45. arXiv:2001.08839  [pdf, other

    cs.LG cs.CV cs.NE

    SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency

    Authors: Zhengang Li, Yifan Gong, Xiaolong Ma, Sijia Liu, Mengshu Sun, Zheng Zhan, Zhenglun Kong, Geng Yuan, Yanzhi Wang

    Abstract: Structured weight pruning is a representative model compression technique of DNNs for hardware efficiency and inference accelerations. Previous works in this area leave great space for improvement since sparse structures with combinations of different structured pruning schemes are not exploited fully and efficiently. To mitigate the limitations, we propose SS-Auto, a single-shot, automatic struct… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

  46. arXiv:2001.08357  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted Regularization Method

    Authors: Xiaolong Ma, Zhengang Li, Yifan Gong, Tianyun Zhang, Wei Niu, Zheng Zhan, Pu Zhao, Jian Tang, Xue Lin, Bin Ren, Yanzhi Wang

    Abstract: Accelerating DNN execution on various resource-limited computing platforms has been a long-standing problem. Prior works utilize l1-based group lasso or dynamic regularization such as ADMM to perform structured pruning on DNN models to leverage the parallel computing architectures. However, both of the pruning dimensions and pruning methods lack universality, which leads to degraded performance an… ▽ More

    Submitted 21 February, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

  47. arXiv:1912.00215  [pdf, other

    cs.LG cs.CV stat.ML

    Probing the State of the Art: A Critical Look at Visual Representation Evaluation

    Authors: Cinjon Resnick, Ze** Zhan, Joan Bruna

    Abstract: Self-supervised research improved greatly over the past half decade, with much of the growth being driven by objectives that are hard to quantitatively compare. These techniques include colorization, cyclical consistency, and noise-contrastive estimation from image patches. Consequently, the field has settled on a handful of measurements that depend on linear probes to adjudicate which approaches… ▽ More

    Submitted 12 August, 2021; v1 submitted 30 November, 2019; originally announced December 2019.

    Comments: erd59xH@Rqt!XCMsCmnz

  48. arXiv:1909.07726  [pdf

    cs.CV

    Building Change Detection for Remote Sensing Images Using a Dual Task Constrained Deep Siamese Convolutional Network Model

    Authors: Yi Liu, Chao Pang, Zongqian Zhan, Xiaomeng Zhang, Xue Yang

    Abstract: In recent years, building change detection methods have made great progress by introducing deep learning, but they still suffer from the problem of the extracted features not being discriminative enough, resulting in incomplete regions and irregular boundaries. To tackle this problem, we propose a dual task constrained deep Siamese convolutional network (DTCDSCN) model, which contains three sub-ne… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

  49. arXiv:1905.08205  [pdf, other

    cs.CL

    Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation

    Authors: Jiaqi Guo, Zecheng Zhan, Yan Gao, Yan Xiao, Jian-Guang Lou, Ting Liu, Dongmei Zhang

    Abstract: We present a neural approach called IRNet for complex and cross-domain Text-to-SQL. IRNet aims to address two challenges: 1) the mismatch between intents expressed in natural language (NL) and the implementation details in SQL; 2) the challenge in predicting columns caused by the large number of out-of-domain words. Instead of end-to-end synthesizing a SQL query, IRNet decomposes the synthesis pro… ▽ More

    Submitted 28 May, 2019; v1 submitted 20 May, 2019; originally announced May 2019.

    Comments: To appear in ACL 2019

  50. arXiv:1812.03125  [pdf, other

    cs.HC cs.AI

    Taking the Scenic Route: Automatic Exploration for Videogames

    Authors: Ze** Zhan, Batu Aytemiz, Adam M. Smith

    Abstract: Machine playtesting tools and game moment search engines require exposure to the diversity of a game's state space if they are to report on or index the most interesting moments of possible play. Meanwhile, mobile app distribution services would like to quickly determine if a freshly-uploaded game is fit to be published. Having access to a semantic map of reachable states in the game would enable… ▽ More

    Submitted 7 December, 2018; originally announced December 2018.