Skip to main content

Showing 1–42 of 42 results for author: Pang, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10100  [pdf, other

    cs.CV cs.AI

    SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding

    Authors: Junwei Luo, Zhen Pang, Yongjun Zhang, Tingzhu Wang, Linlin Wang, Bo Dang, Jiangwei Lao, Jian Wang, **gdong Chen, Yihua Tan, Yansheng Li

    Abstract: Remote Sensing Large Multi-Modal Models (RSLMMs) are develo** rapidly and showcase significant capabilities in remote sensing imagery (RSI) comprehension. However, due to the limitations of existing datasets, RSLMMs have shortcomings in understanding the rich semantic relations among objects in complex remote sensing scenes. To unlock RSLMMs' complex comprehension ability, we propose a large-sca… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 30 pages, 5 figures, 19 tables, dataset and code see https://github.com/Luo-Z13/SkySenseGPT

  2. arXiv:2406.08476  [pdf, other

    cs.CV cs.AI

    RMem: Restricted Memory Banks Improve Video Object Segmentation

    Authors: Junbao Zhou, Ziqi Pang, Yu-Xiong Wang

    Abstract: With recent video object segmentation (VOS) benchmarks evolving to challenging scenarios, we revisit a simple but overlooked strategy: restricting the size of memory banks. This diverges from the prevalent practice of expanding memory banks to accommodate extensive historical information. Our specially designed "memory deciphering" study offers a pivotal insight underpinning such a strategy: expan… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: CVPR 2024, Project Page: https://restricted-memory.github.io/

  3. arXiv:2403.09993  [pdf, other

    cs.CV eess.IV

    TRG-Net: An Interpretable and Controllable Rain Generator

    Authors: Zhiqiang Pang, Hong Wang, Qi Xie, Deyu Meng, Zongben Xu

    Abstract: Exploring and modeling rain generation mechanism is critical for augmenting paired data to ease training of rainy image processing models. Against this task, this study proposes a novel deep learning based rain generator, which fully takes the physical generation mechanism underlying rains into consideration and well encodes the learning of the fundamental rain factors (i.e., shape, orientation, l… ▽ More

    Submitted 29 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  4. arXiv:2402.17486  [pdf, other

    cs.CV

    MGE: A Training-Free and Efficient Model Generation and Enhancement Scheme

    Authors: Xuan Wang, Zeshan Pang, Yuliang Lu, Xuehu Yan

    Abstract: To provide a foundation for the research of deep learning models, the construction of model pool is an essential step. This paper proposes a Training-Free and Efficient Model Generation and Enhancement Scheme (MGE). This scheme primarily considers two aspects during the model generation process: the distribution of model parameters and model performance. Experiments result shows that generated mod… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  5. arXiv:2402.12770  [pdf, other

    cs.CL

    Acknowledgment of Emotional States: Generating Validating Responses for Empathetic Dialogue

    Authors: Zi Haur Pang, Yahui Fu, Divesh Lala, Keiko Ochi, Koji Inoue, Tatsuya Kawahara

    Abstract: In the realm of human-AI dialogue, the facilitation of empathetic responses is important. Validation is one of the key communication techniques in psychology, which entails recognizing, understanding, and acknowledging others' emotional states, thoughts, and actions. This study introduces the first framework designed to engender empathetic dialogue with validating responses. Our approach incorpora… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: This paper has been accepted for presentation at International Workshop on Spoken Dialogue Systems Technology 2024 (IWSDS 2024)

  6. arXiv:2311.03194  [pdf

    cs.CV

    Few-shot Learning using Data Augmentation and Time-Frequency Transformation for Time Series Classification

    Authors: Hao Zhang, Zhendong Pang, Jiangpeng Wang, Teng Li

    Abstract: Deep neural networks (DNNs) that tackle the time series classification (TSC) task have provided a promising framework in signal processing. In real-world applications, as a data-driven model, DNNs are suffered from insufficient data. Few-shot learning has been studied to deal with this limitation. In this paper, we propose a novel few-shot learning framework through data augmentation, which involv… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  7. arXiv:2310.12973  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Frozen Transformers in Language Models Are Effective Visual Encoder Layers

    Authors: Ziqi Pang, Ziyang Xie, Yunze Man, Yu-Xiong Wang

    Abstract: This paper reveals that large language models (LLMs), despite being trained solely on textual data, are surprisingly strong encoders for purely visual tasks in the absence of language. Even more intriguingly, this can be achieved by a simple yet previously overlooked strategy -- employing a frozen transformer block from pre-trained LLMs as a constituent encoder layer to directly process visual tok… ▽ More

    Submitted 6 May, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 Spotlight. 23 pages, 13 figures. Code at https://github.com/ziqipang/LM4VisualEncoding

  8. arXiv:2310.07405  [pdf, ps, other

    cs.IT eess.SP

    IRS Assisted Federated Learning A Broadband Over-the-Air Aggregation Approach

    Authors: Deyou Zhang, Ming Xiao, Zhibo Pang, Lihui Wang, H. Vincent Poor

    Abstract: We consider a broadband over-the-air computation empowered model aggregation approach for wireless federated learning (FL) systems and propose to leverage an intelligent reflecting surface (IRS) to combat wireless fading and noise. We first investigate the conventional node-selection based framework, where a few edge nodes are dropped in model aggregation to control the aggregation error. We analy… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: This paper has been accepted by IEEE Transactions on Wireless Communications

  9. arXiv:2310.01351  [pdf, other

    cs.CV

    Streaming Motion Forecasting for Autonomous Driving

    Authors: Ziqi Pang, Deva Ramanan, Mengtian Li, Yu-Xiong Wang

    Abstract: Trajectory forecasting is a widely-studied problem for autonomous navigation. However, existing benchmarks evaluate forecasting based on independent snapshots of trajectories, which are not representative of real-world applications that operate on a continuous stream of data. To bridge this gap, we introduce a benchmark that continuously queries future trajectories on streaming data and we refer t… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: IROS 2023, 8 pages, 9 figures

  10. arXiv:2310.00033  [pdf

    cs.RO physics.app-ph

    OriWheelBot: An origami-wheeled robot

    Authors: Jie Liu, Zufeng Pang, Zhiyong Li, Guilin Wen, Zhoucheng Su, Junfeng He, Kaiyue Liu, Dezheng Jiang, Zenan Li, Shouyan Chen, Yang Tian, Yi Min Xie, Zhenpei Wang, Zhuangjian Liu

    Abstract: Origami-inspired robots with multiple advantages, such as being lightweight, requiring less assembly, and exhibiting exceptional deformability, have received substantial and sustained attention. However, the existing origami-inspired robots are usually of limited functionalities and develo** feature-rich robots is very challenging. Here, we report an origami-wheeled robot (OriWheelBot) with vari… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Comments: 23 papes, 7 figures

  11. arXiv:2307.15984  [pdf, other

    cs.MM

    VATP360: Viewport Adaptive 360-Degree Video Streaming based on Tile Priority

    Authors: Zhiyu Pang

    Abstract: 360-degree video becomes increasingly popular among users. In the current network bandwidth, serving high resolution 360 degree video to users is quite difficult. Most of the work has been devoted to the prediction of user viewports or tile-based adaptive algorithms. However, it is difficult to predict user viewports more accurately using only information such as user's historical viewports or vid… ▽ More

    Submitted 27 August, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

  12. arXiv:2306.11011  [pdf, other

    cs.CR

    virtCCA: Virtualized Arm Confidential Compute Architecture with TrustZone

    Authors: Xiangyi Xu, Wenhao Wang, Yongzheng Wu, Chenyu Wang, Huifeng Zhu, Haocheng Ma, Zhennan Min, Zixuan Pang, Rui Hou, Yier **

    Abstract: ARM recently introduced the Confidential Compute Architecture (CCA) as part of the upcoming ARMv9-A architecture. CCA enables the support of confidential virtual machines (cVMs) within a separate world called the Realm world, providing protection from the untrusted normal world. While CCA offers a promising future for confidential computing, the widespread availability of CCA hardware is not expec… ▽ More

    Submitted 17 February, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

  13. arXiv:2305.08851  [pdf, other

    cs.CV

    MV-Map: Offboard HD-Map Generation with Multi-view Consistency

    Authors: Ziyang Xie, Ziqi Pang, Yu-Xiong Wang

    Abstract: While bird's-eye-view (BEV) perception models can be useful for building high-definition maps (HD-Maps) with less human labor, their results are often unreliable and demonstrate noticeable inconsistencies in the predicted HD-Maps from different viewpoints. This is because BEV perception is typically set up in an 'onboard' manner, which restricts the computation and consequently prevents algorithms… ▽ More

    Submitted 8 October, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: ICCV 2023

  14. arXiv:2302.03802  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking

    Authors: Ziqi Pang, Jie Li, Pavel Tokmakov, Dian Chen, Sergey Zagoruyko, Yu-Xiong Wang

    Abstract: This work proposes an end-to-end multi-camera 3D multi-object tracking (MOT) framework. It emphasizes spatio-temporal continuity and integrates both past and future reasoning for tracked objects. Thus, we name it "Past-and-Future reasoning for Tracking" (PF-Track). Specifically, our method adapts the "tracking by attention" framework and represents tracked instances coherently over time with objec… ▽ More

    Submitted 3 April, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: CVPR 2023 Camera Ready, 15 pages, 8 figures

  15. arXiv:2212.00998  [pdf, other

    cs.LG

    Credit Assignment for Trained Neural Networks Based on Koopman Operator Theory

    Authors: Zhen Liang, Changyuan Zhao, Wanwei Liu, Bai Xue, Wen**g Yang, Zhengbin Pang

    Abstract: Credit assignment problem of neural networks refers to evaluating the credit of each network component to the final outputs. For an untrained neural network, approaches to tackling it have made great contributions to parameter update and model revolution during the training phase. This problem on trained neural networks receives rare attention, nevertheless, it plays an increasingly important role… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 9 pages, 4 figures

    MSC Class: 68T01 ACM Class: I.2.0

  16. arXiv:2211.10056  [pdf, other

    cs.CV

    Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

    Authors: Zongshang Pang, Yuta Nakashima, Mayu Otani, Hajime Nagahara

    Abstract: Video summarization aims to select the most informative subset of frames in a video to facilitate efficient video browsing. Unsupervised methods usually rely on heuristic training objectives such as diversity and representativeness. However, such methods need to bootstrap the online-generated summaries to compute the objectives for importance score regression. We consider such a pipeline inefficie… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: To appear in WACV2023

  17. arXiv:2209.11553  [pdf, other

    cs.LG cs.AI

    On Efficient Reinforcement Learning for Full-length Game of StarCraft II

    Authors: Ruo-Ze Liu, Zhen-Jia Pang, Zhou-Yu Meng, Wenhai Wang, Yang Yu, Tong Lu

    Abstract: StarCraft II (SC2) poses a grand challenge for reinforcement learning (RL), of which the main difficulties include huge state space, varying action space, and a long time horizon. In this work, we investigate a set of RL techniques for the full-length game of StarCraft II. We investigate a hierarchical RL approach involving extracted macro-actions and a hierarchical architecture of neural networks… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

    Comments: 48 pages,21 figures

    Journal ref: JAIR, 75 (2022), 213-260

  18. Hardware-in-the-Loop Simulation for Evaluating Communication Impacts on the Wireless-Network-Controlled Robots

    Authors: Honghao Lv, Zhibo Pang, Ming Xiao, Geng Yang

    Abstract: More and more robot automation applications have changed to wireless communication, and network performance has a growing impact on robotic systems. This study proposes a hardware-in-the-loop (HiL) simulation methodology for connecting the simulated robot platform to real network devices. This project seeks to provide robotic engineers and researchers with the capability to experiment without heav… ▽ More

    Submitted 28 September, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

    Comments: 6 pages, 11 figures, to appear in 48th Annual Conference of the Industrial Electronics Society IECON 2022 Conference

  19. arXiv:2207.05267  [pdf

    cs.SD eess.AS physics.ins-det physics.optics

    Indoor optical fiber eavesdrop** approach and its avoidance

    Authors: Haiqing Hao, Zhongwang Pang, Guan Wang, Bo Wang

    Abstract: The optical fiber network has become a worldwide infrastructure. In addition to the basic functions in telecommunication, its sensing ability has attracted more and more attention. In this paper, we discuss the risk of household fiber being used for eavesdrop** and demonstrate its performance in the lab. Using a 3-meter tail fiber in front of the household optical modem, voices of normal human s… ▽ More

    Submitted 3 August, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 8 pages, 4 figures, submitted to Optics Express

  20. arXiv:2203.00770  [pdf

    cs.IT eess.SP

    Short-Packet Interleaver against Impulse Interference in Practical Industrial Environments

    Authors: Ming Zhan, Zhibo Pang, Dacfey Dzung, Kan Yu, Ming Xiao

    Abstract: The most common cause of transmission failure in Wireless High Performance (WirelessHP) target industry environments is impulse interference. As interleavers are commonly used to improve the reliability on the Orthogonal Frequency Division Multiplexing (OFDM) symbol level for long packet transmission, this paper considers the feasibility of applying short-packet bit interleaving to enhance the imp… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

    Comments: 14 pages, 12 figures, submitted to IEEE Transactions on Wireless Communications

  21. arXiv:2112.06375  [pdf, other

    cs.CV

    Embracing Single Stride 3D Object Detector with Sparse Transformer

    Authors: Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, Zhaoxiang Zhang

    Abstract: In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases. Overlooking this difference, many 3D detectors directly follow the common practice of 2D detectors, which downsample the feature maps even after quantizing the point clouds. In this paper, we start by rethinking how such multi-stride s… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

  22. arXiv:2111.13672  [pdf, other

    cs.CV

    Immortal Tracker: Tracklet Never Dies

    Authors: Qitai Wang, Yuntao Chen, Ziqi Pang, Naiyan Wang, Zhaoxiang Zhang

    Abstract: Previous online 3D Multi-Object Tracking(3DMOT) methods terminate a tracklet when it is not associated with new detections for a few frames. But if an object just goes dark, like being temporarily occluded by other objects or simply getting out of FOV, terminating a tracklet prematurely will result in an identity switch. We reveal that premature tracklet termination is the main cause of identity s… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

  23. arXiv:2111.10586  [pdf, other

    eess.SP cs.LG

    Satellite Based Computing Networks with Federated Learning

    Authors: Hao Chen, Ming Xiao, Zhibo Pang

    Abstract: Driven by the ever-increasing penetration and proliferation of data-driven applications, a new generation of wireless communication, the sixth-generation (6G) mobile system enhanced by artificial intelligence (AI), has attracted substantial research interests. Among various candidate technologies of 6G, low earth orbit (LEO) satellites have appealing characteristics of ubiquitous wireless access.… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

  24. arXiv:2111.09621  [pdf, other

    cs.CV cs.RO

    SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking

    Authors: Ziqi Pang, Zhichao Li, Naiyan Wang

    Abstract: 3D multi-object tracking (MOT) has witnessed numerous novel benchmarks and approaches in recent years, especially those under the "tracking-by-detection" paradigm. Despite their progress and usefulness, an in-depth analysis of their strengths and weaknesses is not yet available. In this paper, we summarize current 3D MOT methods into a unified framework by decomposing them into four constituent pa… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

  25. arXiv:2111.00695  [pdf

    cs.IT

    Noise Error Pattern Generation Based on Successive Addition-Subtraction for Guessing Decoding

    Authors: Ming Zhan, Zhibo Pang, Kan Yu, **g Xu, Fang Wu

    Abstract: Guessing random additive noise decoding (GRAND) algorithm has emerged as an excellent decoding strategy that can meet both the high reliability and low latency constraints. This paper proposes a successive addition-subtraction algorithm to generate noise error permutations. A noise error patterns generation scheme is presented by embedding the "1" and "0" bursts alternately. Then detailed procedur… ▽ More

    Submitted 1 November, 2021; originally announced November 2021.

    Comments: 6 pages, 7 figures, submitted to IEEE Communications Letters

  26. Nonlocal Patch-Based Fully-Connected Tensor Network Decomposition for Remote Sensing Image Inpainting

    Authors: Wen-Jie Zheng, Xi-Le Zhao, Yu-Bang Zheng, Zhi-Feng Pang

    Abstract: Remote sensing image (RSI) inpainting plays an important role in real applications. Recently, fully-connected tensor network (FCTN) decomposition has been shown the remarkable ability to fully characterize the global correlation. Considering the global correlation and the nonlocal self-similarity (NSS) of RSIs, this paper introduces the FCTN decomposition to the whole RSI and its NSS groups, and p… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Journal ref: IEEE Geoscience and Remote Sensing Letters, 2021

  27. arXiv:2103.11441  [pdf, other

    cs.CL cs.AI

    TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing

    Authors: Tao Gui, Xiao Wang, Qi Zhang, Qin Liu, Yicheng Zou, Xin Zhou, Rui Zheng, Chong Zhang, Qinzhuo Wu, Jiacheng Ye, Zexiong Pang, Yongxin Zhang, Zhengyan Li, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, Yiding Tan, Yuan Hu, Qiyuan Bian, Zhihua Liu, Bolin Zhu, Shan Qin , et al. (9 additional authors not shown)

    Abstract: Various robustness evaluation methodologies from different perspectives have been proposed for different natural language processing (NLP) tasks. These methods have often focused on either universal or task-specific generalization capabilities. In this work, we propose a multilingual robustness evaluation platform for NLP tasks (TextFlint) that incorporates universal text transformation, task-spec… ▽ More

    Submitted 5 May, 2021; v1 submitted 21 March, 2021; originally announced March 2021.

  28. arXiv:2103.06028  [pdf, other

    cs.CV cs.RO

    Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences

    Authors: Ziqi Pang, Zhichao Li, Naiyan Wang

    Abstract: Estimating the states of surrounding traffic participants stays at the core of autonomous driving. In this paper, we study a novel setting of this problem: model-free single-object tracking (SOT), which takes the object state in the first frame as input, and jointly solves state estimation and tracking in subsequent frames. The main purpose for this new setting is to break the strong limitation of… ▽ More

    Submitted 5 August, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

    Comments: Accepted by IROS2021, Camera ready version

  29. arXiv:2102.01955  [pdf, other

    cs.CV q-bio.NC

    Predictive coding feedback results in perceived illusory contours in a recurrent neural network

    Authors: Zhaoyang Pang, Callum Biggs O'May, Bhavin Choksi, Rufin VanRullen

    Abstract: Modern feedforward convolutional neural networks (CNNs) can now solve some computer vision tasks at super-human levels. However, these networks only roughly mimic human visual perception. One difference from human vision is that they do not appear to perceive illusory contours (e.g. Kanizsa squares) in the same way humans do. Physiological evidence from visual cortex suggests that the perception o… ▽ More

    Submitted 16 June, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

    Comments: Manuscript under review

  30. arXiv:2012.07748  [pdf

    cs.CY

    Investigation of the Impacts of COVID-19 on the Electricity Consumption of a University Dormitory Using Weather Normalization

    Authors: Zhihong Pang, Fan Feng, Zheng O'Neill

    Abstract: This study investigated the impacts of the COVID-19 pandemic on the electricity consumption of a university dormitory building in the southern U.S. The historical electricity consumption data of this university dormitory building and weather data of an on-campus weather station, which were collected from January 1st, 2017 to July 31st, 2020, were used for analysis. Four inverse data-driven predict… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

  31. arXiv:2007.07437  [pdf

    cs.CV

    ContourRend: A Segmentation Method for Improving Contours by Rendering

    Authors: Junwen Chen, Yi Lu, Yaran Chen, Dongbin Zhao, Zhonghua Pang

    Abstract: A good object segmentation should contain clear contours and complete regions. However, mask-based segmentation can not handle contour features well on a coarse prediction grid, thus causing problems of blurry edges. While contour-based segmentation provides contours directly, but misses contours' details. In order to obtain fine contours, we propose a segmentation method named ContourRend which a… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  32. arXiv:2003.11941  [pdf, other

    cs.LG cs.AI stat.ML

    AliExpress Learning-To-Rank: Maximizing Online Model Performance without Going Online

    Authors: Guangda Huzhang, Zhen-Jia Pang, Yongqing Gao, Yawen Liu, Weijie Shen, Wen-Ji Zhou, Qing Da, An-Xiang Zeng, Han Yu, Yang Yu, Zhi-Hua Zhou

    Abstract: Learning-to-rank (LTR) has become a key technology in E-commerce applications. Most existing LTR approaches follow a supervised learning paradigm from offline labeled data collected from the online system. However, it has been noticed that previous LTR models can have a good validation performance over offline validation data but have a poor online performance, and vice versa, which implies a poss… ▽ More

    Submitted 31 December, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

  33. arXiv:1912.07186  [pdf, other

    cs.IT eess.SP

    Minimizing Age of Information for Real-Time Monitoring in Resource-Constrained Industrial IoT Networks

    Authors: Qian Wang, He Chen, Yonghui Li, Zhibo Pang, Branka Vucetic

    Abstract: This paper considers an Industrial Internet of Thing (IIoT) system with a source monitoring a dynamic process with randomly generated status updates. The status updates are sent to an designated destination in a real-time manner over an unreliable link. The source is subject to a practical constraint of limited average transmission power. Thus, the system should carefully schedule when to transmit… ▽ More

    Submitted 15 December, 2019; originally announced December 2019.

  34. arXiv:1911.12911  [pdf, other

    cs.CV

    Unlocking the Full Potential of Small Data with Diverse Supervision

    Authors: Ziqi Pang, Zhiyuan Hu, Pavel Tokmakov, Yu-Xiong Wang, Martial Hebert

    Abstract: Virtually all of deep learning literature relies on the assumption of large amounts of available training data. Indeed, even the majority of few-shot learning methods rely on a large set of "base classes" for pretraining. This assumption, however, does not always hold. For some tasks, annotating a large number of classes can be infeasible, and even collecting the images themselves can be a challen… ▽ More

    Submitted 26 April, 2021; v1 submitted 28 November, 2019; originally announced November 2019.

    Comments: Learning from Limited and Imperfect Data (L2ID) Workshop @ CVPR 2021

  35. arXiv:1903.00715  [pdf, other

    cs.LG cs.AI stat.ML

    Efficient Reinforcement Learning for StarCraft by Abstract Forward Models and Transfer Learning

    Authors: Ruo-Ze Liu, Haifeng Guo, Xiaozhong Ji, Yang Yu, Zhen-Jia Pang, Zitai Xiao, Yuzhou Wu, Tong Lu

    Abstract: Injecting human knowledge is an effective way to accelerate reinforcement learning (RL). However, these methods are underexplored. This paper presents our discovery that an abstract forward model (thought-game (TG)) combined with transfer learning (TL) is an effective way. We take StarCraft II as our study environment. With the help of a designed TG, the agent can learn a 99% win-rate on a 64x64 m… ▽ More

    Submitted 2 November, 2021; v1 submitted 2 March, 2019; originally announced March 2019.

  36. arXiv:1811.06166  [pdf, other

    cs.MM

    Tiyuntsong: A Self-Play Reinforcement Learning Approach for ABR Video Streaming

    Authors: Tianchi Huang, Xin Yao, Chenglei Wu, Rui-Xiao Zhang, Zhangyuan Pang, Lifeng Sun

    Abstract: Existing reinforcement learning~(RL)-based adaptive bitrate~(ABR) approaches outperform the previous fixed control rules based methods by improving the Quality of Experience~(QoE) score, as the QoE metric can hardly provide clear guidance for optimization, finally resulting in the unexpected strategies. In this paper, we propose \emph{Tiyuntsong}, a self-play reinforcement learning approach with g… ▽ More

    Submitted 2 May, 2019; v1 submitted 14 November, 2018; originally announced November 2018.

    Comments: Published in ICME 2019

  37. arXiv:1809.09095  [pdf, other

    cs.LG cs.AI stat.ML

    On Reinforcement Learning for Full-length Game of StarCraft

    Authors: Zhen-Jia Pang, Ruo-Ze Liu, Zhou-Yu Meng, Yi Zhang, Yang Yu, Tong Lu

    Abstract: StarCraft II poses a grand challenge for reinforcement learning. The main difficulties of it include huge state and action space and a long-time horizon. In this paper, we investigate a hierarchical reinforcement learning approach for StarCraft II. The hierarchy involves two levels of abstraction. One is the macro-action automatically extracted from expert's trajectories, which reduces the action… ▽ More

    Submitted 3 February, 2019; v1 submitted 23 September, 2018; originally announced September 2018.

    Comments: Appeared in AAAI 2019

  38. arXiv:1808.02079  [pdf, other

    eess.SP cs.NI

    Low-latency Networking: Where Latency Lurks and How to Tame It

    Authors: Xiaolin Jiang, Hossein S. Ghadikolaei, Gabor Fodor, Eytan Modiano, Zhibo Pang, Michele Zorzi, Carlo Fischione

    Abstract: While the current generation of mobile and fixed communication networks has been standardized for mobile broadband services, the next generation is driven by the vision of the Internet of Things and mission critical communication services requiring latency in the order of milliseconds or sub-milliseconds. However, these new stringent requirements have a large technical impact on the design of all… ▽ More

    Submitted 6 August, 2018; originally announced August 2018.

  39. arXiv:1702.06700  [pdf, other

    cs.CV cs.AI cs.CL cs.NE

    Task-driven Visual Saliency and Attention-based Visual Question Answering

    Authors: Yuetan Lin, Zhangyang Pang, Donghui Wang, Yueting Zhuang

    Abstract: Visual question answering (VQA) has witnessed great progress since May, 2015 as a classic problem unifying visual and textual data into a system. Many enlightening VQA works explore deep into the image and question encodings and fusing methods, of which attention is the most effective and infusive mechanism. Current attention based methods focus on adequate fusion of visual and textual features, b… ▽ More

    Submitted 22 February, 2017; originally announced February 2017.

    Comments: 8 pages, 3 figures

  40. arXiv:1605.09116  [pdf, ps, other

    math.OC cs.CV

    Image segmentation based on the hybrid total variation model and the K-means clustering strategy

    Authors: Baoli Shi, Zhi-Feng Pang, **g Xu

    Abstract: The performance of image segmentation highly relies on the original inputting image. When the image is contaminated by some noises or blurs, we can not obtain the efficient segmentation result by using direct segmentation methods. In order to efficiently segment the contaminated image, this paper proposes a two step method based on the hybrid total variation model with a box constraint and the K-m… ▽ More

    Submitted 30 May, 2016; originally announced May 2016.

  41. arXiv:1509.07211  [pdf, other

    cs.SD cs.CL

    Noise-Robust ASR for the third 'CHiME' Challenge Exploiting Time-Frequency Masking based Multi-Channel Speech Enhancement and Recurrent Neural Network

    Authors: Zaihu Pang, Fengyun Zhu

    Abstract: In this paper, the Lingban entry to the third 'CHiME' speech separation and recognition challenge is presented. A time-frequency masking based speech enhancement front-end is proposed to suppress the environmental noise utilizing multi-channel coherence and spatial cues. The state-of-the-art speech recognition techniques, namely recurrent neural network based acoustic and language modeling, state… ▽ More

    Submitted 23 September, 2015; originally announced September 2015.

    Comments: The 3rd 'CHiME' Speech Separation and Recognition Challenge, 5 pages, 1 figure

  42. arXiv:1110.1804   

    cs.CV cs.IT math.OC

    The proximal point method for a hybrid model in image restoration

    Authors: Zhi-Feng Pang, Li-Lian Wang, Yu-Fei Yang

    Abstract: Models including two $L^1$ -norm terms have been widely used in image restoration. In this paper we first propose the alternating direction method of multipliers (ADMM) to solve this class of models. Based on ADMM, we then propose the proximal point method (PPM), which is more efficient than ADMM. Following the operator theory, we also give the convergence analysis of the proposed methods. Further… ▽ More

    Submitted 25 August, 2012; v1 submitted 9 October, 2011; originally announced October 2011.

    Comments: Since we find that there are some unsuitale errors, I withdraw this paper from this website!