Skip to main content

Showing 201–250 of 348 results for author: Ouyang, W

.
  1. arXiv:2105.13695  [pdf, other

    cs.CV

    AutoSampling: Search for Effective Data Sampling Schedules

    Authors: Ming Sun, Haoxuan Dou, Baopu Li, Lei Cui, Junjie Yan, Wanli Ouyang

    Abstract: Data sampling acts as a pivotal role in training deep learning models. However, an effective sampling schedule is difficult to learn due to the inherently high dimension of parameters in learning the sampling schedule. In this paper, we propose an AutoSampling method to automatically learn sampling schedules for model training, which consists of the multi-exploitation step aiming for optimal local… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

    Comments: Automl for sampling firstly without any assumpation

    Journal ref: ICML 2021

  2. arXiv:2105.10154  [pdf, other

    cs.CV

    ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

    Authors: Lumin Xu, Yingda Guan, Sheng **, Wentao Liu, Chen Qian, ** Luo, Wanli Ouyang, Xiaogang Wang

    Abstract: Human pose estimation has achieved significant progress in recent years. However, most of the recent methods focus on improving accuracy using complicated models and ignoring real-time efficiency. To achieve a better trade-off between accuracy and efficiency, we propose a novel neural architecture search (NAS) method, termed ViPNAS, to search networks in both spatial and temporal levels for fast o… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted to CVPR 2021

  3. Learning Graph Meta Embeddings for Cold-Start Ads in Click-Through Rate Prediction

    Authors: Wentao Ouyang, Xiuwu Zhang, Shukui Ren, Li Li, Kun Zhang, **mei Luo, Zhaojie Liu, Yanlong Du

    Abstract: Click-through rate (CTR) prediction is one of the most central tasks in online advertising systems. Recent deep learning-based models that exploit feature embedding and high-order data nonlinearity have shown dramatic successes in CTR prediction. However, these models work poorly on cold-start ads with new IDs, whose embeddings are not well learned yet. In this paper, we propose Graph Meta Embeddi… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

    Comments: SIGIR 2021

  4. arXiv:2105.07561  [pdf, other

    cs.CV cs.LG

    Layerwise Optimization by Gradient Decomposition for Continual Learning

    Authors: Shixiang Tang, Dapeng Chen, **guo Zhu, Shijie Yu, Wanli Ouyang

    Abstract: Deep neural networks achieve state-of-the-art and sometimes super-human performance across various domains. However, when learning tasks sequentially, the networks easily forget the knowledge of previous tasks, known as "catastrophic forgetting". To achieve the consistencies between the old tasks and the new task, one effective solution is to modify the gradient for update. Previous methods enforc… ▽ More

    Submitted 16 May, 2021; originally announced May 2021.

    Comments: cvpr2021

  5. arXiv:2104.09880  [pdf, other

    cs.LG

    GMLP: Building Scalable and Flexible Graph Neural Networks with Feature-Message Passing

    Authors: Wentao Zhang, Yu Shen, Zheyu Lin, Yang Li, Xiaosen Li, Wen Ouyang, Yangyu Tao, Zhi Yang, Bin Cui

    Abstract: In recent studies, neural message passing has proved to be an effective way to design graph neural networks (GNNs), which have achieved state-of-the-art performance in many graph-based tasks. However, current neural-message passing architectures typically need to perform an expensive recursive neighborhood expansion in multiple rounds and consequently suffer from a scalability issue. Moreover, mos… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

  6. arXiv:2104.09770  [pdf, other

    cs.CV

    M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection

    Authors: Junke Wang, Zuxuan Wu, Wenhao Ouyang, Xintong Han, **g**g Chen, Ser-Nam Lim, Yu-Gang Jiang

    Abstract: The widespread dissemination of Deepfakes demands effective approaches that can detect perceptually convincing forged images. In this paper, we aim to capture the subtle manipulation artifacts at different scales using transformer models. In particular, we introduce a Multi-modal Multi-scale TRansformer (M2TR), which operates on patches of different sizes to detect local inconsistencies in images… ▽ More

    Submitted 19 April, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: accepted by ICMR 2022

  7. arXiv:2103.16507  [pdf, other

    cs.CV

    PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop

    Authors: Hongwen Zhang, Yating Tian, Xinchi Zhou, Wanli Ouyang, Yebin Liu, Limin Wang, Zhenan Sun

    Abstract: Regression-based methods have recently shown promising results in reconstructing human meshes from monocular images. By directly map** raw pixels to model parameters, these methods can produce parametric models in a feed-forward manner via neural networks. However, minor deviation in parameters may lead to noticeable misalignment between the estimated meshes and image evidences. To address this… ▽ More

    Submitted 23 August, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: Accepted to ICCV 2021, Oral paper. Code and model available at https://hongwenzhang.github.io/pymaf

  8. arXiv:2103.16237  [pdf, other

    cs.CV

    Delving into Localization Errors for Monocular 3D Object Detection

    Authors: Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang

    Abstract: Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving, while accurate 3D object detection from this kind of data is very challenging. In this work, by intensive diagnosis experiments, we quantify the impact introduced by each sub-task and found the `localization error' is the vital factor in restricting monocular 3D detection. Besides, we also investiga… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: CVPR'2021, code will be made available

  9. arXiv:2103.12294  [pdf, other

    cs.CV

    Gradient Regularized Contrastive Learning for Continual Domain Adaptation

    Authors: Shixiang Tang, Peng Su, Dapeng Chen, Wanli Ouyang

    Abstract: Human beings can quickly adapt to environmental changes by leveraging learning experience. However, adapting deep neural networks to dynamic environments by machine learning algorithms remains a challenge. To better understand this issue, we study the problem of continual domain adaptation, where the model is presented with a labelled source domain and a sequence of unlabelled target domains. The… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: Accepted by AAAI2021 (poster). arXiv admin note: text overlap with arXiv:2007.12942

  10. arXiv:2103.10130  [pdf, other

    cs.CV

    Real-Time Visual Object Tracking via Few-Shot Learning

    Authors: **ghao Zhou, Bo Li, Peng Wang, Peixia Li, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang

    Abstract: Visual Object Tracking (VOT) can be seen as an extended task of Few-Shot Learning (FSL). While the concept of FSL is not new in tracking and has been previously applied by prior works, most of them are tailored to fit specific types of FSL algorithms and may sacrifice running speed. In this work, we propose a generalized two-stage framework that is capable of employing a large variety of FSL algor… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

  11. arXiv:2103.10089  [pdf, other

    cs.CV

    Higher Performance Visual Tracking with Dual-Modal Localization

    Authors: **ghao Zhou, Bo Li, Lei Qiao, Peng Wang, Weihao Gan, Wei Wu, Junjie Yan, Wanli Ouyang

    Abstract: Visual Object Tracking (VOT) has synchronous needs for both robustness and accuracy. While most existing works fail to operate simultaneously on both, we investigate in this work the problem of conflicting performance between accuracy and robustness. We first conduct a systematic comparison among existing methods and analyze their restrictions in terms of accuracy and robustness. Specifically, 4 f… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

  12. arXiv:2102.12220  [pdf

    cs.RO eess.SY

    A Trident Quaternion Framework for Inertial-based Navigation Part II: Error Models and Application to Initial Alignment

    Authors: Wei Ouyang, Yuanxin Wu

    Abstract: This work deals with error models for trident quaternion framework proposed in the companion paper (Part I) and further uses them to investigate the odometer-aided static/in-motion inertial navigation attitude alignment for land vehicles. By linearizing the trident quaternion kinematic equation, the left and right trident quaternion error models are obtained, which are found to be equivalent to th… ▽ More

    Submitted 16 May, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

    Comments: 17 pages, 13 figures

  13. arXiv:2102.12217  [pdf

    cs.RO eess.SY

    A Trident Quaternion Framework for Inertial-based Navigation Part I: Rigid Motion Representation and Computation

    Authors: Wei Ouyang, Yuanxin Wu

    Abstract: Strapdown inertial navigation research involves the parameterization and computation of the attitude, velocity and position of a rigid body in a chosen reference frame. The community has long devoted to finding the most concise and efficient representation for the strapdown inertial navigation system (INS). The current work is motivated by simplifying the existing dual quaternion representation of… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: 10 pages, 5 figures

  14. Continental-scale streamflow modeling of basins with reservoirs: towards a coherent deep-learning-based strategy

    Authors: Wenyu Ouyang, Kathryn Lawson, Dapeng Feng, Lei Ye, Chi Zhang, Chaopeng Shen

    Abstract: A large fraction of major waterways have dams influencing streamflow, which must be accounted for in large-scale hydrologic modeling. However, daily streamflow prediction for basins with dams is challenging for various modeling approaches, especially at large scales. Here we examined which types of dammed basins could be well represented by long short-term memory (LSTM) models using readily-availa… ▽ More

    Submitted 12 May, 2021; v1 submitted 12 January, 2021; originally announced January 2021.

    Journal ref: Journal of Hydrology, 2021

  15. arXiv:2101.02843  [pdf, other

    cs.CV eess.IV

    Probabilistic Graph Attention Network with Conditional Kernels for Pixel-Wise Prediction

    Authors: Dan Xu, Xavier Alameda-Pineda, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, Nicu Sebe

    Abstract: Multi-scale representations deeply learned via convolutional neural networks have shown tremendous importance for various pixel-level prediction problems. In this paper we present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion. In contrast to previous works directly considering multi-sc… ▽ More

    Submitted 13 March, 2022; v1 submitted 7 January, 2021; originally announced January 2021.

    Comments: Regular paper accepted at TPAMI 2020. arXiv admin note: text overlap with arXiv:1801.00524

  16. arXiv:2012.13587  [pdf, other

    cs.CV

    Inception Convolution with Efficient Dilation Search

    Authors: Jie Liu, Chuming Li, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu

    Abstract: As a variant of standard convolution, a dilated convolution can control effective receptive fields and handle large scale variance of objects without introducing additional computational costs. To fully explore the potential of dilated convolution, we proposed a new type of dilated convolution (referred to as inception convolution), where the convolution operations have independent dilation patter… ▽ More

    Submitted 10 May, 2021; v1 submitted 25 December, 2020; originally announced December 2020.

  17. arXiv:2012.06785  [pdf, other

    cs.CV

    DETR for Crowd Pedestrian Detection

    Authors: Matthieu Lin, Chuming Li, Xingyuan Bu, Ming Sun, Chen Lin, Junjie Yan, Wanli Ouyang, Zhidong Deng

    Abstract: Pedestrian detection in crowd scenes poses a challenging problem due to the heuristic defined map** from anchors to pedestrians and the conflict between NMS and highly overlapped pedestrians. The recently proposed end-to-end detectors(ED), DETR and deformable DETR, replace hand designed components such as NMS and anchors using the transformer architecture, which gets rid of duplicate predictions… ▽ More

    Submitted 18 February, 2021; v1 submitted 12 December, 2020; originally announced December 2020.

  18. arXiv:2012.05586  [pdf, other

    cs.CV

    Full Matching on Low Resolution for Disparity Estimation

    Authors: Hong Zhang, Shenglun Chen, Zhihui Wang, Haojie Li, Wanli Ouyang

    Abstract: A Multistage Full Matching disparity estimation scheme (MFM) is proposed in this work. We demonstrate that decouple all similarity scores directly from the low-resolution 4D volume step by step instead of estimating low-resolution 3D cost volume through focusing on optimizing the low-resolution 4D volume iteratively leads to more accurate disparity. To this end, we first propose to decompose the f… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: 9pages,5 figures

    MSC Class: 65D19

  19. arXiv:2012.05570  [pdf, other

    cs.CV

    Direct Depth Learning Network for Stereo Matching

    Authors: Hong Zhang, Haojie Li, Shenglun Chen, Tiantian Yan, Zhihui Wang, Guo Lu, Wanli Ouyang

    Abstract: Being a crucial task of autonomous driving, Stereo matching has made great progress in recent years. Existing stereo matching methods estimate disparity instead of depth. They treat the disparity errors as the evaluation metric of the depth estimation errors, since the depth can be calculated from the disparity according to the triangulation principle. However, we find that the error of the depth… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: 10 pages,4 figures

    MSC Class: 65D19

  20. arXiv:2011.13628  [pdf, other

    cs.CV

    Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving

    Authors: Zhenxun Yuan, Xiao Song, Lei Bai, Wengang Zhou, Zhe Wang, Wanli Ouyang

    Abstract: The strong demand of autonomous driving in the industry has lead to strong interest in 3D object detection and resulted in many excellent 3D object detection algorithms. However, the vast majority of algorithms only model single-frame data, ignoring the temporal information of the sequence of data. In this work, we propose a new transformer, called Temporal-Channel Transformer, to model the spatia… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

  21. arXiv:2011.10904  [pdf, other

    cs.CV

    Evolving Search Space for Neural Architecture Search

    Authors: Yuanzheng Ci, Chen Lin, Ming Sun, Boyu Chen, Hongwen Zhang, Wanli Ouyang

    Abstract: The automation of neural architecture design has been a coveted alternative to human experts. Recent works have small search space, which is easier to optimize but has a limited upper bound of the optimal solution. Extra human design is needed for those methods to propose a more suitable space with respect to the specific task and algorithm capacity. To further enhance the degree of automation for… ▽ More

    Submitted 18 August, 2021; v1 submitted 21 November, 2020; originally announced November 2020.

    Comments: Accepted for publication at the 2021 International Conference on Computer Vision (ICCV 2021)

  22. Parity-Dependent Moiré Superlattices in Graphene/h-BN Heterostructures: A Route to Mechanomutable Metamaterials

    Authors: Wengen Ouyang, Oded Hod, Michael Urbakh

    Abstract: The superlattice of alternating graphene/h-BN few-layered heterostructures is found to exhibit strong dependence on the parity of the number of layers within the stack. Odd-parity systems show a unique flamingo-like pattern, whereas their even-parity counterparts exhibit regular hexagonal or rectangular superlattices. When the alternating stack consists of seven layers or more, the flamingo patter… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

    Comments: 31 pages, 17 figures

    Journal ref: Phys. Rev. Lett. 126, 216101 (2021)

  23. arXiv:2010.15859  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Transient Grating Spectroscopy of Photocarrier Dynamics in Semiconducting Polymer Thin Films

    Authors: Wenkai Ouyang, Yu Li, Brett Yurash, Nora Schopp, Alejandro Vega-Flick, Viktor Brus, Thuc-Quyen Nguyen, Bolin Liao

    Abstract: While charge carrier dynamics and thermal management are both keys to the operational efficiency and stability for energy-related devices, experimental techniques that can simultaneously characterize both properties are still lacking. In this paper, we use laser-induced transient grating (TG) spectroscopy to characterize thin films of the archetypal organic semiconductor regioregular poly(3-hexylt… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: 18 pages; 5 Figures

  24. arXiv:2010.11041  [pdf, other

    cs.LG math.OC

    Adaptive Gradient Method with Resilience and Momentum

    Authors: Jie Liu, Chen Lin, Chuming Li, Lu Sheng, Ming Sun, Junjie Yan, Wanli Ouyang

    Abstract: Several variants of stochastic gradient descent (SGD) have been proposed to improve the learning effectiveness and efficiency when training deep neural networks, among which some recent influential attempts would like to adaptively control the parameter-wise learning rate (e.g., Adam and RMSProp). Although they show a large improvement in convergence speed, most adaptive learning rate methods suff… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  25. arXiv:2010.04354  [pdf, other

    cs.CV eess.IV

    Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search

    Authors: Mingzhu Shen, Feng Liang, Ruihao Gong, Yuhang Li, Chuming Li, Chen Lin, Fengwei Yu, Junjie Yan, Wanli Ouyang

    Abstract: Quantization Neural Networks (QNN) have attracted a lot of attention due to their high efficiency. To enhance the quantization accuracy, prior works mainly focus on designing advanced quantization algorithms but still fail to achieve satisfactory results under the extremely low-bit case. In this work, we take an architecture perspective to investigate the potential of high-performance QNN. Therefo… ▽ More

    Submitted 28 September, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: Accepted by ICCV2021

  26. arXiv:2009.14737  [pdf, other

    cs.LG cs.CV stat.ML

    Improving Auto-Augment via Augmentation-Wise Weight Sharing

    Authors: Keyu Tian, Chen Lin, Ming Sun, Lu** Zhou, Junjie Yan, Wanli Ouyang

    Abstract: The recent progress on automatically searching augmentation policies has boosted the performance substantially for various tasks. A key component of automatic augmentation search is the evaluation process for a particular augmentation policy, which is utilized to return reward and usually runs thousands of times. A plain evaluation process, which includes full model training and validation, would… ▽ More

    Submitted 22 October, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: Accepted to NeurIPS 2020 (Poster)

  27. arXiv:2009.10338  [pdf, other

    cs.CV

    SAMOT: Switcher-Aware Multi-Object Tracking and Still Another MOT Measure

    Authors: Weitao Feng, Zhihao Hu, Baopu Li, Weihao Gan, Wei Wu, Wanli Ouyang

    Abstract: Multi-Object Tracking (MOT) is a popular topic in computer vision. However, identity issue, i.e., an object is wrongly associated with another object of a different identity, still remains to be a challenging problem. To address it, switchers, i.e., confusing targets thatmay cause identity issues, should be focused. Based on this motivation,this paper proposes a novel switcher-aware framework for… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.

  28. arXiv:2009.05982  [pdf, other

    cs.CV

    Improving Deep Video Compression by Resolution-adaptive Flow Coding

    Authors: Zhihao Hu, Zhenghao Chen, Dong Xu, Guo Lu, Wanli Ouyang, Shuhang Gu

    Abstract: In the learning based video compression approaches, it is an essential issue to compress pixel-level optical flow maps by develo** new motion vector (MV) encoders. In this work, we propose a new framework called Resolution-adaptive Flow Coding (RaFC) to effectively compress the flow maps globally and locally, in which we use multi-resolution representations instead of single-resolution represent… ▽ More

    Submitted 13 September, 2020; originally announced September 2020.

    Comments: ECCV 2020(oral)

  29. arXiv:2009.05834  [pdf, other

    cs.CV cs.CL

    Exploring the Hierarchy in Relation Labels for Scene Graph Generation

    Authors: Yi Zhou, Shuyang Sun, Chao Zhang, Yikang Li, Wanli Ouyang

    Abstract: By assigning each relationship a single label, current approaches formulate the relationship detection as a classification problem. Under this formulation, predicate categories are treated as completely different classes. However, different from the object labels where different classes have explicit boundaries, predicates usually have overlaps in their semantic meanings. For example, sit\_on and… ▽ More

    Submitted 12 September, 2020; originally announced September 2020.

  30. arXiv:2008.06226  [pdf, other

    cs.CV

    BriNet: Towards Bridging the Intra-class and Inter-class Gaps in One-Shot Segmentation

    Authors: Xianghui Yang, Bairun Wang, Kaige Chen, Xinchi Zhou, Shuai Yi, Wanli Ouyang, Lu** Zhou

    Abstract: Few-shot segmentation focuses on the generalization of models to segment unseen object instances with limited training samples. Although tremendous improvements have been achieved, existing methods are still constrained by two factors. (1) The information interaction between query and support images is not adequate, leaving intra-class gap. (2) The object categories at the training and inference s… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: 14 pages, 6 figures, BMVC2020(Oral)

  31. arXiv:2008.04582  [pdf, other

    cs.CV

    Rethinking Pseudo-LiDAR Representation

    Authors: Xinzhu Ma, Shinan Liu, Zhiyi Xia, Hongwen Zhang, Xingyu Zeng, Wanli Ouyang

    Abstract: The recently proposed pseudo-LiDAR based 3D detectors greatly improve the benchmark of monocular/stereo 3D detection task. However, the underlying mechanism remains obscure to the research community. In this paper, we perform an in-depth investigation and observe that the efficacy of pseudo-LiDAR representation comes from the coordinate transformation, instead of data representation itself. Based… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: ECCV2020. Supplemental Material attached

  32. arXiv:2008.02974  [pdf, other

    cs.IR cs.LG

    MiNet: Mixed Interest Network for Cross-Domain Click-Through Rate Prediction

    Authors: Wentao Ouyang, Xiuwu Zhang, Lei Zhao, **mei Luo, Yu Zhang, Heng Zou, Zhaojie Liu, Yanlong Du

    Abstract: Click-through rate (CTR) prediction is a critical task in online advertising systems. Existing works mainly address the single-domain CTR prediction problem and model aspects such as feature interaction, user behavior history and contextual information. Nevertheless, ads are usually displayed with natural content, which offers an opportunity for cross-domain CTR prediction. In this paper, we addre… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: CIKM 2020

  33. arXiv:2008.00709  [pdf

    cond-mat.mes-hall

    Sliding Over Graphene Grain Boundaries: A Step Towards Macroscale Superlubricity

    Authors: Xiang Gao, Wengen Ouyang, Oded Hod, Michael Urbakh

    Abstract: In light of the race towards macroscale superlubricity of graphitic contacts, the effect of grain boundaries on their frictional properties becomes of central importance. Here, we elucidate the unique frictional mechanisms characterizing topological defects along typical grain boundaries that can vary from being nearly flat to highly corrugated, depending on the boundary misfit angle. We find that… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

  34. arXiv:2007.11864  [pdf, other

    cs.CV

    Differentiable Hierarchical Graph Grou** for Multi-Person Pose Estimation

    Authors: Sheng **, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, ** Luo

    Abstract: Multi-person pose estimation is challenging because it localizes body keypoints for multiple persons simultaneously. Previous methods can be divided into two streams, i.e. top-down and bottom-up methods. The top-down methods localize keypoints after human detection, while the bottom-up methods localize keypoints directly and then cluster/group them for different persons, which are generally more e… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

    Comments: To appear on ECCV 2020

  35. arXiv:2007.11858  [pdf, other

    cs.CV

    Whole-Body Human Pose Estimation in the Wild

    Authors: Sheng **, Lumin Xu, ** Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, ** Luo

    Abstract: This paper investigates the task of 2D human whole-body pose estimation, which aims to localize dense landmarks on the entire human body including face, hands, body, and feet. As existing datasets do not have whole-body annotations, previous methods have to assemble different deep models trained independently on different datasets of the human face, hand, and body, struggling with dataset biases a… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

    Comments: To appear on ECCV2020

  36. Controllable Thermal Conductivity in Twisted Homogeneous Interfaces of Graphene and Hexagonal Boron Nitride

    Authors: Wengen Ouyang, Huasong Qin, Michael Urbakh, Oded Hod

    Abstract: Thermal conductivity of homogeneous twisted stacks of graphite is found to strongly depend on the misfit angle. The underlying mechanism relies on the angle dependence of phonon-phonon couplings across the twisted interface. Excellent agreement between the calculated thermal conductivity of narrow graphitic stacks and corresponding experimental results indicates the validity of the predictions. Th… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

    Comments: 52 pages, 11 figures

    Journal ref: Nano Lett. 20, 7513-7518 (2020)

  37. INS/Odometer Land Navigation by Accurate Measurement Modeling and Multiple-Model Adaptive Estimation

    Authors: Wei Ouyang, Yuanxin Wu, Hongyue Chen

    Abstract: Land vehicle navigation based on inertial navigation system (INS) and odometers is a classical autonomous navigation application and has been extensively studied over the past several decades. In this work, we seriously analyze the error characteristics of the odometer (OD) pulses and investigate three types of odometer measurement models in the INS/OD integrated system. Specifically, in the pulse… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

    Comments: 16 pages

    Journal ref: IEEE Transactions on Aerospace and Electronic Systems, 2021

  38. arXiv:2006.14539  [pdf, other

    math.OC cs.GR

    Anderson Acceleration for Nonconvex ADMM Based on Douglas-Rachford Splitting

    Authors: Wenqing Ouyang, Yue Peng, Yuxin Yao, Juyong Zhang, Bailin Deng

    Abstract: The alternating direction multiplier method (ADMM) is widely used in computer graphics for solving optimization problems that can be nonsmooth and nonconvex. It converges quickly to an approximate solution, but can take a long time to converge to a solution of high-accuracy. Previously, Anderson acceleration has been applied to ADMM, by treating it as a fixed-point iteration for the concatenation… ▽ More

    Submitted 26 June, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: To be published in Computer Graphis Forum and presented at Eurographics Symposium on Geometry Processing 2020

  39. arXiv:2006.05734  [pdf, other

    cs.CV

    3D Human Mesh Regression with Dense Correspondence

    Authors: Wang Zeng, Wanli Ouyang, ** Luo, Wentao Liu, Xiaogang Wang

    Abstract: Estimating 3D mesh of the human body from a single 2D image is an important task with many applications such as augmented reality and Human-Robot interaction. However, prior works reconstructed 3D mesh from global image feature extracted by using convolutional neural network (CNN), where the dense correspondences between the mesh surface and the image pixels are missing, leading to suboptimal solu… ▽ More

    Submitted 6 June, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: To appear at CVPR 2020

  40. arXiv:2006.02559  [pdf, other

    math.OC math.NA

    Nonmonotone Globalization for Anderson Acceleration via Adaptive Regularization

    Authors: Wenqing Ouyang, Jiong Tao, Andre Milzarek, Bailin Deng

    Abstract: Anderson acceleration (AA) is a popular method for accelerating fixed-point iterations, but may suffer from instability and stagnation. We propose a globalization method for AA to improve stability and achieve unified global and local convergence. Unlike existing AA globalization approaches that rely on safeguarding operations and might hinder fast local convergence, we adopt a nonmonotone trust-r… ▽ More

    Submitted 2 May, 2023; v1 submitted 3 June, 2020; originally announced June 2020.

    Comments: Accepted to Journal of Scientific Computing

  41. arXiv:2005.04854  [pdf, other

    cs.CV

    Scope Head for Accurate Localization in Object Detection

    Authors: Geng Zhan, Dan Xu, Guo Lu, Wei Wu, Chunhua Shen, Wanli Ouyang

    Abstract: Existing anchor-based and anchor-free object detectors in multi-stage or one-stage pipelines have achieved very promising detection performance. However, they still encounter the design difficulty in hand-crafted 2D anchor definition and the learning complexity in 1D direct location regression. To tackle these issues, in this paper, we propose a novel detector coined as ScopeNet, which models anch… ▽ More

    Submitted 11 May, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

  42. arXiv:2004.12178  [pdf, other

    cs.CV

    Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection

    Authors: Dongzhan Zhou, Xinchi Zhou, Hongwen Zhang, Shuai Yi, Wanli Ouyang

    Abstract: In this paper, we propose a general and efficient pre-training paradigm, Montage pre-training, for object detection. Montage pre-training needs only the target detection dataset while taking only 1/4 computational resources compared to the widely adopted ImageNet pre-training.To build such an efficient paradigm, we reduce the potential redundancy by carefully extracting useful samples from the ori… ▽ More

    Submitted 31 August, 2020; v1 submitted 25 April, 2020; originally announced April 2020.

    Comments: Accepted by ECCV2020

  43. arXiv:2004.10999  [pdf, other

    cs.CV

    Location-Aware Feature Selection Text Detection Network

    Authors: Zengyuan Guo, Zilin Wang, Zhihui Wang, Wanli Ouyang, Haojie Li, Wen Gao

    Abstract: Regression-based text detection methods have already achieved promising performances with simple network structure and high efficiency. However, they are behind in accuracy comparing with recent segmentation-based text detectors. In this work, we discover that one important reason to this case is that regression-based methods usually utilize a fixed feature selection way, i.e. selecting features i… ▽ More

    Submitted 25 May, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: 10 pages, 7 figures, 5 tables

    ACM Class: I.4.8

  44. arXiv:2003.14111  [pdf, other

    cs.CV

    Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

    Authors: Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, Wanli Ouyang

    Abstract: Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal dependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-ran… ▽ More

    Submitted 19 May, 2020; v1 submitted 31 March, 2020; originally announced March 2020.

    Comments: CVPR 2020

  45. arXiv:2003.11282  [pdf, other

    eess.IV cs.CV

    Content Adaptive and Error Propagation Aware Deep Video Compression

    Authors: Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, Zhiyong Gao

    Abstract: Recently, learning based video compression methods attract increasing attention. However, the previous works suffer from error propagation due to the accumulation of reconstructed error in inter predictive coding. Meanwhile, the previous learning based video codecs are also not adaptive to different video contents. To address these two problems, we propose a content adaptive and error propagation… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: First two authors contributed equally

  46. arXiv:2003.06757  [pdf, ps, other

    cs.CV

    Channel Pruning Guided by Classification Loss and Feature Importance

    Authors: **yang Guo, Wanli Ouyang, Dong Xu

    Abstract: In this work, we propose a new layer-by-layer channel pruning method called Channel Pruning guided by classification Loss and feature Importance (CPLI). In contrast to the existing layer-by-layer channel pruning approaches that only consider how to reconstruct the features from the next layer, our approach additionally take the classification loss into account in the channel pruning process. We al… ▽ More

    Submitted 15 March, 2020; originally announced March 2020.

    Comments: AAAI2020

  47. arXiv:2003.05176  [pdf, other

    cs.CV

    Equalization Loss for Long-Tailed Object Recognition

    Authors: **gru Tan, Changbao Wang, Buyu Li, Quanquan Li, Wanli Ouyang, Changqing Yin, Junjie Yan

    Abstract: Object recognition techniques using convolutional neural networks (CNN) have achieved great success. However, state-of-the-art object detection methods still perform poorly on large vocabulary and long-tailed datasets, e.g. LVIS. In this work, we analyze this problem from a novel perspective: each positive sample of one category can be seen as a negative sample for other categories, making the tai… ▽ More

    Submitted 14 April, 2020; v1 submitted 11 March, 2020; originally announced March 2020.

    Comments: CVPR 2020. Winner of LVIS Challenge 2019. Code has been available at https: //github.com/tztztztztz/eql.detectron2

  48. arXiv:2001.01233  [pdf, other

    cs.CV cs.LG cs.NE

    EcoNAS: Finding Proxies for Economical Neural Architecture Search

    Authors: Dongzhan Zhou, Xinchi Zhou, Wenwei Zhang, Chen Change Loy, Shuai Yi, Xuesen Zhang, Wanli Ouyang

    Abstract: Neural Architecture Search (NAS) achieves significant progress in many computer vision tasks. While many methods have been proposed to improve the efficiency of NAS, the search progress is still laborious because training and evaluating plausible architectures over large search space is time-consuming. Assessing network candidates under a proxy (i.e., computationally reduced setting) thus becomes… ▽ More

    Submitted 26 February, 2020; v1 submitted 5 January, 2020; originally announced January 2020.

    Comments: CVPR2020

  49. arXiv:1912.13344  [pdf, other

    cs.CV

    Learning 3D Human Shape and Pose from Dense Body Parts

    Authors: Hongwen Zhang, Jie Cao, Guo Lu, Wanli Ouyang, Zhenan Sun

    Abstract: Reconstructing 3D human shape and pose from monocular images is challenging despite the promising results achieved by the most recent learning-based methods. The commonly occurred misalignment comes from the facts that the map** from images to the model space is highly non-linear and the rotation-based pose representation of body models is prone to result in the drift of joint positions. In this… ▽ More

    Submitted 6 December, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: Journal article accepted by IEEE TPAMI. Project page: https://hongwenzhang.github.io/dense2mesh

  50. arXiv:1912.11234  [pdf, other

    cs.CV cs.LG

    Computation Reallocation for Object Detection

    Authors: Feng Liang, Chen Lin, Ronghao Guo, Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang

    Abstract: The allocation of computation resources in the backbone is a crucial issue in object detection. However, classification allocation pattern is usually adopted directly to object detector, which is proved to be sub-optimal. In order to reallocate the engaged computation resources in a more efficient way, we present CR-NAS (Computation Reallocation Neural Architecture Search) that can learn computati… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

    Comments: ICLR2020