-
High-Resolution Volumetric Reconstruction for Clothed Humans
Authors:
Sicong Tang,
Guangyuan Wang,
Qing Ran,
Lingzhi Li,
Li Shen,
** Tan
Abstract:
We present a novel method for reconstructing clothed humans from a sparse set of, e.g., 1 to 6 RGB images. Despite impressive results from recent works employing deep implicit representation, we revisit the volumetric approach and demonstrate that better performance can be achieved with proper system design. The volumetric representation offers significant advantages in leveraging 3D spatial conte…
▽ More
We present a novel method for reconstructing clothed humans from a sparse set of, e.g., 1 to 6 RGB images. Despite impressive results from recent works employing deep implicit representation, we revisit the volumetric approach and demonstrate that better performance can be achieved with proper system design. The volumetric representation offers significant advantages in leveraging 3D spatial context through 3D convolutions, and the notorious quantization error is largely negligible with a reasonably large yet affordable volume resolution, e.g., 512. To handle memory and computation costs, we propose a sophisticated coarse-to-fine strategy with voxel culling and subspace sparse convolution. Our method starts with a discretized visual hull to compute a coarse shape and then focuses on a narrow band nearby the coarse shape for refinement. Once the shape is reconstructed, we adopt an image-based rendering approach, which computes the colors of surface points by blending input images with learned weights. Extensive experimental results show that our method significantly reduces the mean point-to-surface (P2S) precision of state-of-the-art methods by more than 50% to achieve approximately 2mm accuracy with a 512 volume resolution. Additionally, images rendered from our textured model achieve a higher peak signal-to-noise ratio (PSNR) compared to state-of-the-art methods.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Links Assignment Scheme based on Potential Edges Importance in Dual-layer Wavelength Routing Optical Satellite Networks
Authors:
**gkai Yang,
Qiwen Ran,
Hongyu Wu,
**g Ma
Abstract:
With the development of the massive satellite constellation and the on-orbit laser-based communication equipment, the wavelength routing optical satellite network (WROSN) becomes a potential solution for on-orbit, high-capacity, and high-speed communication. Since the inter-satellite links (ISLs) are time-varying, one of the fundamental considerations in the construction of the WROSN is assigning…
▽ More
With the development of the massive satellite constellation and the on-orbit laser-based communication equipment, the wavelength routing optical satellite network (WROSN) becomes a potential solution for on-orbit, high-capacity, and high-speed communication. Since the inter-satellite links (ISLs) are time-varying, one of the fundamental considerations in the construction of the WROSN is assigning limited laser communication terminals for each satellite to establish ISLs with the visible satellites. Therefore, we propose a links assignment scheme (LAS) based on the potential edges importance matrix (PEIM) algorithm to construct a temporarily stable topology of the ISLs for a dual-layer constellation. The simulation results showed that the LAS based on the PEIM algorithm is better than LAS based on the random or Greedy algorithm in terms of node-to-node distance, node pair connectivity, wavelength demand, and transmission delay. The node pair connectivity and wavelength demand in WROSN is a trade-off problem. The research in this paper also brings a novel method for reduction of the cost of the on-board resources, that is through designing topology of the ISLs with links assignment algorithm.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Decoupling Makes Weakly Supervised Local Feature Better
Authors:
Kunhong Li,
Longguang Wang,
Li Liu,
Qing Ran,
Kai Xu,
Yulan Guo
Abstract:
Weakly supervised learning can help local feature methods to overcome the obstacle of acquiring a large-scale dataset with densely labeled correspondences. However, since weak supervision cannot distinguish the losses caused by the detection and description steps, directly conducting weakly supervised learning within a joint describe-then-detect pipeline suffers limited performance. In this paper,…
▽ More
Weakly supervised learning can help local feature methods to overcome the obstacle of acquiring a large-scale dataset with densely labeled correspondences. However, since weak supervision cannot distinguish the losses caused by the detection and description steps, directly conducting weakly supervised learning within a joint describe-then-detect pipeline suffers limited performance. In this paper, we propose a decoupled describe-then-detect pipeline tailored for weakly supervised local feature learning. Within our pipeline, the detection step is decoupled from the description step and postponed until discriminative and robust descriptors are learned. In addition, we introduce a line-to-window search strategy to explicitly use the camera pose information for better descriptor learning. Extensive experiments show that our method, namely PoSFeat (Camera Pose Supervised Feature), outperforms previous fully and weakly supervised methods and achieves state-of-the-art performance on a wide range of downstream tasks.
△ Less
Submitted 25 March, 2022; v1 submitted 8 January, 2022;
originally announced January 2022.
-
ActiveZero: Mixed Domain Learning for Active Stereovision with Zero Annotation
Authors:
Isabella Liu,
Edward Yang,
Jianyu Tao,
Rui Chen,
Xiaoshuai Zhang,
Qing Ran,
Zhu Liu,
Hao Su
Abstract:
Traditional depth sensors generate accurate real world depth estimates that surpass even the most advanced learning approaches trained only on simulation domains. Since ground truth depth is readily available in the simulation domain but quite difficult to obtain in the real domain, we propose a method that leverages the best of both worlds. In this paper we present a new framework, ActiveZero, wh…
▽ More
Traditional depth sensors generate accurate real world depth estimates that surpass even the most advanced learning approaches trained only on simulation domains. Since ground truth depth is readily available in the simulation domain but quite difficult to obtain in the real domain, we propose a method that leverages the best of both worlds. In this paper we present a new framework, ActiveZero, which is a mixed domain learning solution for active stereovision systems that requires no real world depth annotation. First, we demonstrate the transferability of our method to out-of-distribution real data by using a mixed domain learning strategy. In the simulation domain, we use a combination of supervised disparity loss and self-supervised losses on a shape primitives dataset. By contrast, in the real domain, we only use self-supervised losses on a dataset that is out-of-distribution from either training simulation data or test real data. Second, our method introduces a novel self-supervised loss called temporal IR reprojection to increase the robustness and accuracy of our reprojections in hard-to-perceive regions. Finally, we show how the method can be trained end-to-end and that each module is important for attaining the end result. Extensive qualitative and quantitative evaluations on real data demonstrate state of the art results that can even beat a commercial depth sensor.
△ Less
Submitted 5 December, 2021;
originally announced December 2021.
-
WeChat Neural Machine Translation Systems for WMT21
Authors:
Xianfeng Zeng,
Yi** Liu,
Ernan Li,
Qiu Ran,
Fandong Meng,
Peng Li,
**an Xu,
Jie Zhou
Abstract:
This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German. Our systems are based on the Transformer (Vaswani et al., 2017) with several novel and effective variants. In our experiments, we employ data filtering, large-scale synthetic data generation (i.e., back-translation, knowledge distil…
▽ More
This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German. Our systems are based on the Transformer (Vaswani et al., 2017) with several novel and effective variants. In our experiments, we employ data filtering, large-scale synthetic data generation (i.e., back-translation, knowledge distillation, forward-translation, iterative in-domain knowledge transfer), advanced finetuning approaches, and boosted Self-BLEU based model ensemble. Our constrained systems achieve 36.9, 46.9, 27.8 and 31.3 case-sensitive BLEU scores on English->Chinese, English->Japanese, Japanese->English and English->German, respectively. The BLEU scores of English->Chinese, English->Japanese and Japanese->English are the highest among all submissions, and that of English->German is the highest among all constrained submissions.
△ Less
Submitted 13 September, 2021; v1 submitted 5 August, 2021;
originally announced August 2021.
-
Depth Evaluation for Metal Surface Defects by Eddy Current Testing using Deep Residual Convolutional Neural Networks
Authors:
Tian Meng,
Yang Tao,
Ziqi Chen,
Jorge R. Salas Avila,
Qiaoye Ran,
Yuchun Shao,
Ruochen Huang,
Yuedong Xie,
Qian Zhao,
Zhijie Zhang,
Hujun Yin,
Anthony J. Peyton,
Wuliang Yin
Abstract:
Eddy current testing (ECT) is an effective technique in the evaluation of the depth of metal surface defects. However, in practice, the evaluation primarily relies on the experience of an operator and is often carried out by manual inspection. In this paper, we address the challenges of automatic depth evaluation of metal surface defects by virtual of state-of-the-art deep learning (DL) techniques…
▽ More
Eddy current testing (ECT) is an effective technique in the evaluation of the depth of metal surface defects. However, in practice, the evaluation primarily relies on the experience of an operator and is often carried out by manual inspection. In this paper, we address the challenges of automatic depth evaluation of metal surface defects by virtual of state-of-the-art deep learning (DL) techniques. The main contributions are three-fold. Firstly, a highly-integrated portable ECT device is developed, which takes advantage of an advanced field programmable gate array (Zynq-7020 system on chip) and provides fast data acquisition and in-phase/quadrature demodulation. Secondly, a dataset, termed as MDDECT, is constructed using the ECT device by human operators and made openly available. It contains 48,000 scans from 18 defects of different depths and lift-offs. Thirdly, the depth evaluation problem is formulated as a time series classification problem, and various state-of-the-art 1-d residual convolutional neural networks are trained and evaluated on the MDDECT dataset. A 38-layer 1-d ResNeXt achieves an accuracy of 93.58% in discriminating the surface defects in a stainless steel sheet. The depths of the defects vary from 0.3 mm to 2.0 mm in a resolution of 0.1 mm. In addition, results show that the trained ResNeXt1D-38 model is immune to lift-off signals.
△ Less
Submitted 8 March, 2021;
originally announced April 2021.
-
Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
Authors:
Qiu Ran,
Yankai Lin,
Peng Li,
Jie Zhou
Abstract:
Non-autoregressive neural machine translation (NAT) predicts the entire target sequence simultaneously and significantly accelerates inference process. However, NAT discards the dependency information in a sentence, and thus inevitably suffers from the multi-modality problem: the target tokens may be provided by different possible translations, often causing token repetitions or missing. To allevi…
▽ More
Non-autoregressive neural machine translation (NAT) predicts the entire target sequence simultaneously and significantly accelerates inference process. However, NAT discards the dependency information in a sentence, and thus inevitably suffers from the multi-modality problem: the target tokens may be provided by different possible translations, often causing token repetitions or missing. To alleviate this problem, we propose a novel semi-autoregressive model RecoverSAT in this work, which generates a translation as a sequence of segments. The segments are generated simultaneously while each segment is predicted token-by-token. By dynamically determining segment length and deleting repetitive segments, RecoverSAT is capable of recovering from repetitive and missing token errors. Experimental results on three widely-used benchmark datasets show that our proposed model achieves more than 4$\times$ speedup while maintaining comparable performance compared with the corresponding autoregressive model.
△ Less
Submitted 9 June, 2020;
originally announced June 2020.
-
Guiding Non-Autoregressive Neural Machine Translation Decoding with Reordering Information
Authors:
Qiu Ran,
Yankai Lin,
Peng Li,
Jie Zhou
Abstract:
Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration. However, existing NAT models still have a big gap in translation quality compared to autoregressive neural machine translation models due to the enormous decoding space. To address this problem, we propose a novel NAT framework named ReorderNAT which explici…
▽ More
Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration. However, existing NAT models still have a big gap in translation quality compared to autoregressive neural machine translation models due to the enormous decoding space. To address this problem, we propose a novel NAT framework named ReorderNAT which explicitly models the reordering information in the decoding procedure. We further introduce deterministic and non-deterministic decoding strategies that utilize reordering information to narrow the decoding search space in our proposed ReorderNAT. Experimental results on various widely-used datasets show that our proposed model achieves better performance compared to existing NAT models, and even achieves comparable translation quality as autoregressive translation models with a significant speedup.
△ Less
Submitted 16 December, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
NumNet: Machine Reading Comprehension with Numerical Reasoning
Authors:
Qiu Ran,
Yankai Lin,
Peng Li,
Jie Zhou,
Zhiyuan Liu
Abstract:
Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human's reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems. To address this issue, we propose a numerical MRC model named as NumNet, which utilizes a numerically-aware graph neural network to consider the comparing information and performs n…
▽ More
Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human's reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems. To address this issue, we propose a numerical MRC model named as NumNet, which utilizes a numerically-aware graph neural network to consider the comparing information and performs numerical reasoning over numbers in the question and passage. Our system achieves an EM-score of 64.56% on the DROP dataset, outperforming all existing machine reading comprehension models by considering the numerical relations among numbers.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
Option Comparison Network for Multiple-choice Reading Comprehension
Authors:
Qiu Ran,
Peng Li,
Weiwei Hu,
Jie Zhou
Abstract:
Multiple-choice reading comprehension (MCRC) is the task of selecting the correct answer from multiple options given a question and an article. Existing MCRC models typically either read each option independently or compute a fixed-length representation for each option before comparing them. However, humans typically compare the options at multiple-granularity level before reading the article in d…
▽ More
Multiple-choice reading comprehension (MCRC) is the task of selecting the correct answer from multiple options given a question and an article. Existing MCRC models typically either read each option independently or compute a fixed-length representation for each option before comparing them. However, humans typically compare the options at multiple-granularity level before reading the article in detail to make reasoning more efficient. Mimicking humans, we propose an option comparison network (OCN) for MCRC which compares options at word-level to better identify their correlations to help reasoning. Specially, each option is encoded into a vector sequence using a skimmer to retain fine-grained information as much as possible. An attention mechanism is leveraged to compare these sequences vector-by-vector to identify more subtle correlations between options, which is potentially valuable for reasoning. Experimental results on the human English exam MCRC dataset RACE show that our model outperforms existing methods significantly. Moreover, it is also the first model that surpasses Amazon Mechanical Turker performance on the whole dataset.
△ Less
Submitted 7 March, 2019;
originally announced March 2019.