Search | arXiv e-print repository

High-Resolution Volumetric Reconstruction for Clothed Humans

Authors: Sicong Tang, Guangyuan Wang, Qing Ran, Lingzhi Li, Li Shen, ** Tan

Abstract: We present a novel method for reconstructing clothed humans from a sparse set of, e.g., 1 to 6 RGB images. Despite impressive results from recent works employing deep implicit representation, we revisit the volumetric approach and demonstrate that better performance can be achieved with proper system design. The volumetric representation offers significant advantages in leveraging 3D spatial conte… ▽ More We present a novel method for reconstructing clothed humans from a sparse set of, e.g., 1 to 6 RGB images. Despite impressive results from recent works employing deep implicit representation, we revisit the volumetric approach and demonstrate that better performance can be achieved with proper system design. The volumetric representation offers significant advantages in leveraging 3D spatial context through 3D convolutions, and the notorious quantization error is largely negligible with a reasonably large yet affordable volume resolution, e.g., 512. To handle memory and computation costs, we propose a sophisticated coarse-to-fine strategy with voxel culling and subspace sparse convolution. Our method starts with a discretized visual hull to compute a coarse shape and then focuses on a narrow band nearby the coarse shape for refinement. Once the shape is reconstructed, we adopt an image-based rendering approach, which computes the colors of surface points by blending input images with learned weights. Extensive experimental results show that our method significantly reduces the mean point-to-surface (P2S) precision of state-of-the-art methods by more than 50% to achieve approximately 2mm accuracy with a 512 volume resolution. Additionally, images rendered from our textured model achieve a higher peak signal-to-noise ratio (PSNR) compared to state-of-the-art methods. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2304.00708 [pdf, ps, other]

doi 10.1002/sat.1496

Links Assignment Scheme based on Potential Edges Importance in Dual-layer Wavelength Routing Optical Satellite Networks

Authors: **gkai Yang, Qiwen Ran, Hongyu Wu, **g Ma

Abstract: With the development of the massive satellite constellation and the on-orbit laser-based communication equipment, the wavelength routing optical satellite network (WROSN) becomes a potential solution for on-orbit, high-capacity, and high-speed communication. Since the inter-satellite links (ISLs) are time-varying, one of the fundamental considerations in the construction of the WROSN is assigning… ▽ More With the development of the massive satellite constellation and the on-orbit laser-based communication equipment, the wavelength routing optical satellite network (WROSN) becomes a potential solution for on-orbit, high-capacity, and high-speed communication. Since the inter-satellite links (ISLs) are time-varying, one of the fundamental considerations in the construction of the WROSN is assigning limited laser communication terminals for each satellite to establish ISLs with the visible satellites. Therefore, we propose a links assignment scheme (LAS) based on the potential edges importance matrix (PEIM) algorithm to construct a temporarily stable topology of the ISLs for a dual-layer constellation. The simulation results showed that the LAS based on the PEIM algorithm is better than LAS based on the random or Greedy algorithm in terms of node-to-node distance, node pair connectivity, wavelength demand, and transmission delay. The node pair connectivity and wavelength demand in WROSN is a trade-off problem. The research in this paper also brings a novel method for reduction of the cost of the on-board resources, that is through designing topology of the ISLs with links assignment algorithm. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: This is the manuscript version that was submitted to the International Journal of Satellite Communications and Networking (SAT-23-0018)

Journal ref: Int J Satell Commun Network. 2023; 1-12 (2023)

arXiv:2201.02861 [pdf, other]

Decoupling Makes Weakly Supervised Local Feature Better

Authors: Kunhong Li, Longguang Wang, Li Liu, Qing Ran, Kai Xu, Yulan Guo

Abstract: Weakly supervised learning can help local feature methods to overcome the obstacle of acquiring a large-scale dataset with densely labeled correspondences. However, since weak supervision cannot distinguish the losses caused by the detection and description steps, directly conducting weakly supervised learning within a joint describe-then-detect pipeline suffers limited performance. In this paper,… ▽ More Weakly supervised learning can help local feature methods to overcome the obstacle of acquiring a large-scale dataset with densely labeled correspondences. However, since weak supervision cannot distinguish the losses caused by the detection and description steps, directly conducting weakly supervised learning within a joint describe-then-detect pipeline suffers limited performance. In this paper, we propose a decoupled describe-then-detect pipeline tailored for weakly supervised local feature learning. Within our pipeline, the detection step is decoupled from the description step and postponed until discriminative and robust descriptors are learned. In addition, we introduce a line-to-window search strategy to explicitly use the camera pose information for better descriptor learning. Extensive experiments show that our method, namely PoSFeat (Camera Pose Supervised Feature), outperforms previous fully and weakly supervised methods and achieves state-of-the-art performance on a wide range of downstream tasks. △ Less

Submitted 25 March, 2022; v1 submitted 8 January, 2022; originally announced January 2022.

Comments: CVPR2022

arXiv:2112.02772 [pdf, other]

ActiveZero: Mixed Domain Learning for Active Stereovision with Zero Annotation

Authors: Isabella Liu, Edward Yang, Jianyu Tao, Rui Chen, Xiaoshuai Zhang, Qing Ran, Zhu Liu, Hao Su

Abstract: Traditional depth sensors generate accurate real world depth estimates that surpass even the most advanced learning approaches trained only on simulation domains. Since ground truth depth is readily available in the simulation domain but quite difficult to obtain in the real domain, we propose a method that leverages the best of both worlds. In this paper we present a new framework, ActiveZero, wh… ▽ More Traditional depth sensors generate accurate real world depth estimates that surpass even the most advanced learning approaches trained only on simulation domains. Since ground truth depth is readily available in the simulation domain but quite difficult to obtain in the real domain, we propose a method that leverages the best of both worlds. In this paper we present a new framework, ActiveZero, which is a mixed domain learning solution for active stereovision systems that requires no real world depth annotation. First, we demonstrate the transferability of our method to out-of-distribution real data by using a mixed domain learning strategy. In the simulation domain, we use a combination of supervised disparity loss and self-supervised losses on a shape primitives dataset. By contrast, in the real domain, we only use self-supervised losses on a dataset that is out-of-distribution from either training simulation data or test real data. Second, our method introduces a novel self-supervised loss called temporal IR reprojection to increase the robustness and accuracy of our reprojections in hard-to-perceive regions. Finally, we show how the method can be trained end-to-end and that each module is important for attaining the end result. Extensive qualitative and quantitative evaluations on real data demonstrate state of the art results that can even beat a commercial depth sensor. △ Less

Submitted 5 December, 2021; originally announced December 2021.

arXiv:2108.02401 [pdf, other]

WeChat Neural Machine Translation Systems for WMT21

Authors: Xianfeng Zeng, Yi** Liu, Ernan Li, Qiu Ran, Fandong Meng, Peng Li, **an Xu, Jie Zhou

Abstract: This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German. Our systems are based on the Transformer (Vaswani et al., 2017) with several novel and effective variants. In our experiments, we employ data filtering, large-scale synthetic data generation (i.e., back-translation, knowledge distil… ▽ More This paper introduces WeChat AI's participation in WMT 2021 shared news translation task on English->Chinese, English->Japanese, Japanese->English and English->German. Our systems are based on the Transformer (Vaswani et al., 2017) with several novel and effective variants. In our experiments, we employ data filtering, large-scale synthetic data generation (i.e., back-translation, knowledge distillation, forward-translation, iterative in-domain knowledge transfer), advanced finetuning approaches, and boosted Self-BLEU based model ensemble. Our constrained systems achieve 36.9, 46.9, 27.8 and 31.3 case-sensitive BLEU scores on English->Chinese, English->Japanese, Japanese->English and English->German, respectively. The BLEU scores of English->Chinese, English->Japanese and Japanese->English are the highest among all submissions, and that of English->German is the highest among all constrained submissions. △ Less

Submitted 13 September, 2021; v1 submitted 5 August, 2021; originally announced August 2021.

Comments: Accepted by WMT 2021 as a system paper

arXiv:2104.02472 [pdf, other]

Depth Evaluation for Metal Surface Defects by Eddy Current Testing using Deep Residual Convolutional Neural Networks

Authors: Tian Meng, Yang Tao, Ziqi Chen, Jorge R. Salas Avila, Qiaoye Ran, Yuchun Shao, Ruochen Huang, Yuedong Xie, Qian Zhao, Zhijie Zhang, Hujun Yin, Anthony J. Peyton, Wuliang Yin

Abstract: Eddy current testing (ECT) is an effective technique in the evaluation of the depth of metal surface defects. However, in practice, the evaluation primarily relies on the experience of an operator and is often carried out by manual inspection. In this paper, we address the challenges of automatic depth evaluation of metal surface defects by virtual of state-of-the-art deep learning (DL) techniques… ▽ More Eddy current testing (ECT) is an effective technique in the evaluation of the depth of metal surface defects. However, in practice, the evaluation primarily relies on the experience of an operator and is often carried out by manual inspection. In this paper, we address the challenges of automatic depth evaluation of metal surface defects by virtual of state-of-the-art deep learning (DL) techniques. The main contributions are three-fold. Firstly, a highly-integrated portable ECT device is developed, which takes advantage of an advanced field programmable gate array (Zynq-7020 system on chip) and provides fast data acquisition and in-phase/quadrature demodulation. Secondly, a dataset, termed as MDDECT, is constructed using the ECT device by human operators and made openly available. It contains 48,000 scans from 18 defects of different depths and lift-offs. Thirdly, the depth evaluation problem is formulated as a time series classification problem, and various state-of-the-art 1-d residual convolutional neural networks are trained and evaluated on the MDDECT dataset. A 38-layer 1-d ResNeXt achieves an accuracy of 93.58% in discriminating the surface defects in a stainless steel sheet. The depths of the defects vary from 0.3 mm to 2.0 mm in a resolution of 0.1 mm. In addition, results show that the trained ResNeXt1D-38 model is immune to lift-off signals. △ Less

Submitted 8 March, 2021; originally announced April 2021.

arXiv:2006.05165 [pdf, other]

Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation

Authors: Qiu Ran, Yankai Lin, Peng Li, Jie Zhou

Abstract: Non-autoregressive neural machine translation (NAT) predicts the entire target sequence simultaneously and significantly accelerates inference process. However, NAT discards the dependency information in a sentence, and thus inevitably suffers from the multi-modality problem: the target tokens may be provided by different possible translations, often causing token repetitions or missing. To allevi… ▽ More Non-autoregressive neural machine translation (NAT) predicts the entire target sequence simultaneously and significantly accelerates inference process. However, NAT discards the dependency information in a sentence, and thus inevitably suffers from the multi-modality problem: the target tokens may be provided by different possible translations, often causing token repetitions or missing. To alleviate this problem, we propose a novel semi-autoregressive model RecoverSAT in this work, which generates a translation as a sequence of segments. The segments are generated simultaneously while each segment is predicted token-by-token. By dynamically determining segment length and deleting repetitive segments, RecoverSAT is capable of recovering from repetitive and missing token errors. Experimental results on three widely-used benchmark datasets show that our proposed model achieves more than 4$\times$ speedup while maintaining comparable performance compared with the corresponding autoregressive model. △ Less

Submitted 9 June, 2020; originally announced June 2020.

Comments: This work has been accepted for publication at ACL2020

arXiv:1911.02215 [pdf, other]

Guiding Non-Autoregressive Neural Machine Translation Decoding with Reordering Information

Authors: Qiu Ran, Yankai Lin, Peng Li, Jie Zhou

Abstract: Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration. However, existing NAT models still have a big gap in translation quality compared to autoregressive neural machine translation models due to the enormous decoding space. To address this problem, we propose a novel NAT framework named ReorderNAT which explici… ▽ More Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration. However, existing NAT models still have a big gap in translation quality compared to autoregressive neural machine translation models due to the enormous decoding space. To address this problem, we propose a novel NAT framework named ReorderNAT which explicitly models the reordering information in the decoding procedure. We further introduce deterministic and non-deterministic decoding strategies that utilize reordering information to narrow the decoding search space in our proposed ReorderNAT. Experimental results on various widely-used datasets show that our proposed model achieves better performance compared to existing NAT models, and even achieves comparable translation quality as autoregressive translation models with a significant speedup. △ Less

Submitted 16 December, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: Accepted by AAAI 2021

arXiv:1910.06701 [pdf, other]

NumNet: Machine Reading Comprehension with Numerical Reasoning

Authors: Qiu Ran, Yankai Lin, Peng Li, Jie Zhou, Zhiyuan Liu

Abstract: Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human's reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems. To address this issue, we propose a numerical MRC model named as NumNet, which utilizes a numerically-aware graph neural network to consider the comparing information and performs n… ▽ More Numerical reasoning, such as addition, subtraction, sorting and counting is a critical skill in human's reading comprehension, which has not been well considered in existing machine reading comprehension (MRC) systems. To address this issue, we propose a numerical MRC model named as NumNet, which utilizes a numerically-aware graph neural network to consider the comparing information and performs numerical reasoning over numbers in the question and passage. Our system achieves an EM-score of 64.56% on the DROP dataset, outperforming all existing machine reading comprehension models by considering the numerical relations among numbers. △ Less

Submitted 15 October, 2019; originally announced October 2019.

Comments: Accepted to EMNLP2019; 11 pages, 2 figures, 6 tables

arXiv:1903.03033 [pdf, ps, other]

Option Comparison Network for Multiple-choice Reading Comprehension

Authors: Qiu Ran, Peng Li, Weiwei Hu, Jie Zhou

Abstract: Multiple-choice reading comprehension (MCRC) is the task of selecting the correct answer from multiple options given a question and an article. Existing MCRC models typically either read each option independently or compute a fixed-length representation for each option before comparing them. However, humans typically compare the options at multiple-granularity level before reading the article in d… ▽ More Multiple-choice reading comprehension (MCRC) is the task of selecting the correct answer from multiple options given a question and an article. Existing MCRC models typically either read each option independently or compute a fixed-length representation for each option before comparing them. However, humans typically compare the options at multiple-granularity level before reading the article in detail to make reasoning more efficient. Mimicking humans, we propose an option comparison network (OCN) for MCRC which compares options at word-level to better identify their correlations to help reasoning. Specially, each option is encoded into a vector sequence using a skimmer to retain fine-grained information as much as possible. An attention mechanism is leveraged to compare these sequences vector-by-vector to identify more subtle correlations between options, which is potentially valuable for reasoning. Experimental results on the human English exam MCRC dataset RACE show that our model outperforms existing methods significantly. Moreover, it is also the first model that surpasses Amazon Mechanical Turker performance on the whole dataset. △ Less

Submitted 7 March, 2019; originally announced March 2019.

Comments: 6 pages, 2 tables

Showing 1–10 of 10 results for author: Ran, Q