Search | arXiv e-print repository

Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh

Authors: Xiangjun Gao, Xiaoyu Li, Yiyu Zhuang, Qi Zhang, Wenbo Hu, Chaopeng Zhang, Yao Yao, Ying Shan, Long Quan

Abstract: Neural 3D representations such as Neural Radiance Fields (NeRF), excel at producing photo-realistic rendering results but lack the flexibility for manipulation and editing which is crucial for content creation. Previous works have attempted to address this issue by deforming a NeRF in canonical space or manipulating the radiance field based on an explicit mesh. However, manipulating NeRF is not hi… ▽ More Neural 3D representations such as Neural Radiance Fields (NeRF), excel at producing photo-realistic rendering results but lack the flexibility for manipulation and editing which is crucial for content creation. Previous works have attempted to address this issue by deforming a NeRF in canonical space or manipulating the radiance field based on an explicit mesh. However, manipulating NeRF is not highly controllable and requires a long training and inference time. With the emergence of 3D Gaussian Splatting (3DGS), extremely high-fidelity novel view synthesis can be achieved using an explicit point-based 3D representation with much faster training and rendering speed. However, there is still a lack of effective means to manipulate 3DGS freely while maintaining rendering quality. In this work, we aim to tackle the challenge of achieving manipulable photo-realistic rendering. We propose to utilize a triangular mesh to manipulate 3DGS directly with self-adaptation. This approach reduces the need to design various algorithms for different types of Gaussian manipulation. By utilizing a triangle shape-aware Gaussian binding and adapting method, we can achieve 3DGS manipulation and preserve high-fidelity rendering after manipulation. Our approach is capable of handling large deformations, local manipulations, and soft body simulations while kee** high-quality rendering. Furthermore, we demonstrate that our method is also effective with inaccurate meshes extracted from 3DGS. Experiments conducted demonstrate the effectiveness of our method and its superiority over baseline approaches. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Project page here: https://gaoxiangjun.github.io/mani_gs/

arXiv:2405.13874 [pdf, other]

Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching

Authors: Hongkai Chen, Zixin Luo, Yurun Tian, Xuyang Bai, Ziyu Wang, Lei Zhou, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan

Abstract: Identifying robust and accurate correspondences across images is a fundamental problem in computer vision that enables various downstream tasks. Recent semi-dense matching methods emphasize the effectiveness of fusing relevant cross-view information through Transformer. In this paper, we propose several improvements upon this paradigm. Firstly, we introduce affine-based local attention to model cr… ▽ More Identifying robust and accurate correspondences across images is a fundamental problem in computer vision that enables various downstream tasks. Recent semi-dense matching methods emphasize the effectiveness of fusing relevant cross-view information through Transformer. In this paper, we propose several improvements upon this paradigm. Firstly, we introduce affine-based local attention to model cross-view deformations. Secondly, we present selective fusion to merge local and global messages from cross attention. Apart from network structure, we also identify the importance of enforcing spatial smoothness in loss design, which has been omitted by previous works. Based on these augmentations, our network demonstrate strong matching capacity under different settings. The full version of our network achieves state-of-the-art performance among semi-dense matching methods at a similar cost to LoFTR, while the slim version reaches LoFTR baseline's performance with only 15% computation cost and 18% parameters. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Accepted to CVPR2024 Image Matching Workshop

arXiv:2403.17288 [pdf, other]

Sparse-Graph-Enabled Formation Planning for Large-Scale Aerial Swarms

Authors: Yuan Zhou, Lun Quan, Chao Xu, Guangtong Xu, Fei Gao

Abstract: The formation trajectory planning using complete graphs to model collaborative constraints becomes computationally intractable as the number of drones increases due to the curse of dimensionality. To tackle this issue, this paper presents a sparse graph construction method for formation planning to realize better efficiency-performance trade-off. Firstly, a sparsification mechanism for complete gr… ▽ More The formation trajectory planning using complete graphs to model collaborative constraints becomes computationally intractable as the number of drones increases due to the curse of dimensionality. To tackle this issue, this paper presents a sparse graph construction method for formation planning to realize better efficiency-performance trade-off. Firstly, a sparsification mechanism for complete graphs is designed to ensure the global rigidity of sparsified graphs, which is a necessary condition for uniquely corresponding to a geometric shape. Secondly, a good sparse graph is constructed to preserve the main structural feature of complete graphs sufficiently. Since the graph-based formation constraint is described by Laplacian matrix, the sparse graph construction problem is equivalent to submatrix selection, which has combinatorial time complexity and needs a scoring metric. Via comparative simulations, the Max-Trace matrix-revealing metric shows the promising performance. The sparse graph is integrated into the formation planning. Simulation results with 72 drones in complex environments demonstrate that when preserving 30\% connection edges, our method has comparative formation error and recovery performance w.r.t. complete graphs. Meanwhile, the planning efficiency is improved by approximate an order of magnitude. Benchmark comparisons and ablation studies are conducted to fully validate the merits of our method. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2311.17123 [pdf, other]

ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis

Authors: Xiangjun Gao, Xiaoyu Li, Chaopeng Zhang, Qi Zhang, Yanpei Cao, Ying Shan, Long Quan

Abstract: In this work, we propose a method to address the challenge of rendering a 3D human from a single image in a free-view manner. Some existing approaches could achieve this by using generalizable pixel-aligned implicit fields to reconstruct a textured mesh of a human or by employing a 2D diffusion model as guidance with the Score Distillation Sampling (SDS) method, to lift the 2D image into 3D space.… ▽ More In this work, we propose a method to address the challenge of rendering a 3D human from a single image in a free-view manner. Some existing approaches could achieve this by using generalizable pixel-aligned implicit fields to reconstruct a textured mesh of a human or by employing a 2D diffusion model as guidance with the Score Distillation Sampling (SDS) method, to lift the 2D image into 3D space. However, a generalizable implicit field often results in an over-smooth texture field, while the SDS method tends to lead to a texture-inconsistent novel view with the input image. In this paper, we introduce a texture-consistent back view synthesis module that could transfer the reference image content to the back view through depth and text-guided attention injection. Moreover, to alleviate the color distortion that occurs in the side region, we propose a visibility-aware patch consistency regularization for texture map** and refinement combined with the synthesized back view texture. With the above techniques, we could achieve high-fidelity and texture-consistent human rendering from a single image. Experiments conducted on both real and synthetic data demonstrate the effectiveness of our method and show that our approach outperforms previous baseline methods. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: see project page: https://gaoxiangjun.github.io/contex_human/

arXiv:2311.15980 [pdf, other]

Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion

Authors: Yuanxun Lu, **gyang Zhang, Shiwei Li, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan, Xun Cao, Yao Yao

Abstract: Recent advances in generative AI have unveiled significant potential for the creation of 3D content. However, current methods either apply a pre-trained 2D diffusion model with the time-consuming score distillation sampling (SDS), or a direct 3D diffusion model trained on limited 3D data losing generation diversity. In this work, we approach the problem by employing a multi-view 2.5D diffusion fin… ▽ More Recent advances in generative AI have unveiled significant potential for the creation of 3D content. However, current methods either apply a pre-trained 2D diffusion model with the time-consuming score distillation sampling (SDS), or a direct 3D diffusion model trained on limited 3D data losing generation diversity. In this work, we approach the problem by employing a multi-view 2.5D diffusion fine-tuned from a pre-trained 2D diffusion model. The multi-view 2.5D diffusion directly models the structural distribution of 3D data, while still maintaining the strong generalization ability of the original 2D diffusion model, filling the gap between 2D diffusion-based and direct 3D diffusion-based methods for 3D content generation. During inference, multi-view normal maps are generated using the 2.5D diffusion, and a novel differentiable rasterization scheme is introduced to fuse the almost consistent multi-view normal maps into a consistent 3D model. We further design a normal-conditioned multi-view image generation module for fast appearance generation given the 3D geometry. Our method is a one-pass diffusion process and does not require any SDS optimization as post-processing. We demonstrate through extensive experiments that, our direct 2.5D generation with the specially-designed fusion scheme can achieve diverse, mode-seeking-free, and high-fidelity 3D content generation in only 10 seconds. Project page: https://nju-3dv.github.io/projects/direct25. △ Less

Submitted 21 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: CVPR 2024 camera ready, including more evaluations and discussions. Project webpage: https://nju-3dv.github.io/projects/direct25

arXiv:2311.07100 [pdf, other]

Collaborative Planning for Catching and Transporting Objects in Unstructured Environments

Authors: Liuao Pei, Junxiao Lin, Zhichao Han, Lun Quan, Yanjun Cao, Chao Xu, Fei Gao

Abstract: Multi-robot teams have attracted attention from industry and academia for their ability to perform collaborative tasks in unstructured environments, such as wilderness rescue and collaborative transportation.In this paper, we propose a trajectory planning method for a non-holonomic robotic team with collaboration in unstructured environments.For the adaptive state collaboration of a robot team to… ▽ More Multi-robot teams have attracted attention from industry and academia for their ability to perform collaborative tasks in unstructured environments, such as wilderness rescue and collaborative transportation.In this paper, we propose a trajectory planning method for a non-holonomic robotic team with collaboration in unstructured environments.For the adaptive state collaboration of a robot team to catch and transport targets to be rescued using a net, we model the process of catching the falling target with a net in a continuous and differentiable form.This enables the robot team to fully exploit the kinematic potential, thereby adaptively catching the target in an appropriate state.Furthermore, the size safety and topological safety of the net, resulting from the collaborative support of the robots, are guaranteed through geometric constraints.We integrate our algorithm on a car-like robot team and test it in simulations and real-world experiments to validate our performance.Our method is compared to state-of-the-art multi-vehicle trajectory planning methods, demonstrating significant performance in efficiency and trajectory quality. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2310.06744 [pdf, other]

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Authors: Wangbo Yu, Li Yuan, Yan-Pei Cao, Xiangjun Gao, Xiaoyu Li, Wenbo Hu, Long Quan, Ying Shan, Yonghong Tian

Abstract: Recent advances in diffusion models have enabled 3D generation from a single image. However, current methods often produce suboptimal results for novel views, with blurred textures and deviations from the reference image, limiting their practical applications. In this paper, we introduce HiFi-123, a method designed for high-fidelity and multi-view consistent 3D generation. Our contributions are tw… ▽ More Recent advances in diffusion models have enabled 3D generation from a single image. However, current methods often produce suboptimal results for novel views, with blurred textures and deviations from the reference image, limiting their practical applications. In this paper, we introduce HiFi-123, a method designed for high-fidelity and multi-view consistent 3D generation. Our contributions are twofold: First, we propose a Reference-Guided Novel View Enhancement (RGNV) technique that significantly improves the fidelity of diffusion-based zero-shot novel view synthesis methods. Second, capitalizing on the RGNV, we present a novel Reference-Guided State Distillation (RGSD) loss. When incorporated into the optimization-based image-to-3D pipeline, our method significantly improves 3D generation quality, achieving state-of-the-art performance. Comprehensive evaluations demonstrate the effectiveness of our approach over existing methods, both qualitatively and quantitatively. Video results are available on the project page. △ Less

Submitted 25 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: Project Page: https://drexubery.github.io/HiFi-123/

arXiv:2310.06347 [pdf, other]

JointNet: Extending Text-to-Image Diffusion for Dense Distribution Modeling

Authors: **gyang Zhang, Shiwei Li, Yuanxun Lu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan, Yao Yao

Abstract: We introduce JointNet, a novel neural network architecture for modeling the joint distribution of images and an additional dense modality (e.g., depth maps). JointNet is extended from a pre-trained text-to-image diffusion model, where a copy of the original network is created for the new dense modality branch and is densely connected with the RGB branch. The RGB branch is locked during network fin… ▽ More We introduce JointNet, a novel neural network architecture for modeling the joint distribution of images and an additional dense modality (e.g., depth maps). JointNet is extended from a pre-trained text-to-image diffusion model, where a copy of the original network is created for the new dense modality branch and is densely connected with the RGB branch. The RGB branch is locked during network fine-tuning, which enables efficient learning of the new modality distribution while maintaining the strong generalization ability of the large-scale pre-trained diffusion model. We demonstrate the effectiveness of JointNet by using RGBD diffusion as an example and through extensive experiments, showcasing its applicability in a variety of applications, including joint RGBD generation, dense depth prediction, depth-conditioned image generation, and coherent tile-based 3D panorama generation. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2309.14677 [pdf, other]

XGV-BERT: Leveraging Contextualized Language Model and Graph Neural Network for Efficient Software Vulnerability Detection

Authors: Vu Le Anh Quan, Chau Thuan Phat, Kiet Van Nguyen, Phan The Duy, Van-Hau Pham

Abstract: With the advancement of deep learning (DL) in various fields, there are many attempts to reveal software vulnerabilities by data-driven approach. Nonetheless, such existing works lack the effective representation that can retain the non-sequential semantic characteristics and contextual relationship of source code attributes. Hence, in this work, we propose XGV-BERT, a framework that combines the… ▽ More With the advancement of deep learning (DL) in various fields, there are many attempts to reveal software vulnerabilities by data-driven approach. Nonetheless, such existing works lack the effective representation that can retain the non-sequential semantic characteristics and contextual relationship of source code attributes. Hence, in this work, we propose XGV-BERT, a framework that combines the pre-trained CodeBERT model and Graph Neural Network (GCN) to detect software vulnerabilities. By jointly training the CodeBERT and GCN modules within XGV-BERT, the proposed model leverages the advantages of large-scale pre-training, harnessing vast raw data, and transfer learning by learning representations for training data through graph convolution. The research results demonstrate that the XGV-BERT method significantly improves vulnerability detection accuracy compared to two existing methods such as VulDeePecker and SySeVR. For the VulDeePecker dataset, XGV-BERT achieves an impressive F1-score of 97.5%, significantly outperforming VulDeePecker, which achieved an F1-score of 78.3%. Again, with the SySeVR dataset, XGV-BERT achieves an F1-score of 95.5%, surpassing the results of SySeVR with an F1-score of 83.5%. △ Less

Submitted 26 September, 2023; originally announced September 2023.

arXiv:2303.17147 [pdf, other]

NeILF++: Inter-Reflectable Light Fields for Geometry and Material Estimation

Authors: **gyang Zhang, Yao Yao, Shiwei Li, **gbo Liu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan

Abstract: We present a novel differentiable rendering framework for joint geometry, material, and lighting estimation from multi-view images. In contrast to previous methods which assume a simplified environment map or co-located flashlights, in this work, we formulate the lighting of a static scene as one neural incident light field (NeILF) and one outgoing neural radiance field (NeRF). The key insight of… ▽ More We present a novel differentiable rendering framework for joint geometry, material, and lighting estimation from multi-view images. In contrast to previous methods which assume a simplified environment map or co-located flashlights, in this work, we formulate the lighting of a static scene as one neural incident light field (NeILF) and one outgoing neural radiance field (NeRF). The key insight of the proposed method is the union of the incident and outgoing light fields through physically-based rendering and inter-reflections between surfaces, making it possible to disentangle the scene geometry, material, and lighting from image observations in a physically-based manner. The proposed incident light and inter-reflection framework can be easily applied to other NeRF systems. We show that our method can not only decompose the outgoing radiance into incident lights and surface materials, but also serve as a surface refinement module that further improves the reconstruction detail of the neural surface. We demonstrate on several datasets that the proposed method is able to achieve state-of-the-art results in terms of geometry reconstruction quality, material estimation accuracy, and the fidelity of novel view rendering. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: Project page: \url{https://yoyo000.github.io/NeILF_pp}

arXiv:2210.04048 [pdf, other]

Robust and Efficient Trajectory Planning for Formation Flight in Dense Environments

Authors: Lun Quan, Longji Yin, Tingrui Zhang, Mingyang Wang, Ruilin Wang, Sheng Zhong, Zhou Xin, Yanjun Cao, Chao Xu, Fei Gao

Abstract: Formation flight has a vast potential for aerial robot swarms in various applications. However, existing methods lack the capability to achieve fully autonomous large-scale formation flight in dense environments. To bridge the gap, we present a complete formation flight system that effectively integrates real-world constraints into aerial formation navigation. This paper proposes a differentiable… ▽ More Formation flight has a vast potential for aerial robot swarms in various applications. However, existing methods lack the capability to achieve fully autonomous large-scale formation flight in dense environments. To bridge the gap, we present a complete formation flight system that effectively integrates real-world constraints into aerial formation navigation. This paper proposes a differentiable graph-based metric to quantify the overall similarity error between formations. This metric is invariant to rotation, translation, and scaling, providing more freedom for formation coordination. We design a distributed trajectory optimization framework that considers formation similarity, obstacle avoidance, and dynamic feasibility. The optimization is decoupled to make large-scale formation flights computationally feasible. To improve the elasticity of formation navigation in highly constrained scenes, we present a swarm reorganization method that adaptively adjusts the formation parameters and task assignments by generating local navigation goals. A novel swarm agreement strategy called global-remap-local-replan and a formation-level path planner is proposed in this work to coordinate the global planning and local trajectory optimizations. To validate the proposed method, we design comprehensive benchmarks and simulations with other cutting-edge works in terms of adaptability, predictability, elasticity, resilience, and efficiency. Finally, integrated with palm-sized swarm platforms with onboard computers and sensors, the proposed method demonstrates its efficiency and robustness by achieving the largest scale formation flight in dense outdoor environments. △ Less

Submitted 6 August, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

Comments: Accepted for IEEE Transactions on Robotics

arXiv:2209.04791 [pdf, other]

doi 10.1145/3551349.3560427

Towards Understanding the Faults of JavaScript-Based Deep Learning Systems

Authors: Lili Quan, Qianyu Guo, Xiaofei Xie, Sen Chen, Xiaohong Li, Yang Liu

Abstract: Quality assurance is of great importance for deep learning (DL) systems, especially when they are applied in safety-critical applications. While quality issues of native DL applications have been extensively analyzed, the issues of JavaScript-based DL applications have never been systematically studied. Compared with native DL applications, JavaScript-based DL applications can run on major browser… ▽ More Quality assurance is of great importance for deep learning (DL) systems, especially when they are applied in safety-critical applications. While quality issues of native DL applications have been extensively analyzed, the issues of JavaScript-based DL applications have never been systematically studied. Compared with native DL applications, JavaScript-based DL applications can run on major browsers, making the platform- and device-independent. Specifically, the quality of JavaScript-based DL applications depends on the 3 parts: the application, the third-party DL library used and the underlying DL framework (e.g., TensorFlow.js), called JavaScript-based DL system. In this paper, we conduct the first empirical study on the quality issues of JavaScript-based DL systems. Specifically, we collect and analyze 700 real-world faults from relevant GitHub repositories, including the official TensorFlow.js repository, 13 third-party DL libraries, and 58 JavaScript-based DL applications. To better understand the characteristics of these faults, we manually analyze and construct taxonomies for the fault symptoms, root causes, and fix patterns, respectively. Moreover, we also study the fault distributions of symptoms and root causes, in terms of the different stages of the development lifecycle, the 3-level architecture in the DL system, and the 4 major components of TensorFlow.js framework. Based on the results, we suggest actionable implications and research avenues that can potentially facilitate the development, testing, and debugging of JavaScript-based DL systems. △ Less

Submitted 11 September, 2022; originally announced September 2022.

Comments: 13 pages, 8 figures, ASE 2022

arXiv:2208.14201 [pdf, other]

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

Authors: Hongkai Chen, Zixin Luo, Lei Zhou, Yurun Tian, Mingmin Zhen, Tian Fang, David Mckinnon, Yanghai Tsin, Long Quan

Abstract: Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications. To capture context at both global and local granularity, we propose ASpanFormer, a Transformer-based detector-free matcher that is built on hierarchical attention structure, adopting a novel attention operation which is capable of adjusting attention span in a self-adaptive manner. T… ▽ More Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications. To capture context at both global and local granularity, we propose ASpanFormer, a Transformer-based detector-free matcher that is built on hierarchical attention structure, adopting a novel attention operation which is capable of adjusting attention span in a self-adaptive manner. To achieve this goal, first, flow maps are regressed in each cross attention phase to locate the center of search region. Next, a sampling grid is generated around the center, whose size, instead of being empirically configured as fixed, is adaptively computed from a pixel uncertainty estimated along with the flow map. Finally, attention is computed across two images within derived regions, referred to as attention span. By these means, we are able to not only maintain long-range dependencies, but also enable fine-grained attention among pixels of high relevance that compensates essential locality and piece-wise smoothness in matching tasks. State-of-the-art accuracy on a wide range of evaluation benchmarks validates the strong matching capability of our method. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: Accepted to ECCV2022, project page at https://aspanformer.github.io/

arXiv:2206.03087 [pdf, other]

Critical Regularizations for Neural Surface Reconstruction in the Wild

Authors: **gyang Zhang, Yao Yao, Shiwei Li, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan

Abstract: Neural implicit functions have recently shown promising results on surface reconstructions from multiple views. However, current methods still suffer from excessive time complexity and poor robustness when reconstructing unbounded or complex scenes. In this paper, we present RegSDF, which shows that proper point cloud supervisions and geometry regularizations are sufficient to produce high-quality… ▽ More Neural implicit functions have recently shown promising results on surface reconstructions from multiple views. However, current methods still suffer from excessive time complexity and poor robustness when reconstructing unbounded or complex scenes. In this paper, we present RegSDF, which shows that proper point cloud supervisions and geometry regularizations are sufficient to produce high-quality and robust reconstruction results. Specifically, RegSDF takes an additional oriented point cloud as input, and optimizes a signed distance field and a surface light field within a differentiable rendering framework. We also introduce the two critical regularizations for this optimization. The first one is the Hessian regularization that smoothly diffuses the signed distance values to the entire distance field given noisy and incomplete input. And the second one is the minimal surface regularization that compactly interpolates and extrapolates the missing geometry. Extensive experiments are conducted on DTU, BlendedMVS, and Tanks and Temples datasets. Compared with recent neural surface reconstruction approaches, RegSDF is able to reconstruct surfaces with fine details even for open scenes with complex topologies and unstructured camera trajectories. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: CVPR 2022

arXiv:2205.11212 [pdf, other]

CircleChain: Tokenizing Products with a Role-based Scheme for a Circular Economy

Authors: Mojtaba Eshghie, Li Quan, Gustav Andersson Kasche, Filip Jacobson, Cosimo Bassi, Cyrille Artho

Abstract: In a circular economy, tracking the flow of second-life components for quality control is critical. Tokenization can enhance the transparency of the flow of second-life components. However, simple tokenization does not correspond to real economic models and lacks the ability to finely manage complex business processes. In particular, existing systems have to take into account the different roles o… ▽ More In a circular economy, tracking the flow of second-life components for quality control is critical. Tokenization can enhance the transparency of the flow of second-life components. However, simple tokenization does not correspond to real economic models and lacks the ability to finely manage complex business processes. In particular, existing systems have to take into account the different roles of the parties in the supply chain. Based on the Algorand blockchain, we propose a role-based token management scheme, which can achieve authentication, synthesis, circulation, and reuse of these second-life components in a trustless environment. The proposed scheme not only achieves fine-grained and scalable second-life component management, but also enables on-chain trading, subsidies, and green-bond issuance. Furthermore, we implemented and performed scalability tests for the proposed architecture on Algorand blockchain using its smart contracts and Algorand Standard Assets (ASA). The open-source implementation, tests, along with results are available on our Github page. △ Less

Submitted 23 May, 2022; originally announced May 2022.

arXiv:2204.02130 [pdf, other]

SemanticCAP: Chromatin Accessibility Prediction Enhanced by Features Learning from a Language Model

Authors: Yikang Zhang, Xiaomin Chu, Yelu Jiang, Hongjie Wu, Lijun Quan

Abstract: A large number of inorganic and organic compounds are able to bind DNA and form complexes, among which drug-related molecules are important. Chromatin accessibility changes not only directly affects drug-DNA interactions, but also promote or inhibit the expression of critical genes associated with drug resistance by affecting the DNA binding capacity of TFs and transcriptional regulators. However,… ▽ More A large number of inorganic and organic compounds are able to bind DNA and form complexes, among which drug-related molecules are important. Chromatin accessibility changes not only directly affects drug-DNA interactions, but also promote or inhibit the expression of critical genes associated with drug resistance by affecting the DNA binding capacity of TFs and transcriptional regulators. However, Biological experimental techniques for measuring it are expensive and time consuming. In recent years, several kinds of computational methods have been proposed to identify accessible regions of the genome. Existing computational models mostly ignore the contextual information of bases in gene sequences. To address these issues, we proposed a new solution named SemanticCAP. It introduces a gene language model which models the context of gene sequences, thus being able to provide an effective representation of a certain site in gene sequences. Basically, we merge the features provided by the gene language model into our chromatin accessibility model. During the process, we designed some methods to make feature fusion smoother. Compared with other systems under public benchmarks, our model proved to have better performance. △ Less

Submitted 6 April, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

arXiv:2203.07182 [pdf, other]

NeILF: Neural Incident Light Field for Physically-based Material Estimation

Authors: Yao Yao, **gyang Zhang, **gbo Liu, Yihang Qu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan

Abstract: We present a differentiable rendering framework for material and lighting estimation from multi-view images and a reconstructed geometry. In the framework, we represent scene lightings as the Neural Incident Light Field (NeILF) and material properties as the surface BRDF modelled by multi-layer perceptrons. Compared with recent approaches that approximate scene lightings as the 2D environment map,… ▽ More We present a differentiable rendering framework for material and lighting estimation from multi-view images and a reconstructed geometry. In the framework, we represent scene lightings as the Neural Incident Light Field (NeILF) and material properties as the surface BRDF modelled by multi-layer perceptrons. Compared with recent approaches that approximate scene lightings as the 2D environment map, NeILF is a fully 5D light field that is capable of modelling illuminations of any static scenes. In addition, occlusions and indirect lights can be handled naturally by the NeILF representation without requiring multiple bounces of ray tracing, making it possible to estimate material properties even for scenes with complex lightings and geometries. We also propose a smoothness regularization and a Lambertian assumption to reduce the material-lighting ambiguity during the optimization. Our method strictly follows the physically-based rendering equation, and jointly optimizes material and lighting through the differentiable rendering process. We have intensively evaluated the proposed method on our in-house synthetic dataset, the DTU MVS dataset, and real-world BlendedMVS scenes. Our method is able to outperform previous methods by a significant margin in terms of novel view rendering quality, setting a new state-of-the-art for image-based material and lighting estimation. △ Less

Submitted 18 March, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

arXiv:2112.11548 [pdf]

Parallel-disk and cone-and-plate viscometry of a viscoplastic hydrogel with apparent wall slip

Authors: Li Quan, Dilhan M. Kalyon

Abstract: Hydrogels are widely used in myriad applications including those in the biomedical and personal care fields. It is generally the rheology, i.e., the flow and deformation properties, of hydrogels that define their functionalities. However, the ubiquitous viscoplasticity and associated wall slip behavior of hydrogels handicap the accurate characterization of their rheological material functions. Her… ▽ More Hydrogels are widely used in myriad applications including those in the biomedical and personal care fields. It is generally the rheology, i.e., the flow and deformation properties, of hydrogels that define their functionalities. However, the ubiquitous viscoplasticity and associated wall slip behavior of hydrogels handicap the accurate characterization of their rheological material functions. Here parallel-disk and cone-and-plate viscometers were used to characterize the shear viscosity and wall slip behavior of a crosslinked poly(acrylic acid), PAA, hydrogel (carbomer, specifically Carbopol). It is demonstrated that parallel-disk viscometry, i.e., the steady torsional flow in between two parallel disks, but not the cone-and-plate flow can be used to determine the yield stress and other parameters of viscoplastic constitutive equations and wall slip behavior unambiguously. Gap-dependent data from parallel-disk viscometry can then be used to characterize the other parameters of the shear viscosity and wall slip behavior of the hydrogel. The accuracies of the parameters of wall slip and Herschel-Bulkley type viscoplastic constitutive equation in representing the flow and deformation behavior of the hydrogel were tested via the comparisons of the calculated and experimentally determined velocity distributions and torques. The excellent agreements found reflect the accuracies of the parameters of shear viscosity and wall slip and indicate that the methodologies demonstrated here provide the means necessary to understand in detail the steady flow and deformation behavior of hydrogels. Such a detailed understanding of the viscoplastic nature and wall slip behavior of hydrogels can then be used to design and develop novel hydrogels with a wider range of applications in the medical and other industrial areas, and for finding optimum conditions for their processing and manufacturing. △ Less

Submitted 21 December, 2021; originally announced December 2021.

Comments: 59 pages, 22 figures,. arXiv admin note: text overlap with arXiv:2106.13351

arXiv:2109.07682 [pdf, other]

Distributed Swarm Trajectory Optimization for Formation Flight in Dense Environments

Authors: Lun Quan, Longji Yin, Chao Xu, Fei Gao

Abstract: For aerial swarms, navigation in a prescribed formation is widely practiced in various scenarios. However, the associated planning strategies typically lack the capability of avoiding obstacles in cluttered environments. To address this deficiency, we present an optimization-based method that ensures collision-free trajectory generation for formation flight. In this paper, a novel differentiable m… ▽ More For aerial swarms, navigation in a prescribed formation is widely practiced in various scenarios. However, the associated planning strategies typically lack the capability of avoiding obstacles in cluttered environments. To address this deficiency, we present an optimization-based method that ensures collision-free trajectory generation for formation flight. In this paper, a novel differentiable metric is proposed to quantify the overall similarity distance between formations. We then formulate this metric into an optimization framework, which achieves spatial-temporal planning using polynomial trajectories. Minimization over collision penalty is also incorporated into the framework, so that formation preservation and obstacle avoidance can be handled simultaneously. To validate the efficiency of our method, we conduct benchmark comparisons with other cutting-edge works. Integrated with an autonomous distributed aerial swarm system, the proposed method demonstrates its efficiency and robustness in real-world experiments with obstacle-rich surroundings. We will release the source code for the reference of the community. △ Less

Submitted 21 April, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA2022)

arXiv:2108.09964 [pdf, other]

Learning Signed Distance Field for Multi-view Surface Reconstruction

Authors: **gyang Zhang, Yao Yao, Long Quan

Abstract: Recent works on implicit neural representations have shown promising results for multi-view surface reconstruction. However, most approaches are limited to relatively simple geometries and usually require clean object masks for reconstructing complex and concave objects. In this work, we introduce a novel neural surface reconstruction framework that leverages the knowledge of stereo matching and f… ▽ More Recent works on implicit neural representations have shown promising results for multi-view surface reconstruction. However, most approaches are limited to relatively simple geometries and usually require clean object masks for reconstructing complex and concave objects. In this work, we introduce a novel neural surface reconstruction framework that leverages the knowledge of stereo matching and feature consistency to optimize the implicit surface representation. More specifically, we apply a signed distance field (SDF) and a surface light field to represent the scene geometry and appearance respectively. The SDF is directly supervised by geometry from stereo matching, and is refined by optimizing the multi-view feature consistency and the fidelity of rendered images. Our method is able to improve the robustness of geometry estimation and support reconstruction of complex scene topologies. Extensive experiments have been conducted on DTU, EPFL and Tanks and Temples datasets. Compared to previous state-of-the-art methods, our method achieves better mesh reconstruction in wide open scenes without masks as input. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: ICCV 2021 (Oral)

arXiv:2108.08771 [pdf, other]

Learning to Match Features with Seeded Graph Matching Network

Authors: Hongkai Chen, Zixin Luo, Jiahui Zhang, Lei Zhou, Xuyang Bai, Zeyu Hu, Chiew-Lan Tai, Long Quan

Abstract: Matching local features across images is a fundamental problem in computer vision. Targeting towards high accuracy and efficiency, we propose Seeded Graph Matching Network, a graph neural network with sparse structure to reduce redundant connectivity and learn compact representation. The network consists of 1) Seeding Module, which initializes the matching by generating a small set of reliable mat… ▽ More Matching local features across images is a fundamental problem in computer vision. Targeting towards high accuracy and efficiency, we propose Seeded Graph Matching Network, a graph neural network with sparse structure to reduce redundant connectivity and learn compact representation. The network consists of 1) Seeding Module, which initializes the matching by generating a small set of reliable matches as seeds. 2) Seeded Graph Neural Network, which utilizes seed matches to pass messages within/across images and predicts assignment costs. Three novel operations are proposed as basic elements for message passing: 1) Attentional Pooling, which aggregates keypoint features within the image to seed matches. 2) Seed Filtering, which enhances seed features and exchanges messages across images. 3) Attentional Unpooling, which propagates seed features back to original keypoints. Experiments show that our method reduces computational and memory complexity significantly compared with typical attention-based networks while competitive or higher performance is achieved. △ Less

Submitted 19 August, 2021; originally announced August 2021.

Comments: Accepted by ICCV2021, code to be realeased at https://github.com/vdvchen/SGMNet

arXiv:2104.12937 [pdf]

Scalable Metagrating for Efficient Ultrasonic Focusing

Authors: Yan Kei Chiang, Li Quan, Yugui Peng, Shahrokh Sepehrirahnama, Sebastian Oberst, Andrea Alù, David Powell

Abstract: Acoustic focusing plays a pivotal role in a wide variety of applications ranging from medical science to nondestructive testing. Previous works have shown that acoustic metagratings can overcome the inherent efficiency limitations of gradient metasurfaces in beam steering. In this work, we propose a new design principle for acoustic metalenses, based on metagratings, to achieve efficient ultrasoni… ▽ More Acoustic focusing plays a pivotal role in a wide variety of applications ranging from medical science to nondestructive testing. Previous works have shown that acoustic metagratings can overcome the inherent efficiency limitations of gradient metasurfaces in beam steering. In this work, we propose a new design principle for acoustic metalenses, based on metagratings, to achieve efficient ultrasonic focusing. We achieve beam focusing by locally controlling the excitation of a single diffraction order with the use of adiabatically varying metagratings over the lens aperture. A set of metagratings is optimized by a semi-analytical approach using a genetic algorithm, enabling efficient anomalous reflection for a wide range of reflection angles. Numerical results reveal that our metalens can effectively focus im**ing ultrasonic waves to a focal point of FWHM = 0.364λ. The focusing performance of the metalens is demonstrated experimentally, validating our proposed approach. △ Less

Submitted 26 April, 2021; originally announced April 2021.

arXiv:2102.07957 [pdf, other]

doi 10.1073/pnas.2104425118

Vibrational relaxation dynamics in layered perovskite quantum wells

Authors: Li Na Quan, Yoonjae Park, Peijun Guo, Mengyu Gao, Jianbo **, Jianmei Huang, Jason K. Copper, Adam Schwartzberg, Richard Schaller, David T. Limmer, Peidong Yang

Abstract: Organic-inorganic layered perovskites are two-dimensional quantum wells with layers of lead-halide octahedra stacked between organic ligand barriers. The combination of their dielectric confinement and ionic sublattice results in excitonic excitations with substantial binding energies that are strongly coupled to the surrounding soft, polar lattice. However, the ligand environment in layered perov… ▽ More Organic-inorganic layered perovskites are two-dimensional quantum wells with layers of lead-halide octahedra stacked between organic ligand barriers. The combination of their dielectric confinement and ionic sublattice results in excitonic excitations with substantial binding energies that are strongly coupled to the surrounding soft, polar lattice. However, the ligand environment in layered perovskites can significantly alter their optical properties due to the complex dynamic disorder of soft perovskite lattice. Here, we observe the dynamic disorder through phonon dephasing lifetimes initiated by ultrafast photoexcitation employing high-resolution resonant impulsive stimulated Raman spectroscopy of a variety of ligand substitutions. We demonstrate that vibrational relaxation in layered perovskite formed from flexible alkyl-amines as organic barriers is fast and relatively independent of the lattice temperature. Relaxation in aromatic amine based layered perovskite is slower, though still fast relative to pure inorganic lead bromide lattices, with a rate that is temperature dependent. Using molecular dynamics simulations, we explain the fast rates of relaxation by quantifying the large anharmonic coupling of the optical modes with the ligand layers and rationalize the temperature independence due to their amorphous packing. This work provides a molecular and time-domain depiction of the relaxation of nascent optical excitations and opens opportunities to understand how they couple to the complex layered perovskite lattice, elucidating design principles for optoelectronic devices. △ Less

Submitted 15 February, 2021; originally announced February 2021.

Comments: 7 pages, 4 figures, SI

arXiv:2101.00991 [pdf]

Underwater Image Enhancement based on Deep Learning and Image Formation Model

Authors: Xuelei Chen, Pin Zhang, Lingwei Quan, Chao Yi, Cunyue Lu

Abstract: Underwater robots play an important role in oceanic geological exploration, resource exploitation, ecological research, and other fields. However, the visual perception of underwater robots is affected by various environmental factors. The main challenge now is that images captured by underwater robots are color-distorted. The hue of underwater images tends to be close to green and blue. In additi… ▽ More Underwater robots play an important role in oceanic geological exploration, resource exploitation, ecological research, and other fields. However, the visual perception of underwater robots is affected by various environmental factors. The main challenge now is that images captured by underwater robots are color-distorted. The hue of underwater images tends to be close to green and blue. In addition, the contrast is low and the details are fuzzy. In this paper, a new underwater image enhancement algorithm based on deep learning and image formation model is proposed. Experimental results show that the advantages of the proposed method are that it eliminates the influence of underwater environmental factors, enriches the color, enhances details, achieves higher scores in PSNR and SSIM metrics, and helps feature key-point point matching get better results. Another significant advantage is that its computation speed is much faster than other methods. △ Less

Submitted 7 January, 2021; v1 submitted 4 January, 2021; originally announced January 2021.

arXiv:2011.04246 [pdf, other]

EVA-Planner: Environmental Adaptive Quadrotor Planning

Authors: Lun Quan, Zhiwei Zhang, Xingguang Zhong, Chao Xu, Fei Gao

Abstract: The quadrotor is popularly used in challenging environments due to its superior agility and flexibility. In these scenarios, trajectory planning plays a vital role in generating safe motions to avoid obstacles while ensuring flight smoothness. Although many works on quadrotor planning have been proposed, a research gap exists in incorporating self-adaptation into a planning framework to enable a d… ▽ More The quadrotor is popularly used in challenging environments due to its superior agility and flexibility. In these scenarios, trajectory planning plays a vital role in generating safe motions to avoid obstacles while ensuring flight smoothness. Although many works on quadrotor planning have been proposed, a research gap exists in incorporating self-adaptation into a planning framework to enable a drone to automatically fly slower in denser environments and increase its speed in a safer area. In this paper, we propose an environmental adaptive planner to adjust the flight aggressiveness effectively based on the obstacle distribution and quadrotor state. Firstly, we design an environmental adaptive safety aware method to assign the priority of the surrounding obstacles according to the environmental risk level and instantaneous motion tendency. Then, we apply it into a multi-layered model predictive contouring control (Multi-MPCC) framework to generate adaptive, safe, and dynamical feasible local trajectories. Extensive simulations and real-world experiments verify the efficiency and robustness of our planning framework. Benchmark comparison also shows superior performances of our method with another advanced environmental adaptive planning algorithm. Moreover, we release our planning framework as open-source ros-packages. △ Less

Submitted 5 July, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

Comments: IEEE International Conference on Robotics and Automation (ICRA 2021)

arXiv:2009.11411 [pdf]

Dynamics of the sub-ambient gelation and shearing of solutions of P3HT incorporated with a non-fullerene acceptor o-IDTBR towards active layer formation in bulk heterojunction organic solar cells

Authors: Li Quan, Dongrun Ju, Stephanie Lee, Dilhan M. Kalyon

Abstract: Organic solar cells (OSCs) containing an active layer consisting of a nanostructured blend of a conjugated polymer like poly(3-hexylthiophene) (P3HT) and an electron acceptor molecule have the potential of competing against silicon-based photovoltaic panels. However, this potential is unfulfilled primarily due to interrelated production and stability issues. The generally employed spin coating pro… ▽ More Organic solar cells (OSCs) containing an active layer consisting of a nanostructured blend of a conjugated polymer like poly(3-hexylthiophene) (P3HT) and an electron acceptor molecule have the potential of competing against silicon-based photovoltaic panels. However, this potential is unfulfilled primarily due to interrelated production and stability issues. The generally employed spin coating process for fabricating organic solar cells cannot be scaled up. Recently, He et al., have reported that the gelation of P3HT with [6,6]-phenyl-C61-butyric acid methyl ester (PC60BM) under sub-ambient conditions can provide a continuous extrusion/coating based route to the processing of organic solar cells and that increases in power conversion efficiencies (PCEs) of the P3HT/PC60BM active layer are possible under certain shearing and thermal histories of the P3HT/PC60BM gels. Here oscillatory and steady torsional flows were used to investigate the gel formation dynamics of P3HT with a recently proposed non-fullerene o-IDTBR under sub-ambient conditions. The gel strengths defined on the basis of linear viscoelastic material functions as determined via small-amplitude oscillatory shear were observed to be functions of the P3HT and o-IDTBR concentrations, the solvent used and the shearing conditions. Overall, the gels which formed upon quenching to sub-zero temperatures were found to be stable during small-amplitude oscillatory shear (linear viscoelastic range) but broke down even at the relatively low shear rates associated with steady torsional flows, suggesting that the shearing conditions used during the processing of gels of P3HT with small molecule acceptor blends can alter the gel structure and possibly affect the resulting active layer performance. △ Less

Submitted 23 September, 2020; originally announced September 2020.

Comments: 27 manuscript pages and 19 figures

arXiv:2008.04800 [pdf, other]

Learning Stereo Matchability in Disparity Regression Networks

Authors: **gyang Zhang, Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan

Abstract: Learning-based stereo matching has recently achieved promising results, yet still suffers difficulties in establishing reliable matches in weakly matchable regions that are textureless, non-Lambertian, or occluded. In this paper, we address this challenge by proposing a stereo matching network that considers pixel-wise matchability. Specifically, the network jointly regresses disparity and matchab… ▽ More Learning-based stereo matching has recently achieved promising results, yet still suffers difficulties in establishing reliable matches in weakly matchable regions that are textureless, non-Lambertian, or occluded. In this paper, we address this challenge by proposing a stereo matching network that considers pixel-wise matchability. Specifically, the network jointly regresses disparity and matchability maps from 3D probability volume through expectation and entropy operations. Next, a learned attenuation is applied as the robust loss function to alleviate the influence of weakly matchable pixels in the training. Finally, a matchability-aware disparity refinement is introduced to improve the depth inference in weakly matchable regions. The proposed deep stereo matchability (DSM) framework can improve the matching result or accelerate the computation while still guaranteeing the quality. Moreover, the DSM framework is portable to many recent stereo networks. Extensive experiments are conducted on Scene Flow and KITTI stereo datasets to demonstrate the effectiveness of the proposed framework over the state-of-the-art learning-based stereo methods. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Comments: Accepted to ICPR 2020

arXiv:2008.01270 [pdf, other]

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation

Authors: Mingmin Zhen, Shiwei Li, Lei Zhou, Jiaxiang Shang, Haoan Feng, Tian Fang, Long Quan

Abstract: In this paper, we introduce a novel network, called discriminative feature network (DFNet), to address the unsupervised video object segmentation task. To capture the inherent correlation among video frames, we learn discriminative features (D-features) from the input images that reveal feature distribution from a global perspective. The D-features are then used to establish correspondence with al… ▽ More In this paper, we introduce a novel network, called discriminative feature network (DFNet), to address the unsupervised video object segmentation task. To capture the inherent correlation among video frames, we learn discriminative features (D-features) from the input images that reveal feature distribution from a global perspective. The D-features are then used to establish correspondence with all features of test image under conditional random field (CRF) formulation, which is leveraged to enforce consistency between pixels. The experiments verify that DFNet outperforms state-of-the-art methods by a large margin with a mean IoU score of 83.4% and ranks first on the DAVIS-2016 leaderboard while using much fewer parameters and achieving much more efficient performance in the inference phase. We further evaluate DFNet on the FBMS dataset and the video saliency dataset ViSal, reaching a new state-of-the-art. To further demonstrate the generalizability of our framework, DFNet is also applied to the image object co-segmentation task. We perform experiments on a challenging dataset PASCAL-VOC and observe the superiority of DFNet. The thorough experiments verify that DFNet is able to capture and mine the underlying relations of images and discover the common foreground objects. △ Less

Submitted 3 August, 2020; originally announced August 2020.

Journal ref: European Conference on Computer Vision 2020

arXiv:2008.00446 [pdf, other]

Stochastic Bundle Adjustment for Efficient and Scalable 3D Reconstruction

Authors: Lei Zhou, Zixin Luo, Mingmin Zhen, Tianwei Shen, Shiwei Li, Zhuofei Huang, Tian Fang, Long Quan

Abstract: Current bundle adjustment solvers such as the Levenberg-Marquardt (LM) algorithm are limited by the bottleneck in solving the Reduced Camera System (RCS) whose dimension is proportional to the camera number. When the problem is scaled up, this step is neither efficient in computation nor manageable for a single compute node. In this work, we propose a stochastic bundle adjustment algorithm which s… ▽ More Current bundle adjustment solvers such as the Levenberg-Marquardt (LM) algorithm are limited by the bottleneck in solving the Reduced Camera System (RCS) whose dimension is proportional to the camera number. When the problem is scaled up, this step is neither efficient in computation nor manageable for a single compute node. In this work, we propose a stochastic bundle adjustment algorithm which seeks to decompose the RCS approximately inside the LM iterations to improve the efficiency and scalability. It first reformulates the quadratic programming problem of an LM iteration based on the clustering of the visibility graph by introducing the equality constraints across clusters. Then, we propose to relax it into a chance constrained problem and solve it through sampled convex program. The relaxation is intended to eliminate the interdependence between clusters embodied by the constraints, so that a large RCS can be decomposed into independent linear sub-problems. Numerical experiments on unordered Internet image sets and sequential SLAM image sets, as well as distributed experiments on large-scale datasets, have demonstrated the high efficiency and scalability of the proposed approach. Codes are released at https://github.com/zlthinker/STBA. △ Less

Submitted 2 August, 2020; originally announced August 2020.

Comments: Accepted by ECCV 2020

arXiv:2007.12494 [pdf, other]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency

Authors: Jiaxiang Shang, Tianwei Shen, Shiwei Li, Lei Zhou, Mingmin Zhen, Tian Fang, Long Quan

Abstract: Recent learning-based approaches, in which models are trained by single-view images have shown promising results for monocular 3D face reconstruction, but they suffer from the ill-posed face pose and depth ambiguity issue. In contrast to previous works that only enforce 2D feature constraints, we propose a self-supervised training architecture by leveraging the multi-view geometry consistency, whi… ▽ More Recent learning-based approaches, in which models are trained by single-view images have shown promising results for monocular 3D face reconstruction, but they suffer from the ill-posed face pose and depth ambiguity issue. In contrast to previous works that only enforce 2D feature constraints, we propose a self-supervised training architecture by leveraging the multi-view geometry consistency, which provides reliable constraints on face pose and depth estimation. We first propose an occlusion-aware view synthesis method to apply multi-view geometry consistency to self-supervised learning. Then we design three novel loss functions for multi-view consistency, including the pixel consistency loss, the depth consistency loss, and the facial landmark-based epipolar loss. Our method is accurate and robust, especially under large variations of expressions, poses, and illumination conditions. Comprehensive experiments on the face alignment and 3D face reconstruction benchmarks have demonstrated superiority over state-of-the-art methods. Our code and model are released in https://github.com/jiaxiangshang/MGCNet. △ Less

Submitted 24 July, 2020; originally announced July 2020.

Comments: Accepted to ECCV 2020, supplementary materials included

arXiv:2003.10629 [pdf, other]

KFNet: Learning Temporal Camera Relocalization using Kalman Filtering

Authors: Lei Zhou, Zixin Luo, Tianwei Shen, Jiahui Zhang, Mingmin Zhen, Yao Yao, Tian Fang, Long Quan

Abstract: Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed to one-shot relocalization which focuses on a still image. Even though the time dependency has been taken into account, current temporal relocalization methods still generally underperform the state-of-the-art one-shot approaches in terms of accuracy. In this work, we improve the temporal rel… ▽ More Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed to one-shot relocalization which focuses on a still image. Even though the time dependency has been taken into account, current temporal relocalization methods still generally underperform the state-of-the-art one-shot approaches in terms of accuracy. In this work, we improve the temporal relocalization method by using a network architecture that incorporates Kalman filtering (KFNet) for online camera relocalization. In particular, KFNet extends the scene coordinate regression problem to the time domain in order to recursively establish 2D and 3D correspondences for the pose determination. The network architecture design and the loss formulation are based on Kalman filtering in the context of Bayesian learning. Extensive experiments on multiple relocalization benchmarks demonstrate the high accuracy of KFNet at the top of both one-shot and temporal relocalization approaches. Our codes are released at https://github.com/zlthinker/KFNet. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: An oral paper of CVPR 2020

arXiv:2003.10071 [pdf, other]

ASLFeat: Learning Local Features of Accurate Shape and Localization

Authors: Zixin Luo, Lei Zhou, Xuyang Bai, Hongkai Chen, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, Long Quan

Abstract: This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors. First, the ability to estimate the local shape (scale, orientation, etc.) of feature points is often neglected during dense feature extraction, while the shape-awareness is crucial to acquire stronger geometric invariance. Second, the localization accuracy of detected keypoints is not… ▽ More This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors. First, the ability to estimate the local shape (scale, orientation, etc.) of feature points is often neglected during dense feature extraction, while the shape-awareness is crucial to acquire stronger geometric invariance. Second, the localization accuracy of detected keypoints is not sufficient to reliably recover camera geometry, which has become the bottleneck in tasks such as 3D reconstruction. In this paper, we present ASLFeat, with three light-weight yet effective modifications to mitigate above issues. First, we resort to deformable convolutional networks to densely estimate and apply local transformation. Second, we take advantage of the inherent feature hierarchy to restore spatial resolution and low-level details for accurate keypoint localization. Finally, we use a peakiness measurement to relate feature responses and derive more indicative detection scores. The effect of each modification is thoroughly studied, and the evaluation is extensively conducted across a variety of practical scenarios. State-of-the-art results are reported that demonstrate the superiority of our methods. △ Less

Submitted 19 April, 2020; v1 submitted 23 March, 2020; originally announced March 2020.

Comments: Accepted to CVPR 2020, supplementary materials included, code available

arXiv:2003.10061 [pdf]

Tunable microwave absorption performance of nitrogen and sulfur dual-doped graphene by varying do** sequence

Authors: L. Quan, H. T. Lu, F. X. Qin, D. Estevez, Y. F. Wang, Y. H. Li, Y. Tian, H. Wang, H. X. Peng

Abstract: Sulfur and nitrogen dual doped graphene have been extensively investigated in the field of oxygen reduction reaction, supercapacitors and batteries, but their magnetic and absorption performance have not been explored. Besides, the effects of do** sequence of sulfur and nitrogen atoms on the morphology, structural property and the corresponding microwave absorption performance of the dual doped… ▽ More Sulfur and nitrogen dual doped graphene have been extensively investigated in the field of oxygen reduction reaction, supercapacitors and batteries, but their magnetic and absorption performance have not been explored. Besides, the effects of do** sequence of sulfur and nitrogen atoms on the morphology, structural property and the corresponding microwave absorption performance of the dual doped graphene remain unexplored. In this work, nitrogen and sulfur dual doped graphene with different do** sequence were successfully prepared using a controllable two steps facile thermal treatment method. The first do** process played a decisive role on the morphology, crystal size, interlayer distance, do** degree and ultimately magnetic and microwave absorption properties of the dual doped graphene samples. Meanwhile, the second do** step affected the do** sites and further had a repairing or damaging effect on the final doped graphene. The dual doped graphene samples exhibited two pronounced absorption peaks which intensity was decided by the order of the do** elements. This nitrogen and sulfur dual doped graphene with controlled do** order provides a strategy for understanding of the interaction between nitrogen and sulfur as dual dopants in graphene and further acquiring microwave absorbing materials with tunable absorption bands by varying the do** sequence. △ Less

Submitted 22 March, 2020; originally announced March 2020.

arXiv:2003.03164 [pdf, other]

D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features

Authors: Xuyang Bai, Zixin Luo, Lei Zhou, Hongbo Fu, Long Quan, Chiew-Lan Tai

Abstract: A successful point cloud registration often lies on robust establishment of sparse matches through discriminative 3D local features. Despite the fast evolution of learning-based 3D feature descriptors, little attention has been drawn to the learning of 3D feature detectors, even less for a joint learning of the two tasks. In this paper, we leverage a 3D fully convolutional network for 3D point clo… ▽ More A successful point cloud registration often lies on robust establishment of sparse matches through discriminative 3D local features. Despite the fast evolution of learning-based 3D feature descriptors, little attention has been drawn to the learning of 3D feature detectors, even less for a joint learning of the two tasks. In this paper, we leverage a 3D fully convolutional network for 3D point clouds, and propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point. In particular, we propose a keypoint selection strategy that overcomes the inherent density variations of 3D point clouds, and further propose a self-supervised detector loss guided by the on-the-fly feature matching results during training. Finally, our method achieves state-of-the-art results in both indoor and outdoor scenarios, evaluated on 3DMatch and KITTI datasets, and shows its strong generalization ability on the ETH dataset. Towards practical use, we show that by adopting a reliable feature detector, sampling a smaller number of features is sufficient to achieve accurate and fast point cloud alignment.[code release](https://github.com/XuyangBai/D3Feat) △ Less

Submitted 6 March, 2020; originally announced March 2020.

Comments: Accepted to CVPR 2020, supplementary materials included

arXiv:1911.10127 [pdf, other]

BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks

Authors: Yao Yao, Zixin Luo, Shiwei Li, **gyang Zhang, Yufan Ren, Lei Zhou, Tian Fang, Long Quan

Abstract: While deep learning has recently achieved great success on multi-view stereo (MVS), limited training data makes the trained model hard to be generalized to unseen scenarios. Compared with other computer vision tasks, it is rather difficult to collect a large-scale MVS dataset as it requires expensive active scanners and labor-intensive process to obtain ground truth 3D structures. In this paper, w… ▽ More While deep learning has recently achieved great success on multi-view stereo (MVS), limited training data makes the trained model hard to be generalized to unseen scenarios. Compared with other computer vision tasks, it is rather difficult to collect a large-scale MVS dataset as it requires expensive active scanners and labor-intensive process to obtain ground truth 3D structures. In this paper, we introduce BlendedMVS, a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. To create the dataset, we apply a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, we render these mesh models to color images and depth maps. To introduce the ambient lighting information during training, the rendered color images are further blended with the input images to generate the training input. Our dataset contains over 17k high-resolution images covering a variety of scenes, including cities, architectures, sculptures and small objects. Extensive experiments demonstrate that BlendedMVS endows the trained model with significantly better generalization ability compared with other MVS datasets. The dataset and pretrained models are available at \url{https://github.com/YoYo000/BlendedMVS}. △ Less

Submitted 13 April, 2020; v1 submitted 22 November, 2019; originally announced November 2019.

Comments: Accepted to CVPR2020

arXiv:1911.01778 [pdf, other]

Downsampling and Transparent Coding for Blockchain

Authors: Qin Huang, Li Quan, Shengli Zhang

Abstract: With the development of blockchain, the huge history data limits the scalability of the blockchain. This paper proposes to downsample these data to reduce the storage overhead of nodes. These nodes keep good independency, if downsampling follows the entropy of blockchain. Moreover, it demonstrates that the entire blockchain history can be efficiently recovered through the cooperative decoding of a… ▽ More With the development of blockchain, the huge history data limits the scalability of the blockchain. This paper proposes to downsample these data to reduce the storage overhead of nodes. These nodes keep good independency, if downsampling follows the entropy of blockchain. Moreover, it demonstrates that the entire blockchain history can be efficiently recovered through the cooperative decoding of a group of nodes like fountain codes, if reserved data over these nodes obey the soliton distribution. However, these data on nodes are uncoded (transparent). Thus, the proposed algorithm not only keeps decentralization and security, but also has good scalability in independency and recovery. △ Less

Submitted 26 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: 8pages, 6 figures

arXiv:1909.09115 [pdf, other]

Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency

Authors: Tianwei Shen, Lei Zhou, Zixin Luo, Yao Yao, Shiwei Li, Jiahui Zhang, Tian Fang, Long Quan

Abstract: The self-supervised learning of depth and pose from monocular sequences provides an attractive solution by using the photometric consistency of nearby frames as it depends much less on the ground-truth data. In this paper, we address the issue when previous assumptions of the self-supervised approaches are violated due to the dynamic nature of real-world scenes. Different from handling the noise a… ▽ More The self-supervised learning of depth and pose from monocular sequences provides an attractive solution by using the photometric consistency of nearby frames as it depends much less on the ground-truth data. In this paper, we address the issue when previous assumptions of the self-supervised approaches are violated due to the dynamic nature of real-world scenes. Different from handling the noise as uncertainty, our key idea is to incorporate more robust geometric quantities and enforce internal consistency in the temporal image sequence. As demonstrated on commonly used benchmark datasets, the proposed method substantially improves the state-of-the-art methods on both depth and relative pose estimation for monocular image sequences, without adding inference overhead. △ Less

Submitted 19 September, 2019; originally announced September 2019.

Comments: International Conference on Computer Vision (ICCV) Workshop 2019

arXiv:1908.04964 [pdf, other]

Learning Two-View Correspondences and Geometry Using Order-Aware Network

Authors: Jiahui Zhang, Dawei Sun, Zixin Luo, Anbang Yao, Lei Zhou, Tianwei Shen, Yurong Chen, Long Quan, Hongen Liao

Abstract: Establishing correspondences between two images requires both local and global spatial context. Given putative correspondences of feature points in two views, in this paper, we propose Order-Aware Network, which infers the probabilities of correspondences being inliers and regresses the relative pose encoded by the essential matrix. Specifically, this proposed network is built hierarchically and c… ▽ More Establishing correspondences between two images requires both local and global spatial context. Given putative correspondences of feature points in two views, in this paper, we propose Order-Aware Network, which infers the probabilities of correspondences being inliers and regresses the relative pose encoded by the essential matrix. Specifically, this proposed network is built hierarchically and comprises three novel operations. First, to capture the local context of sparse correspondences, the network clusters unordered input correspondences by learning a soft assignment matrix. These clusters are in a canonical order and invariant to input permutations. Next, the clusters are spatially correlated to form the global context of correspondences. After that, the context-encoded clusters are recovered back to the original size through a proposed upsampling operator. We intensively experiment on both outdoor and indoor datasets. The accuracy of the two-view geometry and correspondences are significantly improved over the state-of-the-arts. Code will be available at https://github.com/zjhthu/OANet.git. △ Less

Submitted 14 August, 2019; originally announced August 2019.

Comments: Accepted to ICCV 2019, and Winner solution to both tracks of CVPR IMW 2019 Challenge. Code will be available soon at https://github.com/zjhthu/OANet.git

arXiv:1906.10362 [pdf, other]

EVulHunter: Detecting Fake Transfer Vulnerabilities for EOSIO's Smart Contracts at Webassembly-level

Authors: Li** Quan, Lei Wu, Haoyu Wang

Abstract: As one of the representative Delegated Proof-of-Stake (DPoS) blockchain platforms, EOSIO's ecosystem grows rapidly in recent years. A number of vulnerabilities and corresponding attacks of EOSIO's smart contracts have been discovered and observed in the wild, which caused a large amount of financial damages. However, the majority of EOSIO's smart contracts are not open-sourced. As a result, the We… ▽ More As one of the representative Delegated Proof-of-Stake (DPoS) blockchain platforms, EOSIO's ecosystem grows rapidly in recent years. A number of vulnerabilities and corresponding attacks of EOSIO's smart contracts have been discovered and observed in the wild, which caused a large amount of financial damages. However, the majority of EOSIO's smart contracts are not open-sourced. As a result, the WebAssembly code may become the only available object to be analyzed in most cases. Unfortunately, current tools are web-application oriented and cannot be applied to EOSIO WebAssembly code directly, which makes it more difficult to detect vulnerabilities from those smart contracts. In this paper, we propose \toolname, a static analysis tool that can be used to detect vulnerabilities from EOSIO WASM code automatically. We focus on one particular type of vulnerabilities named \textit{fake-transfer}, and the exploitation of such vulnerabilities has led to millions of dollars in damages. To the best of our knowledge, it is the first attempt to build an automatic tool to detect vulnerabilities of EOSIO's smart contracts. The experimental results demonstrate that our tool is able to detect fake transfer vulnerabilities quickly and precisely. EVulHunter is available on GitHub\footnote{Tool and benchmarks: https://github.com/EVulHunter/EVulHunter} and YouTube\footnote{Demo video: https://youtu.be/5SJ0ZJKVZvw}. △ Less

Submitted 25 June, 2019; originally announced June 2019.

arXiv:1905.08929 [pdf, other]

Learning Fully Dense Neural Networks for Image Semantic Segmentation

Authors: Mingmin Zhen, **glu Wang, Lei Zhou, Tian Fang, Long Quan

Abstract: Semantic segmentation is pixel-wise classification which retains critical spatial information. The "feature map reuse" has been commonly adopted in CNN based approaches to take advantage of feature maps in the early layers for the later spatial reconstruction. Along this direction, we go a step further by proposing a fully dense neural network with an encoder-decoder structure that we abbreviate a… ▽ More Semantic segmentation is pixel-wise classification which retains critical spatial information. The "feature map reuse" has been commonly adopted in CNN based approaches to take advantage of feature maps in the early layers for the later spatial reconstruction. Along this direction, we go a step further by proposing a fully dense neural network with an encoder-decoder structure that we abbreviate as FDNet. For each stage in the decoder module, feature maps of all the previous blocks are adaptively aggregated to feed-forward as input. On the one hand, it reconstructs the spatial boundaries accurately. On the other hand, it learns more efficiently with the more efficient gradient backpropagation. In addition, we propose the boundary-aware loss function to focus more attention on the pixels near the boundary, which boosts the "hard examples" labeling. We have demonstrated the best performance of the FDNet on the two benchmark datasets: PASCAL VOC 2012, NYUDv2 over previous works when not considering training on other datasets. △ Less

Submitted 21 May, 2019; originally announced May 2019.

Journal ref: AAAI 2019

arXiv:1905.03931 [pdf]

A Plainified Composite Absorber Enabled by Vertical Interphase

Authors: Yuhan Li, Faxiang Qin, Le Quan, Huijie Wei, Huan Wang, Hua-Xin Peng

Abstract: Interface constitutes a significant volume fraction in nanocomposites, and it requires the ability to tune and tailor interfaces to tap the full potential of nanocomposites. However, the development and optimization of nanocomposites is currently restricted by the limited exploration and utilization of interfaces at different length scales. In this research, we have designed and introduced a relat… ▽ More Interface constitutes a significant volume fraction in nanocomposites, and it requires the ability to tune and tailor interfaces to tap the full potential of nanocomposites. However, the development and optimization of nanocomposites is currently restricted by the limited exploration and utilization of interfaces at different length scales. In this research, we have designed and introduced a relatively large-scale vertical interphase into carbon nanocomposites, in which the dielectric response and dispersion features in microwave frequency range are successfully adjusted. A remarkable relaxation process has been observed in vertical-interphase nanocomposites, showing sensitivity to both filler loading and the discrepancy in polarization ability across the interphase. Together with our analyses on dielectric spectra and relaxation processes, it is suggested that the intrinsic effect of vertical interphase lies in its ability to constrain and localize heterogeneous charges under external fields. Following this logic, systematic research is presented in this article affording to realize tunable frequency-dependent dielectric functionality by means of vertical interphase engineering. Overall, this study provides a novel method to utilize interfacial effects rationally. The research approach demonstrated here has great potential in develo** microwave dielectric nanocomposites and devices with targeted or unique performance such as tunable broadband absorbers. △ Less

Submitted 4 June, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

Comments: 24pages, 11 figures

arXiv:1904.05300 [pdf, ps, other]

An In-Depth Comparison of s-t Reliability Algorithms over Uncertain Graphs

Authors: Xiangyu Ke, Arijit Khan, Leroy Lim Hong Quan

Abstract: Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging applications, and have recently attracted the attention of the database research community. A fundamental problem on uncertain graphs is the s-t reliability, which measures the probability that a target node t is reachable from a source node s in a probabilistic (or uncertain) graph, i.e… ▽ More Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging applications, and have recently attracted the attention of the database research community. A fundamental problem on uncertain graphs is the s-t reliability, which measures the probability that a target node t is reachable from a source node s in a probabilistic (or uncertain) graph, i.e., a graph where every edge is assigned a probability of existence. Due to the inherent complexity of the s-t reliability estimation problem (#P-hard), various sampling and indexing based efficient algorithms were proposed in the literature. However, since they have not been thoroughly compared with each other, it is not clear whether the later algorithm outperforms the earlier ones. More importantly, the comparison framework, datasets, and metrics were often not consistent (e.g., different convergence criteria were employed to find the optimal number of samples) across these works. We address this serious concern by re-implementing six state-of-the-art s-t reliability estimation methods in a common system and code base, using several medium and large-scale, real-world graph datasets, identical evaluation metrics, and query workloads. Through our systematic and in-depth analysis of experimental results, we report surprising findings, such as many follow-up algorithms can actually be several orders of magnitude inefficient, less accurate, and more memory intensive compared to the ones that were proposed earlier. We conclude by discussing our recommendations on the road ahead. △ Less

Submitted 10 April, 2019; originally announced April 2019.

arXiv:1904.04084 [pdf, other]

ContextDesc: Local Descriptor Augmentation with Cross-Modality Context

Authors: Zixin Luo, Tianwei Shen, Lei Zhou, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, Long Quan

Abstract: Most existing studies on learning local features focus on the patch-based descriptions of individual keypoints, whereas neglecting the spatial relations established from their keypoint locations. In this paper, we go beyond the local detail representation by introducing context awareness to augment off-the-shelf local feature descriptors. Specifically, we propose a unified learning framework that… ▽ More Most existing studies on learning local features focus on the patch-based descriptions of individual keypoints, whereas neglecting the spatial relations established from their keypoint locations. In this paper, we go beyond the local detail representation by introducing context awareness to augment off-the-shelf local feature descriptors. Specifically, we propose a unified learning framework that leverages and aggregates the cross-modality contextual information, including (i) visual context from high-level image representation, and (ii) geometric context from 2D keypoint distribution. Moreover, we propose an effective N-pair loss that eschews the empirical hyper-parameter search and improves the convergence. The proposed augmentation scheme is lightweight compared with the raw local feature description, meanwhile improves remarkably on several large-scale benchmarks with diversified scenes, which demonstrates both strong practicality and generalization ability in geometric matching applications. △ Less

Submitted 8 April, 2019; originally announced April 2019.

Comments: Accepted to CVPR 2019 (oral), supplementary materials included. (https://github.com/lzx551402/contextdesc)

arXiv:1902.10556 [pdf, other]

Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference

Authors: Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan

Abstract: Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes. In this paper, we introduce a scalable multi-view stereo framework based on the recurrent neural network.… ▽ More Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes. In this paper, we introduce a scalable multi-view stereo framework based on the recurrent neural network. Instead of regularizing the entire 3D cost volume in one go, the proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes the 2D cost maps along the depth direction via the gated recurrent unit (GRU). This reduces dramatically the memory consumption and makes high-resolution reconstruction feasible. We first show the state-of-the-art performance achieved by the proposed R-MVSNet on the recent MVS benchmarks. Then, we further demonstrate the scalability of the proposed method on several large-scale scenarios, where previous learned approaches often fail due to the memory constraint. Code is available at https://github.com/YoYo000/MVSNet. △ Less

Submitted 27 February, 2019; originally announced February 2019.

Comments: Accepted by CVPR2019

arXiv:1902.09103 [pdf, other]

Beyond Photometric Loss for Self-Supervised Ego-Motion Estimation

Authors: Tianwei Shen, Zixin Luo, Lei Zhou, Hanyu Deng, Runze Zhang, Tian Fang, Long Quan

Abstract: Accurate relative pose is one of the key components in visual odometry (VO) and simultaneous localization and map** (SLAM). Recently, the self-supervised learning framework that jointly optimizes the relative pose and target image depth has attracted the attention of the community. Previous works rely on the photometric error generated from depths and poses between adjacent frames, which contain… ▽ More Accurate relative pose is one of the key components in visual odometry (VO) and simultaneous localization and map** (SLAM). Recently, the self-supervised learning framework that jointly optimizes the relative pose and target image depth has attracted the attention of the community. Previous works rely on the photometric error generated from depths and poses between adjacent frames, which contains large systematic error under realistic scenes due to reflective surfaces and occlusions. In this paper, we bridge the gap between geometric loss and photometric loss by introducing the matching loss constrained by epipolar geometry in a self-supervised framework. Evaluated on the KITTI dataset, our method outperforms the state-of-the-art unsupervised ego-motion estimation methods by a large margin. The code and data are available at https://github.com/hlzz/DeepMatchVO. △ Less

Submitted 25 February, 2019; originally announced February 2019.

Comments: Accepted by ICRA 2019

arXiv:1812.02318 [pdf, other]

doi 10.1038/s41467-019-10915-5

Acoustic meta-atom with maximum Willis coupling

Authors: Anton Melnikov, Li Quan, Sebastian Oberst, Andrea Alù, Steffen Marburg, David Powell

Abstract: Acoustic metamaterials are structures with exotic acoustic properties, having promising applications in acoustic beam steering, focusing, impedance matching, absorption and isolation. Recent work has shown that the efficiency of many acoustic metamaterials can be enhanced by controlling an additional parameter known as Willis coupling, which is analogous to bianisotropy in electromagnetic metamate… ▽ More Acoustic metamaterials are structures with exotic acoustic properties, having promising applications in acoustic beam steering, focusing, impedance matching, absorption and isolation. Recent work has shown that the efficiency of many acoustic metamaterials can be enhanced by controlling an additional parameter known as Willis coupling, which is analogous to bianisotropy in electromagnetic metamaterials. The magnitude of Willis coupling in an acoustic meta-atom has been shown theoretically to have an upper limit, however the feasibility of reaching this limit has not been experimentally investigated. Here we introduce a meta-atom with Willis coupling which closely approaches this theoretical limit, that is much simpler and less prone to thermo-viscous losses than previously reported structures. We perform two-dimensional experiments to measure the strong Willis coupling, supported by numerical calculations. Our meta-atom geometry is readily modeled analytically, enabling the strength of Willis coupling and its peak frequency to be easily controlled. Together with its ease of fabrication, this will facilitate the design of future high efficiency acoustic devices. △ Less

Submitted 5 December, 2018; originally announced December 2018.

Comments: 16 pages, 6 figures

Journal ref: Nature Communications, Volume 10, Article number: 3148 (2019)

arXiv:1811.10343 [pdf, other]

Matchable Image Retrieval by Learning from Surface Reconstruction

Authors: Tianwei Shen, Zixin Luo, Lei Zhou, Runze Zhang, Siyu Zhu, Tian Fang, Long Quan

Abstract: Convolutional Neural Networks (CNNs) have achieved superior performance on object image retrieval, while Bag-of-Words (BoW) models with handcrafted local features still dominate the retrieval of overlap** images in 3D reconstruction. In this paper, we narrow down this gap by presenting an efficient CNN-based method to retrieve images with overlaps, which we refer to as the matchable image retrie… ▽ More Convolutional Neural Networks (CNNs) have achieved superior performance on object image retrieval, while Bag-of-Words (BoW) models with handcrafted local features still dominate the retrieval of overlap** images in 3D reconstruction. In this paper, we narrow down this gap by presenting an efficient CNN-based method to retrieve images with overlaps, which we refer to as the matchable image retrieval problem. Different from previous methods that generates training data based on sparse reconstruction, we create a large-scale image database with rich 3D geometrics and exploit information from surface reconstruction to obtain fine-grained training data. We propose a batched triplet-based loss function combined with mesh re-projection to effectively learn the CNN representation. The proposed method significantly accelerates the image retrieval process in 3D reconstruction and outperforms the state-of-the-art CNN-based and BoW methods for matchable image retrieval. The code and data are available at https://github.com/hlzz/mirror. △ Less

Submitted 10 December, 2018; v1 submitted 26 November, 2018; originally announced November 2018.

Comments: accepted by ACCV 2018

arXiv:1809.09641 [pdf]

doi 10.1103/PhysRevLett.123.064301

Non-Reciprocal Willis Coupling in Zero-Index Moving Media

Authors: Li Quan, Dimitrios L. Sounas, Andrea Alu

Abstract: Mechanical motion can break the symmetry in which sound travels in a medium, but significant non-reciprocity is typically achieved only for very large motion speeds. Here we combine moving media with zero-index acoustic propagation, yielding extreme non-reciprocity and induced bianisotropy for modest applied speeds. The metamaterial is formed by an array of waveguides loaded by Helmholtz resonator… ▽ More Mechanical motion can break the symmetry in which sound travels in a medium, but significant non-reciprocity is typically achieved only for very large motion speeds. Here we combine moving media with zero-index acoustic propagation, yielding extreme non-reciprocity and induced bianisotropy for modest applied speeds. The metamaterial is formed by an array of waveguides loaded by Helmholtz resonators, and it exhibits opposite signs of the refractive index sustained by asymmetric Willis coupling for propagation in opposite directions. We use this response to design a non-reciprocal acoustic lens focusing only when excitation from one side, with applications for imaging and ultrasound technology. △ Less

Submitted 25 September, 2018; originally announced September 2018.

Journal ref: Phys. Rev. Lett. 123, 064301 (2019)

arXiv:1807.06294 [pdf, other]

doi 10.1007/978-3-030-01240-3_11

GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints

Authors: Zixin Luo, Tianwei Shen, Lei Zhou, Siyu Zhu, Runze Zhang, Yao Yao, Tian Fang, Long Quan

Abstract: Learned local descriptors based on Convolutional Neural Networks (CNNs) have achieved significant improvements on patch-based benchmarks, whereas not having demonstrated strong generalization ability on recent benchmarks of image-based 3D reconstruction. In this paper, we mitigate this limitation by proposing a novel local descriptor learning approach that integrates geometry constraints from mult… ▽ More Learned local descriptors based on Convolutional Neural Networks (CNNs) have achieved significant improvements on patch-based benchmarks, whereas not having demonstrated strong generalization ability on recent benchmarks of image-based 3D reconstruction. In this paper, we mitigate this limitation by proposing a novel local descriptor learning approach that integrates geometry constraints from multi-view reconstructions, which benefits the learning process in terms of data generation, data sampling and loss computation. We refer to the proposed descriptor as GeoDesc, and demonstrate its superior performance on various large-scale benchmarks, and in particular show its great success on challenging reconstruction tasks. Moreover, we provide guidelines towards practical integration of learned descriptors in Structure-from-Motion (SfM) pipelines, showing the good trade-off that GeoDesc delivers to 3D reconstruction tasks between accuracy and efficiency. △ Less

Submitted 16 August, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

Comments: Accepted to ECCV'18

arXiv:1807.05653 [pdf, other]

doi 10.1007/978-3-030-01267-0_31

Learning and Matching Multi-View Descriptors for Registration of Point Clouds

Authors: Lei Zhou, Siyu Zhu, Zixin Luo, Tianwei Shen, Runze Zhang, Mingmin Zhen, Tian Fang, Long Quan

Abstract: Critical to the registration of point clouds is the establishment of a set of accurate correspondences between points in 3D space. The correspondence problem is generally addressed by the design of discriminative 3D local descriptors on the one hand, and the development of robust matching strategies on the other hand. In this work, we first propose a multi-view local descriptor, which is learned f… ▽ More Critical to the registration of point clouds is the establishment of a set of accurate correspondences between points in 3D space. The correspondence problem is generally addressed by the design of discriminative 3D local descriptors on the one hand, and the development of robust matching strategies on the other hand. In this work, we first propose a multi-view local descriptor, which is learned from the images of multiple views, for the description of 3D keypoints. Then, we develop a robust matching approach, aiming at rejecting outlier matches based on the efficient inference via belief propagation on the defined graphical model. We have demonstrated the boost of our approaches to registration on the public scanning and multi-view stereo datasets. The superior performance has been verified by the intensive comparisons against a variety of descriptors and matching methods. △ Less

Submitted 27 February, 2023; v1 submitted 15 July, 2018; originally announced July 2018.

Showing 1–50 of 57 results for author: Quan, L