Search | arXiv e-print repository

Depth Completion with Multiple Balanced Bases and Confidence for Dense Monocular SLAM

Authors: Weijian Xie, Guanyi Chu, Quanhao Qian, Yihao Yu, Hai Li, Danpeng Chen, Shang** Zhai, Nan Wang, Hujun Bao, Guofeng Zhang

Abstract: Dense SLAM based on monocular cameras does indeed have immense application value in the field of AR/VR, especially when it is performed on a mobile device. In this paper, we propose a novel method that integrates a light-weight depth completion network into a sparse SLAM system using a multi-basis depth representation, so that dense map** can be performed online even on a mobile phone. Specifica… ▽ More Dense SLAM based on monocular cameras does indeed have immense application value in the field of AR/VR, especially when it is performed on a mobile device. In this paper, we propose a novel method that integrates a light-weight depth completion network into a sparse SLAM system using a multi-basis depth representation, so that dense map** can be performed online even on a mobile phone. Specifically, we present a specifically optimized multi-basis depth completion network, called BBC-Net, tailored to the characteristics of traditional sparse SLAM systems. BBC-Net can predict multiple balanced bases and a confidence map from a monocular image with sparse points generated by off-the-shelf keypoint-based SLAM systems. The final depth is a linear combination of predicted depth bases that can be optimized by tuning the corresponding weights. To seamlessly incorporate the weights into traditional SLAM optimization and ensure efficiency and robustness, we design a set of depth weight factors, which makes our network a versatile plug-in module, facilitating easy integration into various existing sparse SLAM systems and significantly enhancing global depth consistency through bundle adjustment. To verify the portability of our method, we integrate BBC-Net into two representative SLAM systems. The experimental results on various datasets show that the proposed method achieves better performance in monocular dense map** than the state-of-the-art methods. We provide an online demo running on a mobile phone, which verifies the efficiency and map** quality of the proposed method in real-world scenarios. △ Less

Submitted 20 September, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

arXiv:2109.07690 [pdf, other]

The Neural Metric Factorization for Computational Drug Repositioning

Authors: Xinxing Yang, Genke Yangand Jian Chu

Abstract: Computational drug repositioning aims to discover new therapeutic diseases for marketed drugs and has the advantages of low cost, short development cycle, and high controllability compared to traditional drug development. The matrix factorization model has become the cornerstone technique for computational drug repositioning due to its ease of implementation and excellent scalability. However, the… ▽ More Computational drug repositioning aims to discover new therapeutic diseases for marketed drugs and has the advantages of low cost, short development cycle, and high controllability compared to traditional drug development. The matrix factorization model has become the cornerstone technique for computational drug repositioning due to its ease of implementation and excellent scalability. However, the matrix factorization model uses the inner product to represent the association between drugs and diseases, which is lacking in expressive ability. Moreover, the degree of similarity of drugs or diseases could not be implied on their respective latent factor vectors, which is not satisfy the common sense of drug discovery. Therefore, a neural metric factorization model (NMF) for computational drug repositioning is proposed in this work. We novelly consider the latent factor vector of drugs and diseases as a point in the high-dimensional coordinate system and propose a generalized Euclidean distance to represent the association between drugs and diseases to compensate for the shortcomings of the inner product. Furthermore, by embedding multiple drug (disease) metrics information into the encoding space of the latent factor vector, the information about the similarity between drugs (diseases) can be reflected in the distance between latent factor vectors. Finally, we conduct wide analysis experiments on two real datasets to demonstrate the effectiveness of the above improvement points and the superiority of the NMF model. △ Less

Submitted 28 November, 2021; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: 16 pages

arXiv:2107.07866 [pdf, other]

MD Simulation of Hundred-Billion-Metal-Atom Cascade Collision on Sunway Taihulight

Authors: Genshen Chu, Yang Li, Runchu Zhao, Shuai Ren, Wen Yang, Xinfu He, Chungjun Hu, Jue Wang

Abstract: Radiation damage to the steel material of reactor pressure vessels is a major threat to the nuclear reactor safety. It is caused by the metal atom cascade collision, initialized when the atoms are struck by a high-energy neutron. The paper presents MISA-MD, a new implementation of molecular dynamics, to simulate such cascade collision with EAM potential. MISA-MD realizes (1) a hash-based data stru… ▽ More Radiation damage to the steel material of reactor pressure vessels is a major threat to the nuclear reactor safety. It is caused by the metal atom cascade collision, initialized when the atoms are struck by a high-energy neutron. The paper presents MISA-MD, a new implementation of molecular dynamics, to simulate such cascade collision with EAM potential. MISA-MD realizes (1) a hash-based data structure to efficiently store an atom and find its neighbors, and (2) several acceleration and optimization strategies based on SW26010 processor of Sunway Taihulight supercomputer, including an efficient potential table storage and interpolation method, a coloring method to avoid write conflicts, and double-buffer and data reuse strategies. The experimental results demonstrated that MISA-MD has good accuracy and scalability, and obtains a parallel efficiency of over 79% in an 655-billion-atom system. Compared with a state-of-the-art MD program LAMMPS, MISA-MD requires less memory usage and achieves better computational performance. △ Less

Submitted 16 July, 2021; originally announced July 2021.

arXiv:2010.04904 [pdf, other]

Multi-path Neural Networks for On-device Multi-domain Visual Classification

Authors: Qifei Wang, Junjie Ke, Joshua Greaves, Grace Chu, Gabriel Bender, Luciano Sbaiz, Alec Go, Andrew Howard, Feng Yang, Ming-Hsuan Yang, Jeff Gilbert, Peyman Milanfar

Abstract: Learning multiple domains/tasks with a single model is important for improving data efficiency and lowering inference cost for numerous vision tasks, especially on resource-constrained mobile devices. However, hand-crafting a multi-domain/task model can be both tedious and challenging. This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classifi… ▽ More Learning multiple domains/tasks with a single model is important for improving data efficiency and lowering inference cost for numerous vision tasks, especially on resource-constrained mobile devices. However, hand-crafting a multi-domain/task model can be both tedious and challenging. This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices. The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space. An adaptive balanced domain prioritization algorithm is proposed to balance optimizing the joint model on multiple domains simultaneously. The determined multi-path model selectively shares parameters across domains in shared nodes while kee** domain-specific parameters within non-shared nodes in individual domain paths. This approach effectively reduces the total number of parameters and FLOPS, encouraging positive knowledge transfer while mitigating negative interference across domains. Extensive evaluations on the Visual Decathlon dataset demonstrate that the proposed multi-path model achieves state-of-the-art performance in terms of accuracy, model size, and FLOPS against other approaches using MobileNetV3-like architectures. Furthermore, the proposed method improves average accuracy over learning single-domain models individually, and reduces the total number of parameters and FLOPS by 78% and 32% respectively, compared to the approach that simply bundles single-domain models for multi-domain learning. △ Less

Submitted 8 January, 2021; v1 submitted 10 October, 2020; originally announced October 2020.

Comments: WACV 2021

arXiv:2008.08178 [pdf, other]

Discovering Multi-Hardware Mobile Models via Architecture Search

Authors: Grace Chu, Okan Arikan, Gabriel Bender, Weijun Wang, Achille Brighton, Pieter-Jan Kindermans, Hanxiao Liu, Berkin Akin, Suyog Gupta, Andrew Howard

Abstract: Hardware-aware neural architecture designs have been predominantly focusing on optimizing model performance on single hardware and model development complexity, where another important factor, model deployment complexity, has been largely ignored. In this paper, we argue that, for applications that may be deployed on multiple hardware, having different single-hardware models across the deployed ha… ▽ More Hardware-aware neural architecture designs have been predominantly focusing on optimizing model performance on single hardware and model development complexity, where another important factor, model deployment complexity, has been largely ignored. In this paper, we argue that, for applications that may be deployed on multiple hardware, having different single-hardware models across the deployed hardware makes it hard to guarantee consistent outputs across hardware and duplicates engineering work for debugging and fixing. To minimize such deployment cost, we propose an alternative solution, multi-hardware models, where a single architecture is developed for multiple hardware. With thoughtful search space design and incorporating the proposed multi-hardware metrics in neural architecture search, we discover multi-hardware models that give state-of-the-art (SoTA) performance across multiple hardware in both average and worse case scenarios. For performance on individual hardware, the single multi-hardware model yields similar or better results than SoTA performance on accelerators like GPU, DSP and EdgeTPU which was achieved by different models, while having similar performance with MobilenetV3 Large Minimalistic model on mobile CPU. △ Less

Submitted 23 April, 2021; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: CVPR Workshop 2021

arXiv:2008.06120 [pdf, other]

Can weight sharing outperform random architecture search? An investigation with TuNAS

Authors: Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, Quoc Le

Abstract: Efficient Neural Architecture Search methods based on weight sharing have shown good promise in democratizing Neural Architecture Search for computer vision models. There is, however, an ongoing debate whether these efficient methods are significantly better than random search. Here we perform a thorough comparison between efficient and random search methods on a family of progressively larger and… ▽ More Efficient Neural Architecture Search methods based on weight sharing have shown good promise in democratizing Neural Architecture Search for computer vision models. There is, however, an ongoing debate whether these efficient methods are significantly better than random search. Here we perform a thorough comparison between efficient and random search methods on a family of progressively larger and more challenging search spaces for image classification and detection on ImageNet and COCO. While the efficacies of both methods are problem-dependent, our experiments demonstrate that there are large, realistic tasks where efficient search methods can provide substantial gains over random search. In addition, we propose and evaluate techniques which improve the quality of searched architectures and reduce the need for manual hyper-parameter tuning. Source code and experiment data are available at https://github.com/google-research/google-research/tree/master/tunas △ Less

Submitted 13 August, 2020; originally announced August 2020.

Comments: Published at CVPR 2020

ACM Class: I.2.10

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 14323-14332

arXiv:1908.06970 [pdf]

Agent-based (BDI) modeling for automation of penetration testing

Authors: Ge Chu, Alexei Lisitsa

Abstract: Penetration testing (or pentesting) is one of the widely used and important methodologies to assess the security of computer systems and networks. Traditional pentesting relies on the domain expert knowledge and requires considerable human effort all of which incurs a high cost. The automation can significantly improve the efficiency, availability and lower the cost of penetration testing. Existin… ▽ More Penetration testing (or pentesting) is one of the widely used and important methodologies to assess the security of computer systems and networks. Traditional pentesting relies on the domain expert knowledge and requires considerable human effort all of which incurs a high cost. The automation can significantly improve the efficiency, availability and lower the cost of penetration testing. Existing approaches to the automation include those which map vulnerability scanner results to the corresponding exploit tools, and those addressing the pentesting as a planning problem expressed in terms of attack graphs. Due to mainly non-interactive processing, such solutions can deal effectively only with static and simple targets. In this paper, we propose an automated penetration testing approach based on the belief-desire-intention (BDI) agent model, which is central in the research on agent-based processing in that it deals interactively with dynamic, uncertain and complex environments. Penetration testing actions are defined as a series of BDI plans and the BDI reasoning cycle is used to represent the penetration testing process. The model is extensible and new plans can be added, once they have been elicited from the human experts. We report on the results of testing of proof of concept BDI-based penetration testing tool in the simulated environment. △ Less

Submitted 18 August, 2019; originally announced August 2019.

arXiv:1906.01737 [pdf, other]

Geo-Aware Networks for Fine-Grained Recognition

Authors: Grace Chu, Brian Potetz, Weijun Wang, Andrew Howard, Yang Song, Fernando Brucher, Thomas Leung, Hartwig Adam

Abstract: Fine-grained recognition distinguishes among categories with subtle visual differences. In order to differentiate between these challenging visual categories, it is helpful to leverage additional information. Geolocation is a rich source of additional information that can be used to improve fine-grained classification accuracy, but has been understudied. Our contributions to this field are twofold… ▽ More Fine-grained recognition distinguishes among categories with subtle visual differences. In order to differentiate between these challenging visual categories, it is helpful to leverage additional information. Geolocation is a rich source of additional information that can be used to improve fine-grained classification accuracy, but has been understudied. Our contributions to this field are twofold. First, to the best of our knowledge, this is the first paper which systematically examined various ways of incorporating geolocation information into fine-grained image classification through the use of geolocation priors, post-processing or feature modulation. Secondly, to overcome the situation where no fine-grained dataset has complete geolocation information, we release two fine-grained datasets with geolocation by providing complementary information to existing popular datasets - iNaturalist and YFCC100M. By leveraging geolocation information we improve top-1 accuracy in iNaturalist from 70.1% to 79.0% for a strong baseline image-only model. Comparing several models, we found that best performance was achieved by a post-processing model that consumed the output of the image-only baseline alongside geolocation. However, for a resource-constrained model (MobileNetV2), performance was better with a feature modulation model that trains jointly over pixels and geolocation: accuracy increased from 59.6% to 72.2%. Our work makes a strong case for incorporating geolocation information in fine-grained recognition models for both server and on-device. △ Less

Submitted 4 September, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: ICCVW 2019

arXiv:1905.02244 [pdf, other]

Searching for MobileNetV3

Authors: Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam

Abstract: We present the next generation of MobileNets based on a combination of complementary search techniques as well as a novel architecture design. MobileNetV3 is tuned to mobile phone CPUs through a combination of hardware-aware network architecture search (NAS) complemented by the NetAdapt algorithm and then subsequently improved through novel architecture advances. This paper starts the exploration… ▽ More We present the next generation of MobileNets based on a combination of complementary search techniques as well as a novel architecture design. MobileNetV3 is tuned to mobile phone CPUs through a combination of hardware-aware network architecture search (NAS) complemented by the NetAdapt algorithm and then subsequently improved through novel architecture advances. This paper starts the exploration of how automated search algorithms and network design can work together to harness complementary approaches improving the overall state of the art. Through this process we create two new MobileNet models for release: MobileNetV3-Large and MobileNetV3-Small which are targeted for high and low resource use cases. These models are then adapted and applied to the tasks of object detection and semantic segmentation. For the task of semantic segmentation (or any dense pixel prediction), we propose a new efficient segmentation decoder Lite Reduced Atrous Spatial Pyramid Pooling (LR-ASPP). We achieve new state of the art results for mobile classification, detection and segmentation. MobileNetV3-Large is 3.2\% more accurate on ImageNet classification while reducing latency by 15\% compared to MobileNetV2. MobileNetV3-Small is 4.6\% more accurate while reducing latency by 5\% compared to MobileNetV2. MobileNetV3-Large detection is 25\% faster at roughly the same accuracy as MobileNetV2 on COCO detection. MobileNetV3-Large LR-ASPP is 30\% faster than MobileNetV2 R-ASPP at similar accuracy for Cityscapes segmentation. △ Less

Submitted 20 November, 2019; v1 submitted 6 May, 2019; originally announced May 2019.

Comments: ICCV 2019

arXiv:1805.02751 [pdf, other]

doi 10.1109/JIOT.2018.2866423

Security and Privacy Analyses of Internet of Things Children's Toys

Authors: Gordon Chu, Noah Apthorpe, Nick Feamster

Abstract: This paper investigates the security and privacy of Internet-connected children's smart toys through case studies of three commercially-available products. We conduct network and application vulnerability analyses of each toy using static and dynamic analysis techniques, including application binary decompilation and network monitoring. We discover several publicly undisclosed vulnerabilities that… ▽ More This paper investigates the security and privacy of Internet-connected children's smart toys through case studies of three commercially-available products. We conduct network and application vulnerability analyses of each toy using static and dynamic analysis techniques, including application binary decompilation and network monitoring. We discover several publicly undisclosed vulnerabilities that violate the Children's Online Privacy Protection Rule (COPPA) as well as the toys' individual privacy policies. These vulnerabilities, especially security flaws in network communications with first-party servers, are indicative of a disconnect between many IoT toy developers and security and privacy best practices despite increased attention to Internet-connected toy hacking risks. △ Less

Submitted 28 August, 2018; v1 submitted 7 May, 2018; originally announced May 2018.

Comments: 8 pages, 8 figures; publication version

Journal ref: IEEE Internet of Things Journal (IoT-J), 2018

arXiv:1507.07648 [pdf, ps, other]

Projected Model Counting

Authors: Rehan Abdul Aziz, Geoffrey Chu, Christian Muise, Peter Stuckey

Abstract: Model counting is the task of computing the number of assignments to variables V that satisfy a given propositional theory F. Model counting is an essential tool in probabilistic reasoning. In this paper, we introduce the problem of model counting projected on a subset P of original variables that we call 'priority' variables. The task is to compute the number of assignments to P such that there e… ▽ More Model counting is the task of computing the number of assignments to variables V that satisfy a given propositional theory F. Model counting is an essential tool in probabilistic reasoning. In this paper, we introduce the problem of model counting projected on a subset P of original variables that we call 'priority' variables. The task is to compute the number of assignments to P such that there exists an extension to 'non-priority' variables V¶that satisfies F. Projected model counting arises when some parts of the model are irrelevant to the counts, in particular when we require additional variables to model the problem we are counting in SAT. We discuss three different approaches to projected model counting (two of which are novel), and compare their performance on different benchmark problems. To appear in 18th International Conference on Theory and Applications of Satisfiability Testing, September 24-27, 2015, Austin, Texas, USA △ Less

Submitted 28 July, 2015; originally announced July 2015.

arXiv:1411.5410 [pdf, other]

Stable Model Counting and Its Application in Probabilistic Logic Programming

Authors: Rehan Abdul Aziz, Geoffrey Chu, Christian Muise, Peter Stuckey

Abstract: Model counting is the problem of computing the number of models that satisfy a given propositional theory. It has recently been applied to solving inference tasks in probabilistic logic programming, where the goal is to compute the probability of given queries being true provided a set of mutually independent random variables, a model (a logic program) and some evidence. The core of solving this i… ▽ More Model counting is the problem of computing the number of models that satisfy a given propositional theory. It has recently been applied to solving inference tasks in probabilistic logic programming, where the goal is to compute the probability of given queries being true provided a set of mutually independent random variables, a model (a logic program) and some evidence. The core of solving this inference task involves translating the logic program to a propositional theory and using a model counter. In this paper, we show that for some problems that involve inductive definitions like reachability in a graph, the translation of logic programs to SAT can be expensive for the purpose of solving inference tasks. For such problems, direct implementation of stable model semantics allows for more efficient solving. We present two implementation techniques, based on unfounded set detection, that extend a propositional model counter to a stable model counter. Our experiments show that for particular problems, our approach can outperform a state-of-the-art probabilistic logic programming solver by several orders of magnitude in terms of running time and space requirements, and can solve instances of significantly larger sizes on which the current solver runs out of time or memory. △ Less

Submitted 19 November, 2014; originally announced November 2014.

Comments: Accepted in AAAI, 2015

arXiv:1405.3362 [pdf, ps, other]

Grounding Bound Founded Answer Set Programs

Authors: Rehan Abdul Aziz, Geoffrey Chu, Peter James Stuckey

Abstract: To appear in Theory and Practice of Logic Programming (TPLP) Bound Founded Answer Set Programming (BFASP) is an extension of Answer Set Programming (ASP) that extends stable model semantics to numeric variables. While the theory of BFASP is defined on ground rules, in practice BFASP programs are written as complex non-ground expressions. Flattening of BFASP is a technique used to simplify arbitr… ▽ More To appear in Theory and Practice of Logic Programming (TPLP) Bound Founded Answer Set Programming (BFASP) is an extension of Answer Set Programming (ASP) that extends stable model semantics to numeric variables. While the theory of BFASP is defined on ground rules, in practice BFASP programs are written as complex non-ground expressions. Flattening of BFASP is a technique used to simplify arbitrary expressions of the language to a small and well defined set of primitive expressions. In this paper, we first show how we can flatten arbitrary BFASP rule expressions, to give equivalent BFASP programs. Next, we extend the bottom-up grounding technique and magic set transformation used by ASP to BFASP programs. Our implementation shows that for BFASP problems, these techniques can significantly reduce the ground program size, and improve subsequent solving. △ Less

Submitted 14 May, 2014; originally announced May 2014.

arXiv:1306.4418 [pdf, other]

Structure Based Extended Resolution for Constraint Programming

Authors: Geoffrey Chu, Peter J. Stuckey

Abstract: Nogood learning is a powerful approach to reducing search in Constraint Programming (CP) solvers. The current state of the art, called Lazy Clause Generation (LCG), uses resolution to derive nogoods expressing the reasons for each search failure. Such nogoods can prune other parts of the search tree, producing exponential speedups on a wide variety of problems. Nogood learning solvers can be seen… ▽ More Nogood learning is a powerful approach to reducing search in Constraint Programming (CP) solvers. The current state of the art, called Lazy Clause Generation (LCG), uses resolution to derive nogoods expressing the reasons for each search failure. Such nogoods can prune other parts of the search tree, producing exponential speedups on a wide variety of problems. Nogood learning solvers can be seen as resolution proof systems. The stronger the proof system, the faster it can solve a CP problem. It has recently been shown that the proof system used in LCG is at least as strong as general resolution. However, stronger proof systems such as \emph{extended resolution} exist. Extended resolution allows for literals expressing arbitrary logical concepts over existing variables to be introduced and can allow exponentially smaller proofs than general resolution. The primary problem in using extended resolution is to figure out exactly which literals are useful to introduce. In this paper, we show that we can use the structural information contained in a CP model in order to introduce useful literals, and that this can translate into significant speedups on a range of problems. △ Less

Submitted 19 June, 2013; originally announced June 2013.

Showing 1–14 of 14 results for author: Chu, G