Search | arXiv e-print repository

I$^2$-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs

Authors: **gsen Zhu, Yuchi Huo, Qi Ye, Fujun Luan, Jifan Li, Dianbing Xi, Lisha Wang, Rui Tang, Wei Hua, Hujun Bao, Rui Wang

Abstract: In this work, we present I$^2$-SDF, a new method for intrinsic indoor scene reconstruction and editing using differentiable Monte Carlo raytracing on neural signed distance fields (SDFs). Our holistic neural SDF-based framework jointly recovers the underlying shapes, incident radiance and materials from multi-view images. We introduce a novel bubble loss for fine-grained small objects and error-gu… ▽ More In this work, we present I$^2$-SDF, a new method for intrinsic indoor scene reconstruction and editing using differentiable Monte Carlo raytracing on neural signed distance fields (SDFs). Our holistic neural SDF-based framework jointly recovers the underlying shapes, incident radiance and materials from multi-view images. We introduce a novel bubble loss for fine-grained small objects and error-guided adaptive sampling scheme to largely improve the reconstruction quality on large-scale indoor scenes. Further, we propose to decompose the neural radiance field into spatially-varying material of the scene as a neural field through surface-based, differentiable Monte Carlo raytracing and emitter semantic segmentations, which enables physically based and photorealistic scene relighting and editing applications. Through a number of qualitative and quantitative experiments, we demonstrate the superior quality of our method on indoor scene reconstruction, novel view synthesis, and scene editing compared to state-of-the-art baselines. △ Less

Submitted 29 March, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR 2023, project page: https://**gsenzhu.github.io/i2-sdf

arXiv:2302.14338 [pdf, other]

Turning a CLIP Model into a Scene Text Detector

Authors: Wenwen Yu, Yuliang Liu, Wei Hua, Deqiang Jiang, Bo Ren, Xiang Bai

Abstract: The recent large-scale Contrastive Language-Image Pretraining (CLIP) model has shown great potential in various downstream tasks via leveraging the pretrained vision and language knowledge. Scene text, which contains rich textual and visual information, has an inherent connection with a model like CLIP. Recently, pretraining approaches based on vision language models have made effective progresses… ▽ More The recent large-scale Contrastive Language-Image Pretraining (CLIP) model has shown great potential in various downstream tasks via leveraging the pretrained vision and language knowledge. Scene text, which contains rich textual and visual information, has an inherent connection with a model like CLIP. Recently, pretraining approaches based on vision language models have made effective progresses in the field of text detection. In contrast to these works, this paper proposes a new method, termed TCM, focusing on Turning the CLIP Model directly for text detection without pretraining process. We demonstrate the advantages of the proposed TCM as follows: (1) The underlying principle of our framework can be applied to improve existing scene text detector. (2) It facilitates the few-shot training capability of existing methods, e.g., by using 10% of labeled data, we significantly improve the performance of the baseline method with an average of 22% in terms of the F-measure on 4 benchmarks. (3) By turning the CLIP model into existing scene text detection methods, we further achieve promising domain adaptation ability. The code will be publicly released at https://github.com/wenwenyu/TCM. △ Less

Submitted 26 March, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

Comments: CVPR2023

arXiv:2212.12418 [pdf]

Dynamic Speed Guidance for CAV Ramp Merging in Non-Cooperative Environment: An On-Site Experiment

Authors: Wei Ji, Yechi Ma, Guangzhang Cui, Xiaotian Qin, Wei Hua

Abstract: Ramp merging is a typical application of cooperative intelligent transportation system (C-ITS). Vehicle trajectories perceived by roadside sensors are importation complement to the limited visual field of on-board perception. Vehicle tracking and trajectory denoising algorithm is proposed in this paper to take full advantage of roadside cameras for vehicle trajectory and speed profile estimation.… ▽ More Ramp merging is a typical application of cooperative intelligent transportation system (C-ITS). Vehicle trajectories perceived by roadside sensors are importation complement to the limited visual field of on-board perception. Vehicle tracking and trajectory denoising algorithm is proposed in this paper to take full advantage of roadside cameras for vehicle trajectory and speed profile estimation. Dynamic speed guidance algorithm is proposed to help on-ramp vehicles to merge into mainline smoothly, even in non-cooperative environment where mainline vehicles are not expected to slow down to accommodate on-ramp vehicles. On-site experiments were taken out in a merging area of Hangzhou Belt Highway to testify our prototype system, and simulation analysis shows our proposed algorithm can achieve significant fuel savings during the ramp merging process. △ Less

Submitted 21 December, 2022; originally announced December 2022.

Comments: This work has been submitted to IFAC for possible publication

arXiv:2212.08204 [pdf, other]

LegalRelectra: Mixed-domain Language Modeling for Long-range Legal Text Comprehension

Authors: Wenyue Hua, Yuchen Zhang, Zhe Chen, Josie Li, Melanie Weber

Abstract: The application of Natural Language Processing (NLP) to specialized domains, such as the law, has recently received a surge of interest. As many legal services rely on processing and analyzing large collections of documents, automating such tasks with NLP tools emerges as a key challenge. Many popular language models, such as BERT or RoBERTa, are general-purpose models, which have limitations on p… ▽ More The application of Natural Language Processing (NLP) to specialized domains, such as the law, has recently received a surge of interest. As many legal services rely on processing and analyzing large collections of documents, automating such tasks with NLP tools emerges as a key challenge. Many popular language models, such as BERT or RoBERTa, are general-purpose models, which have limitations on processing specialized legal terminology and syntax. In addition, legal documents may contain specialized vocabulary from other domains, such as medical terminology in personal injury text. Here, we propose LegalRelectra, a legal-domain language model that is trained on mixed-domain legal and medical corpora. We show that our model improves over general-domain and single-domain medical and legal language models when processing mixed-domain (personal injury) text. Our training architecture implements the Electra framework, but utilizes Reformer instead of BERT for its generator and discriminator. We show that this improves the model's performance on processing long passages and results in better long-range text comprehension. △ Less

Submitted 15 December, 2022; originally announced December 2022.

arXiv:2212.02019 [pdf, other]

SASFormer: Transformers for Sparsely Annotated Semantic Segmentation

Authors: Hui Su, Yue Ye, Wei Hua, Lechao Cheng, Mingli Song

Abstract: Semantic segmentation based on sparse annotation has advanced in recent years. It labels only part of each object in the image, leaving the remainder unlabeled. Most of the existing approaches are time-consuming and often necessitate a multi-stage training strategy. In this work, we propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer… ▽ More Semantic segmentation based on sparse annotation has advanced in recent years. It labels only part of each object in the image, leaving the remainder unlabeled. Most of the existing approaches are time-consuming and often necessitate a multi-stage training strategy. In this work, we propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer, that achieves remarkable performance. Specifically, the framework first generates hierarchical patch attention maps, which are then multiplied by the network predictions to produce correlated regions separated by valid labels. Besides, we also introduce the affinity loss to ensure consistency between the features of correlation results and network predictions. Extensive experiments showcase that our proposed approach is superior to existing methods and achieves cutting-edge performance. The source code is available at \url{https://github.com/su-hui-zz/SASFormer}. △ Less

Submitted 25 February, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

Comments: 8 pages, 6 figures, 6 tables; version4.0

arXiv:2211.16101 [pdf, other]

Dependency-aware Self-training for Entity Alignment

Authors: Bing Liu, Tiancheng Lan, Wen Hua, Guido Zuccon

Abstract: Entity Alignment (EA), which aims to detect entity map**s (i.e. equivalent entity pairs) in different Knowledge Graphs (KGs), is critical for KG fusion. Neural EA methods dominate current EA research but still suffer from their reliance on labelled map**s. To solve this problem, a few works have explored boosting the training of EA models with self-training, which adds confidently predicted ma… ▽ More Entity Alignment (EA), which aims to detect entity map**s (i.e. equivalent entity pairs) in different Knowledge Graphs (KGs), is critical for KG fusion. Neural EA methods dominate current EA research but still suffer from their reliance on labelled map**s. To solve this problem, a few works have explored boosting the training of EA models with self-training, which adds confidently predicted map**s into the training data iteratively. Though the effectiveness of self-training can be glimpsed in some specific settings, we still have very limited knowledge about it. One reason is the existing works concentrate on devising EA models and only treat self-training as an auxiliary tool. To fill this knowledge gap, we change the perspective to self-training to shed light on it. In addition, the existing self-training strategies have limited impact because they introduce either much False Positive noise or a low quantity of True Positive pseudo map**s. To improve self-training for EA, we propose exploiting the dependencies between entities, a particularity of EA, to suppress the noise without hurting the recall of True Positive map**s. Through extensive experiments, we show that the introduction of dependency makes the self-training strategy for EA reach a new level. The value of self-training in alleviating the reliance on annotation is actually much higher than what has been realised. Furthermore, we suggest future study on smart data annotation to break the ceiling of EA performance. △ Less

Submitted 29 November, 2022; originally announced November 2022.

Comments: WSDM 2023

arXiv:2211.15833 [pdf, other]

Guiding Neural Entity Alignment with Compatibility

Authors: Bing Liu, Harrisen Scells, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang

Abstract: Entity Alignment (EA) aims to find equivalent entities between two Knowledge Graphs (KGs). While numerous neural EA models have been devised, they are mainly learned using labelled data only. In this work, we argue that different entities within one KG should have compatible counterparts in the other KG due to the potential dependencies among the entities. Making compatible predictions thus should… ▽ More Entity Alignment (EA) aims to find equivalent entities between two Knowledge Graphs (KGs). While numerous neural EA models have been devised, they are mainly learned using labelled data only. In this work, we argue that different entities within one KG should have compatible counterparts in the other KG due to the potential dependencies among the entities. Making compatible predictions thus should be one of the goals of training an EA model along with fitting the labelled data: this aspect however is neglected in current methods. To power neural EA models with compatibility, we devise a training framework by addressing three problems: (1) how to measure the compatibility of an EA model; (2) how to inject the property of being compatible into an EA model; (3) how to optimise parameters of the compatibility model. Extensive experiments on widely-used datasets demonstrate the advantages of integrating compatibility within EA models. In fact, state-of-the-art neural EA models trained within our framework using just 5\% of the labelled data can achieve comparable effectiveness with supervised training using 20\% of the labelled data. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: EMNLP 2022

arXiv:2211.04476 [pdf, other]

Discover, Explanation, Improvement: An Automatic Slice Detection Framework for Natural Language Processing

Authors: Wenyue Hua, Lifeng **, Linfeng Song, Haitao Mi, Yongfeng Zhang, Dong Yu

Abstract: Pretrained natural language processing (NLP) models have achieved high overall performance, but they still make systematic errors. Instead of manual error analysis, research on slice detection models (SDM), which automatically identify underperforming groups of datapoints, has caught escalated attention in Computer Vision for both understanding model behaviors and providing insights for future mod… ▽ More Pretrained natural language processing (NLP) models have achieved high overall performance, but they still make systematic errors. Instead of manual error analysis, research on slice detection models (SDM), which automatically identify underperforming groups of datapoints, has caught escalated attention in Computer Vision for both understanding model behaviors and providing insights for future model training and designing. However, little research on SDM and quantitative evaluation of their effectiveness have been conducted on NLP tasks. Our paper fills the gap by proposing a benchmark named "Discover, Explain, Improve (DEIM)" for classification NLP tasks along with a new SDM Edisa. Edisa discovers coherent and underperforming groups of datapoints; DEIM then unites them under human-understandable concepts and provides comprehensive evaluation tasks and corresponding quantitative metrics. The evaluation in DEIM shows that Edisa can accurately select error-prone datapoints with informative semantic features that summarize error patterns. Detecting difficult datapoints directly boosts model performance without tuning any original model parameters, showing that discovered slices are actionable for users. △ Less

Submitted 10 September, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: 15 pages, 5 figures, accepted by Transactions of the Association for Computational Linguistics

arXiv:2209.06994 [pdf, ps, other]

PriorLane: A Prior Knowledge Enhanced Lane Detection Approach Based on Transformer

Authors: Qibo Qiu, Haiming Gao, Wei Hua, Gang Huang, Xiaofei He

Abstract: Lane detection is one of the fundamental modules in self-driving. In this paper we employ a transformer-only method for lane detection, thus it could benefit from the blooming development of fully vision transformer and achieve the state-of-the-art (SOTA) performance on both CULane and TuSimple benchmarks, by fine-tuning the weight fully pre-trained on large datasets. More importantly, this paper… ▽ More Lane detection is one of the fundamental modules in self-driving. In this paper we employ a transformer-only method for lane detection, thus it could benefit from the blooming development of fully vision transformer and achieve the state-of-the-art (SOTA) performance on both CULane and TuSimple benchmarks, by fine-tuning the weight fully pre-trained on large datasets. More importantly, this paper proposes a novel and general framework called PriorLane, which is used to enhance the segmentation performance of the fully vision transformer by introducing the low-cost local prior knowledge. Specifically, PriorLane utilizes an encoder-only transformer to fuse the feature extracted by a pre-trained segmentation model with prior knowledge embeddings. Note that a Knowledge Embedding Alignment (KEA) module is adapted to enhance the fusion performance by aligning the knowledge embedding. Extensive experiments on our Zjlab dataset show that PriorLane outperforms SOTA lane detection methods by a 2.82% mIoU when prior knowledge is employed, and the code will be released at: https://github.com/vincentqqb/PriorLane. △ Less

Submitted 7 February, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

Comments: Accepted by ICRA 2023

arXiv:2208.11125 [pdf, other]

doi 10.1145/3511808.3557374

Large-scale Entity Alignment via Knowledge Graph Merging, Partitioning and Embedding

Authors: Kexuan Xin, Zequn Sun, Wen Hua, Wei Hu, Jianfeng Qu, Xiaofang Zhou

Abstract: Entity alignment is a crucial task in knowledge graph fusion. However, most entity alignment approaches have the scalability problem. Recent methods address this issue by dividing large KGs into small blocks for embedding and alignment learning in each. However, such a partitioning and learning process results in an excessive loss of structure and alignment. Therefore, in this work, we propose a s… ▽ More Entity alignment is a crucial task in knowledge graph fusion. However, most entity alignment approaches have the scalability problem. Recent methods address this issue by dividing large KGs into small blocks for embedding and alignment learning in each. However, such a partitioning and learning process results in an excessive loss of structure and alignment. Therefore, in this work, we propose a scalable GNN-based entity alignment approach to reduce the structure and alignment loss from three perspectives. First, we propose a centrality-based subgraph generation algorithm to recall some landmark entities serving as the bridges between different subgraphs. Second, we introduce self-supervised entity reconstruction to recover entity representations from incomplete neighborhood subgraphs, and design cross-subgraph negative sampling to incorporate entities from other subgraphs in alignment learning. Third, during the inference process, we merge the embeddings of subgraphs to make a single space for alignment search. Experimental results on the benchmark OpenEA dataset and the proposed large DBpedia1M dataset verify the effectiveness of our approach. △ Less

Submitted 23 August, 2022; originally announced August 2022.

Comments: Accepted by CIKM 2022

arXiv:2208.10366 [pdf, other]

doi 10.1145/3511808.3557352

High-quality Task Division for Large-scale Entity Alignment

Authors: Bing Liu, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang

Abstract: Entity Alignment (EA) aims to match equivalent entities that refer to the same real-world objects and is a key step for Knowledge Graph (KG) fusion. Most neural EA models cannot be applied to large-scale real-life KGs due to their excessive consumption of GPU memory and time. One promising solution is to divide a large EA task into several subtasks such that each subtask only needs to match two sm… ▽ More Entity Alignment (EA) aims to match equivalent entities that refer to the same real-world objects and is a key step for Knowledge Graph (KG) fusion. Most neural EA models cannot be applied to large-scale real-life KGs due to their excessive consumption of GPU memory and time. One promising solution is to divide a large EA task into several subtasks such that each subtask only needs to match two small subgraphs of the original KGs. However, it is challenging to divide the EA task without losing effectiveness. Existing methods display low coverage of potential map**s, insufficient evidence in context graphs, and largely differing subtask sizes. In this work, we design the DivEA framework for large-scale EA with high-quality task division. To include in the EA subtasks a high proportion of the potential map**s originally present in the large EA task, we devise a counterpart discovery method that exploits the locality principle of the EA task and the power of trained EA models. Unique to our counterpart discovery method is the explicit modelling of the chance of a potential map**. We also introduce an evidence passing mechanism to quantify the informativeness of context entities and find the most informative context graphs with flexible control of the subtask size. Extensive experiments show that DivEA achieves higher EA performance than alternative state-of-the-art solutions. △ Less

Submitted 22 August, 2022; originally announced August 2022.

arXiv:2207.03722 [pdf, other]

Frequency-based Randomization for Guaranteeing Differential Privacy in Spatial Trajectories

Authors: Fengmei **, Wen Hua, Boyu Ruan, Xiaofang Zhou

Abstract: With the popularity of GPS-enabled devices, a huge amount of trajectory data has been continuously collected and a variety of location-based services have been developed that greatly benefit our daily life. However, the released trajectories also bring severe concern about personal privacy, and several recent studies have demonstrated the existence of personally-identifying information in spatial… ▽ More With the popularity of GPS-enabled devices, a huge amount of trajectory data has been continuously collected and a variety of location-based services have been developed that greatly benefit our daily life. However, the released trajectories also bring severe concern about personal privacy, and several recent studies have demonstrated the existence of personally-identifying information in spatial trajectories. Trajectory anonymization is nontrivial due to the trade-off between privacy protection and utility preservation. Furthermore, recovery attack has not been well studied in the current literature. To tackle these issues, we propose a frequency-based randomization model with a rigorous differential privacy guarantee for trajectory data publishing. In particular, we introduce two randomized mechanisms to perturb the local/global frequency distributions of significantly important locations in trajectories by injecting Laplace noise. We design a hierarchical indexing along with a novel search algorithm to support efficient trajectory modification, ensuring the modified trajectories satisfy the perturbed distributions without compromising privacy guarantee or data utility. Extensive experiments on a real-world trajectory dataset verify the effectiveness of our approaches in resisting individual re-identification and recovery attacks and meanwhile preserving desirable data utility as well as the feasibility in practice. △ Less

Submitted 8 July, 2022; originally announced July 2022.

Comments: 13 pages, 5 figures, 38th IEEE International Conference on Data Engineering (ICDE) 2022

arXiv:2206.06635 [pdf, ps, other]

CVR-LSE: Compact Vectorization Representation of Local Static Environments for Unmanned Ground Vehicles

Authors: Haiming Gao, Qibo Qiu, Wei Hua, Xuebo Zhang, Zhengyong Han, Shun Zhang

Abstract: According to the requirement of general static obstacle detection, this paper proposes a compact vectorization representation approach of local static environments for unmanned ground vehicles. At first, by fusing the data of LiDAR and IMU, high-frequency pose information is obtained. Then, through the two-dimensional (2D) obstacle points generation, the process of grid map maintenance with a fixe… ▽ More According to the requirement of general static obstacle detection, this paper proposes a compact vectorization representation approach of local static environments for unmanned ground vehicles. At first, by fusing the data of LiDAR and IMU, high-frequency pose information is obtained. Then, through the two-dimensional (2D) obstacle points generation, the process of grid map maintenance with a fixed size is proposed. Finally, the local static environment is described via multiple convex polygons, which is realized throungh the double threshold-based boundary simplification and the convex polygon segmentation. Our proposed approach has been applied in a practical driverless project in the park, and the qualitative experimental results on typical scenes verify the effectiveness and robustness. In addition, the quantitative evaluation shows the superior performance on making use of fewer number of points information (decreased by about 60%) to represent the local static environment compared with the traditional grid map-based methods. Furthermore, the performance of running time (15ms) shows that the proposed approach can be used for real-time local static environment perception. The corresponding code can be accessed at https://github.com/ghm0819/cvr_lse. △ Less

Submitted 14 June, 2022; originally announced June 2022.

arXiv:2203.14441 [pdf, other]

An Interactive Image-based Modeling System

Authors: Zhi He, Rui Wang, Wei Hua, Yuchi Huo

Abstract: This paper propose a interactive 3D modeling method and corresponding system based on single or multiple uncalibrated images. The main feature of this method is that, according to the modeling habits of ordinary people, the 3D model of the target is reconstructed from coarse to fine images. On the basis of determining the approximate shape, the user adds or modify projection constraints and spatia… ▽ More This paper propose a interactive 3D modeling method and corresponding system based on single or multiple uncalibrated images. The main feature of this method is that, according to the modeling habits of ordinary people, the 3D model of the target is reconstructed from coarse to fine images. On the basis of determining the approximate shape, the user adds or modify projection constraints and spatial constraints, and apply topology modification, gradually realize camera calibration, refine rough model, and finally complete the reconstruction of objects with arbitrary geometry and topology. During the interactive process, the geometric parameters and camera projection matrix are solved in real time, and the reconstruction results are displayed in a 3D window. △ Less

Submitted 27 March, 2022; originally announced March 2022.

arXiv:2203.12339 [pdf, other]

Real-time Rendering and Editing of Scattering Effects for Translucent Objects

Authors: Rui Wang, Wei Hua, Yuchi Huo, Hujun Bao

Abstract: The photorealistic rendering of the transparent effect of translucent objects is a hot research topic in recent years. A real-time photorealistic rendering and material dynamic editing method for the diffuse scattering effect of translucent objects is proposed based on the bidirectional surface scattering reflectance function's (BSSRDF) Dipole approximation. The diffuse scattering material functio… ▽ More The photorealistic rendering of the transparent effect of translucent objects is a hot research topic in recent years. A real-time photorealistic rendering and material dynamic editing method for the diffuse scattering effect of translucent objects is proposed based on the bidirectional surface scattering reflectance function's (BSSRDF) Dipole approximation. The diffuse scattering material function in the Dipo le approximation is decomposed into the product form of the shape-related function and the translucent material-related function through principal component analysis; using this decomposition representation, under the real-time photorealistic rendering framework of pre-radiative transmission and the scattering transmission to realize real-time editing of translucent object materials under various light sources. In addition, a method for quadratic wavelet compression of precomputed radiative transfer data in the spatial domain is also proposed. Using the correlation of surface points in the spatial distribution position, on the premise of ensuring the rendering quality, the data is greatly compressed and the rendering is efficiently improved. The experimental results show that the method in this paper can generate a highly realistic translucent effect and ensure the real-time rendering speed. △ Less

Submitted 23 March, 2022; originally announced March 2022.

arXiv:2203.11484 [pdf, other]

A Virtual Point Light Generation Method in Close-Range Area

Authors: Shihao **, Rui Wang, Wenting Zheng, Wei Hua, Yuchi Huo

Abstract: This paper proposes a new hybrid algorithm for sampling virtual point light (VPL). The indirect lighting calculation of the scene is used to distribute the VPL reasonably. In the process of generating VPL, we divide the scene into two parts according to the camera position and orientation. The close-range part: the part that the camera pays attention to. The distant-range part: the part that the c… ▽ More This paper proposes a new hybrid algorithm for sampling virtual point light (VPL). The indirect lighting calculation of the scene is used to distribute the VPL reasonably. In the process of generating VPL, we divide the scene into two parts according to the camera position and orientation. The close-range part: the part that the camera pays attention to. The distant-range part: the part that the camera does not pay attention to or rarely pays attention to. For the close-range part, we use a patch-based vPL sampling method to distribute the VPL as evenly as possible on the patch in the near-field area; for the distant-range part, we use sparse instant radiosity (IR) for sampling. It turns out that, in contrast to conventional multiple instant radiance Compared with the VPL generation algorithm, the method proposed in this paper can greatly improve the quality of the final result graph when the number of VPLs is the same; Under the same rendering quality, the rendering speed can be greatly improved. △ Less

Submitted 22 March, 2022; originally announced March 2022.

arXiv:2203.10521 [pdf, other]

Variational Hierarchical Directed Bounding Box Construction for Solid Mesh Models

Authors: Rui Wang, Wei Hua, Gaofeng Xu, Yuchi Huo, Hujun Bao

Abstract: Object oriented bounding box tree (OBB-Tree for short) has many applications in collision detection, real-time rendering, etc. It has a wide range of applications. The construction of the hierarchical directed bounding box of the solid mesh model is studied, and a new optimization solution method is proposed. But this part of the external space volume that does not belong to the solid mesh model i… ▽ More Object oriented bounding box tree (OBB-Tree for short) has many applications in collision detection, real-time rendering, etc. It has a wide range of applications. The construction of the hierarchical directed bounding box of the solid mesh model is studied, and a new optimization solution method is proposed. But this part of the external space volume that does not belong to the solid mesh model is used as the error, and an error calculation method based on hardware acceleration is given. Secondly, the hierarchical bounding box construction problem is transformed into a variational approximation problem, and the optimal hierarchical directed bounding box is obtained by solving the global error minimum. In the optimization calculation, we propose that combining Lloyd clustering iteration in the same layer and MultiGrid-like reciprocating iteration between layers. Compared with previous results, this method can generate aired original solid mesh models are more tightly packed with hierarchical directed bounding box approximation. In the practical application of collision detection, the results constructed using this method can reduce the computational time of collision detection and improve detection efficiency. △ Less

Submitted 20 March, 2022; originally announced March 2022.

arXiv:2203.06308 [pdf, other]

Ensemble Semi-supervised Entity Alignment via Cycle-teaching

Authors: Kexuan Xin, Zequn Sun, Wen Hua, Bing Liu, Wei Hu, Jianfeng Qu, Xiaofang Zhou

Abstract: Entity alignment is to find identical entities in different knowledge graphs. Although embedding-based entity alignment has recently achieved remarkable progress, training data insufficiency remains a critical challenge. Conventional semi-supervised methods also suffer from the incorrect entity alignment in newly proposed training data. To resolve these issues, we design an iterative cycle-teachin… ▽ More Entity alignment is to find identical entities in different knowledge graphs. Although embedding-based entity alignment has recently achieved remarkable progress, training data insufficiency remains a critical challenge. Conventional semi-supervised methods also suffer from the incorrect entity alignment in newly proposed training data. To resolve these issues, we design an iterative cycle-teaching framework for semi-supervised entity alignment. The key idea is to train multiple entity alignment models (called aligners) simultaneously and let each aligner iteratively teach its successor the proposed new entity alignment. We propose a diversity-aware alignment selection method to choose reliable entity alignment for each aligner. We also design a conflict resolution mechanism to resolve the alignment conflict when combining the new alignment of an aligner and that from its teacher. Besides, considering the influence of cycle-teaching order, we elaborately design a strategy to arrange the optimal order that can maximize the overall performance of multiple aligners. The cycle-teaching process can break the limitations of each model's learning capability and reduce the noise in new training data, leading to improved performance. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed cycle-teaching framework, which significantly outperforms the state-of-the-art models when the training data is insufficient and the new entity alignment has much noise. △ Less

Submitted 11 March, 2022; originally announced March 2022.

arXiv:2203.02549 [pdf, other]

Structured Pruning is All You Need for Pruning CNNs at Initialization

Authors: Yaohui Cai, Weizhe Hua, Hongzheng Chen, G. Edward Suh, Christopher De Sa, Zhiru Zhang

Abstract: Pruning is a popular technique for reducing the model size and computational cost of convolutional neural networks (CNNs). However, a slow retraining or fine-tuning procedure is often required to recover the accuracy loss caused by pruning. Recently, a new research direction on weight pruning, pruning-at-initialization (PAI), is proposed to directly prune CNNs before training so that fine-tuning o… ▽ More Pruning is a popular technique for reducing the model size and computational cost of convolutional neural networks (CNNs). However, a slow retraining or fine-tuning procedure is often required to recover the accuracy loss caused by pruning. Recently, a new research direction on weight pruning, pruning-at-initialization (PAI), is proposed to directly prune CNNs before training so that fine-tuning or retraining can be avoided. While PAI has shown promising results in reducing the model size, existing approaches rely on fine-grained weight pruning which requires unstructured sparse matrix computation, making it difficult to achieve real speedup in practice unless the sparsity is very high. This work is the first to show that fine-grained weight pruning is in fact not necessary for PAI. Instead, the layerwise compression ratio is the main critical factor to determine the accuracy of a CNN model pruned at initialization. Based on this key observation, we propose PreCrop**, a structured hardware-efficient model compression scheme. PreCrop** directly compresses the model at the channel level following the layerwise compression ratio. Compared to weight pruning, the proposed scheme is regular and dense in both storage and computation without sacrificing accuracy. In addition, since PreCrop** compresses CNNs at initialization, the computational and memory costs of CNNs are reduced for both training and inference on commodity hardware. We empirically demonstrate our approaches on several modern CNN architectures, including ResNet, ShuffleNet, and MobileNet for both CIFAR-10 and ImageNet. △ Less

Submitted 31 May, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

arXiv:2202.10447 [pdf, other]

Transformer Quality in Linear Time

Authors: Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le

Abstract: We revisit the design choices in Transformers, and propose methods to address their weaknesses in handling long sequences. First, we propose a simple layer named gated attention unit, which allows the use of a weaker single-head attention with minimal quality loss. We then propose a linear approximation method complementary to this new layer, which is accelerator-friendly and highly competitive in… ▽ More We revisit the design choices in Transformers, and propose methods to address their weaknesses in handling long sequences. First, we propose a simple layer named gated attention unit, which allows the use of a weaker single-head attention with minimal quality loss. We then propose a linear approximation method complementary to this new layer, which is accelerator-friendly and highly competitive in quality. The resulting model, named FLASH, matches the perplexity of improved Transformers over both short (512) and long (8K) context lengths, achieving training speedups of up to 4.9$\times$ on Wiki-40B and 12.1$\times$ on PG-19 for auto-regressive language modeling, and 4.8$\times$ on C4 for masked language modeling. △ Less

Submitted 27 June, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

Comments: Accepted to the 39th International Conference on Machine Learning (ICML'22)

arXiv:2202.10098 [pdf, other]

Applications of blockchain and artificial intelligence technologies for enabling prosumers in smart grids: A review

Authors: Weiqi Hua, Ying Chen, Meysam Qadrdan, **g Jiang, Hongjian Sun, Jianzhong Wu

Abstract: Governments' net zero emission target aims at increasing the share of renewable energy sources as well as influencing the behaviours of consumers to support the cost-effective balancing of energy supply and demand. These will be achieved by the advanced information and control infrastructures of smart grids which allow the interoperability among various stakeholders. Under this circumstance, incre… ▽ More Governments' net zero emission target aims at increasing the share of renewable energy sources as well as influencing the behaviours of consumers to support the cost-effective balancing of energy supply and demand. These will be achieved by the advanced information and control infrastructures of smart grids which allow the interoperability among various stakeholders. Under this circumstance, increasing number of consumers produce, store, and consume energy, giving them a new role of prosumers. The integration of prosumers and accommodation of incurred bidirectional flows of energy and information rely on two key factors: flexible structures of energy markets and intelligent operations of power systems. The blockchain and artificial intelligence (AI) are innovative technologies to fulfil these two factors, by which the blockchain provides decentralised trading platforms for energy markets and the AI supports the optimal operational control of power systems. This paper attempts to address how to incorporate the blockchain and AI in the smart grids for facilitating prosumers to participate in energy markets. To achieve this objective, first, this paper reviews how policy designs price carbon emissions caused by the fossil-fuel based generation so as to facilitate the integration of prosumers with renewable energy sources. Second, the potential structures of energy markets with the support of the blockchain technologies are discussed. Last, how to apply the AI for enhancing the state monitoring and decision making during the operations of power systems is introduced. △ Less

Submitted 21 February, 2022; originally announced February 2022.

Comments: Accepted by Renewable & Sustainable Energy Reviews on 21 Feb 2022

MSC Class: 68Txx ACM Class: I.2; J.7

Journal ref: Renewable & Sustainable Energy Reviews 2022

arXiv:2201.00304 [pdf, other]

doi 10.1145/3488560.3498523

Informed Multi-context Entity Alignment

Authors: Kexuan Xin, Zequn Sun, Wen Hua, Wei Hu, Xiaofang Zhou

Abstract: Entity alignment is a crucial step in integrating knowledge graphs (KGs) from multiple sources. Previous attempts at entity alignment have explored different KG structures, such as neighborhood-based and path-based contexts, to learn entity embeddings, but they are limited in capturing the multi-context features. Moreover, most approaches directly utilize the embedding similarity to determine enti… ▽ More Entity alignment is a crucial step in integrating knowledge graphs (KGs) from multiple sources. Previous attempts at entity alignment have explored different KG structures, such as neighborhood-based and path-based contexts, to learn entity embeddings, but they are limited in capturing the multi-context features. Moreover, most approaches directly utilize the embedding similarity to determine entity alignment without considering the global interaction among entities and relations. In this work, we propose an Informed Multi-context Entity Alignment (IMEA) model to address these issues. In particular, we introduce Transformer to flexibly capture the relation, path, and neighborhood contexts, and design holistic reasoning to estimate alignment probabilities based on both embedding similarity and the relation/entity functionality. The alignment evidence obtained from holistic reasoning is further injected back into the Transformer via the proposed soft label editing to inform embedding learning. Experimental results on several benchmark datasets demonstrate the superiority of our IMEA model compared with existing state-of-the-art entity alignment methods. △ Less

Submitted 2 January, 2022; originally announced January 2022.

Comments: accepted by wsdm 2022

arXiv:2112.06254 [pdf, other]

Sinan: Data Driven Resource Management for Cloud Microservices

Authors: Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, Ed Suh, Christina Delimitrou

Abstract: Cloud applications are increasingly shifting to interactive and loosely-coupled microservices. Despite their advantages, microservices complicate resource management, due to inter-tier dependencies. We present Sinan, a cluster manager for interactive microservices that leverages easily-obtainable tracing data instead of empirical decisions, to infer the impact of a resource allocation on on end-… ▽ More Cloud applications are increasingly shifting to interactive and loosely-coupled microservices. Despite their advantages, microservices complicate resource management, due to inter-tier dependencies. We present Sinan, a cluster manager for interactive microservices that leverages easily-obtainable tracing data instead of empirical decisions, to infer the impact of a resource allocation on on end-to-end performance, and allocate appropriate resources to each tier. In a preliminary evaluation of Sinan with an end-to-end social network built with microservices, we show that Sinan's data-driven approach, allows the service to always meet its QoS without sacrificing resource efficiency. △ Less

Submitted 12 December, 2021; originally announced December 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2105.13424

arXiv:2110.06474 [pdf, other]

ActiveEA: Active Learning for Neural Entity Alignment

Authors: Bing Liu, Harrisen Scells, Guido Zuccon, Wen Hua, Genghong Zhao

Abstract: Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion. Current mainstream methods -- neural EA models -- rely on training with seed alignment, i.e., a set of pre-aligned entity pairs which are very costly to annotate. In this paper, we devise a novel Active Learning (AL) framework for neural EA, aiming to create highly… ▽ More Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion. Current mainstream methods -- neural EA models -- rely on training with seed alignment, i.e., a set of pre-aligned entity pairs which are very costly to annotate. In this paper, we devise a novel Active Learning (AL) framework for neural EA, aiming to create highly informative seed alignment to obtain more effective EA models with less annotation cost. Our framework tackles two main challenges encountered when applying AL to EA: (1) How to exploit dependencies between entities within the AL strategy. Most AL strategies assume that the data instances to sample are independent and identically distributed. However, entities in KGs are related. To address this challenge, we propose a structure-aware uncertainty sampling strategy that can measure the uncertainty of each entity as well as its impact on its neighbour entities in the KG. (2) How to recognise entities that appear in one KG but not in the other KG (i.e., bachelors). Identifying bachelors would likely save annotation budget. To address this challenge, we devise a bachelor recognizer paying attention to alleviate the effect of sampling bias. Empirical results show that our proposed AL strategy can significantly improve sampling quality with good generality across different datasets, EA models and amount of bachelors. △ Less

Submitted 12 October, 2021; originally announced October 2021.

arXiv:2110.02369 [pdf, ps, other]

EntQA: Entity Linking as Question Answering

Authors: Wenzheng Zhang, Wenyue Hua, Karl Stratos

Abstract: A conventional approach to entity linking is to first find mentions in a given document and then infer their underlying entities in the knowledge base. A well-known limitation of this approach is that it requires finding mentions without knowing their entities, which is unnatural and difficult. We present a new model that does not suffer from this limitation called EntQA, which stands for Entity l… ▽ More A conventional approach to entity linking is to first find mentions in a given document and then infer their underlying entities in the knowledge base. A well-known limitation of this approach is that it requires finding mentions without knowing their entities, which is unnatural and difficult. We present a new model that does not suffer from this limitation called EntQA, which stands for Entity linking as Question Answering. EntQA first proposes candidate entities with a fast retrieval module, and then scrutinizes the document to find mentions of each candidate with a powerful reader module. Our approach combines progress in entity linking with that in open-domain question answering and capitalizes on pretrained models for dense entity retrieval and reading comprehension. Unlike in previous works, we do not rely on a mention-candidates dictionary or large-scale weak supervision. EntQA achieves strong results on the GERBIL benchmarking platform. △ Less

Submitted 7 March, 2022; v1 submitted 5 October, 2021; originally announced October 2021.

Comments: ICLR 2022

arXiv:2109.14707 [pdf, other]

BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining

Authors: Weizhe Hua, Yichi Zhang, Chuan Guo, Zhiru Zhang, G. Edward Suh

Abstract: Neural network robustness has become a central topic in machine learning in recent years. Most training algorithms that improve the model's robustness to adversarial and common corruptions also introduce a large computational overhead, requiring as many as ten times the number of forward and backward passes in order to converge. To combat this inefficiency, we propose BulletTrain $-$ a boundary ex… ▽ More Neural network robustness has become a central topic in machine learning in recent years. Most training algorithms that improve the model's robustness to adversarial and common corruptions also introduce a large computational overhead, requiring as many as ten times the number of forward and backward passes in order to converge. To combat this inefficiency, we propose BulletTrain $-$ a boundary example mining technique to drastically reduce the computational cost of robust training. Our key observation is that only a small fraction of examples are beneficial for improving robustness. BulletTrain dynamically predicts these important examples and optimizes robust training algorithms to focus on the important examples. We apply our technique to several existing robust training algorithms and achieve a 2.1$\times$ speed-up for TRADES and MART on CIFAR-10 and a 1.7$\times$ speed-up for AugMix on CIFAR-10-C and CIFAR-100-C without any reduction in clean and robust accuracy. △ Less

Submitted 4 December, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

Comments: Appeared in NeurIPS 2021

arXiv:2109.04221 [pdf, other]

Multi-Constraint Shortest Path using Forest Hop Labeling

Authors: Ziyi Liu, Lei Li, Mengxuan Zhang, Wen Hua, Xiaofang Zhou

Abstract: The \textit{Multi-Constraint Shortest Path (MCSP)} problem aims to find the shortest path between two nodes in a network subject to a given constraint set. It is typically processed as a \textit{skyline path} problem. However, the number of intermediate skyline paths becomes larger as the network size increases and the constraint number grows, which brings about the dramatical growth of computatio… ▽ More The \textit{Multi-Constraint Shortest Path (MCSP)} problem aims to find the shortest path between two nodes in a network subject to a given constraint set. It is typically processed as a \textit{skyline path} problem. However, the number of intermediate skyline paths becomes larger as the network size increases and the constraint number grows, which brings about the dramatical growth of computational cost and further makes the existing index-based methods hardly capable of obtaining the complete exact results. In this paper, we propose a novel high-dimensional skyline path concatenation method to avoid the expensive skyline path search, which then supports the efficient construction of hop labeling index for \textit{MCSP} queries. Specifically, a set of insightful observations and techniques are proposed to improve the efficiency of concatenating two skyline path set, a \textit{n-Cube} technique is designed to prune the concatenation space among multiple hops, and a \textit{constraint pruning} method is used to avoid the unnecessary computation. Furthermore, to scale up to larger networks, we propose a novel \textit{forest hop labeling} which enables the parallel label construction from different network partitions. Our approach is the first method that can achieve both accuracy and efficiency for \textit{MCSP} query answering. Extensive experiments on real-life road networks demonstrate the superiority of our method over the state-of-the-art solutions. △ Less

Submitted 9 September, 2021; originally announced September 2021.

arXiv:2106.14174 [pdf, other]

Transfer-based adaptive tree for multimodal sentiment analysis based on user latent aspects

Authors: Sana Rahmani, Saeid Hosseini, Raziyeh Zall, Mohammad Reza Kangavari, Sara Kamran, Wen Hua

Abstract: Multimodal sentiment analysis benefits various applications such as human-computer interaction and recommendation systems. It aims to infer the users' bipolar ideas using visual, textual, and acoustic signals. Although researchers affirm the association between cognitive cues and emotional manifestations, most of the current multimodal approaches in sentiment analysis disregard user-specific aspec… ▽ More Multimodal sentiment analysis benefits various applications such as human-computer interaction and recommendation systems. It aims to infer the users' bipolar ideas using visual, textual, and acoustic signals. Although researchers affirm the association between cognitive cues and emotional manifestations, most of the current multimodal approaches in sentiment analysis disregard user-specific aspects. To tackle this issue, we devise a novel method to perform multimodal sentiment prediction using cognitive cues, such as personality. Our framework constructs an adaptive tree by hierarchically dividing users and trains the LSTM-based submodels, utilizing an attention-based fusion to transfer cognitive-oriented knowledge within the tree. Subsequently, the framework consumes the conclusive agglomerative knowledge from the adaptive tree to predict final sentiments. We also devise a dynamic dropout method to facilitate data sharing between neighboring nodes, reducing data sparsity. The empirical results on real-world datasets determine that our proposed model for sentiment prediction can surpass trending rivals. Moreover, compared to other ensemble approaches, the proposed transfer-based algorithm can better utilize the latent cognitive cues and foster the prediction outcomes. Based on the given extrinsic and intrinsic analysis results, we note that compared to other theoretical-based techniques, the proposed hierarchical clustering approach can better group the users within the adaptive tree. △ Less

Submitted 27 June, 2021; originally announced June 2021.

Comments: Under Review on IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:2106.07363 [pdf, other]

Cognitive-aware Short-text Understanding for Inferring Professions

Authors: Sayna Esmailzadeh, Saeid Hosseini, Mohammad Reza Kangavari, Wen Hua

Abstract: Leveraging short-text contents to estimate the occupation of microblog authors has significant gains in many applications. Yet challenges abound. Firstly brief textual contents come with excessive lexical noise that makes the inference problem challenging. Secondly, cognitive-semantics are not evident, and important linguistic features are latent in short-text contents. Thirdly, it is hard to meas… ▽ More Leveraging short-text contents to estimate the occupation of microblog authors has significant gains in many applications. Yet challenges abound. Firstly brief textual contents come with excessive lexical noise that makes the inference problem challenging. Secondly, cognitive-semantics are not evident, and important linguistic features are latent in short-text contents. Thirdly, it is hard to measure the correlation between the cognitive short-text semantics and the features pertaining various occupations. We argue that the multi-aspect cognitive features are needed to correctly associate short-text contents to a particular job and discover suitable people for the careers. To this end, we devise a novel framework that on the one hand, can infer short-text contents and exploit cognitive features, and on the other hand, fuses various adopted novel algorithms, such as curve fitting, support vector, and boosting modules to better predict the occupation of the authors. The final estimation module manufactures the $R^w$-tree via coherence weight to tune the best outcome in the inferring process. We conduct comprehensive experiments on real-life Twitter data. The experimental results show that compared to other rivals, our cognitive multi-aspect model can achieve a higher performance in the career estimation procedure, where it is inevitable to neglect the contextual semantics of users. △ Less

Submitted 4 June, 2021; originally announced June 2021.

arXiv:2106.01706 [pdf, other]

EmoDNN: Understanding emotions from short texts through a deep neural network ensemble

Authors: Sara Kamran, Raziyeh Zall, Mohammad Reza Kangavari, Saeid Hosseini, Sana Rahmani, Wen Hua

Abstract: The latent knowledge in the emotions and the opinions of the individuals that are manifested via social networks are crucial to numerous applications including social management, dynamical processes, and public security. Affective computing, as an interdisciplinary research field, linking artificial intelligence to cognitive inference, is capable to exploit emotion-oriented knowledge from brief co… ▽ More The latent knowledge in the emotions and the opinions of the individuals that are manifested via social networks are crucial to numerous applications including social management, dynamical processes, and public security. Affective computing, as an interdisciplinary research field, linking artificial intelligence to cognitive inference, is capable to exploit emotion-oriented knowledge from brief contents. The textual contents convey hidden information such as personality and cognition about corresponding authors that can determine both correlations and variations between users. Emotion recognition from brief contents should embrace the contrast between authors where the differences in personality and cognition can be traced within emotional expressions. To tackle this challenge, we devise a framework that, on the one hand, infers latent individual aspects, from brief contents and, on the other hand, presents a novel ensemble classifier equipped with dynamic dropout convnets to extract emotions from textual context. To categorize short text contents, our proposed method conjointly leverages cognitive factors and exploits hidden information. We utilize the outcome vectors in a novel embedding model to foster emotion-pertinent features that are collectively assembled by lexicon inductions. Experimental results show that compared to other competitors, our proposed model can achieve a higher performance in recognizing emotion from noisy contents. △ Less

Submitted 3 June, 2021; originally announced June 2021.

arXiv:2105.13424 [pdf, other]

Sinan: Data-Driven, QoS-Aware Cluster Management for Microservices

Authors: Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, Edward Suh, Christina Delimitrou

Abstract: Cloud applications are increasingly shifting from large monolithic services, to large numbers of loosely-coupled, specialized microservices. Despite their advantages in terms of facilitating development, deployment, modularity, and isolation, microservices complicate resource management, as dependencies between them introduce backpressure effects and cascading QoS violations. We present Sinan, a… ▽ More Cloud applications are increasingly shifting from large monolithic services, to large numbers of loosely-coupled, specialized microservices. Despite their advantages in terms of facilitating development, deployment, modularity, and isolation, microservices complicate resource management, as dependencies between them introduce backpressure effects and cascading QoS violations. We present Sinan, a data-driven cluster manager for interactive cloud microservices that is online and QoS-aware. Sinan leverages a set of scalable and validated machine learning models to determine the performance impact of dependencies between microservices, and allocate appropriate resources per tier in a way that preserves the end-to-end tail latency target. We evaluate Sinan both on dedicated local clusters and large-scale deployments on Google Compute Engine (GCE) across representative end-to-end applications built with microservices, such as social networks and hotel reservation sites. We show that Sinan always meets QoS, while also maintaining cluster utilization high, in contrast to prior work which leads to unpredictable performance or sacrifices resource efficiency. Furthermore, the techniques in Sinan are explainable, meaning that cloud operators can yield insights from the ML models on how to better deploy and design their applications to reduce unpredictable performance. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2101.06023 [pdf, other]

doi 10.1103/PhysRevLett.126.152502

New $α$-Emitting Isotope $^{214}$U and Abnormal Enhancement of $α$-Particle Clustering in Lightest Uranium Isotopes

Authors: Z. Y. Zhang, H. B. Yang, M. H. Huang, Z. G. Gan, C. X. Yuan, C. Qi, A. N. Andreyev, M. L. Liu, L. Ma, M. M. Zhang, Y. L. Tian, Y. S. Wang, J. G. Wang, C. L. Yang, G. S. Li, Y. H. Qiang, W. Q. Yang, R. F. Chen, H. B. Zhang, Z. W. Lu, X. X. Xu, L. M. Duan, H. R. Yang, W. X. Huang, Z. Liu , et al. (17 additional authors not shown)

Abstract: A new $α$-emitting isotope $^{214}$U, produced by fusion-evaporation reaction $^{182}$W($^{36}$Ar, 4n)$^{214}$U, was identified by employing the gas-filled recoil separator SHANS and recoil-$α$ correlation technique. More precise $α$-decay properties of even-even nuclei $^{216,218}$U were also measured in reactions of $^{40}$Ar, $^{40}$Ca with $^{180, 182, 184}$W targets. By combining the experime… ▽ More A new $α$-emitting isotope $^{214}$U, produced by fusion-evaporation reaction $^{182}$W($^{36}$Ar, 4n)$^{214}$U, was identified by employing the gas-filled recoil separator SHANS and recoil-$α$ correlation technique. More precise $α$-decay properties of even-even nuclei $^{216,218}$U were also measured in reactions of $^{40}$Ar, $^{40}$Ca with $^{180, 182, 184}$W targets. By combining the experimental data, improved $α$-decay reduced widths $δ^2$ for the even-even Po--Pu nuclei in the vicinity of magic neutron number $N=126$ were deduced. Their systematic trends are discussed in terms of $N_{p}N_{n}$ scheme in order to study the influence of proton-neutron interaction on $α$ decay in this region of nuclei. It is strikingly found that the reduced widths of $^{214,216}$U are significantly enhanced by a factor of two as compared with the $N_{p}N_{n}$ systematics for the $84 \leq Z \leq 90$ and $N<126$ even-even nuclei. The abnormal enhancement is interpreted by the strong monopole interaction between the valence protons and neutrons occupying the $π1f_{7/2}$ and $ν1f_{5/2}$ spin-orbit partner orbits, which is supported by a large-scale shell model calculation. △ Less

Submitted 15 January, 2021; originally announced January 2021.

Journal ref: Phys. Rev. Lett. 126, 152502 (2021)

arXiv:2101.02847 [pdf, other]

Color Contrast Enhanced Rendering for Optical See-through Head-mounted Displays

Authors: Yun** Zhang, Rui Wang, Yifan, Peng, Wei Hua, Hujun Bao

Abstract: Most commercially available optical see-through head-mounted displays (OST-HMDs) utilize optical combiners to simultaneously visualize the physical background and virtual objects. The displayed images perceived by users are a blend of rendered pixels and background colors. Enabling high fidelity color perception in mixed reality (MR) scenarios using OST-HMDs is an important but challenging task. W… ▽ More Most commercially available optical see-through head-mounted displays (OST-HMDs) utilize optical combiners to simultaneously visualize the physical background and virtual objects. The displayed images perceived by users are a blend of rendered pixels and background colors. Enabling high fidelity color perception in mixed reality (MR) scenarios using OST-HMDs is an important but challenging task. We propose a real-time rendering scheme to enhance the color contrast between virtual objects and the surrounding background for OST-HMDs. Inspired by the discovery of color perception in psychophysics, we first formulate the color contrast enhancement as a constrained optimization problem. We then design an end-to-end algorithm to search the optimal complementary shift in both chromaticity and luminance of the displayed color. This aims at enhancing the contrast between virtual objects and the real background as well as kee** the consistency with the original color. We assess the performance of our approach using a simulated OST-HMD environment and an off-the-shelf OST-HMD. Experimental results from objective evaluations and subjective user studies demonstrate that the proposed approach makes rendered virtual objects more distinguishable from the surrounding background, thereby bringing a better visual experience. △ Less

Submitted 7 January, 2021; originally announced January 2021.

Comments: 13 pages, 22 figures, submitted to TVCG

arXiv:2012.11938 [pdf, other]

3D Point-to-Keypoint Voting Network for 6D Pose Estimation

Authors: Weitong Hua, Jiaxin Guo, Yue Wang, Rong Xiong

Abstract: Object 6D pose estimation is an important research topic in the field of computer vision due to its wide application requirements and the challenges brought by complexity and changes in the real-world. We think fully exploring the characteristics of spatial relationship between points will help to improve the pose estimation performance, especially in the scenes of background clutter and partial o… ▽ More Object 6D pose estimation is an important research topic in the field of computer vision due to its wide application requirements and the challenges brought by complexity and changes in the real-world. We think fully exploring the characteristics of spatial relationship between points will help to improve the pose estimation performance, especially in the scenes of background clutter and partial occlusion. But this information was usually ignored in previous work using RGB image or RGB-D data. In this paper, we propose a framework for 6D pose estimation from RGB-D data based on spatial structure characteristics of 3D keypoints. We adopt point-wise dense feature embedding to vote for 3D keypoints, which makes full use of the structure information of the rigid body. After the direction vectors pointing to the keypoints are predicted by CNN, we use RANSAC voting to calculate the coordinate of the 3D keypoints, then the pose transformation can be easily obtained by the least square method. In addition, a spatial dimension sampling strategy for points is employed, which makes the method achieve excellent performance on small training sets. The proposed method is verified on two benchmark datasets, LINEMOD and OCCLUSION LINEMOD. The experimental results show that our method outperforms the state-of-the-art approaches, achieves ADD(-S) accuracy of 98.7\% on LINEMOD dataset and 52.6\% on OCCLUSION LINEMOD dataset in real-time. △ Less

Submitted 22 December, 2020; originally announced December 2020.

arXiv:2011.14369 [pdf, other]

Comment on "Longitudinal wobbling in $^{133}$La [Eur. Phys. J. A 55, 159 (2019)]"

Authors: W. Hua, S. Guo, C. M. Petrache

Abstract: In [S. Biswas et al., Eur. Phys. J. A 55, 159 (2019)] a longitudinal wobbling band was reported in $^{133}$La. The critical experimental proof for this assignment is the E2 dominated linking transitions between the wobbling and normal bands, which are supported by angular distribution and linear polarization measurements. However, severe problems are found in the reported experimental information,… ▽ More In [S. Biswas et al., Eur. Phys. J. A 55, 159 (2019)] a longitudinal wobbling band was reported in $^{133}$La. The critical experimental proof for this assignment is the E2 dominated linking transitions between the wobbling and normal bands, which are supported by angular distribution and linear polarization measurements. However, severe problems are found in the reported experimental information, indicating that the assignment of wobbling band was not firmly established. △ Less

Submitted 8 December, 2020; v1 submitted 29 November, 2020; originally announced November 2020.

arXiv:2011.14354 [pdf]

doi 10.1016/j.physletb.2022.137010

Probing the nature of the conjectured low-spin wobbling bands in atomic nuclei

Authors: S. Guo, X. H. Zhou, C. M. Petrache, E. A. Lawrie, S. Mthembu, Y. D. Fang, H. Y. Wu, H. L. Wang, H. Y. Meng, G. S. Li, Y. H. Qiang, J. G. Wang, M. L. Liu, Y. Zheng, B. Ding, W. Q. Zhang, A. Rohilla, K. R. Mukhi, Y. Y. Yang, H. J. Ong, J. B. Ma, S. W. Xu, Z. Bai, H. L. Fan, J. F. Huang , et al. (6 additional authors not shown)

Abstract: Precession is a unique motion in which the orientation of the rotational axis of a rotating body is not fixed but moving, and it generally exists in the Universe from giant stars through tiny atomic nuclei. In principle, the precession of an atomic nuclide can be approximately described as wobbling motion, arising from the coupling of a rotation and a harmonic vibration. Recently, a number of wobb… ▽ More Precession is a unique motion in which the orientation of the rotational axis of a rotating body is not fixed but moving, and it generally exists in the Universe from giant stars through tiny atomic nuclei. In principle, the precession of an atomic nuclide can be approximately described as wobbling motion, arising from the coupling of a rotation and a harmonic vibration. Recently, a number of wobbling bands were reported at low spin, which violate the wobbling approximation that can be valid only at high spin. Here we explore the nature of the reported low-spin wobbling bands. Via a new experiment, we demonstrate that one such band in $^{187}$Au is generated by dominant single-particle excitation rather than by the excitation of a wobbling phonon. We point out that the imperfect research paradigm used previously would lead to unreliable identification of low-spin wobbling bands. Consequently, new experimental approaches should be developed to distinguish among the different excitation mechanisms that can give rise to the observed low-spin bands in odd-even nuclei. △ Less

Submitted 18 September, 2021; v1 submitted 29 November, 2020; originally announced November 2020.

arXiv:2011.08968 [pdf, other]

Contrastive Weight Regularization for Large Minibatch SGD

Authors: Qiwei Yuan, Weizhe Hua, Yi Zhou, Cunxi Yu

Abstract: The minibatch stochastic gradient descent method (SGD) is widely applied in deep learning due to its efficiency and scalability that enable training deep networks with a large volume of data. Particularly in the distributed setting, SGD is usually applied with large batch size. However, as opposed to small-batch SGD, neural network models trained with large-batch SGD can hardly generalize well, i.… ▽ More The minibatch stochastic gradient descent method (SGD) is widely applied in deep learning due to its efficiency and scalability that enable training deep networks with a large volume of data. Particularly in the distributed setting, SGD is usually applied with large batch size. However, as opposed to small-batch SGD, neural network models trained with large-batch SGD can hardly generalize well, i.e., the validation accuracy is low. In this work, we introduce a novel regularization technique, namely distinctive regularization (DReg), which replicates a certain layer of the deep network and encourages the parameters of both layers to be diverse. The DReg technique introduces very little computation overhead. Moreover, we empirically show that optimizing the neural network with DReg using large-batch SGD achieves a significant boost in the convergence and improved generalization performance. We also demonstrate that DReg can boost the convergence of large-batch SGD with momentum. We believe that DReg can be used as a simple regularization trick to accelerate large-batch training in deep learning. △ Less

Submitted 17 November, 2020; originally announced November 2020.

arXiv:2010.12807 [pdf, other]

doi 10.1109/LRA.2021.3062304

REDE: End-to-end Object 6D Pose Robust Estimation Using Differentiable Outliers Elimination

Authors: Weitong Hua, Zhongxiang Zhou, Jun Wu, Huang Huang, Yue Wang, Rong Xiong

Abstract: Object 6D pose estimation is a fundamental task in many applications. Conventional methods solve the task by detecting and matching the keypoints, then estimating the pose. Recent efforts bringing deep learning into the problem mainly overcome the vulnerability of conventional methods to environmental variation due to the hand-crafted feature design. However, these methods cannot achieve end-to-en… ▽ More Object 6D pose estimation is a fundamental task in many applications. Conventional methods solve the task by detecting and matching the keypoints, then estimating the pose. Recent efforts bringing deep learning into the problem mainly overcome the vulnerability of conventional methods to environmental variation due to the hand-crafted feature design. However, these methods cannot achieve end-to-end learning and good interpretability at the same time. In this paper, we propose REDE, a novel end-to-end object pose estimator using RGB-D data, which utilizes network for keypoint regression, and a differentiable geometric pose estimator for pose error back-propagation. Besides, to achieve better robustness when outlier keypoint prediction occurs, we further propose a differentiable outliers elimination method that regresses the candidate result and the confidence simultaneously. Via confidence weighted aggregation of multiple candidates, we can reduce the effect from the outliers in the final estimation. Finally, following the conventional method, we apply a learnable refinement process to further improve the estimation. The experimental results on three benchmark datasets show that REDE slightly outperforms the state-of-the-art approaches and is more robust to object occlusion. △ Less

Submitted 24 February, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

arXiv:2010.01516 [pdf, other]

doi 10.1109/TKDE.2020.3036633

Trajectory-Based Spatiotemporal Entity Linking

Authors: Fengmei **, Wen Hua, Thomas Zhou, Jiajie Xu, Matteo Francia, Maria E Orlowska, Xiaofang Zhou

Abstract: Trajectory-based spatiotemporal entity linking is to match the same moving object in different datasets based on their movement traces. It is a fundamental step to support spatiotemporal data integration and analysis. In this paper, we study the problem of spatiotemporal entity linking using effective and concise signatures extracted from their trajectories. This linking problem is formalized as a… ▽ More Trajectory-based spatiotemporal entity linking is to match the same moving object in different datasets based on their movement traces. It is a fundamental step to support spatiotemporal data integration and analysis. In this paper, we study the problem of spatiotemporal entity linking using effective and concise signatures extracted from their trajectories. This linking problem is formalized as a k-nearest neighbor (k-NN) query on the signatures. Four representation strategies (sequential, temporal, spatial, and spatiotemporal) and two quantitative criteria (commonality and unicity) are investigated for signature construction. A simple yet effective dimension reduction strategy is developed together with a novel indexing structure called the WR-tree to speed up the search. A number of optimization methods are proposed to improve the accuracy and robustness of the linking. Our extensive experiments on real-world datasets verify the superiority of our approach over the state-of-the-art solutions in terms of both accuracy and efficiency. △ Less

Submitted 4 October, 2020; originally announced October 2020.

Comments: 15 pages, 3 figures, 15 tables

Journal ref: IEEE Transactions on Knowledge and Data Engineering 2020

arXiv:2008.11632 [pdf, other]

GuardNN: Secure Accelerator Architecture for Privacy-Preserving Deep Learning

Authors: Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh

Abstract: This paper proposes GuardNN, a secure DNN accelerator that provides hardware-based protection for user data and model parameters even in an untrusted environment. GuardNN shows that the architecture and protection can be customized for a specific application to provide strong confidentiality and integrity guarantees with negligible overhead. The design of the GuardNN instruction set reduces the TC… ▽ More This paper proposes GuardNN, a secure DNN accelerator that provides hardware-based protection for user data and model parameters even in an untrusted environment. GuardNN shows that the architecture and protection can be customized for a specific application to provide strong confidentiality and integrity guarantees with negligible overhead. The design of the GuardNN instruction set reduces the TCB to just the accelerator and allows confidentiality protection even when the instructions from a host cannot be trusted. GuardNN minimizes the overhead of memory encryption and integrity verification by customizing the off-chip memory protection for the known memory access patterns of a DNN accelerator. GuardNN is prototyped on an FPGA, demonstrating effective confidentiality protection with ~3% performance overhead for inference. △ Less

Submitted 25 May, 2022; v1 submitted 26 August, 2020; originally announced August 2020.

Comments: Accepted to the 59th Design Automation Conference (DAC'22)

arXiv:2007.12034 [pdf, other]

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification

Authors: Xiaofang Wang, Xuehan Xiong, Maxim Neumann, AJ Piergiovanni, Michael S. Ryoo, Anelia Angelova, Kris M. Kitani, Wei Hua

Abstract: Convolutional operations have two limitations: (1) do not explicitly model where to focus as the same filter is applied to all the positions, and (2) are unsuitable for modeling long-range dependencies as they only operate on a small neighborhood. While both limitations can be alleviated by attention operations, many design choices remain to be determined to use attention, especially when applying… ▽ More Convolutional operations have two limitations: (1) do not explicitly model where to focus as the same filter is applied to all the positions, and (2) are unsuitable for modeling long-range dependencies as they only operate on a small neighborhood. While both limitations can be alleviated by attention operations, many design choices remain to be determined to use attention, especially when applying attention to videos. Towards a principled way of applying attention to videos, we address the task of spatiotemporal attention cell search. We propose a novel search space for spatiotemporal attention cells, which allows the search algorithm to flexibly explore various design choices in the cell. The discovered attention cells can be seamlessly inserted into existing backbone networks, e.g., I3D or S3D, and improve video classification accuracy by more than 2% on both Kinetics-600 and MiT datasets. The discovered attention cells outperform non-local blocks on both datasets, and demonstrate strong generalization across different modalities, backbones, and datasets. Inserting our attention cells into I3D-R50 yields state-of-the-art performance on both datasets. △ Less

Submitted 31 July, 2020; v1 submitted 23 July, 2020; originally announced July 2020.

Comments: ECCV 2020

arXiv:2007.04155 [pdf, other]

Personalized Dynamic Treatment Regimes in Continuous Time: A Bayesian Approach for Optimizing Clinical Decisions with Timing

Authors: William Hua, Hongyuan Mei, Sarah Zohar, Magali Giral, Yanxun Xu

Abstract: Accurate models of clinical actions and their impacts on disease progression are critical for estimating personalized optimal dynamic treatment regimes (DTRs) in medical/health research, especially in managing chronic conditions. Traditional statistical methods for DTRs usually focus on estimating the optimal treatment or dosage at each given medical intervention, but overlook the important questi… ▽ More Accurate models of clinical actions and their impacts on disease progression are critical for estimating personalized optimal dynamic treatment regimes (DTRs) in medical/health research, especially in managing chronic conditions. Traditional statistical methods for DTRs usually focus on estimating the optimal treatment or dosage at each given medical intervention, but overlook the important question of "when this intervention should happen." We fill this gap by develo** a two-step Bayesian approach to optimize clinical decisions with timing. In the first step, we build a generative model for a sequence of medical interventions-which are discrete events in continuous time-with a marked temporal point process (MTPP) where the mark is the assigned treatment or dosage. Then this clinical action model is embedded into a Bayesian joint framework where the other components model clinical observations including longitudinal medical measurements and time-to-event data conditional on treatment histories. In the second step, we propose a policy gradient method to learn the personalized optimal clinical decision that maximizes the patient survival by interacting the MTPP with the model on clinical observations while accounting for uncertainties in clinical observations learned from the posterior inference of the Bayesian joint model in the first step. A signature application of the proposed approach is to schedule follow-up visitations and assign a dosage at each visitation for patients after kidney transplantation. We evaluate our approach with comparison to alternative methods on both simulated and real-world datasets. In our experiments, the personalized decisions made by the proposed method are clinically useful: they are interpretable and successfully help improve patient survival. △ Less

Submitted 18 February, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

arXiv:2004.09679 [pdf, other]

MGX: Near-Zero Overhead Memory Protection for Data-Intensive Accelerators

Authors: Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh

Abstract: This paper introduces MGX, a near-zero overhead memory protection scheme for hardware accelerators. MGX minimizes the performance overhead of off-chip memory encryption and integrity verification by exploiting the application-specific properties of the accelerator execution. In particular, accelerators tend to explicitly manage data movement between on-chip and off-chip memories. Therefore, the ge… ▽ More This paper introduces MGX, a near-zero overhead memory protection scheme for hardware accelerators. MGX minimizes the performance overhead of off-chip memory encryption and integrity verification by exploiting the application-specific properties of the accelerator execution. In particular, accelerators tend to explicitly manage data movement between on-chip and off-chip memories. Therefore, the general memory access pattern of an accelerator can largely be determined for a given application. Exploiting these characteristics, MGX generates version numbers used in memory encryption and integrity verification using on-chip accelerator state rather than storing them in the off-chip memory; it also customizes the granularity of the memory protection to match the granularity used by the accelerator. To demonstrate the efficacy of MGX, we present an in-depth study of MGX for DNN and graph algorithms. Experimental results show that on average, MGX lowers the performance overhead of memory protection from 28% and 33% to 4% and 5% for DNN and graph processing accelerators in a wide range of benchmarks, respectively. △ Less

Submitted 25 May, 2022; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: Accepted to the 49th International Symposium on Computer Architecture (ISCA'22)

arXiv:2002.07136 [pdf, other]

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations

Authors: Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang

Abstract: We propose precision gating (PG), an end-to-end trainable dynamic dual-precision quantization technique for deep neural networks. PG computes most features in a low precision and only a small proportion of important features in a higher precision to preserve accuracy. The proposed approach is applicable to a variety of DNN architectures and significantly reduces the computational cost of DNN execu… ▽ More We propose precision gating (PG), an end-to-end trainable dynamic dual-precision quantization technique for deep neural networks. PG computes most features in a low precision and only a small proportion of important features in a higher precision to preserve accuracy. The proposed approach is applicable to a variety of DNN architectures and significantly reduces the computational cost of DNN execution with almost no accuracy loss. Our experiments indicate that PG achieves excellent results on CNNs, including statically compressed mobile-friendly networks such as ShuffleNet. Compared to the state-of-the-art prediction-based quantization schemes, PG achieves the same or higher accuracy with 2.4$\times$ less compute on ImageNet. PG furthermore applies to RNNs. Compared to 8-bit uniform quantization, PG obtains a 1.2% improvement in perplexity per word with 2.7$\times$ computational cost reduction on LSTM on the Penn Tree Bank dataset. Code is available at: https://github.com/cornell-zhang/dnn-gating △ Less

Submitted 28 May, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

Comments: Published as a conference paper at ICLR 2020

arXiv:1910.13065 [pdf, other]

A Survey on Map-Matching Algorithms

Authors: **fu Chao, Yehong Xu, Wen Hua, Xiaofang Zhou

Abstract: The map-matching is an essential preprocessing step for most of the trajectory-based applications. Although it has been an active topic for more than two decades and, driven by the emerging applications, is still under development. There is a lack of categorisation of existing solutions recently and analysis for future research directions. In this paper, we review the current status of the map-mat… ▽ More The map-matching is an essential preprocessing step for most of the trajectory-based applications. Although it has been an active topic for more than two decades and, driven by the emerging applications, is still under development. There is a lack of categorisation of existing solutions recently and analysis for future research directions. In this paper, we review the current status of the map-matching problem and survey the existing algorithms. We propose a new categorisation of the solutions according to their map-matching models and working scenarios. In addition, we experimentally compare three representative methods from different categories to reveal how matching model affects the performance. Besides, the experiments are conducted on multiple real datasets with different settings to demonstrate the influence of other factors in map-matching problem, like the trajectory quality, data compression and matching latency. △ Less

Submitted 28 October, 2019; originally announced October 2019.

Comments: 12 pages, 5 figures, submitted to ADC 2020

arXiv:1910.12261 [pdf, other]

Typical Snapshots Selection for Shortest Path Query in Dynamic Road Networks

Authors: Mengxuan Zhang, Lei Li, Wen Hua, Xiaofang Zhou

Abstract: Finding the shortest paths in road network is an important query in our life nowadays, and various index structures are constructed to speed up the query answering. However, these indexes can hardly work in real-life scenario because the traffic condition changes dynamically, which makes the pathfinding slower than in the static environment. In order to speed up path query answering in the dynamic… ▽ More Finding the shortest paths in road network is an important query in our life nowadays, and various index structures are constructed to speed up the query answering. However, these indexes can hardly work in real-life scenario because the traffic condition changes dynamically, which makes the pathfinding slower than in the static environment. In order to speed up path query answering in the dynamic road network, we propose a framework to support these indexes. Firstly, we view the dynamic graph as a series of static snapshots. After that, we propose two kinds of methods to select the typical snapshots. The first kind is time-based and it only considers the temporal information. The second category is the graph representation-based, which considers more insights: edge-based that captures the road continuity, and vertex-based that reflects the region traffic fluctuation. Finally, we propose the snapshot matching to find the most similar typical snapshot for the current traffic condition and use its index to answer the query directly. Extensive experiments on real-life road network and traffic conditions validate the effectiveness of our approach. △ Less

Submitted 27 October, 2019; originally announced October 2019.

arXiv:1910.12180 [pdf, other]

SoulMate: Short-text author linking through Multi-aspect temporal-textual embedding

Authors: Saeed Najafipour, Saeid Hosseini, Wen Hua, Mohammad Reza Kangavari, Xiaofang Zhou

Abstract: Linking authors of short-text contents has important usages in many applications, including Named Entity Recognition (NER) and human community detection. However, certain challenges lie ahead. Firstly, the input short-text contents are noisy, ambiguous, and do not follow the grammatical rules. Secondly, traditional text mining methods fail to effectively extract concepts through words and phrases.… ▽ More Linking authors of short-text contents has important usages in many applications, including Named Entity Recognition (NER) and human community detection. However, certain challenges lie ahead. Firstly, the input short-text contents are noisy, ambiguous, and do not follow the grammatical rules. Secondly, traditional text mining methods fail to effectively extract concepts through words and phrases. Thirdly, the textual contents are temporally skewed, which can affect the semantic understanding by multiple time facets. Finally, using the complementary knowledge-bases makes the results biased to the content of the external database and deviates the understanding and interpretation away from the real nature of the given short text corpus. To overcome these challenges, we devise a neural network-based temporal-textual framework that generates the tightly connected author subgraphs from microblog short-text contents. Our approach, on the one hand, computes the relevance score (edge weight) between the authors through considering a portmanteau of contents and concepts, and on the other hand, employs a stack-wise graph cutting algorithm to extract the communities of the related authors. Experimental results show that compared to other knowledge-centered competitors, our multi-aspect vector space model can achieve a higher performance in linking short-text authors. Additionally, given the author linking task, the more comprehensive the dataset is, the higher the significance of the extracted concepts will be. △ Less

Submitted 27 October, 2019; originally announced October 2019.

arXiv:1906.08172 [pdf, other]

MediaPipe: A Framework for Building Perception Pipelines

Authors: Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, Wan-Teh Chang, Wei Hua, Manfred Georg, Matthias Grundmann

Abstract: Building applications that perceive the world around them is challenging. A developer needs to (a) select and develop corresponding machine learning algorithms and models, (b) build a series of prototypes and demos, (c) balance resource consumption against the quality of the solutions, and finally (d) identify and mitigate problematic cases. The MediaPipe framework addresses all of these challenge… ▽ More Building applications that perceive the world around them is challenging. A developer needs to (a) select and develop corresponding machine learning algorithms and models, (b) build a series of prototypes and demos, (c) balance resource consumption against the quality of the solutions, and finally (d) identify and mitigate problematic cases. The MediaPipe framework addresses all of these challenges. A developer can use MediaPipe to build prototypes by combining existing perception components, to advance them to polished cross-platform applications and measure system performance and resource consumption on target platforms. We show that these features enable a developer to focus on the algorithm or model development and use MediaPipe as an environment for iteratively improving their application with results reproducible across different devices and platforms. MediaPipe will be open-sourced at https://github.com/google/mediapipe. △ Less

Submitted 14 June, 2019; originally announced June 2019.

arXiv:1901.02985 [pdf, other]

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Authors: Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei

Abstract: Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network structure that controls the spatial resolution changes. This… ▽ More Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network structure that controls the spatial resolution changes. This choice simplifies the search space, but becomes increasingly problematic for dense image prediction which exhibits a lot more network level architectural variations. Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space. We present a network level search space that includes many popular designs, and develop a formulation that allows efficient gradient-based architecture search (3 P100 GPU days on Cityscapes images). We demonstrate the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets. Auto-DeepLab, our architecture searched specifically for semantic image segmentation, attains state-of-the-art performance without any ImageNet pretraining. △ Less

Submitted 6 April, 2019; v1 submitted 9 January, 2019; originally announced January 2019.

Comments: To appear in CVPR 2019 as oral. Code for Auto-DeepLab released at https://github.com/tensorflow/models/tree/master/research/deeplab

arXiv:1810.02663 [pdf, ps, other]

doi 10.1016/j.physletb.2019.02.015

Advantages of the multinucleon transfer reactions based on 238U target for producing neutron-rich isotopes around N = 126

Authors: Long Zhu, Cheng Li, Jun Su, Chen-Chen Guo, Wei Hua

Abstract: The mechanism of multinucleon transfer (MNT) reactions for producing neutron-rich heavy nuclei around N = 126 is investigated within two different theoretical frameworks: dinuclear system (DNS) model and isospin-dependent quantum molecular dynamics (IQMD) model. The effects of mass asymmetry relaxation, N=Z equilibration, and shell closures on production cross sections of neutron-rich heavy nuclei… ▽ More The mechanism of multinucleon transfer (MNT) reactions for producing neutron-rich heavy nuclei around N = 126 is investigated within two different theoretical frameworks: dinuclear system (DNS) model and isospin-dependent quantum molecular dynamics (IQMD) model. The effects of mass asymmetry relaxation, N=Z equilibration, and shell closures on production cross sections of neutron-rich heavy nuclei are investigated. For the first time, the advantages for producing neutron-rich heavy nuclei around N = 126 is found in MNT reactions based on 238U target. We propose the reactions with 238U target for producing unknown neutron-rich heavy nuclei around N = 126 in the future. △ Less

Submitted 14 February, 2019; v1 submitted 5 October, 2018; originally announced October 2018.

Comments: 6 pages, 6 figures

Showing 51–100 of 116 results for author: Hua, W