Search | arXiv e-print repository

CTooth+: A Large-scale Dental Cone Beam Computed Tomography Dataset and Benchmark for Tooth Volume Segmentation

Authors: Weiwei Cui, Yaqi Wang, Yilong Li, Dan Song, Xingyong Zuo, Jiaojiao Wang, Yifan Zhang, Huiyu Zhou, Bung san Chong, Liaoyuan Zeng, Qianni Zhang

Abstract: Accurate tooth volume segmentation is a prerequisite for computer-aided dental analysis. Deep learning-based tooth segmentation methods have achieved satisfying performances but require a large quantity of tooth data with ground truth. The dental data publicly available is limited meaning the existing methods can not be reproduced, evaluated and applied in clinical practice. In this paper, we esta… ▽ More Accurate tooth volume segmentation is a prerequisite for computer-aided dental analysis. Deep learning-based tooth segmentation methods have achieved satisfying performances but require a large quantity of tooth data with ground truth. The dental data publicly available is limited meaning the existing methods can not be reproduced, evaluated and applied in clinical practice. In this paper, we establish a 3D dental CBCT dataset CTooth+, with 22 fully annotated volumes and 146 unlabeled volumes. We further evaluate several state-of-the-art tooth volume segmentation strategies based on fully-supervised learning, semi-supervised learning and active learning, and define the performance principles. This work provides a new benchmark for the tooth volume segmentation task, and the experiment can serve as the baseline for future AI-based dental imaging research and clinical application development. △ Less

Submitted 2 August, 2022; originally announced August 2022.

arXiv:2112.07129 [pdf]

Output fusion of MPC and PID and its application in intelligent layered water injection of oilfield

Authors: Yuan-Long Yue, Hao-Yang Wen, Xin Zuo, Mao Sheng, Fu-Chao Sun

Abstract: To improve the dynamic response performance of wave code communication in intelligent layered water injection of oilfield, this paper proposes an output optimal fusion control method based on MPC-PID. Firstly, depending on the well structure and the flow-pressure characteristics of the layer, the steady-state model between the differential pressure and flow of the whole well and different layer se… ▽ More To improve the dynamic response performance of wave code communication in intelligent layered water injection of oilfield, this paper proposes an output optimal fusion control method based on MPC-PID. Firstly, depending on the well structure and the flow-pressure characteristics of the layer, the steady-state model between the differential pressure and flow of the whole well and different layer sections is established for layered water injection, and the corresponding wave code amplitude at the steady-state operating point of different layer sections is solved, the numerical calculation verifies that the increase of the nozzle opening in a single layer section will drive the pressure and flow curve of the whole well downward. Secondly, combining the dynamic response characteristics and steady-state model of the whole-well water distribution equipment, a dynamic model of layered intelligent water injection is established, and the generation process of the wave code is defined; Finally, the MPC-PID optimal fusion control algorithm structure is designed to derive the fusion control law that minimizes the cost function under fixed weights, , and the optimal weights are calculated by combining the internal model structure of controller, so the optimization performance of each algorithm in the optimal fusion control is balanced. By analyzing the control simulation results, the fast response characteristics of the fusion control method are verified. Meanwhile, the simulation comparison experiments of fast wave code communication under different methods are conducted with the actual working conditions, the results show that the fusion control method has both fast tracking control capability and strong robustness, which effectively enhances the efficiency of wave code communication and shortens the wave code operation time. △ Less

Submitted 13 December, 2021; originally announced December 2021.

arXiv:2106.15283 [pdf, other]

doi 10.1145/3448021

Similarity Embedding Networks for Robust Human Activity Recognition

Authors: Chenglin Li, Carrie Lu Tong, Di Niu, Bei Jiang, Xiao Zuo, Lei Cheng, Jian Xiong, Jianming Yang

Abstract: Deep learning models for human activity recognition (HAR) based on sensor data have been heavily studied recently. However, the generalization ability of deep models on complex real-world HAR data is limited by the availability of high-quality labeled activity data, which are hard to obtain. In this paper, we design a similarity embedding neural network that maps input sensor signals onto real vec… ▽ More Deep learning models for human activity recognition (HAR) based on sensor data have been heavily studied recently. However, the generalization ability of deep models on complex real-world HAR data is limited by the availability of high-quality labeled activity data, which are hard to obtain. In this paper, we design a similarity embedding neural network that maps input sensor signals onto real vectors through carefully designed convolutional and LSTM layers. The embedding network is trained with a pairwise similarity loss, encouraging the clustering of samples from the same class in the embedded real space, and can be effectively trained on a small dataset and even on a noisy dataset with mislabeled samples. Based on the learned embeddings, we further propose both nonparametric and parametric approaches for activity recognition. Extensive evaluation based on two public datasets has shown that the proposed similarity embedding network significantly outperforms state-of-the-art deep models on HAR classification tasks, is robust to mislabeled samples in the training set, and can also be used to effectively denoise a noisy dataset. △ Less

Submitted 31 May, 2021; originally announced June 2021.

arXiv:2106.00615 [pdf, other]

doi 10.1145/3442381.3450006

Meta-HAR: Federated Representation Learning for Human Activity Recognition

Authors: Chenglin Li, Di Niu, Bei Jiang, Xiao Zuo, Jianming Yang

Abstract: Human activity recognition (HAR) based on mobile sensors plays an important role in ubiquitous computing. However, the rise of data regulatory constraints precludes collecting private and labeled signal data from personal devices at scale. Federated learning has emerged as a decentralized alternative solution to model training, which iteratively aggregates locally updated models into a shared glob… ▽ More Human activity recognition (HAR) based on mobile sensors plays an important role in ubiquitous computing. However, the rise of data regulatory constraints precludes collecting private and labeled signal data from personal devices at scale. Federated learning has emerged as a decentralized alternative solution to model training, which iteratively aggregates locally updated models into a shared global model, therefore being able to leverage decentralized, private data without central collection. However, the effectiveness of federated learning for HAR is affected by the fact that each user has different activity types and even a different signal distribution for the same activity type. Furthermore, it is uncertain if a single global model trained can generalize well to individual users or new users with heterogeneous data. In this paper, we propose Meta-HAR, a federated representation learning framework, in which a signal embedding network is meta-learned in a federated manner, while the learned signal representations are further fed into a personalized classification network at each user for activity prediction. In order to boost the representation ability of the embedding network, we treat the HAR problem at each user as a different task and train the shared embedding network through a Model-Agnostic Meta-learning framework, such that the embedding network can generalize to any individual user. Personalization is further achieved on top of the robustly learned representations in an adaptation procedure. We conducted extensive experiments based on two publicly available HAR datasets as well as a newly created HAR dataset. Results verify that Meta-HAR is effective at maintaining high test accuracies for individual users, including new users, and significantly outperforms several baselines, including Federated Averaging, Reptile and even centralized learning in certain cases. △ Less

Submitted 31 May, 2021; originally announced June 2021.

arXiv:2007.09198 [pdf, other]

Speech2Video Synthesis with 3D Skeleton Regularization and Expressive Body Poses

Authors: Miao Liao, Sibo Zhang, Peng Wang, Hao Zhu, Xinxin Zuo, Ruigang Yang

Abstract: In this paper, we propose a novel approach to convert given speech audio to a photo-realistic speaking video of a specific person, where the output video has synchronized, realistic, and expressive rich body dynamics. We achieve this by first generating 3D skeleton movements from the audio sequence using a recurrent neural network (RNN), and then synthesizing the output video via a conditional gen… ▽ More In this paper, we propose a novel approach to convert given speech audio to a photo-realistic speaking video of a specific person, where the output video has synchronized, realistic, and expressive rich body dynamics. We achieve this by first generating 3D skeleton movements from the audio sequence using a recurrent neural network (RNN), and then synthesizing the output video via a conditional generative adversarial network (GAN). To make the skeleton movement realistic and expressive, we embed the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps our model quickly learn meaningful body movement through a few recorded videos. To produce photo-realistic and high-resolution video with motion details, we propose to insert part attention mechanisms in the conditional GAN, where each detailed part, e.g. head and hand, is automatically zoomed in to have their own discriminators. To validate our approach, we collect a dataset with 20 high-quality videos from 1 male and 1 female model reading various documents under different topics. Compared with previous SoTA pipelines handling similar tasks, our approach achieves better results by a user study. △ Less

Submitted 8 October, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

Comments: Accepted by ACCV 2020

arXiv:1905.07486 [pdf, other]

doi 10.1109/JPROC.2019.2909694

Aeronautical Ad Hoc Networking for the Internet-Above-The-Clouds

Authors: Jiankang Zhang, Taihai Chen, Shida Zhong, **g**g Wang, Wenbo Zhang, Xin Zuo, Robert G. Maunder, Lajos Hanzo

Abstract: The engineering vision of relying on the ``smart sky" for supporting air traffic and the ``Internet above the clouds" for in-flight entertainment has become imperative for the future aircraft industry. Aeronautical ad hoc Networking (AANET) constitutes a compelling concept for providing broadband communications above clouds by extending the coverage of Air-to-Ground (A2G) networks to oceanic and r… ▽ More The engineering vision of relying on the ``smart sky" for supporting air traffic and the ``Internet above the clouds" for in-flight entertainment has become imperative for the future aircraft industry. Aeronautical ad hoc Networking (AANET) constitutes a compelling concept for providing broadband communications above clouds by extending the coverage of Air-to-Ground (A2G) networks to oceanic and remote airspace via autonomous and self-configured wireless networking amongst commercial passenger airplanes. The AANET concept may be viewed as a new member of the family of Mobile ad hoc Networks (MANETs) in action above the clouds. However, AANETs have more dynamic topologies, larger and more variable geographical network size, stricter security requirements and more hostile transmission conditions. These specific characteristics lead to more grave challenges in aircraft mobility modeling, aeronautical channel modeling and interference mitigation as well as in network scheduling and routing. This paper provides an overview of AANET solutions by characterizing the associated scenarios, requirements and challenges. Explicitly, the research addressing the key techniques of AANETs, such as their mobility models, network scheduling and routing, security and interference are reviewed. Furthermore, we also identify the remaining challenges associated with develo** AANETs and present their prospective solutions as well as open issues. The design framework of AANETs and the key technical issues are investigated along with some recent research results. Furthermore, a range of performance metrics optimized in designing AANETs and a number of representative multi-objective optimization algorithms are outlined. △ Less

Submitted 17 May, 2019; originally announced May 2019.

Journal ref: Proceedings of the IEEE ( Volume: 107 , Issue: 5 , May 2019 )

arXiv:1904.10506 [pdf, other]

Detailed Human Shape Estimation from a Single Image by Hierarchical Mesh Deformation

Authors: Hao Zhu, Xinxin Zuo, Sen Wang, Xun Cao, Ruigang Yang

Abstract: This paper presents a novel framework to recover detailed human body shapes from a single image. It is a challenging task due to factors such as variations in human shapes, body poses, and viewpoints. Prior methods typically attempt to recover the human body shape using a parametric based template that lacks the surface details. As such the resulting body shape appears to be without clothing. In t… ▽ More This paper presents a novel framework to recover detailed human body shapes from a single image. It is a challenging task due to factors such as variations in human shapes, body poses, and viewpoints. Prior methods typically attempt to recover the human body shape using a parametric based template that lacks the surface details. As such the resulting body shape appears to be without clothing. In this paper, we propose a novel learning-based framework that combines the robustness of parametric model with the flexibility of free-form 3D deformation. We use the deep neural networks to refine the 3D shape in a Hierarchical Mesh Deformation (HMD) framework, utilizing the constraints from body joints, silhouettes, and per-pixel shading information. We are able to restore detailed human body shapes beyond skinned models. Experiments demonstrate that our method has outperformed previous state-of-the-art approaches, achieving better accuracy in terms of both 2D IoU number and 3D metric distance. The code is available in https://github.com/zhuhao-nju/hmd.git △ Less

Submitted 8 May, 2019; v1 submitted 24 April, 2019; originally announced April 2019.

Comments: CVPR 2019 Oral

Showing 1–7 of 7 results for author: Zuo, X