Refining 3D Point Cloud Normal Estimation via Sample Selection

 Jun Zhou, Yaoshun Li, Hongchen Tan, Mingjie Wang, Nannan Li, ** Liu
The authors would like to thank the High Performance Computing Center of Dalian Maritime University for providing the computing resources. This research was supported in part by the Natural Science Foundation of China under Grants 62002040, 61976040, and 62201020, in part by China Postdoctoral Science Foundation 2021M690501, in part by the Science Foundation of Zhejiang Sci-Tech University under Grant number 22062338-Y, and in part by Bei**g Postdoctoral Science Foundation under Grant number 2022-ZZ-069. (Corresponding author: Jun Zhou.)J. Zhou,Y. Li and N. Li are with the School of Information Science and Technology, Dalian Maritime University, Dalian, China (E-mail: [email protected], [email protected], [email protected]). H. Tan is with the Institute of Artificial Intelligence, Bei**g University of Technology, Bei**g , China (E-mail: [email protected]). M. Wang is with the School of Science, Zhejiang Sci-Tech University, Zhe Jiang, China (E-mail: [email protected]). X. Liu is with the School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China. (E-mail: [email protected]).
Abstract

In recent years, point cloud normal estimation, as a classical and foundational algorithm, has garnered extensive attention in the field of 3D geometric processing. Despite the remarkable performance achieved by current Neural Network-based methods, their robustness is still influenced by the quality of training data and the models’ performance. In this study, we designed a fundamental framework for normal estimation, enhancing existing model through the incorporation of global information and various constraint mechanisms. Additionally, we employed a confidence-based strategy to select the reasonable samples for fair and robust network training. The introduced sample confidence can be integrated into the loss function to balance the influence of different samples on model training. Finally, we utilized existing orientation methods to correct estimated non-oriented normals, achieving state-of-the-art performance in both oriented and non-oriented tasks. Extensive experimental results demonstrate that our method works well on the widely used benchmarks.

Index Terms:
Normal estimation, Sample selection, Robust training

I Introduction

Point cloud normal estimation is a pivotal task in computer graphics. Due to the inherent unordered and non-uniform distribution of point clouds, normals serve as fundamental features that provide valuable additional information. Effective methods for normal estimation have proven valuable across various downstream applications such as odometry and map** [1, 2], 3D reconstruction [3, 4, 5, 6, 7], point cloud denoising [8], and semantic segmentation [9, 10].

In recent years, deep learning methods have gained considerable attention for normal estimation task. In contrast to traditional approaches such as PCA-based methods [11] and jets [12], deep learning techniques demonstrate promise in handling diverse data and exhibiting robustness to noise without requiring extensive parameter tuning. Typically, these methods adhere to a unified paradigm: they sample local neighborhood patches as input, use a neural network to extract features, and employ a regressor to estimate normals. This regressor can either directly estimate local normals or use surface fitting techniques. Training data is sourced from PCPNet [13] dataset. Algorithms developed based on this paradigm have achieved outstanding performance. Furthermore, thanks to their generalization capabilities, deep learning methods can be directly applied to various real scanned datasets, such as Semantic3D [14], SceneNN [15], and NYU depth v2 [16].

Refer to caption
Figure 1: Explanation of how corrupt noise samples affect network training. (A) The top partillustrates the differences between low-level and high-level patches regarding their underlying surfaces. In high noise scenarios, local patches may lack clear surface patterns. Additionally, compared to patches with low noise levels, those with high noise may show notable differences between the normals of query points and those of the nearest points on the surface. These instances, known as corrupt samples, can weaken the model’s robustness during training. (B) The table outlines four training strategies: A utilizes the entire PCPNET dataset, B exclusively uses clean data, C corrects normals from each point’s underlying surface, and D employs our confidence-based training method. It shows that relying solely on clean data and corrected normals isn’t ideal due to insufficient training data, resulting in reduced model robustness. (C) This part illustrates how the sampling range varies based on confidence values, which decrease as the noise scale increases.

Benefiting from the feature extraction capabilities of neural networks, learning-based normal estimation algorithms, particularly with the current SHS-Net [17], have achieved outstanding performance. However, existing methods still rely on the PCPNet [13] dataset for training, which contains various scales of noise. This diversity can enhance the network’s generalization capability. Nevertheless, as the scale of model noise increases, there is a higher probability that sampled training samples will deviate further from the underlying clean surface, as illustrated in the top part of Fig. 1. Consequently, both the patterns of sampled patch and their corresponding ground truth will deviate from the ideal scenario, introducing ambiguous information into the training process. This limitation can restrict the algorithm’s performance, causing trained models to produce offset results in noise-free data. As depicted in the table portion of Fig. 1, we evaluate a single-scale simple normal estimation model. Training solely with clean data or the entire dataset with corrected data adversely impacts the test results of clean data. However, applying confidence value constraints weakens the impact of corrupted data on noise, thereby enhancing model robustness.

To tackle this issue, we devised an evaluation process for each training sample, assigning individual confidence weights to them. These weights can be incorporated into the loss function, enabling the sample selection can be implemented by soft constraint way. This soft constraint approach offers advantages over direct ground truth correction. It not only addresses inconsistencies in ground truth data linked to high noise but also acknowledges inherent irregularities in noisy data itself. Consequently, this method aids in mitigating model biases. Extensive implementation analyses have shown its effectiveness in maintaining model performance on low noise datasets while also yielding satisfactory results on datasets with substantial noise levels. In addition, regarding the architecture of the normal estimation network, we also validated the effectiveness of multiple modules. By introducing global information and incorporating constraints such as the QSTN module, z loss, and local loss, we enhanced the model’s performance. Finally, leveraging existing neural gradient techniques for correction, our approach demonstrates competitiveness in both unoriented normal estimation and consistent normal orientation tasks.

In summary, the contributions of this study are threefold:

  • A two-branch normal estimation neural network is proposed. This framework simultaneously captures local information of query points and their relative global context. By integrating multiple constraints, we construct a more robust baseline normal estimation model.

  • Two methods for estimating confidence values of sampling patches are proposed. These values are then used as weights in the loss function to select reasonable samples and suppress corrupt ones. Compared to directly using corrected normals, our approach leads to a more robust normal estimation model.

  • We utilize existing orientation methods to update the normals estimated by our model, thereby obtaining normals that are consistent in orientation and precise in accuracy. Finally, we experimentally demonstrate that our method is able to estimate normals with high accuracy and achieves the state-of-the-art results in both unoriented and oriented normal estimation.

II Related Work

In this section, we mainly review the task of unoriented normal estimation, which can be generally divided into two categories: traditional methods and learning-based methods.Then, to ensure overall completeness, we also briefly review the task of consistent normal orientation in the final subsection.

II-A Traditional Methods

Normal estimation in point clouds is a well-established field in geometry processing. Principal Component Analysis (PCA) [11] is the most renowned method, involving sampling fixed-size neighbors and using statistical algorithms to fit a local tangent plane. Variations like Moving Least Squares (MLS) [18], truncated Taylor expansion fitting (n-jet) [12], and local spherical surface fitting [19] have also been proposed within this paradigm, introducing more advanced fitting strategies to mitigate the impact of noise levels and patch scales on algorithm accuracy. Additionally, robust statistics-based methods [20, 21, 22, 23, 24, 25] have been developed to estimate local patches reasonably and alleviate the influence of patch anisotropy on accuracy. However, these methods still struggle with oversmoothing sharp features and geometric details. Consequently, techniques based on Voronoi diagrams [26, 27, 28], Hough transform [29], and plane voting [30] have been proposed. While these methods offer strong theoretical guarantees for stability and accuracy, they often require tedious parameter tuning for different data models. In recent years, driven by the rapid development of deep learning technology, data-driven approaches for normal estimation have emerged, achieving remarkable results.

II-B Learning-based Methods

As a pioneering work, HoughCNN [31] was the first to integrate deep learning models into normal estimation tasks. They utilized image information constructed from a transformed Hough space accumulator as input and employed Convolutional Neural Networks (CNNs) for estimating point cloud normals. This method demonstrated outstanding performance at the time and laid the foundation for the subsequent introduction of deep learning techniques based on point clouds. Following this, a series of more competitive algorithms emerged, including PCPNet [13] based on point cloud representation, Nesti-Net [32] based on 3D modified Fisher vector (3DmFV) representations and IterNet [33] based on graph representation, are proposed. These frameworks notably improved the accuracy and efficiency of normal estimation algorithms. Particularly, PCPNet [13] garnered significant attention from researchers due to its simplicity and direct operation on point clouds. Drawing inspiration from this approach, various algorithms have been introduced to optimize normal estimation methods, broadly falling into two categories: deep surface fitting and direct regression-based methods. Inspired by this approach, various algorithms have emerged to enhance normal estimation methods, broadly divided into two categories: deep surface fitting and direct regression methods.

Deep surface fitting methods specifically combine deep neural networks with traditional fitting techniques to accurately estimate normals from point clouds. For instance, Cao et al[34] propose a differentiable RANSAC-like module at the end of a point cloud-based neural network to predict a latent tangent plane. Lenssen et al[33] utilize a learnable anisotropic kernel to iteratively fit the local tangent plane. Additionally, Refine-Net [35] explores refining the initial normal for each point by considering various representations. Although various efforts have been made by the aforementioned models to enhance local fitting performance, the assumption of a local first-order plane is not suitable for handling diverse patch styles. Therefore, DeepFit [36] utilizes a neural network to learn point-wise weights, enabling weighted least-squares surface fitting. Subsequently, several variations of DeepFit have been proposed to further bolster the robustness of the fitting process. Among them, Zhang et al[37] employ pre-estimated weights to guide the network in learning, thereby improving weight estimation accuracy. AdaFit [38] introduces a novel layer to aggregate features from multiple global scales and predicts point-wise offsets to enhance normal estimation accuracy. The offset strategy resembles that of prior work [39], albeit with the inclusion of offsets for the top-k points. Additionally, GraphFit [40] integrates graph convolution and adaptive fusion layers into the weight estimation network. Du et al[41] introduce two simple strategies, including a z-direction alignment loss and a learnable residual term, which significantly enhance DeepFit and its variants. Despite these advancements mitigating the algorithm’s sensitivity to fitting order, they have not fundamentally resolved the issue. Consequently, deep surface fitting methods continue to grapple with challenges related to overfitting and underfitting.

In the early stages, direct regression models based on point cloud network frameworks, such as PCPNet [13] and its variants [42], face limitations due to network performance issues and can not achieve outstanding performance. However, recent advancements in point cloud frameworks, like Transformer, have enabled direct regression methods to exhibit stronger capabilities. For instance, Zhou et al[43] introduce a fast patch stitching method utilizing transformer modules for direct multi-normal regression within a patch. Similarly, MSECNet [44] introduce a more effective Multi-Scale Edge Conditioning stream framework, resulting in higher accuracy. Unlike this paradigm of direct multi-normal regression for a patch, another type of direct regression methods still follows the approach of PCPNet [13], where only the normal of patch center is regressed. Recent works in this category include HSurf-Net [45] and SHS-Net [17], which first transform point clouds into a hyperspace through local and global feature extractions and then conduct plane fitting in the constructed space. Additionally, NeAF [46] infers an angle field around the ground truth normal to provide more information for learning from the input patch. Moreover, CMG-Net [47] identifies inaccurately annotated training samples and proposed the Chamfer Normal Distance to address this issue. However, the normal corrections introduced by the algorithm may not always be entirely reasonable. Even after correction, the normals of points far from the underlying surface in the presence of significant noise may still contain errors. Furthermore, this type of patch lacks sufficient underlying surface information, reducing the accuracy of the trained model for evaluating normals of low-noise point clouds. In this work, we propose a training strategy to select reasonable samples, significantly enhancing the performance of point cloud normal estimation algorithms.

II-C Consistent Normal Orientation

Note that the normals estimated by previous methods lack consistent orientation, a crucial aspect that has been extensively studied. Classic methods [11, 48, 49, 50, 51, 52, 53], inspired by the Minimum Spanning Tree (MST) algorithm, employ local diffusion strategies to achieve consistent normals for point clouds. Recently, ODP [54] proposes utilizing a neural network to learn oriented normals within local neighborhoods, introducing a dipole propagation strategy for global consistency. However, lacking global information incorporation renders this method less robust. Then, some approaches [55, 56, 57, 58, 26, 59, 60, 61, 62, 63] based on volumetric representation have been proposed to enhance orientation consistency. Current learning-based methods for consistent orientation estimation can be categorized into two types: direct regression using neural networks [17, 64] and those based on the Neural Gradient Function technique [65, 66]. Due to their superior capability in capturing global consistency, direct neural gradient methods enhance qualitative accuracy, though potentially compromising quantitative precision in normal regression. Hence, in this paper, we employ NeuralGF [66] to refine estimated normals, ensuring their effective application in 3D reconstruction tasks.

Refer to caption
Figure 2: The learning pipeline of our method. Data preprocessing: local patches of query points and globally sampled patches based on probabilities are initialized after PAC processing. Network architecture: shared QSTN aligns multiple branch inputs, while NeuralGF method rectifies normal orientation. Sample selection: two strategies for estimating confidence values are presented, with the surface-based confidence assessing the distance of each point to the potential surface, and the normal-based confidence evaluating the disparity between the normals of each point and those of the potential surface.

III Method

III-A Overview

The overall pipeline is depicted in Fig. 2. Similar to SHSNet [17], a regression network is used to directly estimate the normal of the center point within a local patch. However, in contrast to SHSNet [17], we enhance the network’s performance by introducing a rotation module and additional constraint losses. Furthermore, inspired by the integration of global information in SHSNet [17], our approach incorporates globally sampled points as auxiliary information during normal regression. Detailed descriptions of the network architecture are provided in Sec. III-C. In Sec. III-D, we present two simple mechanisms for estimating the confidence of training samples based on position and normal differences with the underlying surface, respectively. Additionally, we introduce a confidence-based training strategy for sample selection, as described in Sec. III-E. Finally, to address potential errors in normal orientation introduced by PCA techniques, we employ the current popular method, namely, NeuralGF [66] , to refine the orientation of the estimated normals (Sec. III-F).

III-B Data pre-processing

As shown in the left part of Fig. 2, given a 3D point clouds X={p1,p2,,pL}RL×3𝑋subscript𝑝1subscript𝑝2subscript𝑝𝐿superscript𝑅𝐿3X=\{p_{1},p_{2},\cdots,p_{L}\}\in R^{L\times 3}italic_X = { italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_p start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT } ∈ italic_R start_POSTSUPERSCRIPT italic_L × 3 end_POSTSUPERSCRIPT, we first sample a local patch Pi={pi,j|pi,jKNN(pi),j=1,2,,r}subscript𝑃𝑖conditional-setsubscript𝑝𝑖𝑗formulae-sequencesubscript𝑝𝑖𝑗𝐾𝑁𝑁subscript𝑝𝑖𝑗12𝑟P_{i}=\{p_{i,j}|p_{i,j}\in KNN(p_{i}),j=1,2,\cdots,r\}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { italic_p start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT | italic_p start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ∈ italic_K italic_N italic_N ( italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , italic_j = 1 , 2 , ⋯ , italic_r } for each query point piXsubscript𝑝𝑖𝑋p_{i}\in Xitalic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_X using k-nearest neighbor (kNN) search. Additionally, we sample a global set Pisubscriptsuperscript𝑃𝑖P^{\prime}_{i}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT including rsuperscript𝑟r^{\prime}italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT points for the query point pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, representing a patch with globally distributed points on the shape surface. As in previous works [17, 64], we introduce a probability-based sampling strategy to capture the local structure of the query point pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT while retaining global information of the point cloud X𝑋Xitalic_X. This strategy samples global points according to a density gradient that decreases with increasing distance from the query point pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. The gradient of a point from X𝑋Xitalic_X can be calculated as follows:

g(pj)={[11.5pjpi2maxqXqpi2]0.0511ifj.𝑔subscript𝑝𝑗casessuperscriptsubscriptdelimited-[]11.5subscriptnormsubscript𝑝𝑗subscript𝑝𝑖2subscript𝑞𝑋subscriptnorm𝑞subscript𝑝𝑖20.051missing-subexpression1𝑖𝑓𝑗g(p_{j})=\left\{\begin{array}[]{ll}\left[1-1.5\frac{\|p_{j}-p_{i}\|_{2}}{\max_% {q\in X}\|q-p_{i}\|_{2}}\right]_{0.05}^{1}&\\ 1&if~{}j\in\mathcal{R}.\end{array}\right.italic_g ( italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = { start_ARRAY start_ROW start_CELL [ 1 - 1.5 divide start_ARG ∥ italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG roman_max start_POSTSUBSCRIPT italic_q ∈ italic_X end_POSTSUBSCRIPT ∥ italic_q - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ] start_POSTSUBSCRIPT 0.05 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL italic_i italic_f italic_j ∈ caligraphic_R . end_CELL end_ROW end_ARRAY (1)

where []0.051superscriptsubscriptdelimited-[]0.051[\cdot]_{0.05}^{1}[ ⋅ ] start_POSTSUBSCRIPT 0.05 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT indicates value clam**, and \mathcal{R}caligraphic_R is a random sample index set of point cloud X𝑋Xitalic_X with rsuperscript𝑟r^{\prime}italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT items. Consequently, we can sample points according to probability distributions. We then use the classical PCA method to align the sampled patches, both local and global, as the initial input to our network.

III-C Architecture configuration

As shown in the top part of Fig. 2, our network framework comprises two branches: one for encoding features from local patches and the other from the global point cloud, similar to the architecture proposed by SHS-Net [17]. However, unlike SHS-Net, both local patch Pisubscript𝑃𝑖P_{i}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and global point cloud inputs Pisubscriptsuperscript𝑃𝑖P^{{}^{\prime}}_{i}italic_P start_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT ′ end_FLOATSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are initially passed through a shared QSTN module [41, 13] to initialize the orientation of the input patches. As the orientation determined by the local patch is more accurate, we share this QSTN module estimated by local patches. Then, like SHS-Net [17], we group the local features by k-NN and capture geometric information using MLP and max-pooling. For feature encoding, we also use a distance-based weight strategy to evaluate the importance of each point. To allow each point in the local patch to acquire global information, we use max-pooling and repetition operations to ensure the global latent code has the same dimension as the local latent code. The two codes are then fused by concatenation. Finally, through hierarchical information fusion and attention-weighted normal prediction, the final normal of the patch center nipredsuperscriptsubscript𝑛𝑖𝑝𝑟𝑒𝑑n_{i}^{pred}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT and the neighbors’ normals are predicted.

III-D Confidence Estimation

First, we identify two potential issues during the training stage: 1) unreliable annotated normals exist for some training samples , as shown in the bottom right corner of Fig. 1, and 2) noise patches that may lack sufficient underlying surface structure, as depicted in the upper part of Fig. 1. For this reason, we propose two confidence-based strategies, namely surface-inclusion-based and normal-discrepancy-based confidence, to evaluate the reliability of training samples. These strategies can effectively mitigate the impact of corrupted samples on the robustness of the model during the training process.

Specifically, the surface-inclusion-based strategy primarily involves evaluating the distance from each sampled patch’s query point to the underlying surface. It is evident that points located far from the underlying surface are unlikely to effectively capture local surface structure information. Given a query point piXsubscript𝑝𝑖𝑋p_{i}\in Xitalic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_X, our objective is to compute the distance from this point to the underlying surface, denoted as diS=D(pi,X)subscriptsuperscript𝑑𝑆𝑖𝐷subscript𝑝𝑖𝑋d^{S}_{i}=D(p_{i},X)italic_d start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_D ( italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_X ). However, estimating the underlying surface for the model X𝑋Xitalic_X affected by noise is challenging. Therefore, we use the corresponding noise-free point cloud X^^𝑋\hat{X}over^ start_ARG italic_X end_ARG and assume it can represent the entire underlying surface. Based on this assumption, we compute the surface distance as diS=D(pi,X^)subscriptsuperscript𝑑𝑆𝑖𝐷subscript𝑝𝑖^𝑋d^{S}_{i}=D(p_{i},\hat{X})italic_d start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_D ( italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over^ start_ARG italic_X end_ARG ). The noise-free point cloud X^^𝑋\hat{X}over^ start_ARG italic_X end_ARG is represented as a point set X^={p^1,p^2,,p^L}RL×3^𝑋subscript^𝑝1subscript^𝑝2subscript^𝑝𝐿superscript𝑅𝐿3\hat{X}=\{\hat{p}_{1},\hat{p}_{2},\cdots,\hat{p}_{L}\}\in R^{L\times 3}over^ start_ARG italic_X end_ARG = { over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT } ∈ italic_R start_POSTSUPERSCRIPT italic_L × 3 end_POSTSUPERSCRIPT, and the distance diSsubscriptsuperscript𝑑𝑆𝑖d^{S}_{i}italic_d start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT can be approximately calculated as follows:

diS=minp^jX^pip^j2.subscriptsuperscript𝑑𝑆𝑖subscriptsubscript^𝑝𝑗^𝑋subscriptnormsubscript𝑝𝑖subscript^𝑝𝑗2d^{S}_{i}=\min_{\hat{p}_{j}\in\hat{X}}\|p_{i}-\hat{p}_{j}\|_{2}.italic_d start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ over^ start_ARG italic_X end_ARG end_POSTSUBSCRIPT ∥ italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT . (2)

Consequently, the surface-inclusion-based confidence value can be calculated:

ciS=exp(diSsσS),subscriptsuperscript𝑐𝑆𝑖subscriptsuperscript𝑑𝑆𝑖𝑠subscript𝜎𝑆c^{S}_{i}=\exp(-\frac{d^{S}_{i}}{s\sigma_{S}}),italic_c start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = roman_exp ( - divide start_ARG italic_d start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_s italic_σ start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT end_ARG ) , (3)

where s𝑠sitalic_s denotes the scale value of the given point cloud X𝑋Xitalic_X, and σSsubscript𝜎𝑆\sigma_{S}italic_σ start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT is set to 0.05 based on extensive experimentation. More details and comparative analyses are provided in the subsequent ablation experiments.

In addition to the surface-inclusion-based confidence estimation strategy, we also introduce a normal-discrepancy-based confidence estimation strategy. We assume that the true normal for each query point should be consistent with the normal near the underlying surface. While this assumption may not always hold, it helps filter out corrupted training samples. Specifically, for a query point p^isubscript^𝑝𝑖\hat{p}_{i}over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with an annotated normal vector niR3subscript𝑛𝑖superscript𝑅3n_{i}\in R^{3}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT, we search for the nearest point p^isubscript^𝑝𝑖\hat{p}_{i}over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT on the noise-free point cloud X^^𝑋\hat{X}over^ start_ARG italic_X end_ARG. The point p^isubscript^𝑝𝑖\hat{p}_{i}over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT approximates a surface point on S𝑆Sitalic_S. Therefore, the for the query point pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the normal on the approximation surface point can be obtain as follows:

n^i={n^j|argminp^jX^pip^j}.subscript^𝑛𝑖conditional-setsubscript^𝑛𝑗subscriptsubscript^𝑝𝑗^𝑋normsubscript𝑝𝑖subscript^𝑝𝑗\hat{n}_{i}=\{\hat{n}_{j}|\arg\min_{\hat{p}_{j}\in\hat{X}}\|p_{i}-\hat{p}_{j}% \|\}.over^ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { over^ start_ARG italic_n end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | roman_arg roman_min start_POSTSUBSCRIPT over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ over^ start_ARG italic_X end_ARG end_POSTSUBSCRIPT ∥ italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∥ } . (4)

Then, the difference between the normal of the query point and the normal near the surface point can be calculated as follows:

diN=arccos(|𝐧i,𝐧^i|)π/2,superscriptsubscript𝑑𝑖𝑁subscript𝐧𝑖subscript^𝐧𝑖𝜋2d_{i}^{N}=\frac{\arccos(|{\langle\mathbf{n}_{i},\hat{\mathbf{n}}_{i}}\rangle|)% }{\pi/2},italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT = divide start_ARG roman_arccos ( | ⟨ bold_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over^ start_ARG bold_n end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟩ | ) end_ARG start_ARG italic_π / 2 end_ARG , (5)

where ,{\langle\cdot,\cdot}\rangle⟨ ⋅ , ⋅ ⟩ represents the inner product of two vectors and arccos()\arccos(\cdot)roman_arccos ( ⋅ ) function computes the inverse cosine, returning the angle in radians. Finally, the normal-discrepancy confidence value can be calculated as follows:

ciN=exp(diNσN),subscriptsuperscript𝑐𝑁𝑖subscriptsuperscript𝑑𝑁𝑖subscript𝜎𝑁c^{N}_{i}=\exp(-\frac{d^{N}_{i}}{\sigma_{N}}),italic_c start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = roman_exp ( - divide start_ARG italic_d start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT end_ARG ) , (6)

where σSsubscript𝜎𝑆\sigma_{S}italic_σ start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT is set to 0.06 based on extensive experimentation.

III-E Loss function

Firstly, similar to the approach in [41], we use the transformation regularization loss and the z-direction transformation loss to constrain the output rotation matrix RiR3×3subscript𝑅𝑖superscript𝑅33R_{i}\in R^{3\times 3}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_R start_POSTSUPERSCRIPT 3 × 3 end_POSTSUPERSCRIPT of the QSTN operation for the i𝑖iitalic_i-th sample of the point cloud X𝑋Xitalic_X. The goal of these regularization losses is to achieve a rigid transformation, aligning the sampled patch vertically with the z-axis as much as possible, thereby minimizing extra degrees of freedom in the given patch Pisubscript𝑃𝑖P_{i}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.These two loss functions are are defined as follows:

L1=IRiRiT2,subscript𝐿1superscriptnorm𝐼subscript𝑅𝑖superscriptsubscript𝑅𝑖𝑇2L_{1}=\|I-R_{i}R_{i}^{T}\|^{2},italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∥ italic_I - italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (7)
L2=niRi×zsubscript𝐿2normsubscript𝑛𝑖subscript𝑅𝑖𝑧L_{2}=\|n_{i}R_{i}\times z\|italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ∥ italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT × italic_z ∥ (8)

where IR3×3𝐼superscript𝑅33I\in R^{3\times 3}italic_I ∈ italic_R start_POSTSUPERSCRIPT 3 × 3 end_POSTSUPERSCRIPT represents the identity matrix and z=(0,0,1)𝑧001z=(0,0,1)italic_z = ( 0 , 0 , 1 ).

In addition to the regularization loss, we also introduce the center loss, neighborhood consistency loss, and weight constraint loss. Unlike previous methods, we re-weight the loss term for the i𝑖iitalic_i-th sample using the corresponding confidence value cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Here, cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT can be chosen either surface-inclusion-based confidence or normal-discrepancy-based confidence. Specifically, the re-weighted center loss for the patch Pisubscript𝑃𝑖P_{i}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT involves minimizing the sine loss between the estimated normal nipredsubscriptsuperscript𝑛𝑝𝑟𝑒𝑑𝑖n^{pred}_{i}italic_n start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and the ground truth nigtsubscriptsuperscript𝑛𝑔𝑡𝑖n^{gt}_{i}italic_n start_POSTSUPERSCRIPT italic_g italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT as follows:

L3=cinipred×nigt.subscript𝐿3subscript𝑐𝑖normsubscriptsuperscript𝑛𝑝𝑟𝑒𝑑𝑖subscriptsuperscript𝑛𝑔𝑡𝑖L_{3}=c_{i}\|n^{pred}_{i}\times n^{gt}_{i}\|.italic_L start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ italic_n start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT × italic_n start_POSTSUPERSCRIPT italic_g italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ . (9)

The weight constraint loss [37] incorporating our confidence value can be written as:

L4=ci1Mk=1M(wi,kpredwi,kgt)2,subscript𝐿4subscript𝑐𝑖1𝑀superscriptsubscript𝑘1𝑀superscriptsubscriptsuperscript𝑤𝑝𝑟𝑒𝑑𝑖𝑘subscriptsuperscript𝑤𝑔𝑡𝑖𝑘2L_{4}=c_{i}\frac{1}{M}\sum_{k=1}^{M}(w^{pred}_{i,k}-w^{gt}_{i,k})^{2},italic_L start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_M end_ARG ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ( italic_w start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT - italic_w start_POSTSUPERSCRIPT italic_g italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (10)

where wpredsuperscript𝑤𝑝𝑟𝑒𝑑w^{pred}italic_w start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT are the predicted weights for each point in the given patch, M𝑀Mitalic_M represents the cardinality of the downsampled patch, wi,kgt=exp((pi,knigt)2/δ2)subscriptsuperscript𝑤𝑔𝑡𝑖𝑘superscriptsubscript𝑝𝑖𝑘subscriptsuperscript𝑛𝑔𝑡𝑖2superscript𝛿2w^{gt}_{i,k}=\exp(-(p_{i,k}\cdot n^{gt}_{i})^{2}/\delta^{2})italic_w start_POSTSUPERSCRIPT italic_g italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT = roman_exp ( - ( italic_p start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT ⋅ italic_n start_POSTSUPERSCRIPT italic_g italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) and δ=max(0.052,0.3k=1M(pi,knigt)2/M)𝛿superscript0.0520.3superscriptsubscript𝑘1𝑀superscriptsubscript𝑝𝑖𝑘subscriptsuperscript𝑛𝑔𝑡𝑖2𝑀\delta=\max(0.05^{2},0.3\sum_{k=1}^{M}(p_{i,k}\cdot n^{gt}_{i})^{2}/M)italic_δ = roman_max ( 0.05 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , 0.3 ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ( italic_p start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT ⋅ italic_n start_POSTSUPERSCRIPT italic_g italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_M ), where pi,ksubscript𝑝𝑖𝑘p_{i,k}italic_p start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT is a point in the downsampled patch of Pisubscript𝑃𝑖P_{i}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Moreover, the neighborhood consistency loss emphasizes the importance of local points near the query center. The loss function for patch Pisubscript𝑃𝑖P_{i}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is expressed as:

L5=cik=1Mwi,kpredni,kpredni,kgt2,subscript𝐿5subscript𝑐𝑖superscriptsubscript𝑘1𝑀subscriptsuperscript𝑤𝑝𝑟𝑒𝑑𝑖𝑘superscriptnormsubscriptsuperscript𝑛𝑝𝑟𝑒𝑑𝑖𝑘subscriptsuperscript𝑛𝑔𝑡𝑖𝑘2L_{5}=c_{i}\sum_{k=1}^{M}w^{pred}_{i,k}\|n^{pred}_{i,k}-n^{gt}_{i,k}\|^{2},italic_L start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT = italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_w start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT ∥ italic_n start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT - italic_n start_POSTSUPERSCRIPT italic_g italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (11)

where the neighborhood point normals ni,kpredsubscriptsuperscript𝑛𝑝𝑟𝑒𝑑𝑖𝑘n^{pred}_{i,k}italic_n start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT are predicted by our network. Therefore, the final loss function for the sample patch Pisubscript𝑃𝑖P_{i}italic_P start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is defined as follows:

L=λ1L1+λ2L2+λ3L3+λ4L4+λ5L5𝐿subscript𝜆1subscript𝐿1subscript𝜆2subscript𝐿2subscript𝜆3subscript𝐿3subscript𝜆4subscript𝐿4subscript𝜆5subscript𝐿5L=\lambda_{1}L_{1}+\lambda_{2}L_{2}+\lambda_{3}L_{3}+\lambda_{4}L_{4}+\lambda_% {5}L_{5}italic_L = italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT italic_L start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT (12)

where λ1=0.1subscript𝜆10.1\lambda_{1}=0.1italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.1,λ2=0.5subscript𝜆20.5\lambda_{2}=0.5italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.5, λ3=0.1subscript𝜆30.1\lambda_{3}=0.1italic_λ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 0.1, λ4=1subscript𝜆41\lambda_{4}=1italic_λ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = 1, and λ5=0.25subscript𝜆50.25\lambda_{5}=0.25italic_λ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT = 0.25 are weighting factors.

III-F Normal orientation correction

As is well-known, direct regression-based normal estimation methods using neural networks often struggle to balance both orientation and accuracy in estimation. To address this problem, our study separates these aspects: the network focuses solely on regressing accuracy, while NeuralGF [66] is introduced to estimate orientation of the entire input model using neural gradient functions. Details of this method can be found in the referenced work [66]. With this approach, we can obtain the orientation nigfsubscriptsuperscript𝑛𝑔𝑓𝑖n^{gf}_{i}italic_n start_POSTSUPERSCRIPT italic_g italic_f end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPTfor each query point of the given model. Then, as shown in the upper right part of Fig. 2, the orientation of the normal nipredsubscriptsuperscript𝑛𝑝𝑟𝑒𝑑𝑖n^{pred}_{i}italic_n start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT can be corrected using the following formula:

n^ipred=sign(nigfnipred)nipred,subscriptsuperscript^𝑛𝑝𝑟𝑒𝑑𝑖𝑠𝑖𝑔𝑛subscriptsuperscript𝑛𝑔𝑓𝑖subscriptsuperscript𝑛𝑝𝑟𝑒𝑑𝑖subscriptsuperscript𝑛𝑝𝑟𝑒𝑑𝑖\hat{n}^{pred}_{i}=sign(n^{gf}_{i}\cdot n^{pred}_{i})n^{pred}_{i},over^ start_ARG italic_n end_ARG start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_s italic_i italic_g italic_n ( italic_n start_POSTSUPERSCRIPT italic_g italic_f end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_n start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_n start_POSTSUPERSCRIPT italic_p italic_r italic_e italic_d end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , (13)

where sgn()𝑠𝑔𝑛sgn(\cdot)italic_s italic_g italic_n ( ⋅ ) is the signum function.

TABLE I: Comparison of the RMSE angle error for unoriented normal estimation of our method to classical geometric methods, and deep learning methods on datasets PCPNet and FamousShape.* means the code is uncompleted.
Category PCPNet Dataset FamousShape Dataset SceneNN Dataset
Noise Density Average Noise Density Average Clean Noise Average
None 0.12%percent\%% 0.6%percent\%% 1.2%percent\%% Striped Gradient None 0.12%percent\%% 0.6%percent\%% 1.2%percent\%% Striped Gradient
Jet [12] 12.35 12.84 18.33 27.68 13.39 13.13 16.29 20.11 20.57 31.34 45.19 18.82 18.69 25.79 15.17 15.59 15.38
PCA [11] 12.29 12.87 18.38 27.52 13.66 12.81 16.25 19.90 20.60 31.33 45.00 19.84 18.54 25.87 15.93 16.32 16.12
PCPNet [13] 9.64 11.51 18.27 22.84 11.73 13.46 14.58 18.47 21.07 32.60 39.93 18.14 19.50 24.95 20.86 21.40 21.13
Zhou et al[42] 8.67 10.49 17.62 24.14 10.29 10.66 13.62 - - - - - - - - - -
Nesti-Net [32] 7.06 10.24 17.77 22.31 8.64 8.95 12.49 11.60 16.80 31.61 39.22 12.33 11.77 20.55 13.01 15.19 14.10
Lenssen et al[33] 6.72 9.95 17.18 21.96 7.73 7.51 11.84 11.62 16.97 30.62 39.43 11.21 10.76 20.10 10.24 13.00 11.62
DeepFit [36] 6.51 9.21 16.73 23.12 7.92 7.31 11.80 11.21 16.39 29.84 39.95 11.84 10.54 19.96 10.33 13.07 11.70
Refine-Net [35] 5.92 9.04 16.52 22.19 7.70 7.20 11.43 - - - - - - - - - -
Zhang et al[37] 5.62 9.91 16.78 22.93 6.68 6.29 11.25 - - - - - - - - - -
AdaFit [40] 5.19 9.05 16.45 21.94 6.01 5.90 10.76 9.09 15.78 29.78 38.74 8.52 8.57 18.41 8.39 12.85 10.62
GraphFit [40] 5.21 8.96 16.12 21.71 6.30 5.86 10.69 8.91 15.73 29.37 38.67 9.10 8.62 18.40 8.39 12.85 10.62
Hsurf [45] 4.17 8.78 16.25 21.61 4.98 4.86 10.11 7.59 15.64 29.43 38.54 7.63 7.40 17.70 7.55 12.33 9.89
SHSNet [17] 3.95 8.55 16.13 21.53 4.91 4.67 9.96 7.41 15.34 29.33 38.56 7.74 7.28 17.61 7.93 12.40 10.17
Li et al[65] 4.06 8.70 16.12 21.65 4.80 4.56 9.89 7.25 15.60 29.35 38.74 7.60 7.20 17.62 - - -
CMG-Net [47] 3.86 8.45 16.08 21.89 4.85 4.45 9.93 7.07 14.83 29.04 38.93 7.43 7.02 17.39 7.64 11.82 9.73
MSECNet [44] 3.84 8.74 16.10 21.05 4.34 4.51 9.76 6.73 15.52 29.19 38.06 6.68 6.70 17.15 6.94 11.66 9.30
Ours(S) 3.43 8.38 16.27 21.59 4.18 4.10 9.66 6.71 15.07 29.37 39.30 6.89 6.65 17.33 7.42 11.70 9.56
Ours(N) 3.42 8.32 16.22 21.71 4.10 4.14 9.65 6.62 15.03 29.25 38.85 6.86 6.63 17.21 7.42 11.81 9.61
Refer to caption
Figure 3: AUC on the PCPNet and FamousShape dataset. X and Y axes are the angle threshold and the percentage of good point (PGP) normals.
Refer to caption
Figure 4: Visualization comparisons on the PCPNet and FamousShape datasets, with numbers indicating RMSEs. The angle error is visualized using a heatmap.
Refer to caption
Figure 5: Qualitative comparisons on the SceneNN datasets (Noise: σ𝜎\sigmaitalic_σ = 0.3%percent\%%).

IV Experiments

IV-A Datasets and Settings

In this study, all comparative experimental models are trained on the same training set from PCPNet dataset [13], which provides ground-truth normals with consistent orientation. Various scales of noise and distribution density are applied during training. To evaluate our method’s generalization capability, we utilize both synthetic and real datasets. The synthetic datasets comprise the PCPNet [13] and Famous datasets [67], while the real datasets comprise Semantic3D [14] and SceneNN [15]. The test dataset configuration is identical to that of SHS-Net [17]. Similar to SHS-Net [17], we employ two metrics, namely, the angle Root Mean Squared Error (RMSE) and the Area Under the Curve (AUC), to quantitatively compare different methods. In the training stage, we use a batch size of 145 and epochs of 800, the Adam optimizer, and a base learning rate of 0.0009. Our network is trained on a single NVIDIA RTX 3090 GPU. For network configuration, in local patch encoding, we randomly select a query point and its 700 neighboring points to create a patch. In global shape encoding, we sample 1200 points from the shape point cloud. Further details regarding parameter settings are provided in the ablation study subsection.

TABLE II: Comparison of the RMSE angle error for corrected unoriented normal estimation of our method to classical geometric methods, and deep learning methods on modified datasets PCPNet.* means the code is uncompleted.
Category Noise Density Average
None 0.12%percent\%% 0.6%percent\%% 1.2%percent\%% Striped Gradient
PCPNet [13] 9.62 11.23 17.28 20.16 11.15 11.69 13.52
Nesti-Net [32] 8.43 10.57 15.00 18.16 10.20 10.66 12.17
DeepFit [36] 6.51 8.98 13.98 19.00 7.93 7.31 10.62
AdaFit [40] 5.21 8.79 13.55 17.31 6.01 5.90 9.46
GraphFit [40] 4.49 8.43 13.00 16.93 5.40 5.20 8.91
Hsurf [45] 4.17 8.52 13.23 16.72 4.98 4.86 8.75
SHS-Net [17] 3.95 8.29 13.13 16.60 4.91 4.67 8.59
CMG-Net [47] 3.86 8.13 12.55 16.23 4.85 4.45 8.35
Ours(S) 3.43 8.06 13.21 16.63 4.10 4.18 8.27
Ours(N) 3.51 7.79 12.92 16.12 4.32 4.05 8.15

IV-B Normal estimation performance

Results on synthetic datasets As shown in Table I, our method is competitive with existing deep learning-based and traditional methods on synthetic datasets PCPNet [13] and Famous datasets [67]. Here, ’S’ represents the surface-inclusion-based strategy, and ’N’ represents the normal-discrepancy-based strategy. Notably, our method achieves superior scores for point clouds with lower noise and varying density. In higher noise scenarios, our method achieves accuracy comparable to mainstream methods. This suggests that our model and sample selection strategy enhance the exploration of local patch patterns, thereby reducing interference from noisy samples during training and leading to more accurate local normal estimation. Subsequently, we compared the accuracy of the most state-of-the-art algorithms on corrected normals sourced from the PCPNet dataset, as depicted in Table III. Our method remains the most competitive on point clouds with high noise, demonstrating the robustness of our normal estimation architecture. Additionally, the AUC results of all methods are shown in Fig. 3, where our method demonstrates superior performance, showcasing remarkable stability across various angular thresholds. In Fig. 4, qualitative comparison results are presented using a heatmap to illustrate the angular error at each point of the point cloud. Our method exhibits the smallest errors in regions characterized by varying density, intricate geometry, and local details. In Tab. III, we present quantitative comparison results of oriented normal estimation on the PCPNet and FamousShape datasets. Our method provides the most accurate normals under almost all noise levels and density variations for both datasets. The experimental results demonstrate a significant improvement in normal estimation accuracy by combining the regression network with NeuralGF [66], a global-oriented normal estimation method. This approach contrasts with SHSNet [17], which employs a single network for both normal and orientation estimation.

Results on real datasets Subsequently, to further evaluate the generalization capacity of our method, we conducted quantitative and qualitative experiments on real datasets, including the indoor SceneNN [15] dataset and the outdoor Semantic3D [14] dataset. The SceneNN dataset is captured using a depth camera, and the ground-truth normals are calculated from the reconstructed meshes. Testing the same models as SHS-Net [17], we present the RMSE scores of different methods in Table I. Our method shows significant improvement over SHS-Net [17] and remains competitive with the latest methods, including CMG-Net [47] and MESCNet [44]. This further demonstrates the effectiveness of our two sample selection strategies proposed in this paper in enhancing the model’s generalization capability. Additionally, as shown in Figure 5, our visualizations illustrate the results of angle error. From these visualized results, it is evident that our method achieves more accurate precision in local details. We also evaluate our method on the Semantic3D dataset [14], which comprises laser-scanned point clouds without ground-truth normals. As shown in Figure 6, we provide visual comparisons by map** the normals to the RGB space. It is evident that our method achieves more accurate estimations in the detailed regions, resulting in sharper estimation results.

Refer to caption
Figure 6: Qualitative comparisons on the Semantic3D dataset, with point normals represented as RGB colors.
TABLE III: Comparison of the RMSE angle error for oriented normal of our method to other methods on datasets PCPNet and FamousShape.* means the code is uncompleted.
Category PCPNet Dataset FamousShape Dataset
Noise Density Average Noise Density Average
None 0.12%percent\%% 0.6%percent\%% 1.2%percent\%% Striped Gradient None 0.12%percent\%% 0.6%percent\%% 1.2%percent\%% Striped Gradient
PCA+++MST [11] 19.05 30.20 31.76 39.64 27.11 23.38 28.52 35.88 41.67 38.09 60.16 31.69 35.40 40.48
PCA+++SNO [49] 18.55 21.61 30.94 39.54 23.00 25.46 26.52 32.25 39.39 41.80 61.91 36.69 35.82 41.31
PCA+++ODP [54] 28.96 25.86 34.91 51.52 28.70 32.00 32.16 30.47 31.29 41.65 84.00 39.41 30.72 42.92
AdaFit [38]+++MST 27.67 43.69 48.83 54.39 36.18 40.46 41.87 43.12 39.33 62.28 60.27 45.58 42.00 48.76
AdaFit [38]+++SNO 26.41 24.17 40.31 48.76 27.74 31.56 33.16 27.55 37.60 69.56 62.77 27.86 29.19 42.42
AdaFit [38]+++ODP 26.37 24.86 35.44 51.88 26.45 20.57 30.93 41.75 39.19 44.31 72.91 45.09 42.37 47.60
Hsurf-Net [45]+++MST 29.82 44.49 50.47 55.47 40.54 43.15 43.99 54.02 42.67 68.37 65.91 52.52 53.96 56.24
Hsurf-Net [45]+++SNO 30.34 32.34 44.08 51.71 33.46 40.49 38.74 41.62 41.06 67.41 62.04 45.59 43.83 50.26
Hsurf-Net [45]+++ODP 26.91 24.85 35.87 51.75 26.91 20.16 31.07 43.77 43.74 46.91 72.70 45.09 43.98 49.37
PCPNet [13] 33.34 34.22 40.54 44.46 37.95 35.44 37.66 40.51 41.09 46.67 54.36 40.54 44.26 44.57
DPGO [64] 23.79 25.19 35.66 43.89 28.99 29.33 31.14 - - - - - - -
SHSNet [17] 10.28 13.23 25.40 35.51 16.40 17.92 19.79 21.63 25.96 41.14 52.67 26.39 28.97 32.79
Li et al[65] 12.52 12.97 25.94 33.25 16.81 9.47 18.49 13.22 18.66 39.70 51.96 31.32 11.30 27.69
NeuralGF [66] 10.60 18.30 24.76 33.45 12.27 12.85 18.70 16.57 19.28 36.22 50.27 17.23 17.38 26.16
Ours(S) 6.67 11.51 24.69 33.76 8.18 8.66 15.58 10.52 18.67 36.65 52.68 11.07 10.37 23.33
Ours(N) 6.76 11.53 24.64 33.54 7.91 8.57 15.49 10.37 18.41 36.43 51.31 11.13 10.29 22.99
TABLE IV: Ablation studies for unoriented normals on the PCPNet dataset with the different settings. Here (A) is the single branch with local patch and (B) means two branches with global and local patch.
Ablation Noise Density Average Δ(%)\Delta(\%)roman_Δ ( % )
None 0.12%percent\%% 0.6%percent\%% 1.2%percent\%% Striped Gradient
(A) baseline (center loss +++ neighbor loss) 3.71 8.66 16.13 21.65 4.69 4.48 9.89 -
baseline +++ qstn 3.60 8.56 16.61 21.58 4.41 4.73 9.84 0.51
baseline+++ qstn +++ z-loss 3.67 8.63 16.26 21.69 4.52 4.56 9.89 0.00
best (baseline+++ qstn +++ z-loss +++ neighbor-sin) 3.70 8.50 16.24 21.54 4.51 4.49 9.83 0.61
best +++ train with modified gt 3.69 8.49 16.08 21.87 4.73 4.33 9.86 0.30
best +++ surface confidence (σ=0.05𝜎0.05\sigma=0.05italic_σ = 0.05) 3.50 8.51 16.26 21.67 4.29 4.16 9.73 1.62
best +++ normal confidence (σ=0.05𝜎0.05\sigma=0.05italic_σ = 0.05) 3.44 8.43 16.18 21.98 4.10 4.22 9.72 1.72
(B) baseline (center loss +++ neighbor loss) 3.87 8.56 16.44 21.56 4.57 4.84 9.97 -
baseline +++ qstn 4.07 8.55 16.50 21.68 4.71 4.90 10.07 -1.00
baseline+++ qstn +++ z-loss 3.53 8.54 16.32 21.58 4.31 4.61 9.82 1.50
best (baseline+++ qstn +++ z-loss +++ neighbor-sin) 3.52 8.38 16.25 21.45 4.31 4.41 9.72 2.51
best +++ train with modified gt 3.62 8.39 16.22 21.92 4.58 4.52 9.87 1.00
best +++ surface confidence (σ=0.05𝜎0.05\sigma=0.05italic_σ = 0.05) 3.43 8.38 16.27 21.59 4.18 4.10 9.66 3.11
best +++ normal confidence (σ=0.05𝜎0.05\sigma=0.05italic_σ = 0.05) 3.51 8.32 16.31 21.67 4.32 4.05 9.70 2.71
TABLE V: Ablation studies for unoriented normals on the PCPNet dataset with the different settings. Here (A) is the single branch with local patch and (B) means two branches with global and local patch.
σ𝜎\sigmaitalic_σ surface conf normal conf Noise Density Average
None 0.12%percent\%% 0.6%percent\%% 1.2%percent\%% Striped Gradient
0.1 \checkmark 6.62 9.50 16.94 22.24 7.98 7.25 11.75
\checkmark 3.43 8.40 16.26 21.72 4.27 4.27 9.72
0.08 \checkmark 6.41 9.45 16.69 22.07 7.64 6.92 11.53
\checkmark 3.43 8.39 16.28 21.67 4.20 4.09 9.68
0.06 \checkmark 3.62 8.52 16.28 21.90 4.44 4.32 9.85
\checkmark 3.42 8.32 16.23 21.71 4.10 4.14 9.65
0.05 \checkmark 3.43 8.38 16.27 21.59 4.18 4.10 9.66
\checkmark 3.51 8.32 16.31 21.67 4.32 4.05 9.70
0.03 \checkmark 3.77 8.43 16.44 21.54 4.74 4.57 9.91
\checkmark 3.60 8.39 16.30 21.59 4.349 4.293 9.75
0.02 \checkmark 6.24 9.42 16.78 22.08 7.49 6.93 11.49
\checkmark 3.52 8.46 16.34 22.14 4.21 4.23 9.82
0.01 \checkmark 3.40 8.40 16.34 22.14 4.10 4.24 9.77
\checkmark 3.44 8.46 16.36 22.12 4.07 4.00 9.74
Refer to caption
Figure 7: The comparison of Poisson surface reconstruction using normals estimated from different methods.

IV-C Ablation Study

Parameter Setting Considering that the confidence values (including surface-inclusion-based confidence and normal-discrepancy-based confidence) used in our work will be utilized to suppress unreasonable samples during training, the choice of confidence values will affect model accuracy. The selection of confidence values is influenced by the sigma value, so in this section, we will provide a more detailed discussion on the choice of σ𝜎\sigmaitalic_σ values. Considering that the confidence values estimated by our surface-inclusion-based and normal-discrepancy-based strategies play a crucial role in sample selection during training, their choice significantly impacts model accuracy. The selection of these confidence values is influenced by the sigma value. In this section, we will provide a more detailed and insightful discussion on the choice of σ𝜎\sigmaitalic_σ values. Firstly, if the value of σ𝜎\sigmaitalic_σ is set too high, the confidence values among samples become similar, making it difficult to distinguish differences between them. Conversely, if σ𝜎\sigmaitalic_σ is too low, many noisy samples may struggle to participate in the training process. Therefore, it is crucial to choose a moderate value. As shown in Table V, a σ𝜎\sigmaitalic_σ value of 0.05 is appropriate for calculating surface-inclusion-based confidence, while a σ𝜎\sigmaitalic_σ value of 0.06 is most suitable for normal-discrepancy-based confidence.

Network Architecture To demonstrate the effectiveness of our architecture and the reweighting-based sample selection strategy, we conducted extensive ablation experiments on the PCPNet dataset. As shown in Table IV (A)-(B), we compared two baselines: a single-branch architecture with local patch input and a two-branch architecture with both global and local patch inputs. It is evident that without the introduction of additional losses and the QSTN module, the inclusion of extra global information can negatively impact model training, leading to decreased model accuracy (the average RMSE decreases from 9.89 to 9.97). Tabl IV also shows that, regardless of whether it’s a single-branch or a two-branch framework, the introduction of the QSTN module, z-direction transformation loss, and neighborhood consistency loss significantly improves model accuracy (compared to the baseline, they lead to improvements of 0.61%percent0.610.61\%0.61 % and 2.51%percent2.512.51\%2.51 %, respectively). With the enhancement of model representation and the introduction of effective constraints, the global branch provides valuable information that further improves the performance of normal estimation. Therefore, we ultimately chose the two-branch framework as the final normal estimation model. Finally, we evaluated the effectiveness of the reweighting-based sample selection method, as shown in the last three columns of Tab. IV (A) and (B). It can be observed that directly training with the corrected normals, whether in a single-branch or multi-branch architecture, does not lead to significant improvements, as employed in CMG-NET [47]. In contrast, the loss functions based on surface-inclusion and normal-discrepancy-based confidence suppression proposed in this paper bring more noticeable improvements, especially in the multi-branch architecture where the enhancement is more prominent. This suggests that the corrected normals may not necessarily represent true normals, and a strategy based on reweighting may offer a more flexible constraint. Here, we set the σ𝜎\sigmaitalic_σ value to 0.05 in these ablation experiments.

IV-D Application of the Proposed Method

Poisson Reconstruction To explore the potential of our proposed method for other tasks, we investigated the effectiveness of normals in point surface reconstruction. We utilized the classic Poisson reconstruction method [4] for this purpose. As depicted in Fig. 7, we present surfaces reconstructed using normals estimated with various methods. The results indicate that surfaces reconstructed with normals from our method are more accurate and exhibit sharper details and boundaries in local regions.

Refer to caption
Figure 8: Qualitative results of point cloud denoising. For each instance: the first row displays the denoised point clouds, the second row exhibits the corresponding reconstructed surfaces, and the third row demonstrates feature detection results of the denoised point clouds.

Point Cloud Denoising In our study, we also apply a normal-based denoising method [68] to validate the accuracy of normal estimation provided by our method. The denoised point clouds, the corresponding reconstructed surfaces, and feature detection results are displayed in Fig. 8. The visualization clearly demonstrates that our method’s normal estimation significantly contributes to point cloud denoising. In comparison to other methods, ours yields smoother surfaces in flat regions while retaining sharp features at edges.

V Conclusion

In our study, we introduce a novel framework for normal estimation that integrates normal estimation architecture with orientation algorithms to achieve higher accuracy and consistency. We propose two confidence-based sample selection training strategies to mitigate the impact of corrupted samples on the training model, ensuring its robustness. Extensive experiments validate that our method outperforms competitors in both accuracy and robustness for normal estimation. Additionally, we demonstrate its effectiveness in tasks such as normal-based reconstruction and point cloud denoising.

References

  • [1] I. Vizzo, X. Chen, N. Chebrolu, J. Behley, and C. Stachniss, “Poisson surface reconstruction for lidar odometry and map**,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 5624–5630.
  • [2] H. Guo, J. Zhu, and Y. Chen, “E-loam: Lidar odometry and map** with expanded local structural information,” IEEE Transactions on Intelligent Vehicles, vol. 8, no. 2, pp. 1911–1921, 2022.
  • [3] M. Berger, A. Tagliasacchi, L. M. Seversky, P. Alliez, J. A. Levine, A. Sharf, and C. T. Silva, “State of the art in surface reconstruction from point clouds,” in 35th Annual Conference of the European Association for Computer Graphics, Eurographics 2014-State of the Art Reports, no. CONF.   The Eurographics Association, 2014.
  • [4] M. Kazhdan, M. Bolitho, and H. Hoppe, “Poisson surface reconstruction,” in Proceedings of the fourth Eurographics symposium on Geometry processing, vol. 7, 2006, p. 0.
  • [5] T. Hashimoto and M. Saito, “Normal estimation for accurate 3d mesh reconstruction with point cloud model incorporating spatial structure.” in CVPR workshops, vol. 1, 2019.
  • [6] M. Kazhdan and H. Hoppe, “Screened poisson surface reconstruction,” ACM Transactions on Graphics (ToG), vol. 32, no. 3, pp. 1–13, 2013.
  • [7] J. Huang, Z. Gojcic, M. Atzmon, O. Litany, S. Fidler, and F. Williams, “Neural kernel surface reconstruction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 4369–4379.
  • [8] D. Zhang, X. Lu, H. Qin, and Y. He, “Pointfilter: Point cloud filtering via encoder-decoder modeling,” IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 3, pp. 2015–2027, 2020.
  • [9] E. Grilli, F. Menna, and F. Remondino, “A review of point clouds segmentation and classification algorithms,” The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 42, pp. 339–344, 2017.
  • [10] E. Che and M. J. Olsen, “Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis,” ISPRS journal of photogrammetry and remote sensing, vol. 143, pp. 233–248, 2018.
  • [11] H. Hoppe, T. DeRose, T. Duchamp, J. McDonald, and W. Stuetzle, “Surface reconstruction from unorganized points,” in Proceedings of the 19th annual conference on computer graphics and interactive techniques, 1992, pp. 71–78.
  • [12] F. Cazals and M. Pouget, “Estimating differential quantities using polynomial fitting of osculating jets,” Computer Aided Geometric Design, vol. 22, no. 2, pp. 121–146, 2005.
  • [13] P. Guerrero, Y. Kleiman, M. Ovsjanikov, and N. J. Mitra, “Pcpnet learning local shape properties from raw point clouds,” in Computer graphics forum, vol. 37, no. 2.   Wiley Online Library, 2018, pp. 75–85.
  • [14] T. Hackel, N. Savinov, L. Ladicky, J. D. Wegner, K. Schindler, and M. Pollefeys, “Semantic3d. net: A new large-scale point cloud classification benchmark,” arXiv preprint arXiv:1704.03847, 2017.
  • [15] B.-S. Hua, Q.-H. Pham, D. T. Nguyen, M.-K. Tran, L.-F. Yu, and S.-K. Yeung, “Scenenn: A scene meshes dataset with annotations,” in 2016 fourth international conference on 3D vision (3DV).   Ieee, 2016, pp. 92–101.
  • [16] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” in Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12.   Springer, 2012, pp. 746–760.
  • [17] Q. Li, H. Feng, K. Shi, Y. Gao, Y. Fang, Y.-S. Liu, and Z. Han, “Shs-net: Learning signed hyper surfaces for oriented normal estimation of point clouds,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 591–13 600.
  • [18] D. Levin, “The approximation power of moving least-squares,” Mathematics of computation, vol. 67, no. 224, pp. 1517–1531, 1998.
  • [19] G. Guennebaud and M. Gross, “Algebraic point set surfaces,” in ACM siggraph 2007 papers, 2007, pp. 23–es.
  • [20] S. Fleishman, D. Cohen-Or, and C. T. Silva, “Robust moving least-squares fitting with sharp features,” ACM transactions on graphics (TOG), vol. 24, no. 3, pp. 544–552, 2005.
  • [21] B. Li, R. Schnabel, R. Klein, Z. Cheng, G. Dang, and S. **, “Robust normal estimation for point clouds with sharp features,” Computers & Graphics, vol. 34, no. 2, pp. 94–106, 2010.
  • [22] M. Yoon, Y. Lee, S. Lee, I. Ivrissimtzis, and H.-P. Seidel, “Surface and normal ensembles for surface reconstruction,” Computer-Aided Design, vol. 39, no. 5, pp. 408–420, 2007.
  • [23] B. Mederos, L. Velho, and L. H. de Figueiredo, “Robust smoothing of noisy point clouds,” in Proc. SIAM Conference on Geometric Design and Computing, vol. 2004, no. 1.   SIAM Philadelphia, PA, USA, 2003, p. 2.
  • [24] Y. Wang, H.-Y. Feng, F.-É. Delorme, and S. Engin, “An adaptive normal estimation method for scanned point clouds with sharp features,” Computer-Aided Design, vol. 45, no. 11, pp. 1333–1348, 2013.
  • [25] J. Wang, K. Xu, L. Liu, J. Cao, S. Liu, Z. Yu, and X. D. Gu, “Consolidation of low-quality point clouds from outdoor scenes,” in Computer graphics forum, vol. 32, no. 5.   Wiley Online Library, 2013, pp. 207–216.
  • [26] P. Alliez, D. Cohen-Steiner, Y. Tong, and M. Desbrun, “Voronoi-based variational reconstruction of unoriented point sets,” in Symposium on Geometry processing, vol. 7, 2007, pp. 39–48.
  • [27] N. Amenta and M. Bern, “Surface reconstruction by voronoi filtering,” in Proceedings of the fourteenth annual symposium on Computational geometry, 1998, pp. 39–48.
  • [28] Q. Mérigot, M. Ovsjanikov, and L. J. Guibas, “Voronoi-based curvature and feature estimation from point clouds,” IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 6, pp. 743–756, 2010.
  • [29] A. Boulch and R. Marlet, “Fast and robust normal estimation for point clouds with sharp features,” in Computer graphics forum, vol. 31, no. 5.   Wiley Online Library, 2012, pp. 1765–1774.
  • [30] J. Zhang, J. Cao, X. Liu, H. Chen, B. Li, and L. Liu, “Multi-normal estimation via pair consistency voting,” IEEE transactions on visualization and computer graphics, vol. 25, no. 4, pp. 1693–1706, 2018.
  • [31] A. Boulch and R. Marlet, “Deep learning for robust normal estimation in unstructured point clouds,” in Computer Graphics Forum, vol. 35, no. 5.   Wiley Online Library, 2016, pp. 281–290.
  • [32] Y. Ben-Shabat, M. Lindenbaum, and A. Fischer, “Nesti-net: Normal estimation for unstructured 3d point clouds using convolutional neural networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 112–10 120.
  • [33] J. E. Lenssen, C. Osendorfer, and J. Masci, “Deep iterative surface normal estimation,” in Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2020, pp. 11 247–11 256.
  • [34] J. Cao, H. Zhu, Y. Bai, J. Zhou, J. Pan, and Z. Su, “Latent tangent space representation for normal estimation,” IEEE Transactions on Industrial Electronics, vol. 69, no. 1, pp. 921–929, 2021.
  • [35] H. Zhou, H. Chen, Y. Zhang, M. Wei, H. Xie, J. Wang, T. Lu, J. Qin, and X.-P. Zhang, “Refine-net: Normal refinement neural network for noisy point clouds,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 946–963, 2022.
  • [36] Y. Ben-Shabat and S. Gould, “Deepfit: 3d surface fitting via neural network weighted least squares,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16.   Springer, 2020, pp. 20–34.
  • [37] J. Zhang, J.-J. Cao, H.-R. Zhu, D.-M. Yan, and X.-P. Liu, “Geometry guided deep surface normal estimation,” Computer-Aided Design, vol. 142, p. 103119, 2022.
  • [38] R. Zhu, Y. Liu, Z. Dong, Y. Wang, T. Jiang, W. Wang, and B. Yang, “Adafit: Rethinking learning-based normal estimation on point clouds,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 6118–6127.
  • [39] J. Zhou, W. **, M. Wang, X. Liu, Z. Li, and Z. Liu, “Improvement of normal estimation for point clouds via simplifying surface fitting,” Computer-Aided Design, vol. 161, p. 103533, 2023.
  • [40] K. Li, M. Zhao, H. Wu, D.-M. Yan, Z. Shen, F.-Y. Wang, and G. Xiong, “Graphfit: Learning multi-scale graph-convolutional representation for point cloud normal estimation,” in European Conference on Computer Vision.   Springer, 2022, pp. 651–667.
  • [41] H. Du, X. Yan, J. Wang, D. Xie, and S. Pu, “Rethinking the approximation error in 3d surface fitting for point cloud normal estimation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 9486–9495.
  • [42] J. Zhou, H. Huang, B. Liu, and X. Liu, “Normal estimation for 3d point clouds via local plane constraint and multi-scale selection,” Computer-Aided Design, vol. 129, p. 102916, 2020.
  • [43] J. Zhou, W. **, M. Wang, X. Liu, Z. Li, and Z. Liu, “Fast and accurate normal estimation for point clouds via patch stitching,” Computer-Aided Design, vol. 142, p. 103121, 2022.
  • [44] H. Xiu, X. Liu, W. Wang, K.-S. Kim, and M. Matsuoka, “Msecnet: Accurate and robust normal estimation for 3d point clouds by multi-scale edge conditioning,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 2535–2543.
  • [45] Q. Li, Y.-S. Liu, J.-S. Cheng, C. Wang, Y. Fang, and Z. Han, “Hsurf-net: Normal estimation for 3d point clouds by learning hyper surfaces,” Advances in Neural Information Processing Systems, vol. 35, pp. 4218–4230, 2022.
  • [46] S. Li, J. Zhou, B. Ma, Y.-S. Liu, and Z. Han, “Neaf: Learning neural angle fields for point normal estimation,” in Proceedings of the AAAI conference on artificial intelligence, vol. 37, no. 1, 2023, pp. 1396–1404.
  • [47] Y. Wu, M. Zhao, K. Li, W. Quan, T. Yu, J. Yang, X. Jia, and D.-M. Yan, “Cmg-net: Robust normal estimation for point clouds via chamfer normal distance and multi-scale geometry,” arXiv preprint arXiv:2312.09154, 2023.
  • [48] S. König and S. Gumhold, “Consistent propagation of normal orientations in point clouds.” in VMV, 2009, pp. 83–92.
  • [49] N. Schertler, B. Savchynskyy, and S. Gumhold, “Towards globally optimal normal orientations for large point clouds,” in Computer Graphics Forum, vol. 36, no. 1.   Wiley Online Library, 2017, pp. 197–208.
  • [50] L. M. Seversky, M. S. Berger, and L. Yin, “Harmonic point cloud orientation,” Computers & Graphics, vol. 35, no. 3, pp. 492–499, 2011.
  • [51] J. Wang, Z. Yang, and F. Chen, “A variational model for normal computation of point clouds,” The Visual Computer, vol. 28, pp. 163–174, 2012.
  • [52] M. Xu, S. Xin, and C. Tu, “Towards globally optimal normal orientations for thin surfaces,” Computers & Graphics, vol. 75, pp. 36–43, 2018.
  • [53] J. Jakob, C. Buchenau, and M. Guthe, “Parallel globally consistent normal orientation of raw unorganized point clouds,” in Computer Graphics Forum, vol. 38, no. 5.   Wiley Online Library, 2019, pp. 163–173.
  • [54] G. Metzer, R. Hanocka, D. Zorin, R. Giryes, D. Panozzo, and D. Cohen-Or, “Orienting point clouds with dipole propagation,” ACM Transactions on Graphics (TOG), vol. 40, no. 4, pp. 1–14, 2021.
  • [55] V. c. Mello, L. Velho, and G. Taubin, “Estimating the in/out function of a surface represented by points,” in Proceedings of the eighth ACM symposium on Solid modeling and applications, 2003, pp. 108–114.
  • [56] P. Mullen, F. De Goes, M. Desbrun, D. Cohen-Steiner, and P. Alliez, “Signing the unsigned: Robust surface reconstruction from raw pointsets,” in Computer Graphics Forum, vol. 29, no. 5.   Wiley Online Library, 2010, pp. 1733–1741.
  • [57] C. Walder, O. Chapelle, and B. Schölkopf, “Implicit surface modelling as an eigenvalue problem,” in Proceedings of the 22nd international conference on Machine learning, 2005, pp. 936–939.
  • [58] Z. Huang, N. Carr, and T. Ju, “Variational implicit point set surfaces,” ACM Transactions on Graphics (TOG), vol. 38, no. 4, pp. 1–13, 2019.
  • [59] S. Katz, A. Tal, and R. Basri, “Direct visibility of point sets,” in ACM SIGGRAPH 2007 papers, 2007, pp. 24–es.
  • [60] Y.-L. Chen, B.-Y. Chen, S.-H. Lai, and T. Nishita, “Binary orientation trees for volume and surface reconstruction from unoriented point clouds,” in Computer Graphics Forum, vol. 29, no. 7.   Wiley Online Library, 2010, pp. 2011–2019.
  • [61] H. Xie, K. T. McDonnell, and H. Qin, “Surface reconstruction of noisy and defective data sets,” in IEEE visualization 2004.   IEEE, 2004, pp. 259–266.
  • [62] D. Xiao, Z. Shi, S. Li, B. Deng, and B. Wang, “Point normal orientation and surface reconstruction by incorporating isovalue constraints to poisson equation,” Computer Aided Geometric Design, vol. 103, p. 102195, 2023.
  • [63] R. Xu, Z. Dou, N. Wang, S. Xin, S. Chen, M. Jiang, X. Guo, W. Wang, and C. Tu, “Globally consistent normal orientation for point clouds by regularizing the winding-number field,” ACM Transactions on Graphics (TOG), vol. 42, no. 4, pp. 1–15, 2023.
  • [64] S. Wang, X. Liu, J. Liu, S. Li, and J. Cao, “Deep patch-based global normal orientation,” Computer-Aided Design, vol. 150, p. 103281, 2022.
  • [65] Q. Li, H. Feng, K. Shi, Y. Fang, Y.-S. Liu, and Z. Han, “Neural gradient learning and optimization for oriented point normal estimation,” in SIGGRAPH Asia 2023 Conference Papers, 2023, pp. 1–9.
  • [66] Q. Li, H. Feng, K. Shi, Y. Gao, Y. Fang, Y.-S. Liu, and Z. Han, “Neuralgf: Unsupervised point normal estimation by learning neural gradient function,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  • [67] P. Erler, P. Guerrero, S. Ohrhallinger, N. J. Mitra, and M. Wimmer, “Points2surf learning implicit surfaces from point clouds,” in European Conference on Computer Vision.   Springer, 2020, pp. 108–124.
  • [68] X. Lu, S. Schaefer, J. Luo, L. Ma, and Y. He, “Low rank matrix approximation for 3d geometry filtering,” IEEE transactions on visualization and computer graphics, vol. 28, no. 4, pp. 1835–1847, 2020.

Biography

[Uncaptioned image] Jun Zhou received the BSc and PhD degrees in computational mathematics from Dalian University of Technology, China, in 2013 and 2020. He is a lecturer in the College of Information Science and Technology at Dalian Maritime University, China. His research interests include computer graphics, image processing and machine learning.
[Uncaptioned image] Yaoshun Li received the bachelor’s degree in oil and gas storage and transportation engineering from Liaoning Petrochemical University in 2018. He is currently pursuing a master’s degree at Dalian Maritime University. His research interests include computer vision and computer graphics.
[Uncaptioned image] Hongchen Tan is a Lecturer of Artificial Intelligence Research Institute at Bei**g University of Technology. He received Ph.D degrees in computational mathematics from the Dalian University of Technology in 2021. His research interests are Person Re-identification, Image Synthesis, and Referring Segmentation.
[Uncaptioned image] Mingjie Wang is an associate professor in the School of Science at Zhejiang Sci-Tech University. He received the Ph.D degree from University of Guelph in 2022. His research interests include image processing, computer vision and deep learning.
[Uncaptioned image] Nannan Li is an associate professor in School of Information Science and Technology at Dalian Maritime University. She received her B.S. and M.D-Ph.D in Computational Mathematics at Dalian University of Technology. Her research interests include computer graphics, differential geometry analysis and processing, computer aided geometric design and machine learning.
[Uncaptioned image] ** Liu received the BSc degree from Jilin University, China, in 1990, and the PhD degree from the Dalian University of Technology, China, in 1999, respectively. She is a professor with the Dalian University of Technology. Between 1999 and 2001, she conducted research as a postdoctoral scholar in the School of Mathematics, Sun Yat-sen University, China. Her research interests include shape modeling and analyzing.