[PDF] Meta-PU: An Arbitrary-Scale Upsampling Network for Point Cloud

Abstract

Point cloud upsampling is vital for the quality of the mesh in three-dimensional reconstruction. Recent research on point cloud upsampling has achieved great success due to the development of deep learning. However, the existing methods regard point cloud upsampling of different scale factors as independent tasks. Thus, the methods need to train a specific model for each scale factor, which is both inefficient and impractical for storage and computation in real applications. To address this limitation, in this work, we propose a novel method called ``Meta-PU" to firstly support point cloud upsampling of arbitrary scale factors with a single model. In the Meta-PU method, besides the backbone network consisting of residual graph convolution (RGC) blocks, a meta-subnetwork is learned to adjust the weights of the RGC blocks dynamically, and a farthest sampling block is adopted to sample different numbers of points. Together, these two blocks enable our Meta-PU to continuously upsample the point cloud with arbitrary scale factors by using only a single model. In addition, the experiments reveal that training on multiple scales simultaneously is beneficial to each other. Thus, Meta-PU even outperforms the existing methods trained for a specific scale factor only.

Full PDF

JJOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1

Meta-PU: An Arbitrary-Scale UpsamplingNetwork for Point Cloud

Shuquan Ye, Dongdong Chen, Songfang Han, Ziyu Wan and Jing Liao

Abstract —Point cloud upsampling is vital for the quality of the mesh in three-dimensional reconstruction. Recent research on point cloudupsampling has achieved great success due to the development of deep learning. However, the existing methods regard point cloudupsampling of different scale factors as independent tasks. Thus, the methods need to train a speciﬁc model for each scale factor, whichis both inefﬁcient and impractical for storage and computation in real applications. To address this limitation, in this work, we propose anovel method called “Meta-PU” to ﬁrstly support point cloud upsampling of arbitrary scale factors with a single model. In the Meta-PUmethod, besides the backbone network consisting of residual graph convolution (RGC) blocks, a meta-subnetwork is learned to adjust theweights of the RGC blocks dynamically, and a farthest sampling block is adopted to sample different numbers of points. Together, thesetwo blocks enable our Meta-PU to continuously upsample the point cloud with arbitrary scale factors by using only a single model. Inaddition, the experiments reveal that training on multiple scales simultaneously is beneﬁcial to each other. Thus, Meta-PU evenoutperforms the existing methods trained for a speciﬁc scale factor only.

Index Terms —Point cloud, upsampling, meta-learning, deep learning. (cid:70)

NTRODUCTION P OINT clouds are the most fundamental and popular representa-tion for three-dimensional (3D) environment modeling. Whenreconstructing the 3D model of an object from the real world, acommon technique is to obtain the point cloud and then recoverthe mesh from it. However, a raw point cloud generated fromdepth cameras or reconstruction algorithms is usually sparse andnoisy due to the restrictions of hardware devices or the limitationsof algorithms, which leads to a low-quality mesh. To solve thisproblem, it is common to apply point cloud upsampling prior tomeshing, which takes a set of sparse points as input and generatesa denser set of points to reﬂect the underlying surface better.Conventional point cloud upsampling methods [1], [2], [3] areoptimization-based with various shape priors as constraints, suchas the local smoothness of the surface and the normal. These worksperform well for simple objects but are not able to handle complexand dedicated structures. Due to the success of deep learning, somedata-driven methods [4], [5], [6], [7] have emerged recently andachieved state-of-the-art performance by employing powerful deepneural networks to learn the upsampling process in an end-to-endway.However, all existing point cloud upsampling networks onlyconsider certain integer scale factors (e.g., 2x). They regard theupsampling of different scale factors as independent tasks. Thus, aspeciﬁc model for each scale factor has to be trained, limiting theuse of these methods in real-world scenarios where different scalefactors are needed to ﬁt different densities of raw point clouds.Some works [6], [7] suggests to achieve larger scales through • Shuquan Ye, Ziyu Wan and Jing Liao are with the Department of ComputerScience, City University of Hong Kong, HK, China.E-mails: [email protected], [email protected] [email protected] • Dongdong Chen is with Microsoft Research.E-mail: [email protected] • Songfang Han is with University of California, San Diego, USA.E-mail: [email protected] • Jing Liao is the corresponding author.

Scale vector for

R=2.5

Predicted weightMeta-Sub-Net UpsamplingNet

Meta-PU

Up-sample Up-sample Down-sample ⌊ ⌋ Single-scale Models ⌊ ⌋ Fig. 1: Arbitrary-scale model Meta-PU vs. the single-scale modelsover the example scale R = 2 . . The existing single-scale modelsﬁrst need to scale to a larger integer scale (e.g., 4x), then use adownsample algorithm to achieve the noninteger scale of 2.5x.iterative upsampling (e.g., upsampling 4x by running the 2x modeltwice). However, this repeated computation is time-consuming, andupsampling of non-integer factors still cannot be achieved (e.g.,2.5x with single-scale models), as depicted in Fig. 1.In real-world scenarios, it is very common and necessary toupsample raw point clouds into various user-customized densitiesfor mesh reconstruction, point cloud processing, or other needs.Thus, an efﬁcient method for upsampling arbitrary scale factorsis desired to solve the aforementioned drawbacks in the existingmethods. However, it is not easy for vanilla neural networks. Theirbehavior is ﬁxed once trained because of the deterministic learnedweights, so it is not straightforward to let the network handle thearbitrary scale factor on the ﬂy.Motivated by the development of meta-learning [8], [9] and thelatest image super-resolution method [10], we propose an efﬁcientand novel network called “Meta-PU” for point upsampling ofarbitrary scale factors. By incorporating one extra cheap meta- a r X i v : . [ c s . G R ] F e b OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 2 subnetwork as the controller, Meta-PU can dynamically changeits behavior during runtime depending on the desired scale factor.Compared with storing the weights for individual scale factors,storing the meta-subnetwork is more convenient and ﬂexible.Speciﬁcally, the backbone of Meta-PU is based on a graphconvolutional network (GCN), consisting of several residual graphconvolutional (RGC) blocks to extract the feature representationof each point as well as relationships to their nearest neighbors.And the meta-subnetwork is trained to generate weights for themeta-RGC block given the input of a scale factor. Then, the meta-convolution uses these weights to extract features that are adap-tively tailored to the scale factor. Following several RGC blocks,a farthest sampling block is further added to output an arbitrarynumber of points. In this way, different scale factors can be trainedsimultaneously with a single model. At the inference stage, whenusers specify an upsampling scale factor, the meta-subnetworkwill dynamically change the behavior of the meta-RGC block byadapting its weights and outputs the corresponding upsamplingresults.To demonstrate the effectiveness and ﬂexibility of our method,we compare it with several strong baseline methods. The compari-son shows that our method can even achieve SOTA performancesfor speciﬁc single scale factors while supporting arbitrary-scaleupsampling for the ﬁrst time. In other words, our approach isboth stronger and more ﬂexible than the SOTA approaches. Tobetter understand the underlying working principle and broaderapplications, we further provide a comprehensive analysis fromdifferent perspectives.In summary, our contribution is three-fold: • We propose the ﬁrst point cloud upsampling networkthat supports arbitrary scale factors (including nonintegerfactors), via a meta-learning approach. • We show that jointly training multiple scale factors with onemodel improves performance. Our arbitrary-scale modeleven achieves better results at each speciﬁc scale than thesingle-scale counterpart. • We evaluate our method on multiple benchmark datasetsand demonstrate that Meta-PU advances state-of-the-artperformance.

ELATED W ORK

Optimization-based upsampling.

Point cloud upsampling isformulated as an optimization problem in early work. A pioneeringsolution proposed by Alexa et al. [1] constructs a Voronoi diagramon the surface and then inserts points at the vertices. Lipman etal. [11] designed a novel locally optimal projection operator forpoint resampling and surface reconstruction based on L median,which is robust to noise outliers. Later Huang et al. [2] improvedthe locally optimal projection operator to enable edge-aware pointset upsampling. Wu et al. [3] employed a joint optimization methodfor the inner points and surface points deﬁned in their new pointset representation. However, most of these methods have a strong apriori assumption (e.g., a reasonable normal estimation or a smoothsurface in the local geometry). Thus, they may easily suffer fromcomplex and massive point cloud data. Deep-learning-based upsampling.

Recently, deep learning hasbecome a powerful tool for extracting features directly frompoint cloud data in a data-driven way. Qi et al. ﬁrstly proposePointNet [12] and PointNet++ [13] for extracting multi-level features from point sets. Based on these ﬂexible feature extractors,deep neural networks have been applied to many point cloud tasks,such as those in [14], [15], [16]. As for point cloud upsampling,Yu et al. [4] presented a point cloud upsampling neural networkoperating on the patch level and made it possible to directly input ahigh-resolution point cloud. Then, Yu et al. developed EC-Net [17]to improve the quality of the upsampled point cloud using an edge-aware joint learning strategy. Wang et al. proposed a progressivepoint set upsampling network [5] to suppress noise further andpreserve the details of the upsampled point cloud. Moreover,different frameworks, such as the generative adversarial network(GAN) [18] and the graph convolutional network(GCN) [19],have attracted researchers’ attention for handling point cloudupsampling. Li et al. proposed the PU-GAN [6] by formulatinga GAN framework to obtain more uniformly distributed pointcloud results. Wu et al. proposed AR-GCN [7] to make the ﬁrstattempt to model point cloud upsampling into a GCN. However,these networks are only designed for upsampling a ﬁxed scalefactor. When different upsampling scales are required in practicalapplications, multiple models have to be retrained. Unlike theirmethods, Meta-PU supports upsampling point clouds for arbitraryscale factors, by employing meta-learning to predict weights of thenetwork and dynamically change behavior for each scale factor.

Meta-learning.

Meta-learning, or learning to learn, refers tolearning by observing the performance of different machinelearning methods on various learning tasks. It is normally a two-level model: a meta-level model performed across tasks, and abase-level model acting within each task. The early meta-learningapproach is primarily used in few-shot/zero-shot learning, andtransfer learning [20], [21]. Recent works have also applied meta-learning to various tasks and achieved state-of-the-art results inobject detection [22], instance segmentation [23], image super-resolution [10], image smoothing [8], [9], [24], network pruning[25], etc. A more comprehensive survey of meta-learning can befound in [26]. Among these works, meta-SR [10], which learns theweights of the network for arbitrary-scale image super-resolution,is closely related to ours. However, it cannot be applied to ourtask. The main reason is that the target of meta-SR is imageswith a regular grid structure, whereas our target includes muchmore challenging irregular and orderless point clouds. For regulargrid-based images, since the correspondence between each outputpixel and the corresponding input pixel is pre-determined, meta-SRcan directly use relative offset information to regress the localupsampling weights. However, no such correspondence exists forpoint clouds. Therefore, we resort to using the meta-subnetworktogether with the farthest sampling block. The meta-subnetwork isresponsible for adaptively tailoring the point features of a speciﬁcscale factor by dynamically adjusting the weights of the RGC block,while the sampling block is responsible for sampling a particularnumber of points.

ETHOD

In this section, we deﬁne the task of arbitrary-scale point cloudupsampling. Then we introduce the proposed Meta-PU in detail.

Given a sparse and unordered point set X = { p i } ni =1 of n points,and with a scale factor R , the task of arbitrary-scale point cloudupsampling is to generate a dense point set Y = { p i } Ni =1 of OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 3 P o i n t C NN Meta-RGC-block R G C - b l o c k s *20 U npoo li ng S a m p li ng L rec R G C - b l o c k X feature c×n feature c×n Meta-Sub-Net w Meta-conv L uni L rep Predicted weight ⌊ R*n ⌋ Scale vector (max(0,R-i),R)i=1... ⌈ R ⌉ Fig. 2: Overview of Meta-PU. Given a sparse input point cloud X of n points and a scale factor R , Meta-PU generates denser pointcloud Y (cid:48) with (cid:98) R × n (cid:99) points. A compound loss function is employed to encourage Y (cid:48) to lie uniformly on the underlying surface oftarget Y . The pink box is the core part of Meta-PU. The meta-subnetwork takes scale factor R as input and outputs the weight tensor v weight for convolutional kernels in the meta-RGC block, to adapt the feature extraction to different upscales. N = (cid:98) R × n (cid:99) points. It is worth noting that R is not necessarilyan integer, and theoretically, N can be any positive integer greaterthan n . The output Y does not necessarily include points in X . Inaddition, X may not be uniformly distributed. In order to meet theneeds of practical applications, we need the upsampled point cloudto satisfy the following two constraints. Firstly, each point of Y lies on the underlying geometry surface described by X . Secondly,the distribution of output points should be smooth and uniform, forany scale factor R or input point number n . Overview.

The backbone upsampling network contains ﬁve basicmodules distinguished by different colors in Fig. 2. The inputpoint cloud ﬁrst goes through a point convolutional neural network(CNN) and several RGC blocks to extract features for each centroidand its neighbors. Among these RGC blocks, the meta-RGC blockis special. The meta-RGC block weights are dynamically generatedby a meta-sub-network given the input of R . Thus the featuresextracted by this meta-RGC block are tailored to the given scalefactor. After the RGC blocks, an unpooling layer is followed tooutput (cid:98) R max × n (cid:99) points, where R max denotes the maximumscale factor supported by our network, and R max = 16 by default.Afterward, the farthest sampling block is adopted to sample N points from (cid:98) R max × n (cid:99) points as the ﬁnal output, which isconstrained by a compound loss function. In the following section,we elaborate on the detailed structure of each block in Meta-PU,and the training loss. Point CNN.

The point CNN is a simple structure on the spatialdomain to extract features from the input point cloud X . In detail,for each point p ∈ X with shape × , we ﬁrst group its k nearestneighbors with shape k × , and then feed them into a series ofpoint-wise convolutions ( k × c ) followed by a max-pooling layerto obtain × c features, where c is the channel number of pointcloud feature. Thus, the output feature F out is a tensor of shape n × c . Recursively applied convolution reaches a wider receptive ﬁeld representing more information, whereas the maximum poolinglayer aggregates information from all points in the previous layer.In our implementation, we set k = 8 , and c = 128 and the numberof convolutional layers is . RGC Block.

As shown in Fig. 3a, the RGC block contains severalgraph convolutional layers and residual skip-connections, whichis inspired from [7]. It takes the feature tensor F in as input andoutputs F out of the same shape n × c as F in .The graph convolution in the RGC block is deﬁned on agraph G = (

V, ε ) , where V denotes the node set and ε denotesthe corresponding adjacency matrix. The graph convolution isformulated as follows: f pout = ω ∗ f pin + ω ∗ Σ q ∈ N ( p ) f qin , ∀ p ∈ V (1)where f pin denotes the input feature of vertex p , and f pout representsthe output feature of vertex p after graph convolution, where ω is the learn-able parameters and ∗ denotes the point-wiseconvolutional operation.The core idea of the RGC block is to separately operate theconvolution on the central-point feature and neighbor feature, asillustrated in Fig. 3b. For the neighbor features, they are groupedwith the k nearest neighbors of the input point cloud x and thengo through a × graph convolution. The central-point featuresare convolved separately from those of the neighbors and thenare concatenated with the neighbors’ features. Moreover, residualskip-connections are introduced to address the vanishing gradientand slow convergence problems. In our implementation, we set k = 8 , c = 128 , and a total of 22 RGC blocks are used. Amongthem, the second one is a special meta-RGC block, which isdescribed in detail next. Meta-RGC Block and Meta-subnetwork.

To solve the arbitrary-scale point cloud upsampling problem with a single model, wepropose a meta-RGC block, which is the core part of Meta-PU.The meta-RGC block is similar to a normal RGC block, but the

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 4

ReLU F in c×n c×n×k Concat Mean

Conv c×n

Group

Conv F out c×nc×n×(k+1) (a) RGC block feature Group feature

ReLU Neighbormeta-convCentermeta-conv Concat Mean (b) Meta-RGC block

ReLU F in c×n c×n×k Concat Mean

Conv c×n

Group

Conv (R max ×3)×n(R max ×3)×n×(k+1) Reshape F out max ×n) Unpooling Block (c) Unpooling block

Fig. 3: Structure of RGC block (a), meta-RGC block (b) and unpool-ing block (c). Both RGC and meta-RGC blocks convolve centroidfeatures and neighborhood features, respectively. In the meta-RGCblock, the weights of its convolutional layers are dynamicallypredicted based on the scale factor R . The unpooling block followsthe last RGC block.graph convolutional weights are dynamically predicted, dependingon the given scale factor R . Instead of feeding R directly into themeta-RGC block, we create a scale vector (cid:101) R = { max (0 , R − i ) , R } i =1 ... (cid:100) R (cid:101) ) as the input and ﬁll the rest with {− , − } toachieve the size ∗ R max . The philosophy behind this design isindeed inspired by meta-SR [10]. More speciﬁcally, because eachinput point is essentially transformed into a group of output points, { max (0 , R − i ) , R } can serve as the location identiﬁer to guidethe point processing network to differentiate the new i − th pointfrom other points generated by the same input seed point.The meta convolution is formulated as follows: f pout = ϕ (cid:16) (cid:101) R ; θ (cid:17) ∗ f pin + ϕ (cid:16) (cid:101) R ; θ (cid:17) ∗ Σ q ∈ N ( p ) f qin , ∀ p ∈ V (2)where the convolution weights are predicted by the meta-subnet-work ϕ ( . ) taking the scale vector (cid:101) R as input. Please note we havetwo branches of meta-convolution, as shown in Fig. 3b. One branchis for the feature of the center point p , and the other is for the featureof the neighbors deﬁned by the adjacency matrix ε . Since there isno pre-deﬁned adjacency matrix ε for point clouds, we deﬁne it as N ( p ) , the k nearest neighbors of p . The convolution weights ofthese two branches are generated by two meta-subnetworks withthe parameters θ i respectively.Each meta-subnetwork for meta-convolution comprises ﬁvefully-connected (FC) layers [27] and several activation layers asshown in Fig.4. In the forward pass, the ﬁrst FC layer takes the scale vector created from R as input, and obtains a vector of c hidden entries. After the activation function, the second FC layerproduces output of the same size as the input. Following theactivation function, the input of the third FC layer is the c hidden -entry encoding, and its output has length c in × c out × l × l . Next,the fourth FC layer outputs a vector w with the same shape asits input. Unlike the previous four concatenated layers, the lastFC layer serves as a skip-connection that obtains output w skip of shape c in × c out × l × l directly from ∗ R max . The twooutputs w , w skip are added and then reshaped to ( c in , c out , l, l ) as the weight matrix w for meta-convolution. We set c out = c in = 128 and c hidden = 128 . The l represents the kernel size ofthe convolution, which is set to in our implementation. In thebackward pass, instead of directly updating the weight matrix ofthe convolution, we calculate the gradients of the meta-subnetwork,with respect to the weights of the FC layers. The gradient of themeta-subnetwork can be naturally calculated by the Chain Rule tobe trained end-to-end.The meta-RGC block with dynamic weights predicted by themeta-subnetwork is necessary for the arbitrary-scale upsamplingtask because the upsampled point cloud iR -th to ( i + 1) R -th pointsare generated directly based on the features of the i -th input pointand its nearest neighbors extracted via RGC blocks. The pointlocations in the output of different scale factors have to be different,to ensure that the uniformity of the upsampled points can coverthe underlying surface. Therefore, the embedding features mustbe adaptively adjusted according to the scale factors. Thereforeadaptive adjustment of the embedding features according to thescale factor is necessary. This adjustment is much better than mereupsampling to R max times and then performing the downsample.The experiments in Section 4.5 are designed to prove this. The unpooling block takes point cloud X and correspondingfeatures F in as input. It is an RGC-based structure, while theoutput channels of the convolutional layers are set to R max × .Speciﬁcally, for feature F in of shape n × c , it is transformedto a tensor of size n × ( R max × , subsequently reshaped to n × R max × , denoted as T out . As a residual block, similar tothe residual connection of the input and output features in the RGCblock, we introduce a skip connection between points. Thus, thetensor T out is then point-wisely added to X to produce the output Y (cid:48) max of shape n × R max × . Note that the “add” operationnaturally expands x to R max copies in a broadcast manner. The farthest sampling block performs a farthest sampling strategyto retain Y (cid:48) with n × R points from Y (cid:48) max with (cid:98) R max × n (cid:99) . Theadvantages are two-fold. First, the farthest sampling can sample anarbitrary number of points from the input point set, which helpsobtain the required number of points as output. Second, since thefarthest sampling iteratively constructs a point set with the farthestpoint-wise distance according to the Euclidean distance from aglobal perspective, this step further enhances the uniformity of thepoint set distribution. For end-to-end training of Meta-PU, we adopt a compound losswith both reconstruction terms L rec and uniform terms L uni , L rep : L = λ rec L rec + λ uni L uni + λ rep L rep (3)The latter two terms aim at encouraging the uniformity of thegenerated point cloud and improving the visual quality. OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 5

Meta-Sub-NetworkScale vectors Outnon-linear w0 Outskip wskip F C ( R m ax , n o u t) Meta-ConvLeakeyReLUfeaturefeaturePredicted weight w L e a k e y R e L U L e a k e y R e L U L e a k e y R e L U F C ( R m ax , ) F C ( n o u t , n o u t) F C ( , ) F C ( , n o u t ) Fig. 4: Structure of the meta-subnetwork. The meta-subnetwork inside the pink box predicts weights for convolutional layers in themeta-RGC block.

Repulsion loss [4] L rep is represented as follows: L rep = N (cid:88) i =0 (cid:88) i (cid:48) ∈ K ( i ) η ( (cid:107) p i (cid:48) − p i (cid:107) ) w ( (cid:107) p i (cid:48) − p i (cid:107) ) (4)where N is the number of output points, K ( i ) is the index setof the k-nearest neighbors of point p i in output point cloud Y (cid:48) , η ( r ) = − r is the repulsion term, and w ( r ) = e − r /h . Uniform loss

The term [6] L uni comprises two parts: U imbalance accounting for global uniformity, and U clutter accounting for thelocal uniformity. L uni = M (cid:88) j =1 U imbalance ( S j ) · U clutter ( S j ) (5)where S j , j = 1 ..M refers to the ball queried point subsets withradius r d and centered at M seed points farthest sampled from Y (cid:48) . U imbalance ( S j ) = ( | S j | − ˆ n ) ˆ n where ˆ n = ˆ N × r d , referring to the expected number of points in S j . Note that, the imbalance term is not differentiable, which actsas a weight for the following clutter term. U clutter ( S j ) = | S j | (cid:88) k =1 (cid:16) d j,k − ˆ d (cid:17) ˆ d where d j,k is the point-to-neighbor distance of the k -th point in S j , while ˆ d = (cid:114) πr d | S j |√ denotes the expected distance. Un-biased Sinkhorn divergences [28] are proposed by us as thereconstruction loss, to encourage the distribution of generatedpoints to lie on the underlying mesh surface. It is the interpolationbetween the Wasserstein distance and kernel distance. The Sinkhorndivergences between output Y (cid:48) and the groundtruth Y can beformulated as follows: L rec = S ε ( Y, Y (cid:48) ) = OT ε ( Y, Y (cid:48) ) −

12 OT ε ( Y, Y ) −

12 OT ε ( Y (cid:48) , Y (cid:48) ) (6)where ε is the regularization parameter, and OT ε ( Y, Y (cid:48) ) def. = min π = Y,π = Y (cid:48) (cid:90) X Cd π + ε KL( π | Y ⊗ Y (cid:48) ) with a cost function on the feature space X ⊂ R D of dimension D as follows: C ( x, y ) = 12 (cid:107) x − y (cid:107) (7)and where optimization is performed over the coupling measures π ∈ M +1 (cid:0) X (cid:1) as ( π , π ) denotes the π ’s two marginal. In the training process of most existing single-scale point cloudupsampling methods, each model is trained with one scale factor.However, because the scale factor varies in our arbitrary-scaleupsampling task, we need to design a variable-scale training schemeto train all factors jointly. We ﬁrst sampled all factors from therange of . to R max with a stride of . , and put them in set S R . For each epoch, a scale factor, say R , is randomly sampledfrom S R , and this factor is shared in a batch. To avoid overﬁtting,we also perform a series of data augmentation: rotation, randomscaling, shifting, jittering, and perturbation with low probability. XPERIMENTS

Dataset.

For training and testing, we utilize the same datasetadopted by PU-Net [4] and AR-GCN [7]. This dataset contains different models from the Visionair repository. Following theprotocol in the above two works, models are used for trainingand the rest models are used for testing.For training, 100 patches are extracted from each model, thuswe have a total of , patches. We uniformly sample N pointsusing Poisson disk sampling from each patch as the ground truth,and non-uniformly sample n points from the ground truth asinput, where n = (cid:4) n max × R (cid:5) , and N = R × (cid:4) n max × R (cid:5) ,corresponding to the scale factor R . Moreover, we set n max =4096 as the maximum number of points in our training. For testing,we use the whole model instead of the patch. The sampling processof the ground truth and input is similar to that in training. Butconstrained by the GPU memory limit, we set different numbers ofinput points for different scale factors, i.e. for R < = 4 , for < R < = 6 , for < R < = 12 , and for R > . Metrics.

For a fair comparison, we employ several different popularmetrics:

Chamfer Distance (CD) and

Earth Mover Distance(EMD) deﬁned on the Euclidean distance are to measure thedifference between predicted points Y (cid:48) and ground-truth pointcloud Y . The CD sums the square of the distance between each OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 6

TABLE 1: Experiments of quantitative comparisons. Single-scale models (including AR-GCN and PU-GAN) trained with each speciﬁcscale factor (top two rows) vs. the naive approach of arbitrary-scale upsampling (rows 3 to 5) vs. our full model (last row). The NUCscores are tested with p = 0 . . Methods \ Scales 2x 4xCD EMD F-score NUC mean std CD EMD F-score NUC mean stdAR-GCN - - - - - - 0.0086 0.018 70.09% 0.339 0.0029 0.0033PU-GAN 0.016 0.0090 32.17% 0.249 0.012 0.015 0.0097 0.016 69.75% 0.202 0.0030 0.0031AR-GCN(x16) 0.015 0.023 30.14% 0.307 0.0089 0.014 0.012 0.041 45.34% 0.256 0.0081 0.0096+random-samplingAR-GCN(x16) 0.014 0.012 33.52% 0.227 0.0088 0.011 0.011 0.018 52.67% 0.318 0.0072 0.0092+farthest-samplingAR-GCN(x16) 0.015 0.013 36.98% 0.273 0.0067 0.0082 0.013 0.013 54.05% 0.288 0.0066 0.0080+disk-samplingours

Methods \ Scales 6x 9x 16xCD EMD F-score NUC mean std CD EMD F-score NUC mean std CD EMD F-score NUC mean stdAR-GCN - - - - - -

TABLE 2: Quantitative comparisons with the EAR. The NUCscores are tested with p = 0 . . Methods CD ↓ EMD ↓ F-score ↑ NUC ↓ mean ↓ std ↓ EAR (2x) 0.0113 0.0214 48.07% 0.747 0.0048 0.0113Ours (2x)

EAR (4x) 0.0112 0.0176 51.26% 0.478 0.0074 0.0137Ours (4x)

EAR (6x) 0.0120 0.0184 52.26% 0.421 0.0085 0.0145Ours (6x)

EAR (9x) 0.0119 0.0174 52.93% 0.442 0.0089 0.0140Ours (9x) point and the nearest point in the other point set, then calculates theaverage for each point set. The EMD measures the minimum costof turning one of the point sets into the other. For these two metrics,the lower, the better. We also report the

F-score between Y (cid:48) and Y that deﬁnes the point cloud super-resolution as a classiﬁcationproblem as [7]. For this metric, larger is better. We employ the normalized uniformity coefﬁcient (NUC) [4] to evaluate theuniformity of Y (cid:48) by directly comparing the output point cloud Y (cid:48) with corresponding ground-truth meshes, and deviation meanand std to measure the difference between the output point cloudand ground-truth mesh. For these two metrics, smaller is better. We train the network for 60 epochs with a batch size of 18. Adamis adopted as the loss optimizer. The learning rate is initially setto 0.001 for FC layers and 0.0001 for convolutions and otherparameters, which is decayed with a cosine annealing scheduler to e − . Parameters λ rec , λ uni and λ rep for the joint loss functionare set to , . and . respectively. Generally, the trainingtakes less than seven hours on two Titan-XP GPUs.Theoretically,MetaPU supports any large scale, but we set the maximum scale to16 due to the limitations of the computing resources and practicalneeds. Fig. 5: Ablation on the meta-RGC block. Meta-PU is applied toupsample the point clouds to 2x, but the weights of its meta-RGCblock are generated with a different input scale factor R . R = 2 achieves the best performance on both F-score and CD, whichindicates our meta-RGC block adapts the convolutional weightappropriately to different scale factors. In this experiment, we compare Meta-PU with state-of-the-artsingle-scale upsampling methods, including PU-GAN [6] and AR-GCN [7], to upsample the sparse point cloud with scale factors R ∈ [2 , , , , . Their models are trained with the author-released code, and all settings are the same as stated in theirpapers. Since they are single-scale upsampling methods, for eachscale factor, an individual model is trained. Due to the limitationsof the two-stage upsampling strategy, AR-GCN can only be trainedwith the factors , , , whereas PU-GAN can be trained with allfour factors. Their performance is reported in the ﬁrst two rows ofTable 1. We surprisingly observe that our arbitrary-upscale modeleven outperforms their single-scale models with most scale factors.Particularly, our model performs signiﬁcantly better on the F-score, OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 7

TABLE 3: Quantitative comparisons with MPU. Our method obtains superior results under most metrics.

Method CD EMD F-score NUC with different p Deviation(1e-2) Time0.2% 0.4% 0.6% 0.8% 1.0% mean stdMPU(2x)

MPU(4x) 0.0086 0.012 73.16% 0.321 0.282 0.265 0.256 0.249

MPU(16x)

TABLE 4: Comparison of the inference time.

Method AR-GCN +Disk-sampling PU-GAN +Disk-sampling EAR oursTime(s) 10.28 10.06 351.10

NUC, mean, and std metrics than other models, and is more stableon all scales. This may be because multiple joint training tasks ofdifferent scales can beneﬁt each other, thus improving performance.In addition, Meta-PU needs to train only once for all testings, whileothers need to train multiple models, which is very inefﬁcient.

A naive approach to achieve arbitrary-scale upsampling is to ﬁrstuse a state-of-the-art single-scale model to upsample the cloud pointto a large scale, and then downsample it to a speciﬁc smaller scale.We compare our method with this naive approach. Speciﬁcally, wechoose AR-GCN [7] to upsample point clouds to 16x and thendownsample them to 2x,4x,6x and 9x with the random sampling,disk sampling, and farthest sampling algorithms. The results arereported in 3-5th rows of Table 1. We can see that random samplinggets the worst scores because it non-uniformly downsamples thepoints. In comparison, the more advanced sampling algorithms,including disk sampling and farthest sampling, perform betterby considering uniformity. Our method is still superior to all ofthem because the result of a smaller scale factor in our methodis not simply a subset of the large-factor one. In fact, Meta-PUcan adaptively adjust the location of the output points to ﬁt theunderlying surface better and maintain uniformity according todifferent scale factors. This will be analyzed by the ablation studyof the meta-RGC block in the next subsection. Moreover, comparedto the strongest baseline (AR-GCN+disk-sampling), ours is 120times faster (Table 4), because this advanced upsampling algorithmrequiring mesh reconstruction is slow.We also compare our method with the state-of-the-artoptimization-based method EAR [29], which is also applicableto variable scales. The results of scale 2,4,6,9,16 are provided inTable 2. It could be found that our method yields superior resultsunder all metrics.Further, we compare Meta-PU with the state-of-the-art multi-step upsample method MPU [5] that recursively upsamples a pointset, which is also applicable to scales of a power of 2,( e.g., 2,4,16).The results of scales 2,4,16 are provided in Table 3. It can be foundthat our method obtains superior results under most metrics. Inaddition, we provide a comparison of inference times. The runningtime of our method is much less at all scales, demonstrating thatour method is more efﬁcient than the recursive approach. 6FDOH ) V F R U H IVFRUH 6FDOH / R VV &'/RVV Fig. 6: F-scores and CD losses of upsampling on different scaleswith Meta-PU. Our method performs stably on different scales.

In Table 4, we provide the comparison of the average inference timeof all integer scales in (1,16]. Since Disk-sampling obtains the bestperformance as shown in Table 1, it is employed in other single-scale baselines for arbitrary-scale upsampling. We also compare theinference time of ours with the optimization-based method EAR.The running time of our method is much less than all the comparedmethods. Speciﬁcally, the speed of a trivial single-scale baselineis dragged down by the bottleneck of disk-sampling. We alsocalculated the inference time of our model at the scales of 2,4,6,9,and 16 with 2500 points. We found no difference in inference timefor different scales, demonstrating the stability of our method interms of inference time.

Fig. 6 shows the quantitative comparison results of the F-scoresand CD loss for point cloud upsampling with different scales. Itdemonstrates that our method can support a wide range of scales,and that the performances on different scales are stable. We need tonote that the lower F-score for small scales (e.g., 2) is because oneﬁxed distance threshold is used in calculating the F-scores, causingthe F-score to be relatively low when the point cloud is too sparse.

Importance of the Meta-RGC Block.

We design an experimentto evaluate the inﬂuence of the meta-RGC block quantitatively. We

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 8

TABLE 5: Ablation on the meta-RGC block.

Methods \ F-score 2x 4x 6xFull-model 53.20 74.05 74.98Replace Meta-RGC 52.33 73.08 74.09

Fig. 7: Close up of the upsampling results with and without meta-RGC block; Greens: full model, Reds: w/o meta-RGC block.use Meta-PU to upsample point clouds to 2x, but the weights of themeta-RGC block are generated with different input R . We measurethe average F-score and CD values on the testing set for different R , as plotted in Fig. 5. It can be observed that the best performancefor both the F-score and CD is achieved when the input scale factor R of the meta-RGC block equals the target upsampling scale .This demonstrates the meta-RGC block adapting the convolutionweight properly to different scale factors. Moreover, the meta-RGCblock adaptation to various scale factors is the key to makeing theoutput points better ﬁt the underlying surface and keep uniform .To demonstrate this, we respectively train the full model and ourmodel where the meta-RGC block is replaced by normal RGC-block, and show the close up of some results in Fig.7. It is obviousthat the results of our full model are more precise, sharper andcleaner, especially around key positions, such as corners. Also,we conducted an ablation study to replace the meta-RGC withnormal RGC, while keeping all the other parts ﬁxed. The resultsare shown in Table 5. These results demonstrate the meta-RGCblock adapts the convolutional weight properly to different scalefactors and improves performance. Therefore, as we explainedbefore, the performance improvements of our method mainly comefrom two aspects: the joint training of multiple scale factors withone model and the meta-RGC block.Another important function of the meta-RGC block is enablingadaptive receptive ﬁelds for different scales, which is usefulbecause large-scale upsampling typically takes sparser input andrequires exploring long-range relationships between input points.To demonstrate this, we analyze the effective receptive ﬁelds.Speciﬁcally, we ﬁx the model weights of Meta-PU and remove thefarthest sampling block to eliminate its inﬂuence. Then, we testthe model with the same inputs but different scale factors. In thecommon input, we randomly choose a point and ﬁnd its closestoutput point in the results of each factor as the centroids. We maskout the gradients from all points except the centroid and propagatethe gradients back to the input points. Only the input points whosegradient values are larger than of the maximum gradient valueare considered within the receptive ﬁeld of the center point. Asshown in Fig. 8, the receptive ﬁeld is dynamically increased withthe larger input scale factor of the meta-RGC block. Fig. 8: Visualizations of center points (Green), correspondingeffective receptive ﬁelds (Red) in the input, and other points in theinput (Gray). The scale factors are 2,5,8,11 from left to right. P indicates the number of input points within the receptive ﬁeld. Importance of Specially Designed Scale Tensor.

In this ablationstudy, we aim to show the advantages of our designed scale tensor (cid:101) R with the location identiﬁer over directly feeding R into the meta-RGC block. In the ﬁrst row of Table. 6, we ﬁll the scale tensor withall R s and train the model with the same setting. We can observethat the model with the specially designed scale tensor performsbetter, because the location identiﬁer in our scale tensor containsextra information to guide the network to better differentiate agroup of points, generated by the same seed point, from each other. Ablation on Loss Terms.

To validate our choice on the loss terms,we design two ablation experiments. In the ﬁrst experiment, wereplace the Sinkhorn loss with the CD distance in our method as [7]and report the performance (ours-CD) in Table 7. It clearly demon-strates the superiority of the Sinkhorn divergence reconstructionloss over CD. For example, the F-score using the Sinkhorn lossis about 7% better than that using the CD loss. Rather than justmeasuring the distance between every nearest point between twopoint sets in the CD loss, the Sinkhorn loss considers the jointprobability between two point sets and encourages the distributionof the generated points to lie on the ground-truth mesh surface. Inthe second experiment, we add the GAN-loss to our full-model(with GAN-loss). The scores on 4x are reported in Table 7. Wefound the GAN loss does not further improve the performance ofour model. Thus, we do not include it in our implementation.

Ablation on Different Scale Ranges.

To test the inﬂuence ofthe scale factor ranges in our method, we design an experimentcomparing our model trained with different R max but all testedwith R = 4 . The result is reported in Table 8, in which wecompare the results of the models trained with R max = 5 , , .Generally, ours(max16) trained with the widest range performs thebest, while ours(max9) are better than ours(max5). From this, wecan conclude that upsampling tasks with scale factors have someshared properties that allow them to beneﬁt from each other duringjoint training, further allowing the models to learn some common OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 9

TABLE 6: Experiments on our specially designed scale tensor vs simply feeding the original scale factor. All models are tested with ascale factor of 4. Our specially designed scale tensor performs better.

Method CD EMD F-score NUC with different p Deviation(1e-2)0.2% 0.4% 0.6% 0.8% 1.0% mean stdall-R 0.0081

TABLE 7: Experiments on our model with Sinkhorn loss vs. Chamfer Distance vs. GAN loss. All models are tested with a scale factor of4.

Method CD EMD F-score NUC with different p Deviation(1e-2)0.2% 0.4% 0.6% 0.8% 1.0% mean stdours-CD 0.0090 0.0099 65.49% 0.256 0.230 0.218 0.211 0.206 0.47 0.51with GAN-loss 0.0081 0.0081 73.69% 0.259 0.222 0.208 0.199 0.192 0.22 0.28ours

TABLE 8: Experiments on training Meta-PU with different upscale ranges R max and testing with scale factor . Training with a widerrange brings performance improvements. Method CD EMD F-score NUC with different p Deviation(1e-2)0.2% 0.4% 0.6% 0.8% 1.0% mean stdours(max5) ours(max9) 0.0081 0.011 73.39% 0.248 0.214 0.202 0.193 0.189 0.22 0.27ours(max16) 0.0080 knowledge about upsampling as well.TABLE 9: Quantitative results on SHREC15 with scale=4. TheNUC scores are tested with p = 0 . . Methods CD ↓ EMD ↓ F-score ↑ NUC ↓ mean ↓ std ↓ AR-GCN 0.0048 0.0127 94.04% 0.364 0.0018 0.0022Ours

In this section, we reveal the advantages of our method in differentpractical applications, including mesh reconstruction, point cloudclassiﬁcation, upsampling on real data, and upsampling witharbitrary input sizes or scales.

Mesh Reconstruction.

Fig. 9 shows the visualized results of 3Dsurface reconstruction. In the 3D mesh reconstruction task, theresult is greatly inﬂuenced by the density as well as the qualityof the input point cloud. Unfortunately, the point cloud scannedfrom the real object is usually sparse and noisy due to the devicelimitations. As a result, arbitrary point cloud upsampling is the keyto improving the mesh reconstruction quality, given the inputs withvariable density. In Fig. 9, all results are reconstructed with the ballpivoting algorithm [31]. The quality of the mesh reconstructed byTABLE 10: Results of point cloud classiﬁcation with PointNet [30]on the ModelNet40 testing set. After upsampling with Meta-PU,classiﬁcation performance improved signiﬁcantly. our model is much higher than the input and other methods. Also,our results ﬁt the underlying surface of the object more smoothly.In Fig. 10, we also compare the results of different upsamplingmethods with the Poisson surface reconstruction. Although thebeneﬁts of our method can be observed from the upsampled pointcloud (upper row), the Poisson reconstruction results (lower row)show no signiﬁcant advantage of any method. This is expectedbecause with a stronger surface reconstruction method like Poisson,the noise level shown here can be handled.

Point Cloud Classiﬁcation.

In this application, we aim todemonstrate that point upsampling can potentially improve theclassiﬁcation results for sparse point clouds. In detail, we ﬁrsttrain the PointNet on the ModelNet40 training set with input points for each model. During testing, we prepare threedatasets for each model: 1) points uniformly sampled fromthe corresponding shape surface (referred to as “ ”); 2) points non-uniformly sampled from the points (referred toas “ ”); 3) the × upsampling results obtained from the points with Meta-PU (referred as “ ”). As shownin Table. 10, because “ ” is sparser than “ ”, it results inlower classiﬁcation accuracy. By using Meta-PU for upsampling,“ ” can achieve a signiﬁcant performance gain andis comparable with the original “ ”. Upsampling on an Unseen Dataset: SHREC15.

We test Meta-PU on the unseen dataset SHREC15 [32] without ﬁne-tuning, tofurther validate the generalizability of our model. The quantitativeresults are shown in Table 9.

Upsampling on an Unseen Dataset: Real-scanned LiDARPoints.

Although Meta-PU is only trained on the Visionair datasetfollowing the protocol in [4], it can generalize very well to theunseen real-scanned LiDAR point clouds from the KITTI dataset.We present some visual examples in Fig. 11, where the input point

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 10

Input GT Ours PU-GAN ARGCN

Fig. 9: Results of the mesh reconstruction from 4x upsampled point cloud with ball pivoting.

Input GT Ours PU-GAN ARGCN

Fig. 10: Results of the mesh reconstruction from 4x upsampled point cloud with Poisson surface reconstruction.clouds are very sparse and non-uniform.

Upsampling With a Noisy Point Cloud.

As depicted in Fig.12, we use Meta-PU to upsample sparse point clouds jittered byGaussian noise with various σ . With noisy and blurry inputs fromthe ﬁrst row, our method can still stably generate smooth anduniformly distributed output. Upsampling on Varying Input Sizes.

Fig. 13 shows the upsam-pling results from input with different numbers of points with thescale R = 4 . It can be seen that Meta-PU is very robust to inputpoints of various sizes and sparsities, even for the input with only points. Upsampling on Non-integer Scales.

Fig. 14 presents the resultsof upsampling the same sparse point cloud input for non-integerscales. Note that, though we did not train Meta-PU with such scalesas . , . , . , ... explicitly, Meta-PU is still stable for theseunseen scales, which is not supported by the existing point cloudupsampling baselines. ONCLUSION

In this paper, we present Meta-PU, the ﬁrst point cloud upsamplingnetwork that supports arbitrary scale factors (including non-integer

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 11

Fig. 11: Upsampling results of Meta-PU on unseen LiDAR point clouds from the KITTI [33] dataset. The ﬁrst row is the sparse real-dataobject-level input, and the second row is the corresponding output. The third row is the sparse real-data scene-level input, and the lastrow is its corresponding output. Meta-PU can generalize very well to unseen sparse and non-uniform LiDAR data.Fig. 12: Results of up-sampling from noisy point clouds. The noiselevel σ of the left two columns is . , and the right two is .The scale factor is set to 4.factors). This method provides a more efﬁcient and practical toolfor 3D reconstruction, than the existing single-scale upsamplingnetworks. The core part of Meta-PU is a novel meta-RGC block,whose weights are dynamically predicted by a meta-subnetwork,thus it can extract features tailored to the upsampling of differentscales. The comprehensive experiments reveal that the joint trainingof multiple scale factors with one model improves performance. Ourarbitrary-scale model even achieves better results at each speciﬁcscale than those single-scale state-of-the-art. The application onmesh reconstruction also demonstrates the superiority of ourmethod in visual quality. Notably, similar to other upsamplingmethods, our method does not aim to ﬁll holes, such that somelarge holes or missing parts still exist in the upsampled results.Another limitation is that the maximum upscale factor supportedby our network is not inﬁnity, constrained by the model size andGPU memory. These are all future directions worth exploring. A CKNOWLEDGMENTS

This work was supported in part by the Hong Kong ResearchGrants Council (RGC) Early Career Scheme under Grant 9048148(CityU 21209119), and in part by the Shenzhen Basic ResearchGeneral Program under Grant JCYJ20190814112007258. R EFERENCES [1] M. Alexa, J. Behr, D. Cohen-Or, S. Fleishman, D. Levin, and C. T. Silva,“Computing and rendering point set surfaces,”

IEEE Transactions onVisualization and Computer Graphics , vol. 9, no. 1, pp. 3–15, Jan 2003. [2] H. Huang, D. Li, H. Zhang, U. Ascher, and D. Cohen-Or, “Consolidationof unorganized point clouds for surface reconstruction,”

ACM Transactionson Graphics (Proc. SIGGRAPH Asia 2009) , vol. 28, pp. 176:1–176:7,2009.[3] S. Wu, H. Huang, M. Gong, M. Zwicker, and D. Cohen-Or, “Deep pointsconsolidation,”

ACM Trans. Graph. , vol. 34, no. 6, pp. 176:1–176:13, Oct.2015. [Online]. Available: http://doi.acm.org/10.1145/2816795.2818073[4] L. Yu, X. Li, C.-W. Fu, D. Cohen-Or, and P.-A. Heng, “Pu-net: Point cloudupsampling network,” in

Proceedings of IEEE Conference on ComputerVision and Pattern Recognition (CVPR) , 2018.[5] W. Yifan, S. Wu, H. Huang, D. Cohen-Or, and O. Sorkine-Hornung,“Patch-based progressive 3d point set upsampling,” in

The IEEE Confer-ence on Computer Vision and Pattern Recognition (CVPR) , June 2019.[6] R. Li, X. Li, C.-W. Fu, D. Cohen-Or, and P.-A. Heng, “Pu-gan: Apoint cloud upsampling adversarial network,” in

The IEEE InternationalConference on Computer Vision (ICCV) , October 2019.[7] H. Wu, J. Zhang, and K. Huang, “Point cloud super resolution withadversarial residual graph networks,” 2019.[8] Q. Fan, D. Chen, L. Yuan, G. Hua, N. Yu, and B. Chen, “Decouple learningfor parameterized image operators,” in

Proceedings of the EuropeanConference on Computer Vision (ECCV) , 2018, pp. 442–458.[9] ——, “A general decoupled learning framework for parameterizedimage operators,”

IEEE Transactions on Pattern Analysis and MachineIntelligence , 2019.[10] X. Hu, H. Mu, X. Zhang, Z. Wang, T. Tan, and J. Sun, “Meta-sr: Amagniﬁcation-arbitrary network for super-resolution,” in

Proceedings ofthe IEEE Conference on Computer Vision and Pattern Recognition , 2019,pp. 1575–1584.[11] Y. Lipman, D. Cohen-Or, D. Levin, and H. Tal-Ezer, “Parameterization-free projection for geometry reconstruction,” in

ACM Transactions onGraphics (TOG) , vol. 26, no. 3. ACM, 2007, p. 22.[12] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning onpoint sets for 3d classiﬁcation and segmentation,” in

The IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR) , July 2017.[13] C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deephierarchical feature learning on point sets in a metric space,”in

Advances in Neural Information Processing Systems 30 ,I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus,S. Vishwanathan, and R. Garnett, Eds. Curran Associates, Inc.,2017, pp. 5099–5108. [Online]. Available: http://papers.nips.cc/paper/7095-pointnet-deep-hierarchical-feature-learning-on-point-sets-in-a-metric-space.pdf[14] Z. Chen, W. Zeng, Z. Yang, L. Yu, C. Fu, and H. Qu, “Lassonet: Deeplasso-selection of 3d point clouds,”

IEEE Transactions on Visualization &Computer Graphics , vol. 26, no. 01, pp. 195–204, jan 2020.[15] Z. Shu, S. Xin, X. Xu, L. Liu, and L. Kavan, “Detecting 3d pointsof interest using multiple features and stacked auto-encoder,”

IEEETransactions on Visualization & Computer Graphics , vol. 25, no. 08,pp. 2583–2596, aug 2019.[16] H. Chen, M. Wei, Y. Sun, X. Xie, and J. Wang, “Multi-patch collaborative

OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 12

Fig. 13: Upsampling results of Meta-PU from varying input sizes with the same scale of . Our model is robust to various sizes andsparsities.Fig. 14: Comparison of point clouds upsampled for non-integer unseen scales from the same input with points. Meta-PU is stillstable on these unseen scales. OURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 13 point cloud denoising via low-rank recovery with graph constraint,”

IEEETransactions on Visualization and Computer Graphics , pp. 1–1, 2019.[17] L. Yu, X. Li, C.-W. Fu, D. Cohen-Or, and P.-A. Heng, “Ec-net: an edge-aware point set consolidation network,” in

The European Conference onComputer Vision (ECCV) , September 2018.[18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in

Advances in Neural Information Processing Systems 27 , Z. Ghahramani,M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds.Curran Associates, Inc., 2014, pp. 2672–2680. [Online]. Available:http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf[19] T. N. Kipf and M. Welling, “Semi-supervised classiﬁcation with graphconvolutional networks,” in

International Conference on Learning Repre-sentations (ICLR) , 2017.[20] M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau,T. Schaul, B. Shillingford, and N. De Freitas, “Learning to learn bygradient descent by gradient descent,” in

Advances in neural informationprocessing systems , 2016, pp. 3981–3989.[21] S. Ravi and H. Larochelle, “Optimization as a model for few-shot learning,”in

International Conference on Learning Representations (ICLR) , 2017.[22] T. Yang, X. Zhang, Z. Li, W. Zhang, and J. Sun, “Metaanchor: Learningto detect objects with customized anchors,” in

Advances in NeuralInformation Processing Systems , 2018, pp. 320–330.[23] R. Hu, P. Doll´ar, K. He, T. Darrell, and R. Girshick, “Learning to segmentevery thing,” in

Proceedings of the IEEE Conference on Computer Visionand Pattern Recognition , 2018, pp. 4233–4241.[24] D. Chen, Q. Fan, J. Liao, A. Aviles-Rivero, L. Yuan, N. Yu, and G. Hua,“Controllable image processing via adaptive ﬁlterbank pyramid,”

IEEETransactions on Image Processing , vol. 29, pp. 8043–8054, 2020.[25] Z. Liu, H. Mu, X. Zhang, Z. Guo, X. Yang, T. K.-T. Cheng, andJ. Sun, “Metapruning: Meta learning for automatic neural network channelpruning,” arXiv preprint arXiv:1903.10258 , 2019.[26] C. Lemke, M. Budka, and B. Gabrys, “Metalearning: a survey of trendsand technologies,”

Artiﬁcial intelligence review , vol. 44, no. 1, pp. 117–130, 2015.[27] S. M. Arnold, P. Mahajan, D. Datta, and I. Bunner, “learn2learn,” Sep.2019. [Online]. Available: https://github.com/learnables/learn2learn[28] J. Feydy, T. S´ejourn´e, F.-X. Vialard, S.-i. Amari, A. Trouv´e, andG. Peyr´e, “Interpolating between optimal transport and mmd usingsinkhorn divergences,” in

The 22nd International Conference on ArtiﬁcialIntelligence and Statistics . PMLR, 2019, pp. 2681–2690.[29] H. Huang, S. Wu, M. Gong, D. Cohen-Or, U. Ascher, and H. Zhang,“Edge-aware point set resampling,”

ACM transactions on graphics (TOG) ,vol. 32, no. 1, pp. 1–12, 2013.[30] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learningon point sets for 3d classiﬁcation and segmentation,” arXiv preprintarXiv:1612.00593 , 2016.[31] F. Bernardini, J. Mittleman, H. E. Rushmeier, C. T. Silva, and G. Taubin,“The ball-pivoting algorithm for surface reconstruction,”

IEEE Trans-actions on Visualization and Computer Graphics , vol. 5, pp. 349–359,1999.[32] D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono,A. Ben Hamza, A. Bronstein, M. Bronstein, S. Bu, U. Castellani, S. Cheng,V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li,H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, and J. Ye, “SHREC’14track: Shape retrieval of non-rigid 3d human models,” in

Proceedings ofthe 7th Eurographics workshop on 3D Object Retrieval , ser. EG 3DOR’14.Eurographics Association, 2014.[33] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomousdriving? the kitti vision benchmark suite,” in