[PDF] An octree cells occupancy geometric dimensionality descriptor for massive on-server point cloud visualisation and classification

Abstract

Lidar datasets are becoming more and more common. They are appreciated for their precise 3D nature, and have a wide range of applications, such as surface reconstruction, object detection, visualisation, etc. For all this applications, having additional semantic information per point has potential of increasing the quality and the efficiency of the application. In the last decade the use of Machine Learning and more specifically classification methods have proved to be successful to create this semantic information. In this paradigm, the goal is to classify points into a set of given classes (for instance tree, building, ground, other). Some of these methods use descriptors (also called feature) of a point to learn and predict its class. Designing the descriptors is then the heart of these methods. Descriptors can be based on points geometry and attributes, use contextual information, etc. Furthermore, descriptors can be used by humans for easier visual understanding and sometimes filtering. In this work we propose a new simple geometric descriptor that gives information about the implicit local dimensionality of the point cloud at various scale. For instance a tree seen from afar is more volumetric in nature (3D), yet locally each leaves is rather planar (2D). To do so we build an octree centred on the point to consider, and compare the variation of the occupancy of the cells across the levels of the octree. We compare this descriptor with the state of the art dimensionality descriptor and show its interest. We further test the descriptor for classification within the Point Cloud Server, and demonstrate efficiency and correctness results.

Full PDF

AAN OCTREE CELLS OCCUPANCY GEOMETRIC DIMENSIONALITY DESCRIPTORFOR MASSIVE ON-SERVER POINT CLOUD VISUALISATION AND CLASSIFICATION

R´emi Cura A , Julien Perret A , Nicolas Paparoditis AA Universit´e Paris-Est, IGN, SRIG, COGIT & MATIS, 73 avenue de Paris, 94160 Saint Mand´e, Franceﬁrst name.last [email protected]

ABSTRACT:

Lidar datasets are becoming more and more common. They are appreciated for their precise 3D nature, and have a wide range ofapplications, such as surface reconstruction, object detection, visualisation, etc. For all this applications, having additional semanticinformation per point has potential of increasing the quality and the efﬁciency of the application. In the last decade the use of MachineLearning and more speciﬁcally classiﬁcation methods have proved to be successful to create this semantic information. In this paradigm,the goal is to classify points into a set of given classes (for instance tree, building, ground, other). Some of these methods use descriptors(also called feature) of a point to learn and predict its class. Designing the descriptors is then the heart of these methods. Descriptors canbe based on points geometry and attributes, use contextual information, etc. Furthermore, descriptors can be used by humans for easiervisual understanding and sometimes ﬁltering. In this work we propose a new simple geometric descriptor that gives information aboutthe implicit local dimensionality of the point cloud at various scale. For instance a tree seen from afar is more volumetric in nature (3D),yet locally each leaves is rather planar (2D). To do so we build an octree centred on the point to consider, and compare the variation of theoccupancy of the cells across the levels of the octree. We compare this descriptor with the state of the art dimensionality descriptor andshow its interest. We further test the descriptor for classiﬁcation within the Point Cloud Server (Cura, 2016), and demonstrate efﬁciencyand correctness results.

1. INTRODUCTION1.1 Problem

Democratisation of sensing device have resulted into an expansionof acquired point clouds. In the same time, acquisition frequencyand precision of the Lidar device are also increasing, resulting inan explosion of the number of points.Lidar datasets are becoming more and more common. They areappreciated for their precise 3D nature, and have a wide rangeof applications, such as surface reconstruction, object detection,visualisation, etc.Semantic information in addition to the raw point data can be veryuseful for these applications. It allows to increase quality and tospeed computing. For instance a method that reconstruct fac¸adecan safely skip the points pertaining to the ground. Similarly, anuser visually exploring a dataset may ﬁnd very useful to isolatepoints pertaining to trees for instance.Logically, adding semantic information to point clouds has beenresearched for a long time by many researchers.In the last decade the use of Machine Learning and more specif-ically classiﬁcation methods have proved to be popular. In thisparadigm, the goal is to classify points into a set of given classes(for instance tree, building, ground, other). Some of this methodsuses descriptors (also called feature) for each point that will beleveraged in a training set to learn to associate descriptors withsemantic information. Once the association is learned, it can beused to extrapolate semantic classes on similar point clouds.The heart of such approaches are then to design appropriate de-scriptors that will enable to accuratly and efﬁciently disambiguatebetween classes. Many different descriptors have been used, basedon geometry or other attributes of points, using or not the contextof the point, etc. We refer to the very recent thesis of Weinmann(2016). Recently deep learning methods potentially allow to learn featureson the ﬂy (see Huang and You (2016)), bypassing the need tocraft descriptors, but also introducing new trade-off in the pro-cess (necessary training set and prior knowledge of similar pointclouds).Of course machining learning approach requires substantial train-ing sets, and may fail if the processed point cloud is too differentfrom the learned point clouds. Furthermore, machine learningmethods are sophisticated and require signiﬁcant computing time.However deep learning methods have been applied to informa-tion extracted from point clouds, such a 2D images(Boulch etal. (2017)), voxels (Tchapmi et al. (2017)), and graph of relationbetween superpoints (Landrieu and Simonovsky (2017)).In this work we focus on a new simple geometric descriptor thatgives information about the implicit local dimensionality of thepoint cloud, regardless of prior knowledge, methods of acquisitionand scale. This descriptor is based on the octree cells occupancyof a point cloud. It is designed with massive point clouds andscaling in mind, which means that it is included in a global LevelOf Detail approach, and is computed on groups of points ratherthan on individuals points.Our aim is to provide a basic descriptor that has a direct geomet-rical interpretation, is extremely fast to compute and scales welldue to its hierarchical nature. This descriptor can then be usedfor visualisation or as a preprocessing step for classiﬁcation andreconstruction.We compare this descriptor with the state of the art. We demon-strate the potential of th descriptor to perform efﬁcient patchclassiﬁcation within the Point Cloud Server (Cura, 2016).

All the methods are tested on billions scale point cloud, and areOpen Source for sake of reproducibility test and improvements. a r X i v : . [ c s . C V ] J a n olor is intensity Color is patch id Color is class idColor is intensity L O D Lidar point cloud Split into patches Level Of Detail (LOD) LearningFiltering

Figure 1: Graphical Abstract : a Lidar point cloud (1), is split it into patches (2) and stored in a Point Cloud Server, patches are re-orderedto obtain free LOD (3) (a gradient of LOD here), lastly the ordering by-product is a multiscale dimensionality descriptor used as a featurefor learning and efﬁcient ﬁltering (4).Our main contribution is an efﬁcient and robust geometric di-mensionality descriptor with a scale parameter that is local ratherthan global. Our second contribution is to explore the interest ofclassiﬁcation of groups of points (patches) rather than points formassive point clouds, and estimate the trade-off associated to thisapproach.

This work follows a classical plan of Introduction Method ResultDiscussion Conclusion (IMRAD). Section 2. presents the geo-metric dimensionality descriptor, and how this can leveraged forclassiﬁcation. Section 3. reports on the experiments validatingthe descriptor interest and how it compares to the state of the artgeometric dimension descriptors. Finally, the details, limitations,and potential applications are discussed in Section 4..

2. METHOD

In this section, we ﬁrst present the Point Cloud Server (section2.1)(PCS Cura et al. (2016)) that this article extends. Then weintroduce a dimensionality descriptor that is based on octree cellsoccupancy (2.2). This descriptor can be used in the PCS to performdensity correction and classiﬁcation at the patch level (2.3). Thisclassiﬁcation can be directly transferred to points, or indirectlyexploited in a pre-ﬁltering step.

Our method strongly depends on using the Point Cloud Serverdescribed in Cura et al. (2016), therefore we introduce its principleand key relevant features (see ﬁgure 2).The PCS is a complete and efﬁcient point cloud management sys-tem based on a database server that works on groups of pointsrather than individual points. This system is speciﬁcally designedto solve all basics needs of point cloud users: fast loading, com-pressed storage, powerful ﬁltering, easy data access and exporting,and integrated processing.The core of the PCS is to store groups of points (called patches)that are multi-indexed (spatially, on attributes, etc.), and repre-sented with different generalisation depending on the applications.

LOADSTORE - groups of points- compressed METADATA - secure and rela � onal- extended (trajectory, sources) - generalisa � on/vis. FILTER - indexes- can use other GIS data

PROCESSING - in-base easy prototyping- classic out-of-base work ﬂ ow EXPORT

RDBMS point (2.1,4.7,1.0,9,..) patch (group of points) - compressed- indexed & e-x Ax T e-x Bx T ... generalisations pointclouds - 1 per table ... - 1 per row Figure 2: Overall and storage organisations of the Point CloudServer.Points can be grouped with any rules. In this work, the points areregrouped spatially by cubes (Paris) or (Vosges) wide.All the methods described in this work are applied on patches. The goal is to estimate the geometric dimen-sionality of a group of points (is the group of points more 1D, 2D,3D in nature?). For instance a door would be mostly 2D, and atelephone line mostly 1D. Our approach is multi-Level Of Details(LOD), because we consider that the geometric dimensionality ofa group of points depends on the scale/the level of detail.Lets consider a group of points, we build an octree where thecell of the ﬁrst level is a cube containing all the points. In thePoint Cloud Server, the cube size is chosen for storage efﬁciency(between 50m and 1m of side length) and aligned on a 3D grid.For a given level L i level of the octree, we note the number ofcells that are occupied by one or more points. Doing so for n levels create a feature vector ppl = ( O , O , ..O n ) where O i isthe number of cells of the level i of the octree that contains 1 ormore points.or instance, given a level L i , a centred line would occupy L i cells, a centred plan L i cells, and a volume all of the cells ( L i cells). Thus simply by counting the number of occupied cellswe get an idea of the geometric dimensionality of the group ofpoints for a given level (See Figure 3). . This occupancy is Line :2 L occupancy Surface :4 L occupancy Volume :8 L occupancy Figure 3: Octree cells occupancy is a basic dimensionality descrip-tor: a 3D line, 2D surface or volume occupy a different amount ofcells.only correctly estimated when the patch is fully ﬁlled and dimen-sionality homogeneous. However, we can also characterize thedimensionality

Dim

LODDiff by the way the occupancy evolves(difference mode). Indeed, a line occupying k cells of an octree atlevel L i will occupy ∗ k cells at the level L i +1 , if enough points(see Fig. 4 ). x2 x2 x LOD LOD LOD LOD All not enough remaining points

Figure 4: Illustration of the octree cell occupancy evolution for apoint cloud acquisition of a real life tree (Dimension is embeddedit the power of 2). Depending on the Level (thus the scale), thegeometric dimensionality of the object goes from 3D to 1D.The Figure 5 illustrate this. Typical parts of a street in the Parisdataset were segmented: a car, a wall with window, a 2 wheelers,a public light, a tree, a person, poles and piece of ground includingcurbs.Due to the geometry of acquisition and sampling, the public lightis almost a 3D line, resulting in the occupation of very few octreecells. A typical number of octree cells per level for a publiclight patch would then be (1 , , , , ... ) , which looks like a (2 ) L function. A piece of ground is often extremely ﬂat and very similarto a planar surface, which means that the occupied octree cellscould be a (1 , , , ... ) , a (2 ) L function. Lastly a piece oftree foliage is going to be very volumetric in nature, due to thefact that a leaf is about the same size as point spacing and is partlytransparent to laser (leading to several echo). Then a foliage patchwould typically be (1 , , ... ) (if enough points), so a (2 ) L function. (Tree patches are in fact a special case, see 3.3.1). In Cura et al. (2016) weintroduced a new method for Level Of Detail of massive point * 2*2Figure 5: All successive levels of LOD from Cura et al. (2016)for common objects (car, window, 2 wheelers, light, tree, people,pole, ground), color is intensity for other points.cloud. The main idea was to store implicitly the LOD within theorder of points. Thus, the method relies on ordering the pointswith the most important (geometrically speaking ﬁrst). This ordermethod called MidOc is also build around an Octree. Whensing the Implicit LOD MidOc building process, the numberof chosen points per level can be stored. Each ordered patchis then associated with a vector of number of points per level ppl = ( N L , .., N L max ) . The number of picked point for L i isalmost the voxel occupancy for the level i of the correspondingoctree. Almost, because in MidOc points picked at a level do notcount for the next Levels. Occupancy over a voxel grid has alreadybeen used as a descriptor (See Bustos et al. (2005)). Howeverwe can go a step further. For the following we consider thatpatches contain enough points and levels are low enough so that theremoving of picked points has no inﬂuence. Thus, by comparing ppl [ L i ] to theoretical L i , L i , L i we retrieve a dimensionalityfeature Dim

LOD [ i ] about the dimensionality of the patch at thelevel L (See Figure 3). This occupancy is only correctly estimatedwhen the patch is fully ﬁlled and homogeneous. However, we canalso characterize the dimensionality Dim

LODDiff by the waythe occupancy evolves (difference mode). Indeed, a line occupying k cells of an octree at level L i will occupy ∗ k cells at the level L i +1 , if enough points. A sophisticated per-point dimen-sionality descriptor is introduced in Demantk´e (2014); Weinmannet al. (2015), then used to ﬁnd optimal neighbourhood size. Thisapproach is very different in the approach. First that this featureis computed for each point (thus is extremely costly to compute).On the opposite, in the PCS we compute feature at the patch level,we do not need to ﬁnd the scale at which compute dimensionality,the descriptor is computed on the whole patch.This dimensionality descriptors (

Dim cov ) relies on computingcovariance of points centred to the barycentre (3D structure tensor),then a normalisation of covariance eigen values. As such, themethod is similar, and has the same advantages and limitation, asthe Principal Component Analysis (See Shlens (2014) for a readerfriendly introduction). It can be seen as ﬁtting a 3D ellipsoid tothe points.First this method is sensible to density variations because all thepoints are considered for the ﬁtting. As opposite to our hypothesis,this method considers implicitly that density holds informationabout the nature of sensed objects. Second, this methods only ﬁtsone ellipse, which is insufﬁcient to capture complex geometricforms. Last, this method is very local and does not allow to exploredifferent scale for a point cloud as a whole. Indeed this method isclassically used on points within a growing sphere to extend thescale.The second difference is more methodological, as typically thescale of the feature would be the radius into which consider pointsto compute feature. Thus it is not possible to analyse a large groupof points at a geometrically ﬁne scale. In the opposite, in the caseof the octree cells occupancy, the scale is actually directly relatedto the geometric size of the octree cells.We compute both dimensionality descriptor and then comparethem for the Paris dataset.

Because x → (2 ) x , x → (2 ) x , x → (2 ) x diverge very fast, we only need to use few levels to have a quitedifferentiating descriptor.For instance, using L = 2 , we have x i = [4 , , , which arevery distinguishable values, and don’t require a total density above points per patch. As long as the patch contains a minimal num-ber of points, the descriptors is density and scale invariant. Lastlya mixed result (following neither of the x i → (2 i ) x function) can * 2*2Figure 6: Covariance-based geometric dimensionality is estimatedsimilarly to a PCA, the scale is the size of the neighbourhoodradius. In the opposite octree cells occupancy geometric dimen-sionality descriptor scale is in fact the scale to which the pointsare analysed.be used as an indicator that the patch contains mixed geometry,either due to nature of the objects in the patch, or due to the waythe patch is deﬁned (sampling).Although it might be possible to go a step further and decomposea patch ppl vector on the base of x i → (2 i ) x , i ∈ [1 .. , the directand exact decomposition can’t be used because the decompositionmight depends on L i . For instance a plane with a small line couldappear as a plan for L and L , and starts to appear differently over L and higher level. In this case, an Expectation-Maximizationscheme might be able to decompose robustly too separate thepoints into more dimensionally coherent groups. We propose to perform patch classiﬁcationusing the Point Cloud Server and the previously introduced oc-tree cells occupancy, along with other basic descriptors, usinga Random Forest classiﬁer. Following the position of the PCStowards abstraction, the classiﬁcation is performed at the patchlevel and not at the point level. This induces a massive scalingand speeding effect, at the cost of introducing quantization error.Indeed, compared to a point classiﬁcation, a patch may containpoints belonging to several classes (due to generalisation), yet itwill only be classiﬁed in one class, thus the ”quantization” error.Because patch classiﬁcation is so fast and scales so well, theend goal can be however slightly different than for usual pointclassiﬁcation.Patch classiﬁcation can be used as a fast pre-process to anotherslower point classiﬁcation, be it to speed it (an artiﬁcial recallincrease for patch classiﬁcation may be needed, see Figure 15), orto better a point classiﬁcation. The patch classiﬁcation can providea rough classiﬁcation. Based on that the most adapted point classi-ﬁer is chosen (similarly to Cascaded classiﬁers), thus improvingthe result of the ﬁnal point classiﬁcation. For instance a patch clas-siﬁed as urban object would lead to chose a classiﬁer specializedin urban object, and not the general classiﬁer. This is especiallyprecious for classes that are statistically under-represented (i.e.pedestrian, urban furniture).Patch classiﬁcation may also be used as a ﬁltering pre-process forapplications that only require one class. Many applications onlyneed one class, and do not require all the points in it, but only asubset with good conﬁdence. For this it is possible to artiﬁciallyboost the precision (by accepting only high conﬁdence predic-tion). For instance computing a Digital Terrain Model (DTM)only requires ground points. Moreover, the ground will have manyparts missing due to objects, so using only a part of all the pointswill sufﬁce anyway. The patch classiﬁer allow to ﬁnd most of theground patch extremely fast. Another example is registration. Aegistration process typically require reliable points to performmapping and registration. In this case there is no need to useall points, and the patch classiﬁcation can provide patches fromground and fac¸ade with high accuracy (for point cloud to pointcloud or point cloud to 3D model registration), or patches of ob-jects and trees (for points cloud to landmark registration). In otherapplications, ﬁnding only a part of the points may be sufﬁcient,for instance when computing a building map from fac¸ade patches.Random Forest method started with Amit and Geman (1997),theorized by Breiman (2001) and has been very popular sincethen. They are for instance used by Golovinskiy et al. (2009) whoperform object detection, segmentation and classiﬁcation. Theyanalyse separately each task on an urban data set, thus provid-ing valuable comParison. Their method is uniquely dedicated tothis task, like Serna and Marcotegui (2014) who provide anothermethod and a state of the art of the segmentation/classiﬁcationsubject. Both of this methods are in fact 2D methods, working onan elevation image obtained by projecting the point cloud. How-ever we observe that street point clouds are dominated by verticalsurfaces, like building (about 70% in Paris data set). Our methodis fully 3D and can then easily be used to detect vertical objectdetails, like windows or doors on buildings.

The ﬁrst descriptor is ppl , the octree cells occupancydimensionality descriptor, usually produced by the MidOc order-ing (see Section 2.2), or computed independently using fast octreebuilding by points ordering. We use the number of points for thelevel [1 .. . For each level L , the number of points is normalizedby the maximum number of points possible ( L ), so that everyfeature is in [0 , .We also use other simple features that require very limited com-puting (avoiding complex features like contextual features). Dueto the PCS patch compression mechanism, min, max, and averageof any attributes of the points are directly available. Using theLOD allows to quickly compute other simple feature, like the 2Darea of points of a patch (points being considered with a givendiameter). Dealing with data set particularities

The Paris data set classesare organized in a hierarchy (100 classes in theory, 22 populated).The rough patch classiﬁer is not designed to deal with so manyclasses, and so a way to determine what level of hierarchy will beused is needed. We propose to perform this choice with the helpof a graph of similarity between classes (See Fig. 10 and 11We ﬁrst determinate how similar the classes are for the simpledimensionality descriptors, classifying with all the classes, andcomputing a class to class confusion matrix. This confusion matrixcan be interpreted as an afﬁnity between class matrix, and thus as agraph. We draw the graph using a spectral layout (team Networkx(2014)), which amounts to draw the graph following the ﬁrsttwo eigen vector of the matrix (Similar to Principal ComponentAnalysis). Those two vectors maximize the variance of the data(while being orthogonal), and thus best explain the data. Thisgraph visually helps to choose the appropriate number of classesto use. A fully automatic method may be used via unsupervisedclustering approach on the matrix (like The Afﬁnity Propagationof Frey and Dueck (2007)).Even when reducing the number of classes, the Paris dataset if un-balanced (some class have far less observations than some others).We tried two classical strategies to balance the data set regardingthe number of observation per class. The ﬁrst is under-sampling big classes : we randomly under-sample the observations to getroughly the same number of observation in every class.The second strategy is to compute a statistical weight for everyobservation based on the class prevalence. This weight is thenused in the learning process when building the Random Forest.

Con-trary to classical classiﬁcation method, we are not exclusivelyinterested in precision and recall per class, but also by the evolu-tion of precision when prediction conﬁdence varies.In fact, for a ﬁltering application, we can leverage the conﬁdenceinformation provided by the Random Forest method to artiﬁciallyboost precision (at the cost of recall diminution). We can do thisby limiting the minimal conﬁdence allowed for every prediction.Orthogonally, it is possible for some classes to increase recall at thecost of precision by using the result of a ﬁrst patch classiﬁcationand then incorporate in the result the other neighbour patches.We stress that if the goal is to detect objects (and not classify eachpoint), this strategy can be extremely efﬁcient. For instance ifwe are looking for objects that are big enough to be in severalpatches (e.g. a car). In this case we can perform the classiﬁcation(which is very fast and efﬁcient), then keep only highly conﬁdentpredictions, and then use the position of predictions to perform alocal search for car limits. The classical alternative solution wouldbe to perform a per point classiﬁcation on each point, which wouldbe extremely slow.

3. RESULT3.1 Introduction to results

We design and execute several experiments in order to test thedescriptors for a random forest classiﬁer on two large data sets,proving their usefulness. We analyse the potential of this descrip-tors, and what it brings when used in conjunction to other simpledescriptors.The base DBMS is team PostgreSQL (2014-). The spatial layerteam PostGIS (2014-) is added to beneﬁts from generic geometrictypes and multidimensional indexes. The speciﬁc point cloud stor-age and function come from pgPointCloud (2014-). The MidOc iseither plpgsql or made in python with team SciPy (2014-). Theclassiﬁcation is done with team Scikit (2014-), and the networkclustering with team Networkx (2014). Timings are only ordersof magnitude due to the inﬂuence of database caching.* 2*2Figure 7: Histogram of number of points per patch, with alogarithmic scale for X and Y axisWe use two data sets. There were chosen as different as possible tofurther evaluate how proposed methods can generalise on differentdata (See Figure ﬁg:hist-density-dataset for histogram of patchensity ). The ﬁrst data set is IQmulus (2014) (Paris data set), anopen source urban data set with varying density, singularities, andvery challenging point cloud geometry. Every point is labelledwith a hierarchy of 100 classes. The training set is only 12 millionspoints. Only 22 classes are represented. We group points in cubes. The histogram of density seems to follow an exponentiallaw (See ﬁgure 7), the effect being that many patches with fewpoints exist.We also use the Vosges data set, which is a very wide spread, aerialLidar, 5.5 Billions point cloud. Density is much more constant at10k pts/patch . A vector ground truth about surface occupationnature (type of forest) is produced by the French Forest Agency.Again the classes are hierarchical, with 28 classes. We grouppoints in 50 × squares. All the experiments are performed using a Point Cloud Server(cf Cura (2014)). The key idea are that point clouds are storedinside a DBMS (PostgreSQL), as patch. Patch are compressedgroups of points along with some basic statistics about points inthe group. We hypothesize that in typical point cloud processingwork-ﬂow, a point is never needed alone, but almost always withits surrounding points.Each patch of points is then indexed in an R tree for most interest-ing attributes (obviously X,Y,Z but also time of acquisition, metadata, number of points, distance to source, etc.)Having such a meta-type with powerful indexes allows use toﬁnd points based on various criteria extremely fast. (order ofmagnitude : ms). As an example, for a 2 Billion points dataset, wecan ﬁnd all patches in few milliseconds having : - between -1 and3 meters high in reference to vehicle wheels - in a given 2D areadeﬁned by any polygon - acquired between 8h and 8h10 - etc.The PCS offers an easy mean to perform data-partition basedparallelism. We extensively use it in our experiments.

We test the dimensionality descriptor ( ppl ) in two ways. First wecompare the extracted (

Dim

LOD ) to the classical structure tensorbased descriptor (

Dim cov ). Second we assess how useful it is forclassiﬁcation, by analysing how well it separates classes, and howmuch it is used when several other features are available.

We compute

Dim cov following the indica-tions of Weinmann et al. (2015) to get p dim − > [0 .. , i.e. theprobability to belong to [1D,2D,3D]. We convert this to Dim cov with

Dim cov = (cid:80) i =1 i ∗ p dim [ i ] .Optionally, we test a ﬁltering option so that the maximum dis-tance in biggest two dimensions is more equivalent. However thisapproach fails to signiﬁcantly improve results.We test several method to extract Dim

LOD from ppl . The ﬁrstmethod is to compute

Dim

LODs [ i ] = log ppl [ i ]) /i , whichgives the simple geometric dimension for each level. The sec-ond method is the same but work on occupancy evolution, with Dim

LODd [ i ] = log ppl [ i ] /ppl [ i − (discarding L ). In bothcase the result is a geometric dimension between 0 and 3 for eachLevel. We use both indices to fusion the dimensionality acrossLevels ( working on Dim

LODA = Dim

LODs (cid:83)

Dim

LODd ).The ﬁrst method uses a RANSAC (team SciPy (2014-) imple-mentation of Choi et al. (2009)) to ﬁnd the best linear regression. The slope gives an idea of conﬁdence (ideally, should be 0), andthe value of the line at the middle of abscissa is an estimate of

Dim

LOD . The second method robustly ﬁlters

Dim

LODA basedon median distance to median value and average the inlier toestimate

Dim

LOD . Dim cov and

Dim

LOD are computed with in-base and out-of-base processing, the latter being executed in parallel (8 work-ers). For 10k patches, 12 M pts, retrieving data and writing resultaccounts for , computing Dim

LOD to , Dim cov to .Computing ppl (which is multi-scale) using a linear octree takesbetween L and L . ( 0.75Comparing Dim cov and

Dim

LOD is not straightforward be-cause the implicit deﬁnition of dimension is very different inthe two methods. We analyse the patches where | Dim

LOD − Dim cov | < = 0 . . 0.5 is an arbitrary threshold, but we feel that itrepresents the point above which descriptors will predict unrecon-cilable dimensions. Those patches represent 93% of the data set(0.96 % of points), with a correlation of 0.80. Overall the proposeddimensions are similar for the majority of patch, especially forwell ﬁlled 1D and 2D patches (See Fig. 8).We analyse the 684 remaining patches to look for possible expla-nations of the difference in dimension (See Fig. 9).We consider the following four main sources of limitations from Dim cov . • Elongated patch.

Dim cov =1.42 ,

Dim

LOD =1.92. If the patch is not roughly asquare,

Dim cov gives a bad estimation as it is biased by theun-symmetry of point distribution. • Ellipsoid too simple.

Dim cov =1.68,

Dim

LOD =2.24.

Dim cov ﬁts an ellipsoid,which can not cope with complex objects, especially whenthe barycentre does not happen to be at a favourable place. • Coping with heterogeneous sampling. p dim =[0.56,0.32,0.12], Dim cov =1.57,

Dim

LOD =2.16.

Dim cov is sensitive to difference in point density. The pointson the bottom plan are much 3 times less dense than in thevertical plan, leading to a wrong estimate. • Deﬁnition of dimension different.General:

Dim cov ∈ [1 . , . , Dim LOD ∈ [1 . .. . This patch: p dim =[0.11, 0.23, 0.66] ppl =[1,8,36,74..], Dim

LODD =[3.0,2.17,1.04]. Trees are agood example of how the two descriptors rely on a differentdimension deﬁnition. For

Dim cov points may be well spreadout, so usually p D is high. Yet, tree patches are also subjectto density variation, and may also be elongate, which renders Dim cov very variable. On the opposite,

Dim

LOD considersthe dimensionality at different scale (See Fig. 4). From afara tree-patch is volumetric, at lower scal, it seems planar (leafand small sticks form rough plans together). Lastly at smallscale, the tree looks linear (sticks).

Us-ing only the ppl descriptor, a classiﬁcation is performed on Parisdata set, then a confusion matrix is computed. We use the spectrallayout (see Section 2.3.2) to automatically produce the graph inFigure 10. We manually ad 1D,2D and 3D arrows. On this graph,classes that are most similar according to the classiﬁcation with ppl are close. The graph clearly present an organisation following3 axis. Those axis coincide with the dimensionality of the classes.For instance the ”tree” class as a strong 3D dimensionality. The”Punctual object” class, deﬁned by ”Objects which representationon a map should be a point”, is strongly 1D (lines), with objectlike bollard and public light. The ”Road” class is strongly 2D,igure 8:

Dim

LOD and

Dim cov are mostly comparable, except for few patches (5%, coloured)* 2*2

Ellongated patch Ellipsoid too simpleNot same definition Inhomogene sampling

Dimcov:

DimLOD:

Dimcov:

DimLOD:

Dimcov:

DimLOD: p dim :DimLODD: [0.11,0.23,0.66][3.0,2.17,1.04] Figure 9: Representative patches for | Dim

LOD - Dim cov > . | .Most differences are explained by Dim cov limitations (See 3.3.1).the road being locally roughly a plan. The centre of the graphis occupied by classes that may have mixed dimensionality, forinstance ”4+ wheeler” (i.e. car) may be a plan or more volumetric,because of the sampling. ”Building” and ”sidewalk” are notas clearly 2D as the ”road” class. Indeed, the patch of ”sidewalk”class are strongly mixed (containing 22% of non-sidewalk points,See ﬁgure 13). The building class is also not pure 2D becausebuilding fac¸ade in Paris contains balcony, building decoration andﬂoors, which introduce lots of object-like patches, which explain * 2*2Figure 10: Spectral clustering of confusion matrix of Paris dataset classiﬁcation using only ppl descriptor. Edge width and colourare proportional to afﬁnity. Node position is determined fullyautomatically. Red-ish arrows are manually placed as visual clues.that building is closer to the object cluster. (See Figure 5 for in-stance). The dimensionality descriptor clearly separates classesfollowing their dimensionality, but can’t separate classes withmixed dimensionality.To further evaluate the dimensionality descriptor, we introduceother classiﬁcation features (see 3.4), perform classiﬁcation andcompute each feature importance. The overall precision and recallresult of these classiﬁcation is correct, and the ppl descriptor is ofsigniﬁcant use (See Figure 13 and 12), especially in the Vosgesdata set. The ppl descriptor is less used in Paris data set, maybebecause lots of classes can not really be deﬁned geometrically, butmore with the context. The dimensionality descrip-tor alone cannot be used to perform sophisticated classiﬁcation,ecause many semantically different objects have similar dimen-sion (for instance, a piece of wall and of ground are dimensionallyvery similar , yet semantically very different). We introduce ad-ditional simple features for classiﬁcation (See Section 2.3.2). Alluse already stored patch statistics, and thus are extremely fastto compute. (P : for Paris , V : for Vosges: - average of altituderegarding sensing device origin(P) - area of patch bounding box (P) : - patch height (P) - points per level ( ppl ), level 1 to 4 (P+V)- average of intensity (P+V) - average of number of echo (P+V)- average Z (V)For Vosges data set, we reach a speed of 1 M points / s / worker toextract those features. Undersampling and weighting areused on the Paris dataset. First undersampling to reduce the overdominant building class to a 100 factor of the smallest class sup-port. Then weighting is used to compensate for differences insupport. For the Vosges data set only the weighting strategy isused. The weighting approach is favoured over undersamplingbecause it lessen variability of results when classes are very het-erogeneous.To ensure signiﬁcant results we follow a K-fold cross-validationmethod. We again compute a confusion matrix (i.e. afﬁnity be-tween classes) on the Paris data set to choose which level of classhierarchy should be used. ﬁg:class-clustering-all-features* 2*2

VEGETATION

OBJECTS

BUILDINGS

POLES

VEHICLE

GROUND

Figure 11: Result of automatic spectral clustering over confusionmatrix for patch classiﬁcation of Paris data set with all simplefeatures. Edges width and colour are proportional to confusion.Manually drawn clusters for easier understanding.

Choosing which level of theclass hierarchy to use depends on data set and applications. In acanonical classiﬁcation perspective, we have to strongly reduce thenumber of classes if we want to have signiﬁcant results. Howeverreducing the number of class (i.e use a higher level in the classeshierarchy) also means that classes are more heterogeneous.Both data set are extremely unbalanced (factor 100 or more). Thusour simple and direct Random Forest approach is ill suited fordealing with extremely small classes. (Cascading or one versusall framework would be needed).For Vosges data set a short analysis convince us to use 3 classes:Forest, Land, and other, on this three classes, the Land class isstatistically overweighted by the 2 others.For the Paris data set, we again use a spectral layout to representthe afﬁnity graph (See Figure 11). Introducing other featuresclearly helps to separate more classes. The graph also shows the limit of the classiﬁcation, because some class cannot be properlydeﬁned without context (e.g. the side-walk, which is by deﬁnitionthe space between building and road, hence is deﬁned contextu-ally).

Closed Forest

Moor

Not forest avg/total p r e c . r e c . s u p p . m i x . ppl_1 ppl_2 ppl_3 ppl_4 nbe r o f e c ho m ean Z patchheight I n t en s i t y Feature usage

Figure 12: Vosges dataset. (table 2) Precision(prec.), recall (rec.),support (supp.), and average percent of points of the class in thepatches, for comParison with point based method (mix.). (table1)Feature usage

P recision point = P recision patch ∗ Mix. . (Same for recall). %). Further ﬁltering on conﬁdence can not increase preci-sion, but will reduce the variability of the found building patches.This result (red) would provides a much better base for buildingreconstruction for instance.0.75The patch classiﬁer can also be used as a ﬁltering preprocess. Inthis case, the goal is not to have a great precision, but to be fastand with a good recall. Such recall may be increased artiﬁciallyfor class of objects bigger than the sampling size ( for Paris).We take the example of ground classiﬁcation (See Figure 15).The goal is to ﬁnd all ground patches very fast. We focus ona small area for illustration purpose. This area contains patches, including ground patches. Patch classiﬁcation ﬁnds ground patch, with a recall of . %. Using the found patch,all the surrounding patches (X,Y : , Z : . ) are added tothe result (few seconds). There are now patches in the result, nclassifiedother6ground road sidewalkcurb building other6objectpunctual6objectlinearpedestrian26wheelers4+6wheelers tree potted6plantavg/total p r e c . r e c . s u p p . p r e c . r e c . s u p p . m i x . unclassifiedother6ground road sidewalkcurb building other6objectpostfloor6lamptraffic6signmailboxtrash6cangridpedestrianother6pedestrianstill6pedestrianholding6pedestrianbicyclescooterother64+6wheels tree potted6plantavg/total unclassified groundbuilding other6objectstaticDynamic natural avg/total p r e c . r e c . s u p p . ppl_1 ppl_2 ppl_3 ppl_4 i n t en s i t y nbe r o f e c ho he i gh t6 abo v eg r ound pa t chhe i gh t pa t c ha r ea feature6usage Figure 13: Results for Paris data set: at various level of class hierarchy. Precision(prec.), recall (rec.), support (sup.) and average percentof points of the class in the patches of the class, for comParison with point based method (mix.). Classes of the same type are in the samecontinuous tone. Feature usage is evaluated for each level in the class hierarchy.Figure 14: Plotting of patches classiﬁed as building, using con-ﬁdence to increase precision. Ground truth from IGN and OpenData Parisand the recall is %. This means that from a ﬁltering point ofview, a complex classiﬁer that would try to ﬁnd ground points canbe used on / of the data set, at the price of fewseconds of computing, without any loss of information.

4. DISCUSSION4.1 Point cloud server

We refer the reader to Cura et al. (2015) for an exhaustive analyseof the Point Cloud Server. Brieﬂy, the PCS has demonstrated allthe required capacities to manage point clouds and scale well. Tothe best of our knowledge the fastest and easiest way to ﬁlter verybig point cloud using complex spatial and temporal criteria, aswell as natively integrate point cloud with other GIS data (raster,vector). The main limitation is based on the hypothesis thanpoints can be regrouped into meaningful (regarding further pointutilisation) patches. If this hypothesis is false, the PCS lose mostof its interest.

Tree patches are challenging for both dimensionality descriptor.There possible dimension changes a lot (See Fig. 16), although

Dim

LOD is more concentrated. Yet, ppl is extremely useful toclassify trees. Indeed, ppl contains the dimensionality at variousscale, and potentially the variation of it, which is quite speciﬁc fortrees (See Fig. 4).We stress that a true octree cell occupancy (i.e. without pickingpoints as in the ppl ) can be obtained without computing the octree,simply by using the binary coordinates. We implement it in pythonas a proof of concept. Computing it is about as fast as computing

Dim cov .igure 15: Map of patch clustering result for ground. The classicalresult ﬁnds few extra patches that are not ground (blue), and missessome ground patches (red). Recall is increased by adding to theground class all patches that are less than 2 meters in X,Y and0.5 meter in Z around the found patches. Extra patches are muchmore numerous, but all the ground patches are found.* 2*2

Dim H i s t og r a m Dim_LODDim_Cov

Figure 16: Histogram of

Dim

LOD and

Dim cov for patch in trees(500 k pts). Tree dimension could be from 1.2 to 2.6, yet Dim

LOD is less ambiguous than

Dim cov

Overall, ppl offers a good alternative to the classical dimension-ality descriptor (

Dim cov ), being more robust and multi-scale.However the ppl also has limitations. First the quality of thedimensionality description may be affected by a low number ofpoints in the patch. Second in some case it is hard to reduce it to ameaningful

Dim

LOD ] . Last because of assumption on density, itis sensible to geometric noise. The ppl descriptor contains lots of information about the patch.This information is leveraged by the Random Forest method andpermit a rough classiﬁcation based on geometric differences. Asexpected, ppl descriptor are not sufﬁcient to correctly separate complex objects, which is the main limitation for a classiﬁcationapplication.The additional features are extremely simple, and far from theone used in state of the art. Notably, we don’t use any contextualfeature. We choose to classify directly in N classes, whereas due tothe large unbalance in dataset, cascade or 1 versus all approacheswould be more adapted.

The ﬁgure 11 shows the limitof a classiﬁcation without contextual information. For instancethe class grid and buildings are similar because in Paris buildingsbalcony are typically made of grids.To better identify confusion between classes, we use a spectrallayout on the afﬁnity matrix. Graphing this matrix in 2D amount toa problem of dimensionality reduction. It could use more advancedmethod than simply using the ﬁrst two eigen vector, in particularthe two vector wouldn’t need to be orthogonal (for instance, likein Hyv¨arinen and Oja (2000)).

First the feature usage for Vosgesdata set clearly shows that amongst all the simple descriptor, the ppl descriptor is largely favoured. This may be explained by thefact that forest and bare land have very different dimensionality,which is conveyed by the ppl descriptor.Second the patch classiﬁer appears to have very good result topredict if a patch is forest or not. The precision is almost perfectfor forest. We reach the limit of precision of ground truth. Becausemost of the errors are on border area, the recall for forest can alsobe easily artiﬁcially increased. The percent of points in patch thatare in the patch class allow to compare result with a point basedmethod. For instance the average precision per point for closedforest would be . ∗ .

883 = 0 . . We stress that this isaveraged results, and better precision per point could be achievedbecause we may use random forest conﬁdence to guess border area(with a separate learning for instance). For comParison with pointmethods, the patch classiﬁer predict with very good precision andsupport over 6 billions points in few seconds (few minutes fortraining). We don’t know other method that have similar resultwhile being as fast and natively 3D. The Moor class can’t beseparated without more specialised descriptor, because Moor andno forest classes are geometrically very similar.The principal limitation is that for this kind of aerial Lidar data setthe 2.5D approximation may be sufﬁcient, which enables manyraster based methods that may perform better or faster.The ﬁgure 13 gives full results for Paris data set, at various classhierarchy level. Because the goal is ﬁltering and not pure clas-siﬁcation, we only comment the 7 classes result. The proposedmethods appears very effective to ﬁnd building, ground and trees.Even taking into consideration the fact that patch may containsmixed classes (column mix.), the result are in the range of stateof the art point classiﬁer, while being extremely fast. This resultare sufﬁcient to increase recall or precision to 1 if necessary. Westress that even results appearing less good (4+wheelers , 0.69precision, 0.45 recall) are in fact sufﬁcient to increase recall to 1(by spatial dilatation of the result), which enables then to use moresubtle methods on ﬁltered patches. ppl descriptor is less used than for the Vosges data set, but is stilluseful, particularly when there are few classes. It is interestingto note that the mean intensity descriptor seems to be used todistinguish between objects, which makes it less useful in the7 classes case. The patch classiﬁer for Paris data set is clearlylimited to separate simple classes. In particular, the performancesfor objects are clearly lower than the state of the art. A dedicatedapproach should be used (cascaded or one versus all classiﬁer). .3.3 Estimating the speed and performance of patch basedclassiﬁcation compared to point based classiﬁcation The PointCloud Server is designed to work on patches, which in turns enablemassive scaling.Timing a server is difﬁcult because of different layer of caches,and background workers. Therefore, timing should be consideredas order of magnitude. For Paris data set,extracting extra classi-ﬁcation features requires ∼ n workers ( 1 to 8 workers), learning ∼ , and classiﬁcation ∼ few s . We refer to Weinmann et al.(2015)(Table 5) for point classiﬁcation timing on the same dataset(4.28 h , 2 s , 90 s ) (please note that the training set is much reducedby data balancing). As expected the speed gain is high for complexfeature computing (not required) and point classiﬁcation (done onpatch and not points in our case).For Vosges data set, features are extracted at pts/ s /worker ,learning ∼ few min , classiﬁcation ∼ . The Vosges data sethas not been used in other articles, therefore we propose to com-pare the timings to Shapovalov et al. (2010) (Table 3). Keepingonly feature computation and random forest training (again on areduced data set), they process 2 M points in 2 min , whereas ourmethod process the equivalent of 5.5 B points in few minutes.Learning and classiﬁcation are mono-threaded (3 folds validation),although the latter is easy to parallelise. Overall, the proposedmethod is one to three orders of magnitude faster.For Paris data set (Fig. 13), we compare to Weinmann et al.(2015)(Table 5). As expected there results are better, particularlyin terms of precision (except for the class of vegetation). Thistrend is aggravated by taking into account the ”mix.” factor. Indeedwe perform patch classiﬁcation, and patch may not pertain to onlyone class, which is measured by the mix factor (amount of pointsin the main class divided by the total number of point). However,including the mix factor the results are still within the 85 to 90 %precision for the main classes (ground, building, natural).For Vosges data set (Fig 12), we refer to Shapovalov et al. (2010)(Table 2). There random forest classiﬁer get trees with 93% preci-sion and 89% recall. Including the mix factor we get trees witha precision of 87% and 80% recall. As a comParison to imagebased classiﬁcation, an informal experiment of classiﬁcation withsatellite image reaches between 85 % and 93 % of precision forforest class depending on the pixel size (between 5 and 0.5 m ).Overall, the proposed method get reasonably good results com-pared to more sophisticated methods, while being much faster. Itso makes a good candidate as a preprocessing ﬁltering step. Because the propose meth-ods are pre-process of ﬁltering step, it can be advantageous toincrease precision or recall. * 2*2

Figure 17: Precision of 4+wheelers class is a roughly rising func-tion of random forest conﬁdence scores. In the Figure 14 gives a visual example where increasing precisionand reducing class heterogeneity is advantageous. This illustratesthat having a precision or recall is not necessary the ultimategoal. In this case it would be much easier to perform line detectionstarting from red patches rather than blue patches.The limitation of artiﬁcial precision increase is that it is only pos-sible when precision is roughly a rising function of random forestconﬁdence, as seen on the illustration 17. For this class, by accept-ing only prediction of random forest that have a conﬁdence over . the precision goes from . to . , at the cost of ignoringhalf the predictions for this class. This method necessitates thatthe precision is roughly a rising function of the conﬁdence, as forthe 4+wheeler class for instance (See Figure 17). This strategy isnot possible for incoherent class, like unclassiﬁed class.The method we present for artiﬁcial recall increase is only possibleif at least one patch of each object is retrieved, and objects arespatially dense. This is because a spatial dilatation operation isused. This is the case for ”4+wheelers” objects in the Paris dataset for instance. The whole method is possible because spatialdilatation is very fast in point cloud server (because of index).Moreover, because the global goal is to ﬁnd all patches of a classwhile leaving out some patches, it would be senseless to dilatewith a very big distance. In such case recall would be , but allpatches would be in the result, thus there would be no ﬁltering,and no speeding.The limitation is that this recall increase method is more likea deformation of already correct results rather than a magicalmethod that will work with all classes.

5. CONCLUSION

We propose a dimensionality descriptor based on octree cellsoccupancy at different scales. It allows to describe local geomet-rical dimensionality of a point cloud in a meaningful way, andconcentrates information, which can be leveraged for helping vi-sualisation, or by a rough classiﬁer. This descriptor is less affectedby sampling than alternative structure tensor geometric descriptor,fast to compute (a simple ordering of the points), and intrinsicallymulti-scales. We show the interest of this descriptor, both by com-Parison to the state of the art dimensionality descriptor, and byproof of its usefulness in real Lidar dataset classiﬁcation. Classi-ﬁcation is extremely fast, sometime at the price of performance(precision / recall). However we prove that those results can beused as a pre-processing step for more complex methods, using ifnecessary precision-increase or recall-increase strategies.

6. BIBLIOGRAPHYReferences6. BIBLIOGRAPHYReferences