An octree cells occupancy geometric dimensionality descriptor for massive on-server point cloud visualisation and classification
AAN OCTREE CELLS OCCUPANCY GEOMETRIC DIMENSIONALITY DESCRIPTORFOR MASSIVE ON-SERVER POINT CLOUD VISUALISATION AND CLASSIFICATION
R´emi Cura A , Julien Perret A , Nicolas Paparoditis AA Universit´e Paris-Est, IGN, SRIG, COGIT & MATIS, 73 avenue de Paris, 94160 Saint Mand´e, Francefirst name.last [email protected]
ABSTRACT:
Lidar datasets are becoming more and more common. They are appreciated for their precise 3D nature, and have a wide range ofapplications, such as surface reconstruction, object detection, visualisation, etc. For all this applications, having additional semanticinformation per point has potential of increasing the quality and the efficiency of the application. In the last decade the use of MachineLearning and more specifically classification methods have proved to be successful to create this semantic information. In this paradigm,the goal is to classify points into a set of given classes (for instance tree, building, ground, other). Some of these methods use descriptors(also called feature) of a point to learn and predict its class. Designing the descriptors is then the heart of these methods. Descriptors canbe based on points geometry and attributes, use contextual information, etc. Furthermore, descriptors can be used by humans for easiervisual understanding and sometimes filtering. In this work we propose a new simple geometric descriptor that gives information aboutthe implicit local dimensionality of the point cloud at various scale. For instance a tree seen from afar is more volumetric in nature (3D),yet locally each leaves is rather planar (2D). To do so we build an octree centred on the point to consider, and compare the variation of theoccupancy of the cells across the levels of the octree. We compare this descriptor with the state of the art dimensionality descriptor andshow its interest. We further test the descriptor for classification within the Point Cloud Server (Cura, 2016), and demonstrate efficiencyand correctness results.
1. INTRODUCTION1.1 Problem
Democratisation of sensing device have resulted into an expansionof acquired point clouds. In the same time, acquisition frequencyand precision of the Lidar device are also increasing, resulting inan explosion of the number of points.Lidar datasets are becoming more and more common. They areappreciated for their precise 3D nature, and have a wide rangeof applications, such as surface reconstruction, object detection,visualisation, etc.Semantic information in addition to the raw point data can be veryuseful for these applications. It allows to increase quality and tospeed computing. For instance a method that reconstruct fac¸adecan safely skip the points pertaining to the ground. Similarly, anuser visually exploring a dataset may find very useful to isolatepoints pertaining to trees for instance.Logically, adding semantic information to point clouds has beenresearched for a long time by many researchers.In the last decade the use of Machine Learning and more specif-ically classification methods have proved to be popular. In thisparadigm, the goal is to classify points into a set of given classes(for instance tree, building, ground, other). Some of this methodsuses descriptors (also called feature) for each point that will beleveraged in a training set to learn to associate descriptors withsemantic information. Once the association is learned, it can beused to extrapolate semantic classes on similar point clouds.The heart of such approaches are then to design appropriate de-scriptors that will enable to accuratly and efficiently disambiguatebetween classes. Many different descriptors have been used, basedon geometry or other attributes of points, using or not the contextof the point, etc. We refer to the very recent thesis of Weinmann(2016). Recently deep learning methods potentially allow to learn featureson the fly (see Huang and You (2016)), bypassing the need tocraft descriptors, but also introducing new trade-off in the pro-cess (necessary training set and prior knowledge of similar pointclouds).Of course machining learning approach requires substantial train-ing sets, and may fail if the processed point cloud is too differentfrom the learned point clouds. Furthermore, machine learningmethods are sophisticated and require significant computing time.However deep learning methods have been applied to informa-tion extracted from point clouds, such a 2D images(Boulch etal. (2017)), voxels (Tchapmi et al. (2017)), and graph of relationbetween superpoints (Landrieu and Simonovsky (2017)).In this work we focus on a new simple geometric descriptor thatgives information about the implicit local dimensionality of thepoint cloud, regardless of prior knowledge, methods of acquisitionand scale. This descriptor is based on the octree cells occupancyof a point cloud. It is designed with massive point clouds andscaling in mind, which means that it is included in a global LevelOf Detail approach, and is computed on groups of points ratherthan on individuals points.Our aim is to provide a basic descriptor that has a direct geomet-rical interpretation, is extremely fast to compute and scales welldue to its hierarchical nature. This descriptor can then be usedfor visualisation or as a preprocessing step for classification andreconstruction.We compare this descriptor with the state of the art. We demon-strate the potential of th descriptor to perform efficient patchclassification within the Point Cloud Server (Cura, 2016).
All the methods are tested on billions scale point cloud, and areOpen Source for sake of reproducibility test and improvements. a r X i v : . [ c s . C V ] J a n olor is intensity Color is patch id Color is class idColor is intensity L O D Lidar point cloud Split into patches Level Of Detail (LOD) LearningFiltering
Figure 1: Graphical Abstract : a Lidar point cloud (1), is split it into patches (2) and stored in a Point Cloud Server, patches are re-orderedto obtain free LOD (3) (a gradient of LOD here), lastly the ordering by-product is a multiscale dimensionality descriptor used as a featurefor learning and efficient filtering (4).Our main contribution is an efficient and robust geometric di-mensionality descriptor with a scale parameter that is local ratherthan global. Our second contribution is to explore the interest ofclassification of groups of points (patches) rather than points formassive point clouds, and estimate the trade-off associated to thisapproach.
This work follows a classical plan of Introduction Method ResultDiscussion Conclusion (IMRAD). Section 2. presents the geo-metric dimensionality descriptor, and how this can leveraged forclassification. Section 3. reports on the experiments validatingthe descriptor interest and how it compares to the state of the artgeometric dimension descriptors. Finally, the details, limitations,and potential applications are discussed in Section 4..
2. METHOD
In this section, we first present the Point Cloud Server (section2.1)(PCS Cura et al. (2016)) that this article extends. Then weintroduce a dimensionality descriptor that is based on octree cellsoccupancy (2.2). This descriptor can be used in the PCS to performdensity correction and classification at the patch level (2.3). Thisclassification can be directly transferred to points, or indirectlyexploited in a pre-filtering step.
Our method strongly depends on using the Point Cloud Serverdescribed in Cura et al. (2016), therefore we introduce its principleand key relevant features (see figure 2).The PCS is a complete and efficient point cloud management sys-tem based on a database server that works on groups of pointsrather than individual points. This system is specifically designedto solve all basics needs of point cloud users: fast loading, com-pressed storage, powerful filtering, easy data access and exporting,and integrated processing.The core of the PCS is to store groups of points (called patches)that are multi-indexed (spatially, on attributes, etc.), and repre-sented with different generalisation depending on the applications.
LOADSTORE - groups of points- compressed METADATA - secure and rela � onal- extended (trajectory, sources) - generalisa � on/vis. FILTER - indexes- can use other GIS data
PROCESSING - in-base easy prototyping- classic out-of-base work fl ow EXPORT
RDBMS point (2.1,4.7,1.0,9,..) patch (group of points) - compressed- indexed & e-x Ax T e-x Bx T ... generalisations pointclouds - 1 per table ... - 1 per row Figure 2: Overall and storage organisations of the Point CloudServer.Points can be grouped with any rules. In this work, the points areregrouped spatially by cubes (Paris) or (Vosges) wide.All the methods described in this work are applied on patches. The goal is to estimate the geometric dimen-sionality of a group of points (is the group of points more 1D, 2D,3D in nature?). For instance a door would be mostly 2D, and atelephone line mostly 1D. Our approach is multi-Level Of Details(LOD), because we consider that the geometric dimensionality ofa group of points depends on the scale/the level of detail.Lets consider a group of points, we build an octree where thecell of the first level is a cube containing all the points. In thePoint Cloud Server, the cube size is chosen for storage efficiency(between 50m and 1m of side length) and aligned on a 3D grid.For a given level L i level of the octree, we note the number ofcells that are occupied by one or more points. Doing so for n levels create a feature vector ppl = ( O , O , ..O n ) where O i isthe number of cells of the level i of the octree that contains 1 ormore points.or instance, given a level L i , a centred line would occupy L i cells, a centred plan L i cells, and a volume all of the cells ( L i cells). Thus simply by counting the number of occupied cellswe get an idea of the geometric dimensionality of the group ofpoints for a given level (See Figure 3). . This occupancy is Line :2 L occupancy Surface :4 L occupancy Volume :8 L occupancy Figure 3: Octree cells occupancy is a basic dimensionality descrip-tor: a 3D line, 2D surface or volume occupy a different amount ofcells.only correctly estimated when the patch is fully filled and dimen-sionality homogeneous. However, we can also characterize thedimensionality
Dim
LODDiff by the way the occupancy evolves(difference mode). Indeed, a line occupying k cells of an octree atlevel L i will occupy ∗ k cells at the level L i +1 , if enough points(see Fig. 4 ). x2 x2 x LOD LOD LOD LOD All not enough remaining points
Figure 4: Illustration of the octree cell occupancy evolution for apoint cloud acquisition of a real life tree (Dimension is embeddedit the power of 2). Depending on the Level (thus the scale), thegeometric dimensionality of the object goes from 3D to 1D.The Figure 5 illustrate this. Typical parts of a street in the Parisdataset were segmented: a car, a wall with window, a 2 wheelers,a public light, a tree, a person, poles and piece of ground includingcurbs.Due to the geometry of acquisition and sampling, the public lightis almost a 3D line, resulting in the occupation of very few octreecells. A typical number of octree cells per level for a publiclight patch would then be (1 , , , , ... ) , which looks like a (2 ) L function. A piece of ground is often extremely flat and very similarto a planar surface, which means that the occupied octree cellscould be a (1 , , , ... ) , a (2 ) L function. Lastly a piece oftree foliage is going to be very volumetric in nature, due to thefact that a leaf is about the same size as point spacing and is partlytransparent to laser (leading to several echo). Then a foliage patchwould typically be (1 , , ... ) (if enough points), so a (2 ) L function. (Tree patches are in fact a special case, see 3.3.1). In Cura et al. (2016) weintroduced a new method for Level Of Detail of massive point * 2*2Figure 5: All successive levels of LOD from Cura et al. (2016)for common objects (car, window, 2 wheelers, light, tree, people,pole, ground), color is intensity for other points.cloud. The main idea was to store implicitly the LOD within theorder of points. Thus, the method relies on ordering the pointswith the most important (geometrically speaking first). This ordermethod called MidOc is also build around an Octree. Whensing the Implicit LOD MidOc building process, the numberof chosen points per level can be stored. Each ordered patchis then associated with a vector of number of points per level ppl = ( N L , .., N L max ) . The number of picked point for L i isalmost the voxel occupancy for the level i of the correspondingoctree. Almost, because in MidOc points picked at a level do notcount for the next Levels. Occupancy over a voxel grid has alreadybeen used as a descriptor (See Bustos et al. (2005)). Howeverwe can go a step further. For the following we consider thatpatches contain enough points and levels are low enough so that theremoving of picked points has no influence. Thus, by comparing ppl [ L i ] to theoretical L i , L i , L i we retrieve a dimensionalityfeature Dim
LOD [ i ] about the dimensionality of the patch at thelevel L (See Figure 3). This occupancy is only correctly estimatedwhen the patch is fully filled and homogeneous. However, we canalso characterize the dimensionality Dim
LODDiff by the waythe occupancy evolves (difference mode). Indeed, a line occupying k cells of an octree at level L i will occupy ∗ k cells at the level L i +1 , if enough points. A sophisticated per-point dimen-sionality descriptor is introduced in Demantk´e (2014); Weinmannet al. (2015), then used to find optimal neighbourhood size. Thisapproach is very different in the approach. First that this featureis computed for each point (thus is extremely costly to compute).On the opposite, in the PCS we compute feature at the patch level,we do not need to find the scale at which compute dimensionality,the descriptor is computed on the whole patch.This dimensionality descriptors (
Dim cov ) relies on computingcovariance of points centred to the barycentre (3D structure tensor),then a normalisation of covariance eigen values. As such, themethod is similar, and has the same advantages and limitation, asthe Principal Component Analysis (See Shlens (2014) for a readerfriendly introduction). It can be seen as fitting a 3D ellipsoid tothe points.First this method is sensible to density variations because all thepoints are considered for the fitting. As opposite to our hypothesis,this method considers implicitly that density holds informationabout the nature of sensed objects. Second, this methods only fitsone ellipse, which is insufficient to capture complex geometricforms. Last, this method is very local and does not allow to exploredifferent scale for a point cloud as a whole. Indeed this method isclassically used on points within a growing sphere to extend thescale.The second difference is more methodological, as typically thescale of the feature would be the radius into which consider pointsto compute feature. Thus it is not possible to analyse a large groupof points at a geometrically fine scale. In the opposite, in the caseof the octree cells occupancy, the scale is actually directly relatedto the geometric size of the octree cells.We compute both dimensionality descriptor and then comparethem for the Paris dataset.
Because x → (2 ) x , x → (2 ) x , x → (2 ) x diverge very fast, we only need to use few levels to have a quitedifferentiating descriptor.For instance, using L = 2 , we have x i = [4 , , , which arevery distinguishable values, and don’t require a total density above points per patch. As long as the patch contains a minimal num-ber of points, the descriptors is density and scale invariant. Lastlya mixed result (following neither of the x i → (2 i ) x function) can * 2*2Figure 6: Covariance-based geometric dimensionality is estimatedsimilarly to a PCA, the scale is the size of the neighbourhoodradius. In the opposite octree cells occupancy geometric dimen-sionality descriptor scale is in fact the scale to which the pointsare analysed.be used as an indicator that the patch contains mixed geometry,either due to nature of the objects in the patch, or due to the waythe patch is defined (sampling).Although it might be possible to go a step further and decomposea patch ppl vector on the base of x i → (2 i ) x , i ∈ [1 .. , the directand exact decomposition can’t be used because the decompositionmight depends on L i . For instance a plane with a small line couldappear as a plan for L and L , and starts to appear differently over L and higher level. In this case, an Expectation-Maximizationscheme might be able to decompose robustly too separate thepoints into more dimensionally coherent groups. We propose to perform patch classificationusing the Point Cloud Server and the previously introduced oc-tree cells occupancy, along with other basic descriptors, usinga Random Forest classifier. Following the position of the PCStowards abstraction, the classification is performed at the patchlevel and not at the point level. This induces a massive scalingand speeding effect, at the cost of introducing quantization error.Indeed, compared to a point classification, a patch may containpoints belonging to several classes (due to generalisation), yet itwill only be classified in one class, thus the ”quantization” error.Because patch classification is so fast and scales so well, theend goal can be however slightly different than for usual pointclassification.Patch classification can be used as a fast pre-process to anotherslower point classification, be it to speed it (an artificial recallincrease for patch classification may be needed, see Figure 15), orto better a point classification. The patch classification can providea rough classification. Based on that the most adapted point classi-fier is chosen (similarly to Cascaded classifiers), thus improvingthe result of the final point classification. For instance a patch clas-sified as urban object would lead to chose a classifier specializedin urban object, and not the general classifier. This is especiallyprecious for classes that are statistically under-represented (i.e.pedestrian, urban furniture).Patch classification may also be used as a filtering pre-process forapplications that only require one class. Many applications onlyneed one class, and do not require all the points in it, but only asubset with good confidence. For this it is possible to artificiallyboost the precision (by accepting only high confidence predic-tion). For instance computing a Digital Terrain Model (DTM)only requires ground points. Moreover, the ground will have manyparts missing due to objects, so using only a part of all the pointswill suffice anyway. The patch classifier allow to find most of theground patch extremely fast. Another example is registration. Aegistration process typically require reliable points to performmapping and registration. In this case there is no need to useall points, and the patch classification can provide patches fromground and fac¸ade with high accuracy (for point cloud to pointcloud or point cloud to 3D model registration), or patches of ob-jects and trees (for points cloud to landmark registration). In otherapplications, finding only a part of the points may be sufficient,for instance when computing a building map from fac¸ade patches.Random Forest method started with Amit and Geman (1997),theorized by Breiman (2001) and has been very popular sincethen. They are for instance used by Golovinskiy et al. (2009) whoperform object detection, segmentation and classification. Theyanalyse separately each task on an urban data set, thus provid-ing valuable comParison. Their method is uniquely dedicated tothis task, like Serna and Marcotegui (2014) who provide anothermethod and a state of the art of the segmentation/classificationsubject. Both of this methods are in fact 2D methods, working onan elevation image obtained by projecting the point cloud. How-ever we observe that street point clouds are dominated by verticalsurfaces, like building (about 70% in Paris data set). Our methodis fully 3D and can then easily be used to detect vertical objectdetails, like windows or doors on buildings.
The first descriptor is ppl , the octree cells occupancydimensionality descriptor, usually produced by the MidOc order-ing (see Section 2.2), or computed independently using fast octreebuilding by points ordering. We use the number of points for thelevel [1 .. . For each level L , the number of points is normalizedby the maximum number of points possible ( L ), so that everyfeature is in [0 , .We also use other simple features that require very limited com-puting (avoiding complex features like contextual features). Dueto the PCS patch compression mechanism, min, max, and averageof any attributes of the points are directly available. Using theLOD allows to quickly compute other simple feature, like the 2Darea of points of a patch (points being considered with a givendiameter). Dealing with data set particularities
The Paris data set classesare organized in a hierarchy (100 classes in theory, 22 populated).The rough patch classifier is not designed to deal with so manyclasses, and so a way to determine what level of hierarchy will beused is needed. We propose to perform this choice with the helpof a graph of similarity between classes (See Fig. 10 and 11We first determinate how similar the classes are for the simpledimensionality descriptors, classifying with all the classes, andcomputing a class to class confusion matrix. This confusion matrixcan be interpreted as an affinity between class matrix, and thus as agraph. We draw the graph using a spectral layout (team Networkx(2014)), which amounts to draw the graph following the firsttwo eigen vector of the matrix (Similar to Principal ComponentAnalysis). Those two vectors maximize the variance of the data(while being orthogonal), and thus best explain the data. Thisgraph visually helps to choose the appropriate number of classesto use. A fully automatic method may be used via unsupervisedclustering approach on the matrix (like The Affinity Propagationof Frey and Dueck (2007)).Even when reducing the number of classes, the Paris dataset if un-balanced (some class have far less observations than some others).We tried two classical strategies to balance the data set regardingthe number of observation per class. The first is under-sampling big classes : we randomly under-sample the observations to getroughly the same number of observation in every class.The second strategy is to compute a statistical weight for everyobservation based on the class prevalence. This weight is thenused in the learning process when building the Random Forest.
Con-trary to classical classification method, we are not exclusivelyinterested in precision and recall per class, but also by the evolu-tion of precision when prediction confidence varies.In fact, for a filtering application, we can leverage the confidenceinformation provided by the Random Forest method to artificiallyboost precision (at the cost of recall diminution). We can do thisby limiting the minimal confidence allowed for every prediction.Orthogonally, it is possible for some classes to increase recall at thecost of precision by using the result of a first patch classificationand then incorporate in the result the other neighbour patches.We stress that if the goal is to detect objects (and not classify eachpoint), this strategy can be extremely efficient. For instance ifwe are looking for objects that are big enough to be in severalpatches (e.g. a car). In this case we can perform the classification(which is very fast and efficient), then keep only highly confidentpredictions, and then use the position of predictions to perform alocal search for car limits. The classical alternative solution wouldbe to perform a per point classification on each point, which wouldbe extremely slow.
3. RESULT3.1 Introduction to results
We design and execute several experiments in order to test thedescriptors for a random forest classifier on two large data sets,proving their usefulness. We analyse the potential of this descrip-tors, and what it brings when used in conjunction to other simpledescriptors.The base DBMS is team PostgreSQL (2014-). The spatial layerteam PostGIS (2014-) is added to benefits from generic geometrictypes and multidimensional indexes. The specific point cloud stor-age and function come from pgPointCloud (2014-). The MidOc iseither plpgsql or made in python with team SciPy (2014-). Theclassification is done with team Scikit (2014-), and the networkclustering with team Networkx (2014). Timings are only ordersof magnitude due to the influence of database caching.* 2*2Figure 7: Histogram of number of points per patch, with alogarithmic scale for X and Y axisWe use two data sets. There were chosen as different as possible tofurther evaluate how proposed methods can generalise on differentdata (See Figure fig:hist-density-dataset for histogram of patchensity ). The first data set is IQmulus (2014) (Paris data set), anopen source urban data set with varying density, singularities, andvery challenging point cloud geometry. Every point is labelledwith a hierarchy of 100 classes. The training set is only 12 millionspoints. Only 22 classes are represented. We group points in cubes. The histogram of density seems to follow an exponentiallaw (See figure 7), the effect being that many patches with fewpoints exist.We also use the Vosges data set, which is a very wide spread, aerialLidar, 5.5 Billions point cloud. Density is much more constant at10k pts/patch . A vector ground truth about surface occupationnature (type of forest) is produced by the French Forest Agency.Again the classes are hierarchical, with 28 classes. We grouppoints in 50 × squares. All the experiments are performed using a Point Cloud Server(cf Cura (2014)). The key idea are that point clouds are storedinside a DBMS (PostgreSQL), as patch. Patch are compressedgroups of points along with some basic statistics about points inthe group. We hypothesize that in typical point cloud processingwork-flow, a point is never needed alone, but almost always withits surrounding points.Each patch of points is then indexed in an R tree for most interest-ing attributes (obviously X,Y,Z but also time of acquisition, metadata, number of points, distance to source, etc.)Having such a meta-type with powerful indexes allows use tofind points based on various criteria extremely fast. (order ofmagnitude : ms). As an example, for a 2 Billion points dataset, wecan find all patches in few milliseconds having : - between -1 and3 meters high in reference to vehicle wheels - in a given 2D areadefined by any polygon - acquired between 8h and 8h10 - etc.The PCS offers an easy mean to perform data-partition basedparallelism. We extensively use it in our experiments.
We test the dimensionality descriptor ( ppl ) in two ways. First wecompare the extracted (
Dim
LOD ) to the classical structure tensorbased descriptor (
Dim cov ). Second we assess how useful it is forclassification, by analysing how well it separates classes, and howmuch it is used when several other features are available.
We compute
Dim cov following the indica-tions of Weinmann et al. (2015) to get p dim − > [0 .. , i.e. theprobability to belong to [1D,2D,3D]. We convert this to Dim cov with
Dim cov = (cid:80) i =1 i ∗ p dim [ i ] .Optionally, we test a filtering option so that the maximum dis-tance in biggest two dimensions is more equivalent. However thisapproach fails to significantly improve results.We test several method to extract Dim
LOD from ppl . The firstmethod is to compute
Dim
LODs [ i ] = log ppl [ i ]) /i , whichgives the simple geometric dimension for each level. The sec-ond method is the same but work on occupancy evolution, with Dim
LODd [ i ] = log ppl [ i ] /ppl [ i − (discarding L ). In bothcase the result is a geometric dimension between 0 and 3 for eachLevel. We use both indices to fusion the dimensionality acrossLevels ( working on Dim
LODA = Dim
LODs (cid:83)
Dim
LODd ).The first method uses a RANSAC (team SciPy (2014-) imple-mentation of Choi et al. (2009)) to find the best linear regression. The slope gives an idea of confidence (ideally, should be 0), andthe value of the line at the middle of abscissa is an estimate of
Dim
LOD . The second method robustly filters
Dim
LODA basedon median distance to median value and average the inlier toestimate
Dim
LOD . Dim cov and
Dim
LOD are computed with in-base and out-of-base processing, the latter being executed in parallel (8 work-ers). For 10k patches, 12 M pts, retrieving data and writing resultaccounts for , computing Dim
LOD to , Dim cov to .Computing ppl (which is multi-scale) using a linear octree takesbetween L and L . ( 0.75Comparing Dim cov and
Dim
LOD is not straightforward be-cause the implicit definition of dimension is very different inthe two methods. We analyse the patches where | Dim
LOD − Dim cov | < = 0 . . 0.5 is an arbitrary threshold, but we feel that itrepresents the point above which descriptors will predict unrecon-cilable dimensions. Those patches represent 93% of the data set(0.96 % of points), with a correlation of 0.80. Overall the proposeddimensions are similar for the majority of patch, especially forwell filled 1D and 2D patches (See Fig. 8).We analyse the 684 remaining patches to look for possible expla-nations of the difference in dimension (See Fig. 9).We consider the following four main sources of limitations from Dim cov . • Elongated patch.
Dim cov =1.42 ,
Dim
LOD =1.92. If the patch is not roughly asquare,
Dim cov gives a bad estimation as it is biased by theun-symmetry of point distribution. • Ellipsoid too simple.
Dim cov =1.68,
Dim
LOD =2.24.
Dim cov fits an ellipsoid,which can not cope with complex objects, especially whenthe barycentre does not happen to be at a favourable place. • Coping with heterogeneous sampling. p dim =[0.56,0.32,0.12], Dim cov =1.57,
Dim
LOD =2.16.
Dim cov is sensitive to difference in point density. The pointson the bottom plan are much 3 times less dense than in thevertical plan, leading to a wrong estimate. • Definition of dimension different.General:
Dim cov ∈ [1 . , . , Dim LOD ∈ [1 . .. . This patch: p dim =[0.11, 0.23, 0.66] ppl =[1,8,36,74..], Dim
LODD =[3.0,2.17,1.04]. Trees are agood example of how the two descriptors rely on a differentdimension definition. For
Dim cov points may be well spreadout, so usually p D is high. Yet, tree patches are also subjectto density variation, and may also be elongate, which renders Dim cov very variable. On the opposite,
Dim
LOD considersthe dimensionality at different scale (See Fig. 4). From afara tree-patch is volumetric, at lower scal, it seems planar (leafand small sticks form rough plans together). Lastly at smallscale, the tree looks linear (sticks).
Us-ing only the ppl descriptor, a classification is performed on Parisdata set, then a confusion matrix is computed. We use the spectrallayout (see Section 2.3.2) to automatically produce the graph inFigure 10. We manually ad 1D,2D and 3D arrows. On this graph,classes that are most similar according to the classification with ppl are close. The graph clearly present an organisation following3 axis. Those axis coincide with the dimensionality of the classes.For instance the ”tree” class as a strong 3D dimensionality. The”Punctual object” class, defined by ”Objects which representationon a map should be a point”, is strongly 1D (lines), with objectlike bollard and public light. The ”Road” class is strongly 2D,igure 8:
Dim
LOD and
Dim cov are mostly comparable, except for few patches (5%, coloured)* 2*2
Ellongated patch Ellipsoid too simpleNot same definition Inhomogene sampling
Dimcov:
DimLOD:
Dimcov:
DimLOD:
Dimcov:
DimLOD: p dim :DimLODD: [0.11,0.23,0.66][3.0,2.17,1.04] Figure 9: Representative patches for | Dim
LOD - Dim cov > . | .Most differences are explained by Dim cov limitations (See 3.3.1).the road being locally roughly a plan. The centre of the graphis occupied by classes that may have mixed dimensionality, forinstance ”4+ wheeler” (i.e. car) may be a plan or more volumetric,because of the sampling. ”Building” and ”sidewalk” are notas clearly 2D as the ”road” class. Indeed, the patch of ”sidewalk”class are strongly mixed (containing 22% of non-sidewalk points,See figure 13). The building class is also not pure 2D becausebuilding fac¸ade in Paris contains balcony, building decoration andfloors, which introduce lots of object-like patches, which explain * 2*2Figure 10: Spectral clustering of confusion matrix of Paris dataset classification using only ppl descriptor. Edge width and colourare proportional to affinity. Node position is determined fullyautomatically. Red-ish arrows are manually placed as visual clues.that building is closer to the object cluster. (See Figure 5 for in-stance). The dimensionality descriptor clearly separates classesfollowing their dimensionality, but can’t separate classes withmixed dimensionality.To further evaluate the dimensionality descriptor, we introduceother classification features (see 3.4), perform classification andcompute each feature importance. The overall precision and recallresult of these classification is correct, and the ppl descriptor is ofsignificant use (See Figure 13 and 12), especially in the Vosgesdata set. The ppl descriptor is less used in Paris data set, maybebecause lots of classes can not really be defined geometrically, butmore with the context. The dimensionality descrip-tor alone cannot be used to perform sophisticated classification,ecause many semantically different objects have similar dimen-sion (for instance, a piece of wall and of ground are dimensionallyvery similar , yet semantically very different). We introduce ad-ditional simple features for classification (See Section 2.3.2). Alluse already stored patch statistics, and thus are extremely fastto compute. (P : for Paris , V : for Vosges: - average of altituderegarding sensing device origin(P) - area of patch bounding box (P) : - patch height (P) - points per level ( ppl ), level 1 to 4 (P+V)- average of intensity (P+V) - average of number of echo (P+V)- average Z (V)For Vosges data set, we reach a speed of 1 M points / s / worker toextract those features. Undersampling and weighting areused on the Paris dataset. First undersampling to reduce the overdominant building class to a 100 factor of the smallest class sup-port. Then weighting is used to compensate for differences insupport. For the Vosges data set only the weighting strategy isused. The weighting approach is favoured over undersamplingbecause it lessen variability of results when classes are very het-erogeneous.To ensure significant results we follow a K-fold cross-validationmethod. We again compute a confusion matrix (i.e. affinity be-tween classes) on the Paris data set to choose which level of classhierarchy should be used. fig:class-clustering-all-features* 2*2
VEGETATION
OBJECTS
BUILDINGS
POLES
VEHICLE
GROUND
Figure 11: Result of automatic spectral clustering over confusionmatrix for patch classification of Paris data set with all simplefeatures. Edges width and colour are proportional to confusion.Manually drawn clusters for easier understanding.
Choosing which level of theclass hierarchy to use depends on data set and applications. In acanonical classification perspective, we have to strongly reduce thenumber of classes if we want to have significant results. Howeverreducing the number of class (i.e use a higher level in the classeshierarchy) also means that classes are more heterogeneous.Both data set are extremely unbalanced (factor 100 or more). Thusour simple and direct Random Forest approach is ill suited fordealing with extremely small classes. (Cascading or one versusall framework would be needed).For Vosges data set a short analysis convince us to use 3 classes:Forest, Land, and other, on this three classes, the Land class isstatistically overweighted by the 2 others.For the Paris data set, we again use a spectral layout to representthe affinity graph (See Figure 11). Introducing other featuresclearly helps to separate more classes. The graph also shows the limit of the classification, because some class cannot be properlydefined without context (e.g. the side-walk, which is by definitionthe space between building and road, hence is defined contextu-ally).
Closed Forest
Moor
Not forest avg/total p r e c . r e c . s u p p . m i x . ppl_1 ppl_2 ppl_3 ppl_4 nbe r o f e c ho m ean Z patchheight I n t en s i t y Feature usage
Figure 12: Vosges dataset. (table 2) Precision(prec.), recall (rec.),support (supp.), and average percent of points of the class in thepatches, for comParison with point based method (mix.). (table1)Feature usage
P recision point = P recision patch ∗ Mix. . (Same for recall). %). Further filtering on confidence can not increase preci-sion, but will reduce the variability of the found building patches.This result (red) would provides a much better base for buildingreconstruction for instance.0.75The patch classifier can also be used as a filtering preprocess. Inthis case, the goal is not to have a great precision, but to be fastand with a good recall. Such recall may be increased artificiallyfor class of objects bigger than the sampling size ( for Paris).We take the example of ground classification (See Figure 15).The goal is to find all ground patches very fast. We focus ona small area for illustration purpose. This area contains patches, including ground patches. Patch classification finds ground patch, with a recall of . %. Using the found patch,all the surrounding patches (X,Y : , Z : . ) are added tothe result (few seconds). There are now patches in the result, nclassifiedother6ground road sidewalkcurb building other6objectpunctual6objectlinearpedestrian26wheelers4+6wheelers tree potted6plantavg/total p r e c . r e c . s u p p . p r e c . r e c . s u p p . m i x . unclassifiedother6ground road sidewalkcurb building other6objectpostfloor6lamptraffic6signmailboxtrash6cangridpedestrianother6pedestrianstill6pedestrianholding6pedestrianbicyclescooterother64+6wheels tree potted6plantavg/total unclassified groundbuilding other6objectstaticDynamic natural avg/total p r e c . r e c . s u p p . ppl_1 ppl_2 ppl_3 ppl_4 i n t en s i t y nbe r o f e c ho he i gh t6 abo v eg r ound pa t chhe i gh t pa t c ha r ea feature6usage Figure 13: Results for Paris data set: at various level of class hierarchy. Precision(prec.), recall (rec.), support (sup.) and average percentof points of the class in the patches of the class, for comParison with point based method (mix.). Classes of the same type are in the samecontinuous tone. Feature usage is evaluated for each level in the class hierarchy.Figure 14: Plotting of patches classified as building, using con-fidence to increase precision. Ground truth from IGN and OpenData Parisand the recall is %. This means that from a filtering point ofview, a complex classifier that would try to find ground points canbe used on / of the data set, at the price of fewseconds of computing, without any loss of information.
4. DISCUSSION4.1 Point cloud server
We refer the reader to Cura et al. (2015) for an exhaustive analyseof the Point Cloud Server. Briefly, the PCS has demonstrated allthe required capacities to manage point clouds and scale well. Tothe best of our knowledge the fastest and easiest way to filter verybig point cloud using complex spatial and temporal criteria, aswell as natively integrate point cloud with other GIS data (raster,vector). The main limitation is based on the hypothesis thanpoints can be regrouped into meaningful (regarding further pointutilisation) patches. If this hypothesis is false, the PCS lose mostof its interest.
Tree patches are challenging for both dimensionality descriptor.There possible dimension changes a lot (See Fig. 16), although
Dim
LOD is more concentrated. Yet, ppl is extremely useful toclassify trees. Indeed, ppl contains the dimensionality at variousscale, and potentially the variation of it, which is quite specific fortrees (See Fig. 4).We stress that a true octree cell occupancy (i.e. without pickingpoints as in the ppl ) can be obtained without computing the octree,simply by using the binary coordinates. We implement it in pythonas a proof of concept. Computing it is about as fast as computing
Dim cov .igure 15: Map of patch clustering result for ground. The classicalresult finds few extra patches that are not ground (blue), and missessome ground patches (red). Recall is increased by adding to theground class all patches that are less than 2 meters in X,Y and0.5 meter in Z around the found patches. Extra patches are muchmore numerous, but all the ground patches are found.* 2*2
Dim H i s t og r a m Dim_LODDim_Cov
Figure 16: Histogram of
Dim
LOD and
Dim cov for patch in trees(500 k pts). Tree dimension could be from 1.2 to 2.6, yet Dim
LOD is less ambiguous than
Dim cov
Overall, ppl offers a good alternative to the classical dimension-ality descriptor (
Dim cov ), being more robust and multi-scale.However the ppl also has limitations. First the quality of thedimensionality description may be affected by a low number ofpoints in the patch. Second in some case it is hard to reduce it to ameaningful
Dim
LOD ] . Last because of assumption on density, itis sensible to geometric noise. The ppl descriptor contains lots of information about the patch.This information is leveraged by the Random Forest method andpermit a rough classification based on geometric differences. Asexpected, ppl descriptor are not sufficient to correctly separate complex objects, which is the main limitation for a classificationapplication.The additional features are extremely simple, and far from theone used in state of the art. Notably, we don’t use any contextualfeature. We choose to classify directly in N classes, whereas due tothe large unbalance in dataset, cascade or 1 versus all approacheswould be more adapted.
The figure 11 shows the limitof a classification without contextual information. For instancethe class grid and buildings are similar because in Paris buildingsbalcony are typically made of grids.To better identify confusion between classes, we use a spectrallayout on the affinity matrix. Graphing this matrix in 2D amount toa problem of dimensionality reduction. It could use more advancedmethod than simply using the first two eigen vector, in particularthe two vector wouldn’t need to be orthogonal (for instance, likein Hyv¨arinen and Oja (2000)).
First the feature usage for Vosgesdata set clearly shows that amongst all the simple descriptor, the ppl descriptor is largely favoured. This may be explained by thefact that forest and bare land have very different dimensionality,which is conveyed by the ppl descriptor.Second the patch classifier appears to have very good result topredict if a patch is forest or not. The precision is almost perfectfor forest. We reach the limit of precision of ground truth. Becausemost of the errors are on border area, the recall for forest can alsobe easily artificially increased. The percent of points in patch thatare in the patch class allow to compare result with a point basedmethod. For instance the average precision per point for closedforest would be . ∗ .
883 = 0 . . We stress that this isaveraged results, and better precision per point could be achievedbecause we may use random forest confidence to guess border area(with a separate learning for instance). For comParison with pointmethods, the patch classifier predict with very good precision andsupport over 6 billions points in few seconds (few minutes fortraining). We don’t know other method that have similar resultwhile being as fast and natively 3D. The Moor class can’t beseparated without more specialised descriptor, because Moor andno forest classes are geometrically very similar.The principal limitation is that for this kind of aerial Lidar data setthe 2.5D approximation may be sufficient, which enables manyraster based methods that may perform better or faster.The figure 13 gives full results for Paris data set, at various classhierarchy level. Because the goal is filtering and not pure clas-sification, we only comment the 7 classes result. The proposedmethods appears very effective to find building, ground and trees.Even taking into consideration the fact that patch may containsmixed classes (column mix.), the result are in the range of stateof the art point classifier, while being extremely fast. This resultare sufficient to increase recall or precision to 1 if necessary. Westress that even results appearing less good (4+wheelers , 0.69precision, 0.45 recall) are in fact sufficient to increase recall to 1(by spatial dilatation of the result), which enables then to use moresubtle methods on filtered patches. ppl descriptor is less used than for the Vosges data set, but is stilluseful, particularly when there are few classes. It is interestingto note that the mean intensity descriptor seems to be used todistinguish between objects, which makes it less useful in the7 classes case. The patch classifier for Paris data set is clearlylimited to separate simple classes. In particular, the performancesfor objects are clearly lower than the state of the art. A dedicatedapproach should be used (cascaded or one versus all classifier). .3.3 Estimating the speed and performance of patch basedclassification compared to point based classification The PointCloud Server is designed to work on patches, which in turns enablemassive scaling.Timing a server is difficult because of different layer of caches,and background workers. Therefore, timing should be consideredas order of magnitude. For Paris data set,extracting extra classi-fication features requires ∼ n workers ( 1 to 8 workers), learning ∼ , and classification ∼ few s . We refer to Weinmann et al.(2015)(Table 5) for point classification timing on the same dataset(4.28 h , 2 s , 90 s ) (please note that the training set is much reducedby data balancing). As expected the speed gain is high for complexfeature computing (not required) and point classification (done onpatch and not points in our case).For Vosges data set, features are extracted at pts/ s /worker ,learning ∼ few min , classification ∼ . The Vosges data sethas not been used in other articles, therefore we propose to com-pare the timings to Shapovalov et al. (2010) (Table 3). Keepingonly feature computation and random forest training (again on areduced data set), they process 2 M points in 2 min , whereas ourmethod process the equivalent of 5.5 B points in few minutes.Learning and classification are mono-threaded (3 folds validation),although the latter is easy to parallelise. Overall, the proposedmethod is one to three orders of magnitude faster.For Paris data set (Fig. 13), we compare to Weinmann et al.(2015)(Table 5). As expected there results are better, particularlyin terms of precision (except for the class of vegetation). Thistrend is aggravated by taking into account the ”mix.” factor. Indeedwe perform patch classification, and patch may not pertain to onlyone class, which is measured by the mix factor (amount of pointsin the main class divided by the total number of point). However,including the mix factor the results are still within the 85 to 90 %precision for the main classes (ground, building, natural).For Vosges data set (Fig 12), we refer to Shapovalov et al. (2010)(Table 2). There random forest classifier get trees with 93% preci-sion and 89% recall. Including the mix factor we get trees witha precision of 87% and 80% recall. As a comParison to imagebased classification, an informal experiment of classification withsatellite image reaches between 85 % and 93 % of precision forforest class depending on the pixel size (between 5 and 0.5 m ).Overall, the proposed method get reasonably good results com-pared to more sophisticated methods, while being much faster. Itso makes a good candidate as a preprocessing filtering step. Because the propose meth-ods are pre-process of filtering step, it can be advantageous toincrease precision or recall. * 2*2
Figure 17: Precision of 4+wheelers class is a roughly rising func-tion of random forest confidence scores. In the Figure 14 gives a visual example where increasing precisionand reducing class heterogeneity is advantageous. This illustratesthat having a precision or recall is not necessary the ultimategoal. In this case it would be much easier to perform line detectionstarting from red patches rather than blue patches.The limitation of artificial precision increase is that it is only pos-sible when precision is roughly a rising function of random forestconfidence, as seen on the illustration 17. For this class, by accept-ing only prediction of random forest that have a confidence over . the precision goes from . to . , at the cost of ignoringhalf the predictions for this class. This method necessitates thatthe precision is roughly a rising function of the confidence, as forthe 4+wheeler class for instance (See Figure 17). This strategy isnot possible for incoherent class, like unclassified class.The method we present for artificial recall increase is only possibleif at least one patch of each object is retrieved, and objects arespatially dense. This is because a spatial dilatation operation isused. This is the case for ”4+wheelers” objects in the Paris dataset for instance. The whole method is possible because spatialdilatation is very fast in point cloud server (because of index).Moreover, because the global goal is to find all patches of a classwhile leaving out some patches, it would be senseless to dilatewith a very big distance. In such case recall would be , but allpatches would be in the result, thus there would be no filtering,and no speeding.The limitation is that this recall increase method is more likea deformation of already correct results rather than a magicalmethod that will work with all classes.
5. CONCLUSION
We propose a dimensionality descriptor based on octree cellsoccupancy at different scales. It allows to describe local geomet-rical dimensionality of a point cloud in a meaningful way, andconcentrates information, which can be leveraged for helping vi-sualisation, or by a rough classifier. This descriptor is less affectedby sampling than alternative structure tensor geometric descriptor,fast to compute (a simple ordering of the points), and intrinsicallymulti-scales. We show the interest of this descriptor, both by com-Parison to the state of the art dimensionality descriptor, and byproof of its usefulness in real Lidar dataset classification. Classi-fication is extremely fast, sometime at the price of performance(precision / recall). However we prove that those results can beused as a pre-processing step for more complex methods, using ifnecessary precision-increase or recall-increase strategies.
6. BIBLIOGRAPHYReferences6. BIBLIOGRAPHYReferences