Automated Ground Truth Estimation For Automotive Radar Tracking Applications With Portable GNSS And IMU Devices
Nicolas Scheiner, Stefan Haag, Nils Appenrodt, Bharanidhar Duraisamy, Jürgen Dickmann, Martin Fritzsche, Bernhard Sick
AAutomated Ground Truth Estimation For Automotive RadarTracking Applications With Portable GNSS And IMU Devices
Nicolas Scheiner ∗ , Stefan Haag ∗ , Nils Appenrodt ∗ , Bharanidhar Duraisamy ∗ , J ¨urgenDickmann ∗ , Martin Fritzsche ∗ , Bernhard Sick ∗∗∗ Daimler AGUlm, Germanyemail: { nicolas.scheiner, stefan.s.haag } @daimler.com ∗∗ University of KasselKassel, Germanyemail: [email protected]
Abstract:
Baseline generation for tracking applications is a difficult task when work-ing with real world radar data. Data sparsity usually only allows an indirect way ofestimating the original tracks as most objects’ centers are not represented in the data.This article proposes an automated way of acquiring reference trajectories by using ahighly accurate hand-held global navigation satellite system (GNSS). An embedded in-ertial measurement unit (IMU) is used for estimating orientation and motion behavior.This article contains two major contributions. A method for associating radar data tovulnerable road user (VRU) tracks is described. It is evaluated how accurate the sys-tem performs under different GNSS reception conditions and how carrying a referencesystem alters radar measurements. Second, the system is used to track pedestrians andcyclists over many measurement cycles in order to generate object centered occupancygrid maps. The reference system allows to much more precisely generate real world radardata distributions of VRUs than compared to conventional methods. Hereby, an importantstep towards radar-based VRU tracking is accomplished.
1. Introduction
Autonomous driving is one of the major topics in current automotive research. In order toachieve excellent environmental perception various techniques are being investigated.
Extendedobject tracking (EOT) aims to estimate length, width and orientation in addition to positionand state of movement of other traffic participants and is, therefore, an important example ofthese methods. In the radar domain, research is usually focused on using data on a detectionlevel, such as in [1] or [2]. Major problems of applying EOT to radar data are a higher sensornoise, clutter and a reduced resolution compared to other sensor types. Among other issues, thisleads to a missing ground truth of the object’s extent when working with non-simulated data.A workaround could be to test an algorithm’s performance by comparing the points merged ina track with the data annotations gathered from data labeling. The data itself, however, suffersfrom occlusions and other effects which usually limit the major part of radar detections to the a r X i v : . [ ee ss . SP ] J un igure 1: Automated ground truth estimation: GNSS positions of ego-vehicle and VRU are used to automaticallydetermine the object center and an enclosing area in close proximity around the VRU’s location. objects edges that face the observing sensor. The object center can either be neglected in theevaluation process or it can be determined manually during the data annotation, i.e., labelingprocess. For abstract data representations as in this task, labeling is particularly tedious and ex-pensive, even for experts. As estimating the object centers for all data clusters introduces evenmore complexity to an already challenging task, alternative approaches for data annotation be-come more appealing. To this end, this article proposes using a hand-held highly accurate globalnavigation satellite system (GNSS) which is referenced to another GNSS module mounted ona vehicle (cf. Fig. 1). The portable system is incorporated in a backpack that allows being car-ried by vulnerable road users (VRU) such as pedestrians and cyclists. The GNSS positioningis accompanied by an inertial measurement unit (IMU) for orientation and motion estimation.This makes it possible to determine relative positioning of vehicle and observed object and,therefore, associate radar data and corresponding VRU tracks. In [3], the same hardware setupis used for automated data labeling of individual radar points in order to create a data set formachine learning applications. It was found that the internal position estimation filter whichfuses GNSS and IMU is not well equipped for processing unsteady VRU movements, thus onlyGNSS was used there. In comparison to the work in [3], the task of finding a reference track inthe radar data is more difficult. The requirements are stricter in this case because overestimatingthe area corresponding to the outlines of the VRUs is more critical. Therefore, this article aimsto incorporate the IMU measurements after all. In particular, it is shown how IMU data can beused to improve the accuracy of separating VRU data from surrounding reflection points andhow a fine-tuned version of the internal position filtering is beneficial in rare situations. Thearticle consists of two major contributions. First, the proposed system for generating a trackreference is introduced. Second, the GNSS reference system is used to analyze real world VRUbehavior. Therefore, the advantage of measuring stable object centers is used to generate objectsignatures for pedestrians and cyclists which are not based on erroneous tracking algorithms,but are all centered to a fixed reference point. . Ground Truth Estimation According to [3], the proposed reference system consists of the following components: VRUsand vehicle are equipped with a device combining GNSS receiver and an IMU for orientationestimation each. VRUs comprise pedestrians and cyclists for this article. The communicationbetween car and the VRU’s receiver is handled via Wi-Fi. The GNSS receivers use GPS andGLONASS satellites and real-time kinematic (RTK) positioning to reach centimeter-level ac-curacy. RTK is a more accurate version of GNSS processing which uses an additional basestation with known location in close distance to the desired position of the so-called rover [4].It is based on the assumption that most errors measured by the rover are essentially the sameat the base station and can, therefore, be eliminated by using a correction signal that is sentfrom base station to rover. All system components for the VRU system except the antennas areinstalled in a backpack including a power supply. The GNSS antenna is mounted on a hat toensure best possible satellite reception, the Wi-Fi antenna is attached to the backpack. Espe-cially for the ego-vehicle, a complete pose estimation (position + orientation) is necessary forthe correct annotation of global GNSS positions and radar measurements in sensor coordinates.For a complete track reference, the orientation of the VRU is also an essential component. Fur-thermore, both vehicle and VRU can benefit from a position update via IMU if the GNSS signalis erroneous or simply lost for a short period. Experiments in [3] revealed that the standardconfiguration of the internal position filter, which fuses both signals in the GNSS + IMU unit,is not well equipped for unsteady movements of VRUs, especially not for pedestrians. Thisquickly led to accumulating positioning errors. The internal filter uses several heuristics aboutminimum velocities and turning rates that are required for initialization and standstill detection.Exemplary trajectories of combined GNSS + IMU positioning with fine-tuned filter parame-ters versus pure GNSS can be found in Fig. 2, along with some examples of the data selectionarea which will be explained in the remainder of this section. It is clearly visible that combinedGNSS + IMU and pure GNSS trajectory both remain on the preset eight-shaped course for reg-ular measurements as depicted in the left plot of Fig. 2. In rare cases, the GNSS signal itselfcontains prominent errors. These errors may result, e.g., from multipath reflections, satelliteocclusion, or high ionospheric activity. During the measurement campaign for this article thissituation occurred only once within roughly ten recorded sequences. While such measurementswould need to be repeated for pure GNSS processing, the second plot in Fig. 2 outlines thebenefits of IMU-based position correction for erroneous GNSS measurements. A quantitativecomparison of both methods is given in Sec. 4.Once all data from GNSS, IMU and radar is captured, VRU tracks have to be assigned to cor-responding radar reflections. A basic overview is given in Fig. 3. At first, all data needs tobe transformed to a common coordinate system, e.g., car coordinates. Then, GNSS, IMU, andcombined information is smoothed with a moving average filter of length to remove jitter inpositions and movements. The length corresponds to roughly . for GNSS and . for IMUdata. Both durations are a trade-off between good smoothing characteristics and the expectedinterval of continuous VRU behavior. For each timestamp of each radar measurement the po-sition is estimated by cubic spline interpolation. In the next step, an area around the position − x axis in car coordinates [ m ] y a x i s i n ca r c oo r d i n a t e s [ m ] Standard GNSS Measurement
Pure GNSS referenceGNSS + IMU referenceSelection shape example − − x axis in car coordinates [ m ] Erroneous GNSS Measurement
Pure GNSS referenceGNSS + IMU reference
Figure 2: Five repetitions of estimated reference system trajectories based on GNSS position with and withoutusage of IMU. The left plot depicts a normal measurement cycle with smoothed GNSS signal. On the right, thebenefits of a well calibrated position filter are visible when taking into account IMU data. has to be defined. If ordinary car tracks were referenced with this method a rigid selection areacould be easily defined as car orientation and outer dimensions can usually be estimated veryprecisely. For VRUs though, the selection area is more volatile. Swinging body parts or turninghandlebars of cyclists, for example, complicate defining a fixed enclosing structure. Hence, twodifferent versions of non-rigid surrounding shapes are proposed for the VRUs under considera-tion. For a pedestrian, a Gaussian distribution is assumed, thus an ellipse is used with its majoraxis oriented in movement direction (yaw angle) φ . Major and minor axis of the ellipse arecalculated from fixed minimum pedestrian dimensions (empirically determined: . × . )plus a variable extra length for swinging body parts defined by its velocity v and yaw rate ˙ φ :ax major = (cid:40) . | v | · , , if v ≥ .
05 m s − . . , otherwise . (1)ax minor = (cid:40) . | ˙ φ | · − , , if v ≥ .
05 m s − . . , otherwise . (2) Transformall data to car coords.
Position smoothing + interpolation Calculate capturingellipses Label allradar pointsin ellipses Detection-based shape refinement
Figure 3: Processing steps of ground truth estimation strategy.4 φ can be directly adopted from the IMU data. For a stable position filtering, the reference sys-tem is also capable to return good estimates for φ and v which makes the more complicatedderivation from pure GNSS data in [3] obsolete. The cyclist is labeled inside a rectangle withfixed length of . oriented in driving direction φ and width of . plus a variable amountbased on its yaw rate. width rect = 1 . | ˙ φ | · − , (3)As bikes usually cannot turn without driving, the derivation of φ assumes constant continua-tion of the cyclist’s orientation for v < .
05 m s − overruling the estimated yaw angle of thereference system. At each time step all radar detections that lie inside the defined regions arebeing assigned the corresponding track. Lastly, the enclosing shape is stored as a bounding boxthat fits a symmetric ellipse-shaped area around the object center which is aligned with theVRU’s orientation φ and includes all radar points of the current time step. This ensures thatoverestimating the outlines in previous steps has a lower impact on the estimated radar track.
3. VRU Object Signatures
Sophisticated and sensor specialized measurement models allow Extended Object Tracking andfusion to obtain a higher precision and higher robustness as they allow a better interpretationof the measurements. Also, modeling errors can be significantly reduced. The distribution ofcar measurements can, e.g., be modeled as Gaussian distributed over the cars extent or theycan be modeled more sophisticatedly using several reflection centers [1]. Therefore, obtainingspecialized measurement models of VRUs is particularly important. Knowing the position ofan observed object in a traffic scenario allows to extract its detections in every time step andtransform them into an observed object coordinate system, where the x-axis corresponds to thedirection of movement. The target signature of a pedestrian in Fig. 4 is obtained by accumulat-ing all measurements over time in this coordinate system. All measurements of an eight shapedmovement as depicted in Fig. 2 are evaluated to include all aspect angles. In contrast to thetarget signatures shown in [2], the target signature is closer to a round Gaussian distribution,but without a clear peak at the object center. Frequent measurements occur in a circular areawith
25 cm diameter around the origin. In contrast to cyclist and car signatures, the major axisis not clearly headed towards the movement direction. This obstructs the estimation of the ob-ject’s orientation. In comparison, the cyclist in Fig. 5 is more elongated in driving direction.Both wheels seem to build two different peaks and also pedals and legs are roughly indicated.The cyclist signature is bent rightwards in its heading direction. This might be caused by GNSSinaccuracies or, more likely, one loop provides more radar detections than the other half of theeight due to a position in the scanning area with higher resolution facilitating more detectionsper object per scan.The presented measurement models have to be further adjusted to sensor specific influences.The impact of varying aspect angles and distances have to be examined to create non-static sen-sor specific measurement models. Furthermore, a sophisticated range rate measurement modelhas to be developed for pedestrians and cyclists adjusted to non-uniform leg and arm move- Relative Frequency -3 Figure 4: Pedestrian radar object signature calculatedrelative to GNSS+IMU position measurements.
Relative Frequency -3 Figure 5: Cyclist radar object signature calculated rel-ative to GNSS+IMU position measurements. ments. The GNSS+IMU position measurements can be exploited to obtain those sophisticatedmeasurement models with real world data.
4. System Evaluation
A series of experiments was conducted to evaluate the performance of the proposed referencesystem. In order to get measurements from all angles of the VRU, an eight-shaped path waschosen (cf. Fig 2) and evaluated in three different scenarios consisting of more than 8500 radarcycles using two chirp sequence radar sensors operating at
77 GHz with resolutions of approx-imately
15 cm in range, . ° in azimuth angle, and .
17 m s − in radial velocity:1) Pedestrian walking at constant normal speed (with/without reference)2) Cyclist driving at approximately − (with/without reference)3) Cyclist driving at approximately − with GNSS perturbations (with reference)Scene 3) occurred only by chance and is, hence, only available with the reference system. Allmeasurements were hand-labeled by a human expert and additionally all measurements includ-ing a GNSS reference were annotated automatically with the proposed method. Several indi-cators are important for comparing the proposed method with conventional manual labeling.First, the accuracy of the method has to be compared against the ground truth obtained frommanual labeling. Second, the differences in measured values for VRU carrying or not carryingthe reference system have to be estimated.To determine the performance of the point-to-track assignment system, two scores were calcu-lated. Let TP (true positives) be the amount of points correctly assigned to a VRU, FP (falsepositives) the incorrectly assigned points, and FN (false negatives) the amount of points thatincorrectly have not been included in a track. Then, the precision of the method can be calcu-lated as Pr = TP / ( TP + FP ) ∈ [0 , and the recall is Re = TP / ( TP + FN ) ∈ [0 , . Thescores are calculated for scenario 1) and 2) with attached GNSS reference. The macro-averagedresults, i.e., averaged individual scores yield a precision of .
53 % and a recall of .
61 % (cf. recision of .
48 % and recall of .
66 % in [3]). Note, that it would certainly be possible toimprove these scores for this data set by fine-tuning the parameters of the selection area. Thiscould, however, easily result in an overfitting on the given data set, i.e., a parameterization thatwould not generalize well on other data. Both scores, precision and recall, are identical up tothe second decimal place for pure GNSS and combined GNSS + IMU referencing when onlyregarding the first two scenarios. In scene 3) the fluctuations of the GNSS signal deteriorate theselection scores to a precision of .
35 % and a recall of .
41 % when using the pure GNSStrajectory. However, the combined GNSS + IMU position yields a precision of .
49 % and a recall of .
71 % which is close to perfect. Despite generally high accuracies for both methods,these findings suggest a higher stability against adverse environmental conditions when usinga combined GNSS + IMU reference system. This is especially beneficial for a wider series ofexperiments as it results in a lower rejection rate of measurement files.In order to determine how wearing the GNSS equipment alters measured values, manually la-beled data of scenes 1) and 2) are compared for scenarios where the reference system was andwas not worn. Important criteria for comparison are measured amplitudes, variations of Dopplervalues, the spatial extent, and the amount of detections per measurement. Therefore, in [3] themean reflected power compensated for free-space path loss using R correction, the standarddeviation of Doppler values, the length of the major and minor axis of the
95 % confidence el-lipse, and the amount of detections weighted by the mean distance to the sensor are calculatedfor each measurement cycle. Unpaired t-tests on all data in [3] revealed the only statisticallysignificant differences in the length of the minor axis of the
95 % confidence ellipse of pedestri-ans and for the standard deviation of Doppler values for cyclists. In order to closer evaluate thefound differences Figs. 6 and 7 display the distribution of minor axis lengths and Doppler stan-dard deviations, respectively. For the pedestrian it can be seen, how the distribution of valuesfor minor axis of the confidence ellipse of all radar detections is slightly shifted towards largervalues. This makes sense as this axis usually corresponds to sagittal body direction which isdirectly elongated by the reference backpack. The same effect on the extent of a pedestrianshould, however, also occur from any other ordinary backpack with e.g. a laptop inside. For thecyclist as similar behavior of radial velocity standard deviations can be recognized, i.e., withthe exception of the higher regions above . − the backpack distribution is shifted towardsbigger values. This is less intuitive as it would be expected that a backpack enlarges the quasi-static torso and, therefore, leads to smaller Doppler deviations. A simple explanation for thisbehavior would be different driving speeds during these scenarios. In any case, the distributionsstill look very similar, suggesting that the observed differences introduce a bias, but unlikelymake them less relevant.
5. Extended Object Tracking
EOT identifies filtering techniques that supplement point tracking by tracking the objects’ spa-tial extents and their orientations over time. It is sufficient for EOT in traffic scenarios to ap-proximate the road users extents with basic shapes such as ellipses, rectangles or circles. InSection 3 and [2], it was shown that ellipses are suitable forms for VRUs. Furthermore, as the . . . . . . . . . Minor axis of detection confidence ellipses [ m ] N o r m e d fr e qu e n c y [ % ] Without BackpackWith Backpack
Figure 6: Histogram of a pedestrian’s extension along minor axis of confidence ellipse – scenario 1). . . . . . . . Standard deviation of Doppler velocity [ m s − ] N o r m e d fr e qu e n c y [ % ] Without BackpackWith Backpack
Figure 7: Histogram of a cyclist’s standard deviations in mean Doppler velocity – scenario 2). accumulated target measurements are Gaussian distributed over the object’s extent the RandomMatrix Model (RMM) [5], [6] seems promising for EOT of VRUs on the provided data. TheRMM complements the centroid and kinematic state vector x k with a symmetric positive defi-nite matrix X k that represents the object’s extent. The object’s length, width and orientation isobtained by principal component analysis of the extent matrix. The given scenario is very chal-lenging due to the non-linear movement. Therefore, the RMM is combined with the constantturn motion model [7] and adjustments of the extent matrix according to the object motion [8].Fig. 8 shows the results of pedestrian and cyclist tracking on the eight shaped walk/ride sce-narios 1) and 2). Both VRU centroids can be tracked accurately. Utilizing the Doppler velocitymeasurements reduces the RMSE drastically. Length and width are well estimated when theDoppler measurements are incorporated. Without them, stability is lost in the first
20 s . It canbe observed that wrong prior orientation causes changes in axis length first instead of a rotation.The tracking framework is optimized to maintain stable yaw rate and, thereby, orientation es-timation. This allows to handle non-linear movements and reduce orientation errors in contrastto [2] where non-linear movements caused estimation errors. Fig. 9 depicts the absolute yawrate and orientation errors. Utilizing Doppler values the yaw rate is determined very accuratebut with time delay. The orientation errors are very high for both VRUs. This shows that estimat-ing the yaw rate accurately is not sufficient for the determination of object orientation. Hence,specific measurement models are needed to gain a stable orientation in challenging scenarios. . Conclusion
In this article, a method for fast reference generation for automotive radar tracking of VRUswas proposed. The system is based on the combination of two GNSS receivers mounted on theego-vehicle and the VRU in combination with an IMU. Radar data is automatically assignedto a track if it falls within a close area around the VRU’s estimated position. The selectionarea is determined by the kind of tracked VRU, i.e., pedestrian or cyclist, its speed and yawrate. Experiments prove the accuracy of the proposed method with precision and recall bothover
99 % . The supplementary IMU upheld the scores are over
98 % during perturbations inthe GNSS signal. The article concludes by using the generated reference to evaluate ExtendedObject Tracking on VRUs. By collecting more data sophisticated measurement models caneasily be generated. This allows developing more precise, robust, and faster extended objecttracking methods in order to provide a higher safety level for all road users. Besides creatingbetter VRU models, it is also planned to use several GNSS backpacks for tracking multipleVRUs in future work. This involves adapting the selection strategy to cope with situations whereselection areas overlap.
References [1] K. Granstr¨om, M. Baum, and S. Reuter, “Extended Object Tracking: Introduction, Overview andApplications,”
Journal of Advances in Information Fusion , vol. 12, no. 2, pp. 139–174, 2017.[Online]. Available: http://arxiv.org/abs/1604.00970[2] S. Haag, B. Duraisamy, W. Koch, and J. Dickmann, “Radar and Lidar Target Signatures of VariousObject Types and Evaluation of Extended Object Tracking Methods for Autonomous Driving Ap-plications,” in
018 21st International Conference on Information Fusion (FUSION) . Cambridge:IEEE, 2018, pp. 1746–1755.[3] N. Scheiner, N. Appenrodt, J. Dickmann, and B. Sick, “Automated Ground Truth Estimation ofVulnerable Road Users in Automotive Radar Data Using GNSS,” in . Detroit, USA: IEEE, In press.[4] T. Pany,
Navigation Signal Processing for GNSS Software Receivers . Artech House Pub., 2010.[5] W. Koch, “Bayesian Approach to Extended Object and Cluster Tracking using Random Matrices,”
IEEE Transactions on Aerospace and Electronic Systems , vol. 44, no. 3, pp. 1042 – 1059, 2008.[6] M. Feldmann, D. Fr¨anken, and W. Koch, “Tracking of extended objects and group targets usingrandom matrices,”
IEEE Transactions on Signal Processing , vol. 59, no. 4, pp. 1409–1420, 2011.[7] G. Zhai, H. Meng, and X. Wang, “A constant speed changing rate and constant turn rate model formaneuvering target tracking,”
Sensors (Switzerland) , vol. 14, no. 3, pp. 5239–5253, 2014.[8] J. Lan and X. R. Li, “Tracking of Extended Object or Target Group Using Random MatrixPart I:New Model and Approach,” in
Fusion2012 , 2012, pp. 2185–2192., 2012, pp. 2185–2192.