[PDF] FAITH: Fast iterative half-plane focus of expansion estimation using event-based optic flow

Abstract

Course estimation is a key component for the development of autonomous navigation systems for robots. While state-of-the-art methods widely use visual-based algorithms, it is worth noting that they all fail to deal with the complexity of the real world by being computationally greedy and sometimes too slow. They often require obstacles to be highly textured to improve the overall performance, particularly when the obstacle is located within the focus of expansion (FOE) where the optic flow (OF) is almost null. This study proposes the FAst ITerative Half-plane (FAITH) method to determine the course of a micro air vehicle (MAV). This is achieved by means of an event-based camera, along with a fast RANSAC-based algorithm that uses event-based OF to determine the FOE. The performance is validated by means of a benchmark on a simulated environment and then tested on a dataset collected for indoor obstacle avoidance. Our results show that the computational efficiency of our solution outperforms state-of-the-art methods while keeping a high level of accuracy. This has been further demonstrated onboard an MAV equipped with an event-based camera, showing that our event-based FOE estimation can be achieved online onboard tiny drones, thus opening the path towards fully neuromorphic solutions for autonomous obstacle avoidance and navigation onboard MAVs.

Full PDF

FFAITH: Fast iterative half-plane focus of expansion estimationusing event-based optic ﬂow

Raoul Dinaux, Nikhil Wessendorp, Julien Dupeyroux and Guido C. H. E. de Croon ∗ Abstract — Course estimation is a key component for the de-velopment of autonomous navigation systems for robots. Whilestate-of-the-art methods widely use visual-based algorithms, itis worth noting that they all fail to deal with the complexity ofthe real world by being computationally greedy and sometimestoo slow. They often require obstacles to be highly texturedto improve the overall performance, particularly when theobstacle is located within the focus of expansion (FOE) wherethe optic ﬂow (OF) is almost null. This study proposes theFAst ITerative Half-plane (FAITH) method to determine thecourse of a micro air vehicle (MAV). This is achieved bymeans of an event-based camera, along with a fast RANSAC-based algorithm that uses event-based OF to determine theFOE. The performance is validated by means of a benchmarkon a simulated environment and then tested on a datasetcollected for indoor obstacle avoidance. Our results show thatthe computational efﬁciency of our solution outperforms state-of-the-art methods while keeping a high level of accuracy. Thishas been further demonstrated onboard an MAV equipped withan event-based camera, showing that our event-based FOEestimation can be achieved online onboard tiny drones, thusopening the path towards fully neuromorphic solutions forautonomous obstacle avoidance and navigation onboard MAVs.

I. I

NTRODUCTION

Autonomous navigation, including path planning, obstacleavoidance, and localization, for both ground and aerial robotsis considered as one of the top ten technological challengesof our time [1]. Despite outstanding studies in this ﬁeld,it must be noticed that we still fail at tackling real-worldscenarios, where lighting conditions can change abruptly,light can be absent, and where obstacles can severely hamperthe performance of the navigation system running onboardthe robot. Another crucial aspect for making an autonomousnavigation system suitable for real-world applications is tomake sure that the robot can deal with high speeds. This isprecisely one of the bottlenecks for drone applications, wherecomputational resources and energy usage are importantfactors for the viability of the proposed method. Lastly,the navigation system must be endowed with deep auto-adaptation skills to make it worth deploying onboard robotsin complex environments that remain hard to model, or ofwhich the core nature is simply not understood yet.The limitations of navigation systems are multi-factorial,but the most important reason for this may be the sensingcomponent itself. While observing the animal kingdom, onecan note that each species optimized its sensors to betterevolve in its environment. For instance, birds and insects are ∗ All authors are with Faculty of Aerospace Engineering, Delft Uni-versity of Technology, Kluyverweg 1, 2629HS Delft, The Netherlands. [email protected]

Fig. 1. An application of the FAITH method for fast and accurate FOEestimation in MAV ﬂight towards a single pole. The left plot shows a video-image from the MAV, ﬂying towards a single pole in the TU Delft ﬂyingarena. The right plot shows an event-image of the pole with an overlay ofoptic ﬂow vectors and the FOE estimation, performed by our method. sensitive to the polarization state of the skylight to estimatetheir course, dung beetles retrieve navigational cues fromthe Milky Way, and eagles have an extremely high visualacuity to better ﬁnd their preys. In comparison, robots areoften equipped with cameras for which both the temporal( − f ps on average) and the visual resolutions arelimited. This gets even more crucial with small drones.Given their small dimensions and weight, micro air vehicles(MAVs) are safe to operate autonomously around humans incomplex environments. Unfortunately, MAVs are endowedwith highly restricted power capacity, and extremely limitedcomputational resources. Their use is also hampered bythe risk of GPS failure indoor, magnetometer disturbancesbecause of surrounding ferrous materials (buildings, infras-tructures), and IMU drift over time. Embedded cameras havegreatly contributed to the reduction of navigation failure, buttheir use remains limited by the low computational resourcesavailable onboard. It is therefore crucial to determine fast andefﬁcient methods to allow MAVs to autonomously navigateand avoid both static and moving obstacles.The recent developments in neuromorphic systems repre-sent a promising opportunity for autonomous obstacle avoid-ance and navigation onboard robots, in particular for MAVs.In this respect, event-based cameras were ﬁrst released in2008 by Lichtensteiner et al. [2]. Unlike conventional cam-eras which output images at a ﬁxed frame-rate, event-basedcameras produce a stream of asynchronous and independentevents reporting changes in brightness at the pixel level [3].Therefore, these cameras inherently capture the apparentmotion. The intensity change threshold, which triggers the a r X i v : . [ c s . R O ] F e b ixel, is user-deﬁned. The events are labeled by the pixel lo-cation, trigger time and a polarity ( +1 for positive change ofbrightness, − for a negative change). Event-based camerasoffer a high dynamic range ( > dB ) along with a hightemporal resolution (in the range of microseconds). Theseadvantages make event-based cameras inherently insensitiveto classical visual artifacts such as motion blur or the tunneleffect. As a result, these cameras provide accurate visualinformation at extremely high speed, making them suitablefor aerial robotics, including MAVs, for various tasks suchas obstacle avoidance [4], [5] and visual odometry [6].In this study, we propose the FAITH (FAst ITerative Half-plane) method to estimate the course of the MAV by meansof an event-based camera (i.e., the DVS240C [7]), alongwith a fast RANSAC-based algorithm for the determinationof the focus of expansion (FOE) using optic ﬂow (OF) asan input (Fig.1). Optic ﬂow is described as the pattern ofapparent motion of objects in a visual scene caused bythe relative motion between the observer and a scene [8].The FOE is therefore deﬁned as the singular point fromwhich the apparent OF expands, assuming the scene is staticand the motion of the observer is purely translational. Thispoint indicates the course of the observer, and therefore isa crucial element in visual-based navigation. Appendix Igives a theoretical background to OF and FOE estimation.Determining the FOE is challenging as only normal ﬂow isavailable, and the computational limitation of the MAV doesnot allow for expensive online visual-processing.The determination of the FOE onboard mobile systemsequipped with cameras has received large attention fromresearchers over the past decades, showing a great varietyof approaches to solve this very complex problem. In thefollowing study, we focused on sparse OF-based FOE estima-tion, for which state-of-the-art solutions currently availablecan be divided into three categories: (i) counting vectorsdirections [9], [10], (ii) creating a probability map based onnegative vector intersections and (iii) based on negative half-planes.Methods relying on counting vectors show limited per-formance when exploited in online MAVs application. Toreduce the computation cost of online FOE estimation,methods based on probability maps seem to be a promisingalternative. Guzel et al. [11] proposed to compute a probabil-ity map based on the amount of OF vector intersections perlocation. They demonstrated the performance of their methodthrough a navigation task with a ground robot equipped witha camera. A similar method was implemented by Buczko etal. [12], where RANSAC scheme randomly selects two OFvectors, create a candidate FOE location by calculating theintersection, and test this location against all OF vectors.After a predetermined amount of iterations, the candidatewith the highest amount of inliers is selected as the FOE.Results obtained with a RGB camera showed a translationerror as low as 0.81%. Yet, using vectors intersection remainsa limited solution to FOE estimation since OF estimation onnatural scenes is a complex task and the resulting estimates(normal ﬂow) can differ from the true ﬂow. Fig. 2. Schematic example of an FOE estimation by the FAITH method.The arrows represent normal optic ﬂow, the dotted lines their orthogonalhalf-planes. As the centre of the potential FOE area lies within three half-planes, the iteration score of this estimation is three.

To compensate for this, it has been proposed to build theprobability map using the negative half-planes [13]. As thenormal ﬂow is computed, the assumption is made that theFOE must lie in the negative half-plane of as many normalOF vectors as possible. For each OF vector an orthogonal lineis taken, which intersects the vector location. The negativehalf-plane of this orthogonal line is used to update theprobability map. All locations which are not updated aresubject to exponential decay over time. The location withthe highest value on the probability map is selected as theFOE. This method has been used with event-based camerasto estimate time-to-contact (TTC) in the context of obstacleavoidance with MAVs [14].Although the negative half-planes approach suggest animprovement in the course estimation, it is worth noting thatthe new computation introduced in the plane estimation andintersection considerably affect the overall performance.

Contributions – We propose the FAITH method for FOE es-timation based on negative half-planes intersections, furtheroptimized by means of a RANSAC process. Our contribu-tions are:(a) a novel course estimation algorithm (FAITH) that ishighly computationally efﬁcient, runs real-time on-board robots (including MAVs), and provides a robustestimate of the FOE even with poor-textured obstacles;(b) an exhaustive assessment of the overall performance ofthe FAITH algorithm, ﬁrst using the ESIM event-basedcamera simulator [15], and then using an extensivedataset collected in the TU Delft ﬂying arena equippedwith the OptiTrack motion tracking system;(c) a real-world demonstration of the performance onboardan MAV designed for the purpose of this study.II. M

ATERIALS AND METHODS

A. The proposed FAITH method

We apply an event-surface method to compute the localnormal ﬂow based on visual events streamed by an even-based camera [16], [17]. When an OF vector is available,we assume the FOE lies in the negative half-plane delimitedby the straight line orthogonal to the OF vector (Fig. 2). lgorithm 1

FAITH method for FOE estimation for iterations do stop search = falsePick two random OF vectors.Calculate current FOE area (bounded by negative half-planes orthogonal to the selected vectors). while stop search == false do Pick new random vector.Calculate new area (bounded by current FOE area and the negative half-plane of the selected vector). if new area < current FOE area then current FOE area = new area else Calculate score as the total amount half-planes thecenter of new area lies in. stop search = true if score > max score then max score = scorebest area = new area FOE = center of best area The aperture problem limits OF on edges to be normal tothe edge, whereas only OF on corners result in true OF.Therefore, the assumption is made that the FOE must liein the negative half-plane of a line orthogonal to the OFvector. We then build on the approach from [13] to computea probability region for the FOE estimation. As MAVs arelimited by their computational resources, the ego-motionestimation should require as little as computation as possiblewhile still assuring accuracy. Because we are using an event-based camera, the algorithm implemented in [13], whichupdates all pixels of the probability map for each OF vector,will inevitably lead to large computational needs.To compensate for the computational cost, we proposeto apply a RANSAC scheme to create an FOE area bytaking the intersection of the negative half-planes of tworandomly chosen vectors (Algorithm 1). A new OF vector isthen chosen and the intersection of the negative half-plane isupdated. If the new vector reduces the size of the FOE area,it is added as a new boundary. This process is continueduntil the new chosen vector does not reduce the size ofthe FOE area. Then the center of this area is calculated(Fig. 2) and an iteration score is assigned by computingin how many negative half-planes the FOE estimate lies.The score and center position are saved and another iterationis performed. After a user-deﬁned amount of iterations, thesearch is stopped and the iteration with the highest score ischosen as the best FOE candidate.

B. Computational complexity analysis

Given that our proposed method extends the one intro-duced in Clady et al. [13], we determined the computationalcomplexity of both algorithms to assess their overall compu-tational performances. For each vector, the method by Cladyet al. updates all locations of a probability map. Thereforethe computational complexity for this method can be written as O ( N ∗ M p ) , with N the number of OF vectors and M p the number of pixels in the probability map.In our method, the majority of the computational complex-ity lies in checking how many inliers the candidate locationshave. This depends on the total of OF vectors and candidatelocations, which is equal to the user-deﬁned number ofiterations run. Therefore it can be expressed as O ( N ∗ I ) , with N the number of OF vectors and I the number of iterations(i.e., potential FOE locations). To get an estimate of this I ,the theoretical minimum of RANSAC iterations required toconstruct a proper model with a chosen probability is: I = log (1 − p ) log (1 − w n ) (1)where I is the required number of iterations, p is the prob-ability of selecting a proper model, w is the ratio betweeninliers and the total set, and n is the sum of inliers requiredfor a proper model.This formula can be seen as a theoretical upper boundas it assumes that the random selection of vectors caninclude already chosen vectors. For example: requiring aprobability of 95% to ﬁnd a proper model, assuming atleast 10 vectors are required for creating a proper modeland assuming that 75% of the total set consists of inliers,the total amount of iterations required is 52. Comparingthe computational complexity of both methods (assuminga × pixel probability map) shows that the methodimplemented by Clady et al., O (43200 ∗ N ) , is a few ordersof magnitude more complex than the proposed method, O (52 ∗ N ) . Therefore, it is concluded that in the generalcase the theoretical computational complexity of our methodis lower, and user-deﬁned.III. P ERFORMANCE BENCHMARK

To assess the performance of our method, we ﬁrst testit in a virtual environment featuring an event-based camerasimulator. Then, we demonstrate its robustness by testing iton a manually controlled obstacle avoidance dataset that wecollected in our indoor ﬂying arena equipped with the Opti-Track motion capture system (see Supplementary Materials).Lastly, the online performance is demonstrated by testing themethod onboard an MAV equipped with a DVS240 camerain an autonomous obstacle detection and avoidance task. Forthe sake of experimental simpliﬁcation, the motion of theMAV is bounded to translation in the horizontal plane andwithin the camera FOV. Appendix II discusses the impactthis assumption. While assessing the performance of ourproposed method, we compare it with the state-of-the-artFOE estimation methods. As described in Section I, threecategories of FOE estimation methods using sparse normalﬂow are identiﬁed. Therefore, we also implement the threefollowing algorithms to test them on both simulated and real-world dataset: (i) the vector counting method from Huanget al. [10], (ii) the probability map method based on vectorintersections implemented by Buczko et al. [12], and (iii) thenegative half-planes method introduced by Clady et al. [13], ig. 3. Rendering of four scenes used in the simulated benchmark: (A) the TU Delft ﬂying arena, (B) a kitchen, (C) a set of storage shelves, and (D) a wood warehouse. further referred to as ’NESW’, ’Vec. Intersections’, and’Half-planes’ respectively.

A. Benchmark on simulated data

First, the FAITH method is benchmarked in a simulatedenvironment using the ESIM event-based camera simula-tor [15] provided with the DVS240 event-based camera spec-iﬁcations which we used in our indoor obstacle avoidancedataset. Four distinct 3D scenes are exported from the open-source software Blender to .obj ﬁles. These scenes havedifferent textures and layouts to ensure the diversity ofenvironments (Fig. 3). We then provide these scenes to theESIM simulator along with 100 ﬂight trajectories (cameracoordinates over time in a .csv ﬁle). To test the robustnessof the methods to different FOE locations, the trajectoriesare chosen such that the FOE covers all course angles in theFOV ( − ◦ to ◦ ). Both straight trajectories (with differentyaw angles) and sway trajectories (varying the FOE duringsimulation) are used. The ground truth FOE is known fromthe simulated trajectory and camera pose.The results of these N = 100 simulations are shown inFigure 4 and Table I. The mean course angle estimation erroris compared for the four methods. The FAITH method showsstate-of-the-art accuracy, with a mean error of . ◦ ± . ◦ .The worst performance is achieved by the ’Vec. Intersec-tions’ method with an overall angular error of . ◦ ± . ◦ .Fig. 4-B shows the mean computation time per 1000 vectors.This proves the large reduction in computational effort forour method, conﬁrming the theoretical insight detailed inSection II-B. It also clearly demonstrates the computationalefﬁciency of the ’Vec. intersections’ method, which also doesnot update all probability map pixels and uses a RANSACscheme. In contrast, the mean course estimation error issigniﬁcantly larger for the ’Vec. intersections’ method, con-ﬁrming that using the OF vectors instead of half-planesdecreases the accuracy (normal vs. real ﬂow). B. Benchmark on an event-based obstacle avoidance dataset

The event-based domain requires new approaches anddatasets due to the sparse asynchronous event representation.To address this challenge, a novel obstacle avoidance dataset

Fig. 4. Comparison of the overall performance of our FOE estimationmethod with three other state-of-the-art methods, after testing over 100distinct trajectories in the four simulated environments (Fig. 3). (A)

Averageangular error (in degrees) in the FOE estimation. (B)

Mean computation time(in seconds) required to process OF vectors. using a real event-based camera was recorded (see Supple-mentary Materials). It consists of ∼ manual obstacleavoidance runs performed with an MAV equipped with anevent-based camera (DVS240), a 24-GHz radar sensor, aFull-HD RGB camera, a 6-axes IMU, and OptiTrack datafor position and attitude ground truth. The obstacles consistof one or two 50-cm wide poles, of which the groundtruth location is known (Fig. 5). Each trial consists ofapproximately 10 seconds of recording.The benchmark on the obstacle avoidance dataset is per-formed by comparing the four methods. The ground truthFOE is available as the OptiTrack system tracks both the poseand the position of the MAV during the trials. These trialscontain a variety of trajectories, obstacles and backgroundsto ensure diversity of environment and motion.The results of this benchmark, as seen in Figure 6 andTable I, conﬁrm those obtained with the simulator indicatingthat the FAITH method outperforms the others. Comparingthese results show that the FOE estimation accuracy of allmethods on the live recorded data is lower than when usingsimulated data. This is a consequence of multiple factors,such as the higher amount of noise from the DVS240 camera,vibrations caused by the propellers of the MAV or theincreased sparsity of the OF due to the texture and lightningconditions of the scene. In contrast, the relative performanceof the methods does not change, our method is still the mostaccurate and computationally efﬁcient of these methods. ABLE IO

VERALL PERFORMANCE OBTAINED WITH THE

FAITH

METHOD , COMPARED TO OTHER STATE - OF - THE - ART

FOE

ESTIMATION METHODS , FOR BOTH THE SIMULATED BENCHMARK (ESIM)

AND OUR OBSTACLE AVOIDANCE DATASET .Method ESIM benchmark ( N = 100 ) Obstacle avoidance dataset ( N = 1300 )Angular error Computation time Angular error Computation timeNESW [10] . ◦ ± . ◦ . ± . s . ◦ ± . ◦ . ± . sVec. Intersections [12] . ◦ ± . ◦ . ± . s . ◦ ± . ◦ . ± . sHalf-planes [13] . ◦ ± . ◦ . ± . s . ◦ ± . ◦ . ± . s FAITH . ◦ ± . ◦ . ± . s . ◦ ± . ◦ . ± . sFig. 5. Representation of 78 sample trajectories from the obstacle avoidancedataset. This dataset is used to validate the performance of the FOEestimation method. The MAV is controlled manually and two poles in thecenter of the TU Delft ﬂying arena are avoided.Fig. 6. Comparison of the overall performance of the FAITH method withthree other state-of-the-art FOE estimation methods, after testing over 1300samples of our obstacle avoidance dataset. (A) Average angular error (indegrees) in the FOE estimation. (B)

Mean computation time (in seconds)required to process OF vectors.

C. Experiment onboard MAV

To show onboard performance of the FAITH method, itis implemented within the ROS (Robot Operating System)framework using C ++ and used in an autonomous obstacleavoidance task. The MAV is set to ﬂy straight-forward at aconstant velocity in the ﬂying arena and encounters a poleapproximately halfway. The obstacle avoidance algorithmthen detects the pole and gives an avoidance command tothe iNav Autopilot running onboard the MAV.

1) Obstacle avoidance strategy:

A straight-forward obsta-cle avoidance algorithm is designed using OF as input and anavoidance course as output which is fed to the iNav autopilot.In order to detect an obstacle, OF is clustered based on theconcatenated image coordinate (normalized between 0 and 1)and TTC (normalized by mean and variance). The FOE isestimated using the FAITH method. The TTC is calculatedusing this FOE estimation (Appendix I-C). To cluster thevectors, we apply a Density-Based Spatial Clustering ofApplications with Noise (DBSCAN) [18] with (cid:15) = 0 . , minPts = 20 and an Euclidean distance measure. HighTTC values are clipped to a user-deﬁned maximum. Themean TTC of the clusters is calculated and the cluster withlowest TTC is assumed to be the highest priority obstacle.A bounding box is drawn around this obstacle cluster. Ifthe FOE location is within the obstacle region and the meanobstacle TTC is below a user-deﬁned threshold, the algorithmgives an 1.5 second roll command to the autopilot to avoidthe obstacle. The sign of the roll command is determined byselecting the direction towards the cluster with the highestmean TTC.

2) Hardware architecture:

The MAV is a quadrotor builtupon the GEPRC FPV frame kit Mark4, featuring the KakuteF7 Tekko ESC Combo v1.5 ﬂashed with the iNav autopilot.The embedded CPU consists in the Intel Up board (64-bitsIntel Atom x5 Z8350 1.92GHz Processor) running the Linux18.04 LTS operating system. An overview of the hardwarearchitecture is provided in Fig. 8. The board is used fordata acquisition and processing, autonomous navigation, andwireless communication with the host machine. All MAV-related embedded processing, i.e., DVS data acquisition,OF and FOE estimation, obstacle avoidance and navigation,are performed within the ROS (Robot Operating System)framework. As for the obstacle avoidance dataset, the MAVis equipped with the DVS240 event-based camera ( × pixels). The altitude of the drone is controlled separately witha downward-facing micro LiDAR (TFMini, QWiic). ig. 7. Example of DBSCAN clustering for the onboard FOE estimationexperiment, implementing the FAITH method in an obstacle avoidance task. (A) Frame-based image of the pole. (B)

Event-based image of the pole. (C)

Clustering optic ﬂow based on TTC and position. (D)

Clusters, mapped toa spatial plot. As Cluster 2 has the lowest mean TTC, it is identiﬁed as(highest priority) object and a bounding box is drawn.Fig. 8. Hardware architecture of the MAV. The visual processing and theobstacle avoidance algorithms are processed onboard Intel’s Up board withinthe ROS environment. A switch allows the user to switch from manualcontrol to autonomous mode. In both cases, the altitude is kept constant bymeans of the micro Lidar.

3) Experimental setup:

During the experiments, the MAVground truth position and attitude are determined by theOptiTrack motion capture system installed in the ﬂyingarena. A pole, of which the ground truth location is known,is positioned in the center of the ﬂying arena. The MAV isset to autonomously ﬂy along a straight trajectory, from 12different starting positions and headings. A set of 60% of thetrajectories are designed as collision courses with the pole,while the remaining 40% trajectories concern a near pass ofthe pole. This conﬁguration is meant to qualitatively assessthe robustness of the FOE estimation and obstacle avoidancemethods in real-world conditions.

Fig. 9. 20 Successful autonomous obstacle avoidance trajectories usingthe FAITH method to estimate the FOE. This shows the successful imple-mentation of our method in an obstacle avoidance task.

4) Results:

The autonomous obstacle avoidance method,using FAITH to estimate the FOE, is shown to perform asuccessful obstacle avoidance manoeuvre in of the runs(20 out of 25). The faulty runs are a result of the low-textured scene, which impedes the FOE estimation. Whenthe potential FOE area (Fig. 2) is not fully bounded byOF, the FOE estimation becomes less accurate. This alsoinﬂuences the TTC estimation and subsequently deterioratesthe clustering quality. As a result, occasionally when nofully bounding OF is generated in the scene, the object isnot detected correctly. Fig. 9 shows the trajectories of 20successful obstacle avoidance runs. This ﬁgure shows theability of the MAV to autonomously determine its courseusing the FAITH method and avoid the object. This showsthe successful onboard performance of our method in a real-time obstacle avoidance task.IV. C

ONCLUSION AND FUTURE WORK

We introduced the novel FAITH method to determine thecourse of an MAV by means of an event-based camera, alongwith a fast RANSAC-based algorithm for the determinationof the FOE. Using event-based normal OF as input, themethod is able to efﬁciently estimate the course of the MAV.The accuracy and computational performance are validatedby performing a benchmark using both a simulated event-based camera data and a novel live obstacle avoidance datasetcontaining real sensor data. On both simulated and real event-based camera data, the FAITH method shows a state-of-the-art accuracy, with a beyond state-of-the-art computationalperformance.We further tested our method in an obstacle avoidancetask onboard an MAV, successfully demonstrating real-timeperformance of our method. The limitations of OF-basedstrategies in low-textured environments show the bottlenecktowards MAV autonomous applications, also suggested byresults obtained with our dataset.

CKNOWLEDGMENT

This work is part of the Comp4Drones project and hasreceived funding from the ECSEL Joint Undertaking (JU)under grant agreement No. 826610. The JU receives supportfrom the European Union’s Horizon 2020 research andinnovation program and Spain, Austria, Belgium, CzechRepublic, France, Italy, Latvia, Netherlands.S

UPPLEMENTARY MATERIALS

The ROS implementation of FAITH can be found here: https://github.com/tudelft/faith , and the sup-porting video https://youtu.be/X09mIqoqAFU . TheObstacle Detection and Avoidance dataset is available at: https://github.com/tudelft/ODA_Dataset .R EFERENCES[1] G.-Z. Yang, J. Bellingham, P. E. Dupont, P. Fischer, L. Floridi, R. Full,N. Jacobstein, V. Kumar, M. McNutt, R. Merriﬁeld et al. , “The grandchallenges of science robotics,”

Science robotics , vol. 3, no. 14, p.eaar7650, 2018.[2] L. Patrick, C. Posch, and T. Delbruck, “A 128x 128 120 db 15 µ slatency asynchronous temporal contrast vision sensor,” IEEE journalof solid-state circuits , vol. 43, pp. 566–576, 2008.[3] G. Gallego, T. Delbruck, G. M. Orchard, C. Bartolozzi, B. Taba,A. Censi, S. Leutenegger, A. Davison, J. Conradt, K. Daniilidis, andD. Scaramuzza, “Event-based vision: A survey,”

IEEE Transactionson Pattern Analysis and Machine Intelligence , pp. 1–1, 2020.[4] D. Falanga, K. Kleber, and D. Scaramuzza, “Dynamic obstacle avoid-ance for quadrotors with event cameras,”

Science Robotics , vol. 5,no. 40, 2020.[5] A. Mitrokhin, C. Ferm¨uller, C. Parameshwara, and Y. Aloimonos,“Event-based moving object detection and tracking,” in . IEEE, 2018, pp. 1–9.[6] A. R. Vidal, H. Rebecq, T. Horstschaefer, and D. Scaramuzza, “Ul-timate slam? combining events, images, and imu for robust visualslam in hdr and high-speed scenarios,”

IEEE Robotics and AutomationLetters , vol. 3, no. 2, pp. 994–1001, 2018.[7] C. Brandli, R. Berner, M. Yang, S.-C. Liu, and T. Delbruck, “A 240 ×

180 130 db 3 µ s latency global shutter spatiotemporal vision sensor,” IEEE Journal of Solid-State Circuits , vol. 49, no. 10, pp. 2333–2341,2014.[8] J. J. Gibson, P. Olum, and F. Rosenblatt, “Parallax and perspectiveduring aircraft landings,”

The American journal of psychology , vol. 68,no. 3, pp. 372–385, 1955.[9] K. Souhila and A. Karim, “Optical ﬂow based robot obstacle avoid-ance,”

International Journal of Advanced Robotic Systems , vol. 4,no. 1, p. 2, 2007.[10] R. Huang and S. Ericson, “An efﬁcient way to estimate the focus ofexpansion,” , pp. 691–695, 2018.[11] M. Serdar Guzel and R. Bicker, “Optical ﬂow based system designfor mobile robots,” in , 2010, pp. 545–550.[12] M. Buczko and V. Willert, “Monocular outlier detection for visualodometry,” in , 2017,pp. 739–745.[13] X. Clady, C. Clercq, S.-H. Ieng, F. Houseini, M. Randazzo, L. Natale,C. Bartolozzi, and R. Benosman, “Asynchronous visual event-basedtime-to-contact,”

Frontiers in Neuroscience , vol. 8, 2014.[14] F. Colonnier, L. D. Vedova, R. Teo, and G. Orchard, “Obstacle avoid-ance using event-based visual sensor and time-to-contact processing,”2018.[15] H. Rebecq, D. Gehrig, and D. Scaramuzza, “Esim: an open eventcamera simulator,” in

Conference on Robot Learning , 2018, pp. 969–982.[16] R. Benosman, C. Clercq, X. Lagorce, S. Ieng, and C. Bartolozzi,“Event-based visual ﬂow,”

IEEE Transactions on Neural Networksand Learning Systems , vol. 25, no. 2, pp. 407–417, 2014. [17] B. Hordijk, K. Scheper, and G. Croon, “Vertical landing for micro airvehicles using event-based optical ﬂow,”

Journal of Field Robotics ,vol. 35, pp. 69–90, 01 2018.[18] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-basedalgorithm for discovering clusters in large spatial databases withnoise.” AAAI Press, 1996, pp. 226–231.[19] H. C. Longuet-Higgins and K. Prazdny, “The interpretation of amoving retinal image,”

Proceedings of the Royal Society of London.Series B. Biological Sciences , vol. 208, pp. 385 – 397, 1980. A PPENDIX IO PTIC F LOW , F

OCUS OF E XPANSION AND T IME -T O -C ONTACT T HEORY

A. Optic Flow Theory

Optic ﬂow (OF) consists of two components, due totranslation and rotation. The OF generated by translationgives information about the scene and the ego-motion of theobserver. In contrast, the OF generated by rotation does notprovide any insights on translational ego-motion. Therefore,the OF in this research is derotated using an onboard IMUsuch that only OF based on translation is used. This researchuses an event-surface method, ﬁrstly proposed by Benosmanet al. [16], and later improved for online application byHordijk et al. [17]. This method generates sparse normalOF. In order to describe the underlying geometry of OF,an arbitrary point from the 3D world is projected on a 2Dsurface. The projected point on the surface has the followingcoordinates (Fig. 10). x = XZ , y = YZ (2)To determine the motion of this point, the equation aboveis differentiated with respect to time. ˙ x = ˙ XZ − X ˙ ZZ , ˙ y = ˙ YZ − Y ˙ ZZ (3)Values for ˙ X , ˙ Y and ˙ Z can be derived (for a derivationsee Longuet-Higgins et al. [19]), resulting in the followingOF equations. u = − UZ + x WZ + Axy − Bx − B + Cy = u T + u R v = − VZ + y WZ − Cx + A + Ay − Bxy = v T + v R (4)Note that these equations consist of a translational( u T , v T ) and rotational ( u R , v R ) component. The rotationalcomponent is a result of camera rotations and does notcontain information about the ego-motion of the observer.This effect is compensated in this research by using theknown ego-rotation from an onboard Inertial MeasurementUnit. B. Focus of Expansion Theory

When an observer translates through a static scene, theOF diverges from a singular point called the FOE. At thislocation on the image, the OF is zero and all OF is directedoutwards. This position is an indication of the course of the ig. 10. Optic ﬂow reference system. From Longuet-Higgins et al. [19]. observer. If the OF is zero and the rotational component isﬁltered out, the following derivation is made using Eq. 4. u T = 0 = − UZ + x F OE

WZv T = 0 = − VZ + y F OE WZ (5)Rewriting these equations gives the following result. x F OE = UW , y

F OE = VW (6)To show OF diverges from the FOE, (5) and (6) are usedto re-express u T and v T . u T = − UZ + xWZ = ( − UW + x ) WZ = ( x − x F OE ) WZv T = − VZ + yWZ = ( − VW + y ) WZ = ( y − y F OE ) WZ (7)Rewriting this equation shows the geometrical relationwhich results in the OF diverging from the FOE. u T v T = x − x F OE y − y F OE (8)This geometrical relation is used as basis for the methodsdiscussed in the benchmark (Section III).

C. Time-To-Contact Theory

The Time-To-Contact (TTC) is a property of each pointin an image, describing its relative velocity to the cameraprinciple axis. Eq. 7 can be rewritten to the followingequation for divergence. WZ = u T x − x FoE = v T y − y FoE (9)The divergence is inversely related to the TTC. τ = ZW (10)In the onboard test of the FAITH method for estimatingthe FOE, an obstacle detection method is used which clustersthe OF based on the vector position and TTC. Divergence is inversely related to the TTC and has converging properties.Although this seems an advantage over TTC, divergence val-ues are much lower (i.e. zero for inﬁnite obstacle distance orzero observer velocity) and unsuitable for proper clustering.Therefore, TTC is chosen as primary clustering variable inthis research. A PPENDIX

IIFOE

OUTSIDE THE FIELD OF VIEW

The performed benchmark on simulated and real event-based camera data considers only FOE locations inside thecamera ﬁeld of view. In the obstacle avoidance strategy, theMAV ﬂies towards clusters which are within the ﬁeld ofview, as this gives certainty about the scene the MAV isﬂying towards. If the MAV ﬂies a course which is outsidethe ﬁeld of view, our method will provide an unbounded FOEregion, and thus also no exact FOE location. Although thisis a limitation of our method, it does provide the generaldirection the MAV is moving towards. The side in whichthe FOE region is unbounded is also side on which theFOE lies, thus the FOE location is bounded to a half-plane. Of the compared methods in the benchmark, only thevector intersections method (e.g. implemented by Buczkoet al. [12]) is able to estimate FOE locations outside theﬁeld of view. Fig. 11 shows the performance of the vectorintersections method for 40 simulated trials, with an FOEangle ranging from ◦ to ◦ . The course estimation errorand CV grows rapidly as the course is further outside theFOE. This results in CV values of over , which showa very low estimation certainty. Therefore, it is concludedthat this method has a limited advantage over our methodregarding estimating the FOE outside the ﬁeld of view. Fig. 11. Performance of the ’Vec. intersections’ method implemented byBuczko et al. [12] for a course of ◦ to ◦ , outside the FOV. The lowerplot shows the coefﬁcient of variation as percentage, CV ==