Fast Simulation of Crowd Collision Avoidance
John Charlton, Luis Rene Montana Gonzalez, Steve Maddock, Paul Richmond
FFast Simulation of Crowd Collision Avoidance
John Charlton − − − , Luis Rene MontanaGonzalez − − − , Steve Maddock − − − , and PaulRichmond − − − University of Sheffield, Sheffield S10 2TN, UK { j.a.charlton,lrmontanagonzalez1,s.maddock,p.richmond } @sheffield.ac.uk Abstract.
Real-time large-scale crowd simulations with realistic behav-ior, are important for many application areas. On CPUs, the ORCApedestrian steering model is often used for agent-based pedestrian simu-lations. This paper introduces a technique for running the ORCA pedes-trian steering model on the GPU. Performance improvements of up to30 times greater than a multi-core CPU model are demonstrated. Thisimprovement is achieved through a specialized linear program solver onthe GPU and spatial partitioning of information sharing. This allowsover 100,000 people to be simulated in real time (60 frames per second).
Keywords:
Pedestrian Simulation · Real-time rendering · GPU-computing
Crowd simulations are important for many applications, such as safety studiesfor communal transport hubs and flows within sports stadiums and large build-ings [29]. Such simulations require believable dynamics that match observedbehavior, including correct collision avoidance, or steering behavior. The Op-timal Reciprocal Collision Avoidance (ORCA) algorithm [4] is an agent-basedsolution that can simulate many real crowd behaviors. Currently, implementa-tions of the ORCA algorithm have been made for single- and multi-core CPU.This paper presents a GPU implementation, supporting real-time simulationsand interactivity for very large populations of order 5 × .Computer models that contain inherent parallelism are suitable candidatesfor GPUs. This applies to agent-based pedestrian simulation models, where allagents follow the same rules. Using steering techniques that lend themselves wellto implementation on GPU architecture can result in much faster performance[2,5]. By increasing performance, greater numbers of people can be simulatedand/or a more accurate, possibly more time-consuming, algorithm can be usedfor the simulation.This paper presents a GPU implementation of the ORCA model for agent-based pedestrian simulation. We parallelize as much of the data and computationas possible, choosing data parallel algorithms and spatial partitioning to allowcommunication between people to provide speedup. Our solution makes use ofa novel low-dimension linear program solver developed for the architecture of a a r X i v : . [ c s . R O ] A ug J. Charlton et al.
GPU [8], and a grid-based spatial partitioning scheme of information transferbetween GPU threads [22]. Grid partitioned data structures are an efficient formof spatial partitioning on the GPU [17]. Our GPU implementation shows perfor-mance increases of up to 30 times over the original CPU multi-core version [4,26]with these changes. In addition, it consistently outperforms the CPU version forsufficiently large amounts of people.The organization of the paper is as follows. Section 2 covers backgroundinformation and related work. Section 3 explains in detail the implementation ofthe ORCA model on the GPU. Section 4 presents results and discussion of themulti-core CPU and GPU ORCA models. Finally, section 5 gives the conclusions.
Many types of models have been proposed to generate local pedestrian motionand collision avoidance [20,21,27]. The simplest separation of steering models isbetween continuum models and microscopic models. Continuum models attemptto treat the whole crowd in a similar way to a fluid, allowing for fast simulationof larger numbers of people, but are lacking in accuracy at the individual personscale [18]. Moving part of the calculation to the GPU has shown performanceimprovements [9]. Overall, however, the model is not ideal for solving on theGPU due to the large sparse data structures. In comparison, microscopic modelstend to be paired with a global path planner to give people goal locations andtrajectories. Such models specify rules at the individual person scale, with crowd-scale dynamics being an emergent effect of the rules and interactions, and easilyallow for non-homogeneous agents and behavior.Popular microscopic models are cellular automata (CA), social forces[14] andvelocity obstacles (VO) [10]. CA are popular due to the ability to reproduce ob-servable phenomena [6,7], but a downside is the inability to reproduce other be-haviors due to using discrete space. CA models are computationally lightweightand lend themselves well to specify certain complex behavior. However, CApedestrian models tend to use discrete spatial rules, where the order of agentmovements are sequential, which does not lend itself to parallelism and GPUimplementations [24]. Social forces models use a computationally lightweight setof rules that allows for crowd-scale observables such as lane formation. Theyare well suited to parallelizing on the GPU since all agents can be updatedsimultaneously, with good performance for many simulated people [15,23]. How-ever, generated simulations can result in unrealistic looking motion and produceundesirable behavior at large densities.Velocity obstacles (VO) work by examining the velocity and position ofnearby moving objects to compute a collision-free trajectory. Velocity-space isanalyzed to determine what velocities can be taken which do not cause colli-sions. VO models lend themselves to parallelization since agents are updatedsimultaneously and navigate independently of one another with minimal explicitcommunication. It tends to be more computational and memory intensive thansocial forces models, but the large throughput capability of the GPU for such ast Simulation of Crowd Collision Avoidance 3 parallel tasks make it a very suitable technique for GPU implementation. Earlymodels assumed that each person would take full responsibility for avoiding otherpeople. Several variations include the reactive behavior of other models [1,16,11].One example is reciprocal velocity obstacles (RVO), where the assumption is thatall other people will take half the responsibility for avoiding collisions [3,12]. Thismodel has been implemented on the GPU [5] and has shown credible speedupover the multi-core CPU implementation through use of hashing instead of naivenearest neighbor search. Group behavior has also been included in VO models[13,30] allowing people to be joined into groups. Such people attempt to remainclose to other members of the group and aim for the same goal location. Afurther extension is optimal reciprocal collision avoidance (ORCA). It providessufficient conditions for collision-free motion. It works by solving low-dimensionlinear programs. Freely available code libraries have been implemented for bothsingle- and multi-core CPU [26].VO techniques are very suitable candidates for GPU implementation. TheRVO model and implementation by Bleiweiss [5] show notable performance gainsagainst multi-core CPU equivalent models. However, these methods must per-form expensive calculations to find a suitable velocity. They tend to performslower and are not guaranteed to find the best velocity. ORCA is deemed moresuitable because of its performance relative to other VO models and collision-freemotion, theoretically providing “better” motion (i.e. less collisions).Linear programming is a way of maximizing an objective function subject toa set of constraints. For ORCA, linear programming is used to find the closestvelocity to a person’s desired velocity which does not result in collisions. It isimportant to choose a solver that is efficient on the GPU at low dimensions. Apopular solver type is the Simplex method. This is best suited for large dimensionproblems and struggles at lower dimensions. The incremental solver [25] is effi-cient at low dimensions but suffers on the GPU due to load balance: not all GPUthreads have the same amount of computation, which reduces the performanceon such parallel architecture. The batch GPU two-dimension linear solver [8] isan efficient way to solve the numerous linear problems simultaneously. We makeuse of this approach, demonstrating its use for large-scale pedestrian simulations.
The proposed algorithm is based on the multi-core ORCA model and appliesGPU optimizations. This section provides an overview of the algorithm as wellas important changes and optimizations that need to be made to make thesimulation efficient for running on the GPU. For more in-depth description ofthe ORCA algorithm, see the work of van den Berg et. al [4]. The main changesare the use of an efficient linear program made for GPUs and an efficient methodof communication between GPU cores for people to “observe” properties of otherpeople.As an overview to the ORCA model, each person in the model has a startlocation and an end location they want to reach as quickly as possible, subject
J. Charlton et al.(a) (b)(c) (d)
Fig. 1: (a) A system of 2 people a and b with corresponding radius r a and r b . (b)The associated velocity obstacle V O a | b in velocity space for a look-ahead periodof time τ caused by the neighbor b for a . (c) The vector of velocities v opta − v opta lies within the velocity obstacle V O a | b . The vector u is the shortest vector to theedge of the obstacle from the vector of velocities. The corresponding half-plane ORCA a | b is in the direction of u , and intersects the point v opta + u . (d) A viewof a blue agent and its neighbors, as well as the generated half-planes causedby the neighbors interacting with the blue agent. The solid blue arrow showsthe desired velocity of the blue agent. The dotted blue arrow is the resultingcalculated velocity that does not collide with any neighbor in time τ . ast Simulation of Crowd Collision Avoidance 5 to an average speed and capped maximum speed. For each simulation iteration,each agent “observes” properties of nearby people, namely radius, the currentposition and velocity. For each nearby agent a half-plane of restricted velocities iscalculated (figure 1). By selecting a velocity not restricted by this half-plane, thetwo agents are guaranteed to not collide within time τ , where τ is the lookaheadtime , the amount of forward time planning people make to avoid collisions. Byconsidering all nearby agents, the set of half-planes creates a set of velocitiesthat, if taken, do not collide with any nearby agents in time τ . The agent thenselects from the permissible velocities the one closest to its desired velocity andgoal. Figure 1d shows the resulting half-planes caused by neighboring agentson an example setup, and the optimal velocity that most closely matches theperson’s desired velocity.It is possible that the generated set of half-planes does not contain any possi-ble velocities. Such situations are caused by large densities of people. The solutionis to select a velocity that least penetrates the set of half-planes induced by theother agents. In this case, there is no guarantee of collision-free motion.The computation of velocity subject to the set of half-planes is done usinglinear programming. The problem for the linear program is defined with theconstraints corresponding to the half-plane ORCA a | b of velocities, attemptingto minimize the difference of the suitable velocity from the desired velocity. Sinceeach agent needs to find a new velocity, there is a linear problem correspondingto each agent, each iteration. The algorithm used to solve this is the batch-GPU-LP algorithm [8]. It is an algorithm designed for solving multiple low-dimensional linear programs on the GPU, based on the randomized incrementallinear program solver of Seidel [25].This batch-LP solver works by initially assigning each thread to a problem(i.e. one pedestrian). Each thread must solve a set of half-plane constraints,subject to an optimization function. Respectively, these are that the personshould not choose a velocity that collides with other people, and the personwants to travel as close to their desired velocity as possible.Each half-plane constraint is considered incrementally. If the current velocityis not satisfied by the currently considered constraint a new valid velocity iscalculated. The calculation of a new velocity is one of the most computationallyexpensive operations. It is also very branched, as only only some of the solversrequire a new valid velocity and others can maintain their current value. Thisbranching calculation causes the threads that do no need to perform a calcu-lation to remain idle while the other threads perform the operation. This is anunbalanced workload on the GPU device and can vastly reduce the throughputas many threads do not perform any calculations, exacerbated by the fact thatthose threads performing the operation must take a lot of time to complete theoperation.The implementation of this calculation uses ideas from cooperative threadarrays [28] to subdivide the calculation into “work units” , blocks of equal sizecomputation. These work units can be transferred to and computed by differentthreads, allowing for a balanced work load and good performance. If the thread J. Charlton et al. does not need to compute a new velocity, then it can aid in another problem’scalculation. This algorithm shows performance improvements over state-of-the-art CPU LP solvers and other GPU LP solvers [8].Fig. 2: FLAME message partitioning. The simulation is discretized into spatialbins and people save their message to the corresponding bin. For a given person(blue star), it does not read messages of those in non-neighboring bins (whitepentagons). For those within the same or neighboring partitioning bins, it cal-culates whether they are within the observation radius r obs . If not, they areignored (grey pentagons). If they are within the observation radius (green pen-tagon) the person is aware of them and will attempt to avoid them accordingly,by generating corresponding ORCA half-planes of valid velocities.The other main improvement is concerning the communication between peo-ple. Some information must be observed by people in the model. Examples arethe radius, speed and position of others nearby. In order to communicate thisinformation between people we use the idea of messages from the FLAME GPUframework [22], which is demonstrated in figure 2. Each agent creates a messagewhich contains information on observables about themselves. Each message isassigned a spatial location equal to the position of the agent in the simulation.These messages are organized into spatial bins. Each agent will then read themessages from its associated bin, and those neighboring. This method is far fasterthan a brute-force read all approach for large spaces and many people/messages.The associated overhead in organizing messages into bins outweighs the cost ofreading all messages and discarding those far away. A possible alternate imple-mentation, that is used by the CPU model, uses a KD-tree spatial partitioning.It is expected, from the work of Li and Mukundan [17], that this grid basedspatial partitioning is faster than a KD-tree implementation on the GPU. This section presents the results of two experiments. The first experiment iscomposed of two test cases to demonstrate the appearance and correctness ofthe model. The first test case is a two-way crossing and the second test case ast Simulation of Crowd Collision Avoidance 7 is an eight-way crossing. The second experiment demonstrates the performancecompared to the equivalent multi-core CPU version [4,26].For the first experiment, all the test cases are set up in a similar way. Multipleassociated start and end regions are chosen, such that people are spawned in astart region with a target in an associated end region. Random spawn locationsare chosen so that there is no overlap with other people within a certain timeperiod based on person size and speed. Within a simulation, each agent hasa goal location to aim for. The agent’s velocity is in the direction of the goallocation, scaled to the walking speed. Once a person reaches the goal locationthey are removed from the simulation. Once all people have reached their goalthe simulation is ended.The first test case was a 2-way crossing, with the two crowds attempting topass amongst each other to reach their destination. Two variations of this weresimulated. The first involves all people with the same size and speed parameters.The second version varies the size and speed parameters of each individual.Figure 3 shows the first variation for 2 . × people. The starting region of onegroup of people is the same area as the goal region of the other, forcing the twogroups to navigate past each other. All agents have the same parameters, namelyradius= 0 . . . J. Charlton et al.
In the second variation of the first test case, people have different sizes anddifferent maximum speeds. Figure 4 illustrates this. In this example all peoplehave an equal chance of being of radius 0 . .
75m or 1m (shown by personsize in the figure, as well as S, M and L on their tops) and, independently, anequal chance of a desired speed of 1m/s, 1 . x direction (left to right) have red tops, and people moving in the negative x direction (right to left) have blue tops. Brighter shaded tops indicate the largestdesired speed, 2m/s, and the darkest tops indicate the slowest speed, 1m/s.Fig. 4: Visualization of 2,500 people in Unreal. Two crowds navigate past eachother, one heading from left to right (blue clothes) and the other from rightto left (red clothes). People are color coded according to their maximum speeds(using three shades of red or blue, respectively) and have varying radii (indicatedby their actual size and also using S, M and L on their tops). Top: scene viewfrom above; One pedestrian is highlighted with a green circle; Inset: view fromthe perspective of the pedestrian in the green circleThe second test case was an 8-way crossing, visualized in figure 5. Each crowdmust navigate 135 deg across the environment, resulting in a vortex-like patternaround the center.The second experiment was designed to test the performance of the GPUimplementation in comparison to the multi-core CPU implementation. Figure 6shows the results that varying numbers of people have on the frame time. Varioustest cases (e.g. 2-way and 8-way crossings) with different agent parameters wererun, and the timings averaged between them. In this experiment no visualisationwas used so as to ensure the timings were due to the algorithm only. The GPUsolution gives speed increases of up to 30 times compared to the multi-core CPUimplementation. Results for the single-core CPU version are not given as for any ast Simulation of Crowd Collision Avoidance 9 Fig. 5: Visualization of 10,000 people in Unreal. Eight crowds attempting tonavigate to the opposite end of the environment. Different colors are used foreach crowd. Top: scene view from above; One pedestrian is highlighted with agreen circle; Inset: view from the perspective of the pedestrian in the green circlesizeable number of agents the multi-core CPU implementation always outper-forms the single-core CPU implementation. This is due to better utilization ofthe CPU device. The colored bars of figure 6 correspond to the primary (left)vertical axis, which uses a logarithmic scale. The relative time taken betweenthe charts corresponds to the secondary (right) vertical axis, with linear scale.The results show that the speed increases proportionally to the number ofpeople. Greater relative speed-up occurs for even larger numbers of people, butthe time taken per frame is below real time. The GPU simulations ran at closeto 30 frames a second (33ms per frame) for up to 5 × agents. The CPUversion performs better for smaller number of agents, with a crossover occurringat approximately 2 × agents. This is due to the GPU device not being fullyutilized for smaller simulations and the reduced throughput being outperformedby the CPU. The experiments were run on an NVIDIA GTX 970 GPU cardwith 4GB dedicated memory and a 4-core/8-thread Intel i7-4790K with 16 GBRAM. The GPU was connected by PCI-E 2.0. The GPU software was developedwith NVIDIA CUDA 8.0 on Windows 10. On the GPU tested, there was a limiton the amount of usable memory of 4GB, which corresponded to approximately5 × people. It is expected that relative performance increases will continue tobe obtained for larger numbers of people for the GPU implementation for GPUswith larger memory capacity. Fig. 6: Frame time (in ms) for multi-core CPU and GPU ORCA models withvarying numbers of people. Logarithmic scale on primary (left) vertical axis.Relative timing is given on the secondary (right) vertical axis, in linear scale.Simulation time only without visualisation of the pedestrians.
We have introduced a GPU-optimized version of the ORCA model. It showssubstantial performance increases for large numbers of people compared to themulti-core CPU version. We demonstrated the performance gains through real-time visualizations that would not be possible on similar level CPU hardware.Our model is currently limited in the number of people in the simulation sizedue to GPU memory. The models use large amounts of memory for storing theORCA half-planes of each person. Memory usage could be reduced by consideringfewer people. This would reduce the memory of each person but may result inless realistic motion with greater chance of collisions. A solution to the lack ofmemory is with Maxwell and later architectures, which can use managed memory[19] to page information from CPU to GPU on demand. This would allow formany more people to be simulated, up to the computer’s RAM capacity. It isexpected that greater relative speedups between multi-core CPU and GPU willcontinue to be obtained for even larger amounts of simulated people.It is expected that the more computationally expensive steering models wouldinclude more realistic motion such as side-stepping, more realistic densities, andless probability of collisions. In comparison, it is expected that the model in thispaper would have greater performance and larger numbers of simulated people.The current work involves writing the data from the simulation to a file beforevisualization using Unreal. The data is copied from the GPU to the CPU, thenloaded into Unreal and copied back to the GPU in Unreal for visualization. Thisis expensive. Future work will look at how to use the Unreal engine to visualizea simulation as it is calculated, which could be done by sharing GPU bufferinformation between the simulation program and the Unreal Engine. ast Simulation of Crowd Collision Avoidance 11
Acknowledgement.
This research was supported by the Transport Systems Catapultand the National Council of Science and Technology in Mexico (Consejo Nacional deCiencia y Tecnolog´ıa, CONACYT).
References
1. Abe, Y., Yoshiki, M.: Collision avoidance method for multiple autonomous mo-bile agents by implicit cooperation. In: Proceedings 2001 IEEE/RSJ InternationalConference on Intelligent Robots and Systems. Expanding the Societal Role ofRobotics in the the Next Millennium (Cat. No.01CH37180). vol. 3, pp. 1207–1212vol.3 (Oct 2001). https://doi.org/10.1109/IROS.2001.9771472. Barut, O., Haciomeroglu, M., Sezer, E.A.: Combining GPU-generated linear trajec-tory segments to create collision-free paths for real-time ambient crowds. GraphicalModels , 31–45 (Sep 2018). https://doi.org/10.1016/j.gmod.2018.07.0023. van den Berg, J., Lin, M., Manocha, D.: Reciprocal Velocity Obstaclesfor real-time multi-agent navigation. In: IEEE International Conference onRobotics and Automation, 2008. ICRA 2008. pp. 1928–1935 (May 2008).https://doi.org/10.1109/ROBOT.2008.45434894. van den Berg, J., Guy, S.J., Lin, M., Manocha, D.: Reciprocal n-Body CollisionAvoidance. In: Robotics Research, pp. 3–19. Springer, Berlin, Heidelberg (2011).https://doi.org/10.1007/978-3-642-19457-3 15. Bleiweiss, A.: Multi agent navigation on the gpu. White paper, GDC (2009)6. Blue, V., Adler, J.: Emergent Fundamental Pedestrian Flows From Cel-lular Automata Microsimulation — Request PDF. Transportation ResearchRecord: Journal of the Transportation Research Board , 29–36 (1998).https://doi.org/http://dx.doi.org/10.3141/1644-047. Blue, V., Adler, J.: Cellular Automata Microsimulation of Bidirectional PedestrianFlows. Transportation Research Record: Journal of the Transportation ResearchBoard , 135–141 (Jan 1999). https://doi.org/10.3141/1678-178. Charlton, J., Maddock, S., Richmond, P.: Two-dimensional batch linear program-ming on the GPU. Journal of Parallel and Distributed Computing , 152–160(Apr 2019). https://doi.org/10.1016/j.jpdc.2019.01.0019. Fickett, M., Zarko, L.: GPU Continuum Crowds. CIS Final Project Final report,University of Pennsylvania (2007)10. Fiorini, P., Shiller, Z.: Motion Planning in Dynamic Environments Using Veloc-ity Obstacles. The Int’l Journal of Robotics Research (7), 760–772 (Jul 1998).https://doi.org/10.1177/02783649980170070611. Fulgenzi, C., Spalanzani, A., Laugier, C.: Dynamic Obstacle Avoidance in uncer-tain environment combining PVOs and Occupancy Grid. In: Proceedings 2007IEEE International Conference on Robotics and Automation. pp. 1610–1616.IEEE, Rome, Italy (Apr 2007). https://doi.org/10.1109/ROBOT.2007.36355412. Guy, S.J., Chhugani, J., Kim, C., Satish, N., Lin, M., Manocha, D., Dubey, P.:ClearPath: Highly Parallel Collision Avoidance for Multi-agent Simulation. In:Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Com-puter Animation. pp. 177–187. SCA ’09, ACM, New York, NY, USA (2009).https://doi.org/10.1145/1599470.159949413. He, L., Pan, J., Narang, S., Wang, W., Manocha, D.: Dynamic Group Behaviorsfor Interactive Crowd Simulation. arXiv:1602.03623 [cs] (Feb 2016)14. Helbing, D., Moln´ar, P.: Social force model for pedestrian dynamics. Phys. Rev. E (5), 4282–4286 (May 1995). https://doi.org/10.1103/PhysRevE.51.42822 J. Charlton et al.15. Karmakharm, T., Richmond, P.: Agent-based Large Scale Simulation of Pedestri-ans With Adaptive Realistic Navigation Vector Fields. EG UK Theory and Practiceof Computer Graphics (2010)16. Kluge, B., Prassler, E.: Recursive Probabilistic Velocity Obstacles for ReflectiveNavigation. In: Yuta, S., Asama, H., Prassler, E., Tsubouchi, T., Thrun, S. (eds.)Field and Service Robotics: Recent Advances in Reserch and Applications, pp.71–79. Springer Tracts in Advanced Robotics, Springer Berlin Heidelberg, Berlin,Heidelberg (2006). https://doi.org/10.1007/10991459 817. Li, B., Mukundan, R.: A Comparative Analysis of Spatial Partitioning Methods forLarge-Scale, Real-Time Crowd Simulation. V´aclav Skala - UNION Agency (2013)18. Narain, R., Golas, A., Curtis, S., Lin, M.C.: Aggregate Dynamics forDense Crowd Simulation. In: ACM SIGGRAPH Asia 2009 Papers. pp.122:1–122:8. SIGGRAPH Asia ’09, ACM, New York, NY, USA (2009).https://doi.org/10.1145/1661412.161846819. Nvidia: Tuning CUDA Applications for Maxwell.http://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html (2018)20. Pettr´e, J., Kallmann, M., Lin, M.C.: Motion Planning and Autonomy for VirtualHumans. In: ACM SIGGRAPH 2008 Classes. pp. 42:1–42:31. SIGGRAPH ’08,ACM, New York, NY, USA (2008). https://doi.org/10.1145/1401132.140119321. Pettr´e, J., Pelechano, N.: Introduction to Crowd Simulation. In: Bousseau, A.,Gutierrez, D. (eds.) EG 2017 - Tutorials. The Eurographics Association (2017).https://doi.org/10.2312/egt.2017102922. Richmond, P.: Flame Gpu Technical Report and User Guide. Department of Com-puter Science Technical Report CS-11-03, University of Sheffield (2011)23. Richmond, P., Romano, D.M.: A high performance framework for agent basedpedestrian dynamics on gpu hardware. Proceedings of EUROSIS ESM (2008)24. Sch¨onfisch, B., de Roos, A.: Synchronous and asynchronous updating in cellularautomata. Biosystems (3), 123–143 (Sep 1999). https://doi.org/10.1016/S0303-2647(99)00025-825. Seidel, R.: Small-dimensional linear programming and convex hullsmade easy. Discrete Comput Geom (3), 423–434 (Sep 1991).https://doi.org/10.1007/BF0257469926. Snape, J.: Optimal Reciprocal Collision Avoidance (C++). Contribute tosnape/RVO2 development by creating an account on GitHub (Mar 2019)27. Thalmann, D.: Populating Virtual Environments with Crowds. In: Proceedingsof the 2006 ACM International Conference on Virtual Reality Continuum andIts Applications. pp. 11–11. VRCIA ’06, ACM, New York, NY, USA (2006).https://doi.org/10.1145/1128923.112892528. Wang, Y., Davidson, A., Pan, Y., Wu, Y., Riffel, A., Owens, J.D.: Gunrock: Ahigh-performance graph processing library on the GPU. In: Proceedings of the 21stACM SIGPLAN Symposium on Principles and Practice of Parallel Programming.p. 11. ACM (2016)29. Xu, M.L., Jiang, H., Jin, X., Deng, Z.: Crowd Simulation and Its Applications:Recent Advances. Journal of Computer Science and Technology29