[PDF] A Cooperative Dynamic Task Assignment Framework for COTSBot AUVs

Abstract

This paper presents a cooperative dynamic task assignment framework for a certain class of Autonomous Underwater Vehicles (AUVs) employed to control outbreak of Crown-Of-Thorns Starfish (COTS) in Australia's Great Barrier Reef. The problem of monitoring and controlling the COTS is transcribed into a constrained task assignment problem in which eradicating clusters of COTS, by the injection system of COTSbot AUVs, is considered as a task. A probabilistic map of the operating environment including seabed terrain, clusters of COTS, and coastlines is constructed. Then, a novel heuristic algorithm called Heuristic Fleet Cooperation (HFC) is developed to provide a cooperative injection of the COTSbot AUVs to the maximum possible COTS in an assigned mission time. Extensive simulation studies together with quantitative performance analysis are conducted to demonstrate the effectiveness and robustness of the proposed cooperative task assignment algorithm in eradicating the COTS in the Great Barrier Reef.

Full PDF

1  Abstract —This paper presents a cooperative dynamic task assignment framework for a certain class of Autonomous Underwater Vehicles (AUVs) employed to control outbreak of Crown-Of-Thorns Starfish (COTS) in the Australia’s Great Barrier Reef. The problem of monitoring and controlling the COTS is transcribed into a constrained task assignment problem in which eradicating clusters of COTS, by the injection system of COTSbot AUVs, is considered as a task. A probabilistic map of the operating environment including seabed terrain, clusters of COTS, and coastlines is constructed. Then, a novel heuristic algorithm called Heuristic Fleet Cooperation (HFC) is developed to provide cooperative injection of the COTSbot AUVs to the maximum possible COTS in an assigned mission time. Extensive simulation studies together with quantitative performance analysis are conducted to demonstrate the effectiveness and robustness of the proposed cooperative task assignment algorithm in eradicating the COTS in the Great Barrier Reef.

Note to Practitioners —This research is motivated by controlling outbreak of COTS, in the Australia’s Great Barrier Reef, by employing a certain class of AUVs equipped with the injection arm. A new evolutionary algorithm, called HFC, is designed upon a hierarchal priority evolution to facilitate multivehicle dynamic task assignment problem. The HFC is able to rapidly prototype multiple optimal solutions for efficient grouping and accomplishment of the tasks spread over a large operation area.

Index Terms — Cooperation, COTSbot AUVs, Crown-Of-Thorns Starfish, Task assignment I. I NTRODUCTION

N recent years, autonomous underwater vehicles (AUVs) have been used excessively in various ranges of oceanic surveys and applications such as mine detection, offshore infrastructure manipulation and maintenance, mapping, data collection and sampling, and monitoring [1], [2]. For example, for environmental scientists, an AUV can be a great tool to collect submerged data of marine organisms such as sediment transport in degraded ecosystems [3], [4]. Due to the unique underwater ecological nature of Australia, monitoring undersea terrains and biological organisms have been always important for Australian Government and environmentalists. One of the challenging issues of recent years that the Queensland’s Government has been dealt with is the destructive impact of COTS destroyed more than 50% of the coral in the Great Barrier Reef [5]. Different measures have been taken into account for this particular problem, however, using traditional approaches in removing starfish, undertaken

Amin Abbasi is with the Department of Electrical Engineering,Azad University of Khoemeinishar, Esfahan, Iran (e-mail: [email protected]) Somaiyeh MahmoudZadeh is with the School of IT, Deakin University, Geelong, VIC 3220, Australia (e-mail: [email protected]) with human divers, are not operationally effective and logistically economical. Thus, Queensland University of Technology (QUT) was developed a particular class of AUV called COTSbot AUV equipped with vison-based technology and an injection arm to eradicate COTS in the Great Barrier Reef automatically [6]. Even though the employment of the proposed AUV was promising and effective in protection of corals against population of COTS compared to the traditional approaches, however, this performance can be further improved by employing a group of COTSbot AUVs working cooperatively in the Great Barrier Reef region. The main purpose of this paper is to use COTSbot AUVs in the context of multi-robot task assignment (MRTA) to enhance the operational performance in conjunction with saving operation cost and time for the particular problem of control of COTS in the Great Barrier Reef. MRTA systems are usually utilized to perform missions which are time-consuming or difficult to be carried out by a single robot/vehicle as they are more capable of handling large-scale and complex missions and more fault-tolerant than a single robot system [7]. The main objectives of using MRTA technology is to maximize the operational performance and to minimize operation time/cost; these are achieved by organizing the underlying mission in the context of task allocation and assignment in which clustering, classifying, prioritizing, and accomplishment of tasks are key elements. For example, in [8], a group of Unmanned Aerial Vehicles (UAVs) was used in a detect-and-treat mission to first identify the palm trees which infested by weevils and then treat them by pesticide. The mission was defined as an MRTA problem, and a bio-inspired algorithm based on bacteria foraging behavior was employed to solve the problem. In [9], a search-and-rescue mission was carried out by multiple UAVs in which the performance impact algorithm was proposed to define a dynamic grouping allocation to deal with communication disruptions and to avoid conflict in task assignment of UAVs. In [10], a MRTA problem was defined for a team of AUVs in which a workload balance algorithm was developed for investigation of suspicious objects in an underwater environment in presence of obstacles and wave disturbance. In [11], an MRTA problem which considers the time utility and energy consumption of a team of robot was mathematically formulated as a multi-objective optimization problem and solved by multi-objective PSO algorithm. In [12], a team of wheeled mobile robots was employed to carry items from storage racks to packing dock in a factory. The items had

Amirmehdi Yazdani is with the College of Science, Health, Engineering and Education, Murdoch University, Perth, WA 6150, Australia (e-mail: [email protected])

A Cooperative Dynamic Task Assignment Framework for COTSBot AUVs

Amin Abbasi, Somaiyeh MahmoudZadeh, Amirmehdi Yazdani I P ROBLEM F ORMULATION

Figure 1 illustrates the concept of employing multiple COTSbot AUVs for COTS control. The COTSbot AUV is equipped with a robotic arm and injection mechanism with an onboard vision-based controller [6] to coordinate arm movement and vehicles position according to the COTS location. A partial map of the Barrier Reef region with the Latitude of 〈 〉 to 〈 〉 and longitude of 〈 〉 to 〈 〉 is provided for the AUVs operation, presented in Fig.2. Fig. 1. Concept of employing multiple AUVs equipped with the injection system for COTS control.

Multiple vehicles’ cooperative operations would be a useful idea to minimize the cost of deployment, launch and recovery, and to increase the efficiency of the undersea missions 3 restricted by a vehicle battery capacity. This research aims to use the COTSBot AUVs to identify COTS within complex reef environments and perform injection to eradicate them. The AUVs should cooperatively inject the maximum possible COTS in the assigned mission time. The vehicles should be able to update each other about the COTS areas cleaned up to increase the effectiveness of coverage and to avoid duplicating the treatment.

Remark

1- The AUVs are equipped with the acoustic navigation aids such as digital ultra-short baseline (DUSBL), and therefore they are able to share their localization information and coordinate of COTS areas to each other in a fixed sample time via the acoustic communication. This share/exchange of information between the AUVs contributes to effectiveness of coverage and avoiding duplicating the treatment.

Assumption

1- It is assumed that the AUVs use the constant thrust power during the mission and therefore the average vehicles’ velocity is constant.

Fig. 2. A snapshot of selected map area in the Queensland’s Great Barrier Reef region with the Latitude of 〈 〉 to 〈 〉 and longitude of 〈 〉 to 〈 〉. The problem of controlling COTS is transcribed into a cooperative task assignment problem in which eradicating the COST (via the injection system mounted on the AUVs) is defined as a task. The AUVs are given a probabilistic map of the undersea environment prior to the mission, and they use the developed vison-based technology [6], to accurately distinguish the COTS and to perform the injection process. During the mission, each vehicle should exchange the information of its pose, COTS coordinate, and number of tasks completed with other vehicle. In a case that a vehicle aborts its mission for any reason (e.g., ran out of battery), the closest vehicle undertakes the incomplete mission, or the mission is rearranged between several vehicles and a new mission scenario for all vehicles is planned. In the subsequent sections, the detailed mathematical representations of this problem are provided. A. Modelling the operation field and distribution of the COTS

A prior knowledge of the terrain such as coastal areas, forbidden operation zones, and the coordinate of start and endpoint enhances AUVs’ capability in robust motion planning. Even though preparing a perfect offline map is rarely possible in undersea operations, AUVs can take advantage of any partially constructed map to have a rough perception of the operating field in a priori. The seabed terrain is modelled using a numerical estimated model of the field, derived from the following equations shown in (1): 𝜏 𝑥,𝑦𝕍 = { 𝜏 𝑥𝕍 = (𝔗𝑦−𝔗𝑦 𝒪 )(𝑒 −(𝕍−𝒪)2𝓇−2 −1)2𝜋(𝕍−𝒪) 𝜏 𝑦𝕍 = (𝔗𝑥−𝔗𝑥 𝒪 )(1−𝑒 −(𝕍−𝒪)2𝓇−2 )2𝜋(𝕍−𝒪) (1) where, 𝜏 𝑥,𝑦 represents non-uniform distributions of COTS in a 2D plane of volume of 𝕍 ; 𝒪 and 𝓇 are the centre and radius of high-density COTS area; 𝔗 corresponds to density of distribution around each centre ( 𝒪 ). Information of coastlines and islands locations are known priori and included in the map. Probability of COTS distribution in the 𝕍 : 〈{1 × 1 𝑘𝑚 } 𝑥−𝑦 〉 area is depicted by Fig.3. Fig. 3. Probability of COTS distribution in the operation environment captured from

Australia’s Great Barrier Reef. 𝒞 = { 𝒞 , … , 𝒞 𝑗 , … , 𝒞 𝑛 } , where each node 𝒞 𝑗 is a spot to be injected. In this context, the AUV mission planner simultaneously tends to determine the optimum order of COTS spots to be injected mathematically described as follows: 𝒞 = {𝒞 , … , 𝒞 𝑗 , … , 𝒞 𝑛 }; ∀𝒞 𝑗 , ∃ 𝜌 𝒞 𝑗 ~𝕌(1,100),∀𝒞 𝑗 , ∃ 𝑡 𝒞 𝑗 ~𝕌(60,90) (2) where, 𝑡 𝒞 𝑗 is the time required to complete injection, which depends on COTs density in the area, and can range between 60 to 90 seconds. 𝜌 𝒞 𝑗 is the task priority rank. The high intensity areas are associated with a greater number of tasks (killing COTs) which should be completed. B. Mathematical Model of Multiple Vehicles Operation

Assuming there are 𝓀 number of identical AUVs in the fleet 𝒜 = 〈𝒜 , … , 𝒜 𝓀 〉 with six degrees of freedom for translational and rotational motion in NED { n } and Body { b } frames; the physical model of 𝒜 𝑖 moving in a 3D volume is described as follows [19, 20]: ∀𝒜 𝑖 , {𝑛} → ∃ 𝜂 𝑖 = [𝑥 𝑖 , 𝑦 𝑖 , 𝑧 𝑖 , 𝜑 𝑖 , 𝜃 𝑖 , 𝜓 𝑖 ] 𝑇 {𝑏} → ∃ 𝑣 𝑖 = [𝑣 𝑖,𝑥 , 𝑣 𝑖,𝑦 , 𝑣 𝑖,𝑧 , 𝑝 𝑖 , 𝑞 𝑖 , 𝑟 𝑖 ] 𝑇 (3) {𝑣 𝑖,𝑥 = |𝑣 𝑖 | cos 𝜃 𝑖 cos 𝜓 𝑖 𝑣 𝑖,𝑦 = |𝑣 𝑖 | cos 𝜃 𝑖 sin 𝜓 𝑖 𝑣 𝑖,𝑧 = |𝑣 𝑖 | sin 𝜃 𝑖 (4) where, 𝜂 𝑖 denotes the 𝒜 𝑖 state vector on NED, including the position in North, 𝑥 𝑖 , East, 𝑦 𝑖 , Down, 𝑧 𝑖 and the Euler angles of roll 𝜑 𝑖 , pitch 𝜃 𝑖 and yaw 𝜓 𝑖 motions. The 𝑣 𝑖 : 〈𝑣 𝑖,𝑥 , 𝑣 𝑖,𝑦 , 𝑣 𝑖,𝑧 〉 is the 𝒜 𝑖 translational velocity vector along the surge, sway and heave directions; while 𝑣 𝑖 : 〈𝑝 𝑖 , 𝑞 𝑖 , 𝑟 𝑖 〉 is the 𝒜 𝑖 vector of rotational velocity . The vehicle rotation along the z-axis (yaw angle) and y-axis (pitch angle) are obtained via (5)–(6). 𝜓 𝑖 (𝑡) = arctan ( ∆𝑦 𝑖 (𝑡)∆𝑥 𝑖 (𝑡) ) (5) 𝜃 𝑖 (𝑡) = arctan ( −∆𝑧 𝑖 (𝑡)√(∆𝑦 𝑖 (𝑡)) +(∆𝑥 𝑖 (𝑡)) ) (6) Assumption

2- The vehicle rotation along the x -axis, roll angle, is assumed to be negligible in this study. The distance travelled by vehicle 𝒜 𝑖 from spot 𝒞 𝑗 to 𝒞 𝑙 is calculated via (7). 𝒟 𝒜 𝑖 𝒞 𝑗,𝑙 (𝑡) = √(∆𝑥 𝒜 𝑖 (𝑡)) + (∆𝑦 𝒜 𝑖 (𝑡)) + (∆𝑧 𝒜 𝑖 (𝑡)) (7) The objective of multi-AUVs mission planning system is to find an optimal route ℜ that maximizes the total number of injected COTs (in a possible widest area) for a restricted mission time for each vehicle (𝒯 𝒜 𝑖 ∇ ) while optimizing some performance indices such as operation time and travel distance. To this end, the vehicles move through the high-density COTS areas following the route generated by the mission planning system, where on-time visit to the target station is the main concern of the framework. Due to energy restrictions and extensive number of COTS distributed in a large operation field, completing all tasks in one mission is not feasible for a limited number of vehicles. Therefore, an impact factor of 𝜌 has been assigned to COTS centers to prioritize the order of tasks which should be completed and govern the vehicles toward the destination. In this framework, any arbitrary route ℜ traveled by 𝒜 𝑖 is characterized by the corresponding time 𝒯 𝒜 𝑖 ℳ required for travelling ℜ 𝒜 𝑖 and completing injection process on each 𝒞 𝑗 , which is modelled by: ℜ 𝒜 𝑖 : 〈𝑆 𝑥𝑦𝑧𝑠 𝑖 , … , 𝒞 𝑗,𝑥,𝑦𝑧 , 𝒞 𝑙,𝑥,𝑦𝑧 , … , 𝑆 𝑥𝑦𝑧𝐺 𝑖 〉∀, 𝒞 𝑙,𝑥,𝑦𝑧 ; ∃ (𝒟 𝒞 𝑗,𝑙 , 𝒯 𝒞 𝑗,𝑙 , 𝜌 𝒞 𝑗 , 𝜌 𝒞 𝑙 )𝒟 𝒞 𝑗,𝑙 = √(𝒞 𝑙,𝑥 − 𝒞 𝑗,𝑥 ) + (𝒞 𝑙,𝑦 − 𝒞 𝑗,𝑦 ) + (𝒞 𝑙,𝑧 − 𝒞 𝑗,𝑧 ) ∀𝒟 𝒞 𝑗.𝑙 ; ∃𝒯 𝒞 𝑗,𝑙 = 𝒟 𝒞 𝑗,𝑙 × |𝑣 𝒜 𝑖 | −1 + 𝑡 𝒞 𝑗 + 𝑡 𝒞 𝑙 (8) ∀ℜ 𝒜 𝑖 , ∃ 𝒯 𝒜 𝑖 ℳ = ∑ (𝛼 × min (𝒟 𝒞 𝑗,𝑙 , 1) × 𝒯 𝒞 𝑗,𝑙 ) 𝑛𝑗=0𝑙≠𝑗 , 𝛼 ∈ {0,1}𝐶𝑜𝑠𝑡 ℜ 𝒜𝑖 = 𝜆 |𝒯 𝒜 𝑖 ℳ − 𝒯 𝒜 𝑖 ∇ | + 𝜆 (∑ 𝛼𝒞 𝑗 × 𝜌 𝒞 𝑗 𝑛𝑗=1 ) −1 + 𝜆 ∗ 𝛾 ℜ 𝒜𝑖 𝛾 ℜ 𝒜𝑖 = 𝜀 × max(0; 𝒯 𝒜 𝑖 ℳ − 𝒯 𝒜 𝑖 ∇ ) 𝐶𝑜𝑠𝑡 𝑡𝑜𝑡𝑎𝑙 = ∑ 𝐶𝑜𝑠𝑡 ℜ 𝒜𝑖 𝑘𝑖=1 (9) where 𝒯 𝒜 𝑖 ℳ is the mission time completed by 𝒜 𝑖 , 𝒯 𝒜 𝑖 ∇ is the total battery time for 𝒜 𝑖 ; 𝑆 𝑥𝑦𝑧𝑠 𝑖 and 𝑆 𝑥𝑦𝑧𝐺 𝑖 correspond to the start and goal stations for 𝒜 𝑖 , 𝒟 𝒞 𝑗.𝑙 is the distance from COTS in 𝒞 𝑗 to 𝒞 𝑙 and ℜ 𝒜 𝑖 is the route travelled by 𝒜 𝑖 with ground referenced velocity of |𝑣 𝒜 𝑖 |. The 𝛼 is the selection variable to show selected COTS spots in the network, while each COTS spot like 𝒞 𝑗 is weighted in advance by a priority value of 𝜌 𝒞 𝑗 . The total injected COTS in a mission should be maximized, and the mission time should approach the total available time for 𝒜 𝑖 , which is represented by 𝐶𝑜𝑠𝑡 ℜ 𝒜𝑖 . The distribution pattern changes by time as the vehicles eradicate the COTS. 𝛾 ℜ 𝒜𝑖 is the time overdue violation to guarantee on-time completion of mission before 𝒜 𝑖 runs out of battery and 𝜀 is a coefficient denoting the impact of violation in cost calculation. Finally, 𝜆 𝑖𝑝 (𝑖 = 1,2,3) represents weighting factors to balance/ highlight the corresponding mission terms used in the cost function (9). These weighting factors are tuned based on the importance of the corresponding terms, for example the magnitude difference of mission and battery time, according to the COTS eradication mission. In this particular mission, the first and third terms in the cost function (9) are of more importance for the designer and mission. 5 III. H EURISTIC F LEET C OOPERATION (HFC) Considering the key requirements for solving the constrained MRTA (C-MRTA) problem, a Heuristic Fleet Cooperation (HFC) algorithm as a population-based evolutionary approach is presented for the first time in this work. This method comprises four operators of

Clustering , Ordering , Screening , and

Cooperation specifically designed for solving the C-MRTA problem. In the first stage, the proposed HFC algorithm naturally uses an automatic subdivision mechanism through the clustering process to categorize the most similar tasks in groups of confined areas. In the second stage, the solutions are iteratively evolved through the ordering process until the shortest route with exclusive set of tasks is generated for each cluster, where no tasks remain unattended. In the third stage, the screening mechanism effectively discards the most distant less priority tasks to fit each individual route to the defined time constraint that is the battery life for each AUV. In this way, the algorithm guarantees completing maximum possible highest priority tasks for each vehicle in the given time threshold. Ultimately, in the fourth stage, the cooperation mechanism facilitates each vehicle to effectively use its residual time for assisting the other vehicles after completing its own tasks that leads to maximum use of all vehicles’ batty life. The algorithm’s control parameters can be adjusted iteratively which enhances the convergence rate of the algorithm. The proposed method is capable of rapid prototyping of multiple optimal solutions for efficient grouping and accomplishment of the existing tasks spread over a large operation area. The detailed mechanism of the algorithm is explained in the following steps. Clustering Operator

In this section, the initial population is generated where each individual comprises a random sequence of tasks with uniform probability. The solution space (task sequences) is identical for each cluster and improves iteratively through the evolution operators of ordering and screening. Now let us assume 𝓀 number of AUVs to be deployed in the operating field (given by (10)). The working environment should be divided into 𝑐 = 𝓀 exclusive groups of tasks to avoid multi-vehicle mission overlap. The K-means and Fuzzy C-means (FCM) clustering methods are utilized in this study to effectively and reasonably divide the tasks between the set of AUVs. ∀𝒜 𝑖 ∈ 〈𝒜 , … , 𝒜 𝓀 〉, ∃ 𝑐 𝑗 ∈ 〈𝑐 , … , 𝑐 𝑐 〉 (10) Here the task data is partitioned into 𝑐 clusters defined by 𝑐 𝑗 ∈ {𝑐 , … , 𝑐 𝑐 } . The FCM algorithm attempts to partition a finite collection of 𝑛 elements 𝜕 = {𝜕 , … , 𝜕 𝑛 } into a collection of c fuzzy clusters with respect to the given criteria. The FCM aims to minimize an objective function (11): argmin 𝐶 ∑ ∑ 𝑤 𝑖𝑗𝑚 ‖𝜕 𝑖 − 𝑐 𝑗 ‖ 𝑤 𝑖𝑗 =

1∑ ( ‖𝜕𝑖−𝑐𝑗‖‖𝜕𝑖−𝑐𝑘‖ ) (11) Given a finite set of data, the algorithm returns a list of 𝑐 cluster centers and a partition matrix 𝑊 = 𝑤 𝑖,𝑗 ∈ [0,1] , where each element, 𝑤 𝑖,𝑗 , refers the degree to which element, 𝜕 𝑖 , belongs to cluster 𝑐 𝑗 ( 𝑤 𝑖,𝑗 is also called membership value). The 𝑚 ≥ 1 ∈ 𝑅 is a fuzzifier to determine the level of cluster fuzziness. The FCM method offers number of 𝑐 membership values to each task; therefore, each task belongs to all the 𝑐 clusters, but in different degrees of membership value 𝑤 𝑖𝑗 . The membership of a task to a cluster depends on its distance from the center of the clusters. In K-means method, Lloyd’s algorithm [21] is applied to determine the center of the clusters, and each task belongs to the closest cluster center in the environment. Clusters are refined iteratively and converge when a saturation phase emerges where there is no further chance for changes in assignment of the clusters. The K-means clustering also attempts to minimize a squared error function as an objective function defined in (11) without using the membership values 𝑤 𝑖𝑗 and the fuzzifier 𝑚 . Figure 4 illustrates the performance of different clustering approaches in space decomposition for a team of three AUVs. As shown in Fig.4, 90 tasks are randomly distributed in a non-uniform environment, and three methods have been applied to divide the tasks into three groups. Fig.4 (a) shows the performance of K-means method, where all the three clusters are completely separated. In Fig.4 (b), the FCM method is applied and maximum membership value is the selection criterion for allocating each task to its cluster. In this method, the border areas between clusters are not strictly determined and it is likely to give rise to overlap among the clusters. In Fig.4 Fig. 4. Clustering the COTs distribution using the K-means method (a); FCM method with Max operator (b); and FCM with the roulette wheel operator (c). Ordering Operator

The ordering operator plays a pivotal role in the evolution process of route planning. In this stage, for each cluster, the catching order of the tasks is changed with the aim of minimizing the route length and mission time. The process starts with randomly selecting a number of feasible solution vectors from the initial population of each cluster. A cost function is defined by (9) to validate the solutions’ quality during the evolution process. To change the placement sequence of the tasks in the routes, three conventional mechanisms namely Swap, Insertion, and Reversion are applied [22], [23]. Fig. 5 shows an example of the three mentioned methods used for the ordering stage.

Fig.5

Mechanism of ordering operation.

Assuming two tasks of Screening Operator

The screening operation is a mechanism designed in this study to effectively eliminate the less important tasks out of the AUV’s route to assure accurate mission timing. The pseudo-code in Algorithm (2) describes the mechanism of screening operator. Let us consider the 𝑇 𝑎𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒 as the maximum operation time assigned for the AUVs’ mission (see (12)); if the time is not sufficient to cover all the tasks in a cluster, it leads to violation of time threshold, and some tasks should be abandoned (to eliminate the violation value). 𝑇 𝑎𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒 = ∑ 𝒯 𝒜 𝑗 ∇𝓀𝑗=1 𝑇 𝑑𝑖𝑓𝑓 (𝑗) = 𝒯 𝒜 𝑗 ∇ − 𝒯 ℜ(𝑗)ℳ (12) where, 𝒯 𝒜 𝑗 ∇ is the total battery time for 𝒜 𝑗 , 𝒯 𝒜 𝑗 ℳ is the mission time completed by 𝒜 𝑗 , and 𝑇 𝑑𝑖𝑓𝑓 is the residual battery time. The vehicles should complete the highest priority tasks in such a way that the largest possible area is covered, and the least number of far-off tasks (distant ones) need to be abandoned (to meet the defined time constraint). In other words, the screening approach should create a balance between two factors: the maximum width (expanse) of covering area, and the minimum number of tasks to be abandoned. The screening mechanism is not depended in the position of the tasks and their distance from the initial point; therefore, the tasks are uniformly abandoned form the whole area, and distant parts of the cluster will not be untouched. As a result, the widest possible area is covered by the deployed vehicles. As shown in Algorithm (2), in each iteration a certain number of individuals are randomly selected to be screened. The cost Algorithm (1) –

Pseudocode of Ordering

Input: ℜ 𝒜 𝑗 𝑁 : 〈𝑆 𝑠 𝑗 , … , 𝒞 𝑖 , … , 𝑆 𝐺 𝑗 〉 //Take number of N input individual as “route” for each vehicle 𝒜 𝑗 , where 𝑆 𝑠 𝑗 , 𝑆 𝐺 𝑗 are start and goal stations 𝑛 = size (ℜ 𝒜 𝑗 = ℜ 𝑗 ) , //Number of the tasks in route ℜ 𝒜 𝑗 𝐅𝐨𝐫 𝑗 = 1 to 𝑁 // For all routes in population 𝐅𝐨𝐫 𝑖 = 1 to 𝑛 𝐿(𝑖𝑛𝑑(𝑖)) = ∑ dist(𝑖𝑛𝑑(𝑖, 𝑗)) 𝑛𝑗=1 // Calculate the Length of the routes 𝑟 = rand(1,2,3) // Select a random ordering method 𝐢𝐟 𝑟 = 1 𝑖𝑛𝑑 𝑜 (𝑖) = doSwap(𝑖𝑛𝑑(𝑖)) // Do Swap else-if 𝑟 = 2 𝑖𝑛𝑑 𝑜 (𝑖) = doInsertion ( 𝑖𝑛𝑑(𝑖) ) // Do Insertion else-if 𝑟 = 3 𝑖𝑛𝑑 𝑜 (𝑖) = doReversion ( 𝑖𝑛𝑑(𝑖) ) // Do Reversion end if 𝐿(𝑖𝑛𝑑 𝑜 (𝑖)) = ∑ dist(𝑖𝑛𝑑 𝑜 (𝑖, 𝑗)) 𝑛𝑗=1 // Calculate the length of the ordered route 𝐢𝐟 𝐿(𝑖𝑛𝑑 𝑜 (𝑖)) ≤ 𝐿(𝑖𝑛𝑑(𝑖)) 𝑖𝑛𝑑(𝑖) = 𝑖𝑛𝑑 𝑜 (𝑖) // Replace the route with the ordered one end if end For end For Output: ordered individuals 𝑛𝑆𝑐 ) should be selected carefully as the high level of screening rate may cause immature convergence of the algorithm. It is experientially discovered that the number of screening candidates should not be more than one percent of the population size, i.e., 𝑛𝑆𝑐 ≤ 0 ∙ 01𝑁 . Cooperation Operator

In highly non-uniform task distribution environments, the size of clusters is usually non-uniform as well. In this case, some agents abandon several tasks to accomplish their mission without violating the time threshold constraint. On the other hand, in the smaller clusters, the AUV may complete all the tasks while still some battery time is left; this can be used for completing the remaining tasks in other clusters. The main role of cooperation operator is to employ idle AUVs to accomplish the remaining tasks of any nearby cluster. This contributes to equitable use of time among the AUVs and enhances the productivity. This process should be repeated until the unused time is consumed. It should be noted that during the route planning process, the travel time from the last accomplished task to the pre-determined rendezvous point should be taken into account. Algorithm (3) illustrates the pseudocode of cooperation stage. As expressed in the algorithm, by checking the remaining time of the AUVs, the idle vehicles are determined. Each idle AUV receives the position information of the abandoned tasks from the other AUVs and after calculating the distance between the last task of the route and all the abandoned tasks, overtakes the closest one into its route (after overtaking the new task the ordering operator re-orders the cluster in presence of the new task). The cooperation is repeated until the remaining time is consumed. Equation 13 gives the distance of the 𝑖′ th abandoned task from the latest task in the route. 𝒟 𝒞 𝑖′ = √(𝒞 𝑖 ′ ,𝑥 − 𝒞 𝑒𝑛𝑑,𝑥𝒜 𝑗 ) + (𝒞 𝑖 ′ ,𝑦 − 𝒞 𝑒𝑛𝑑,𝑦𝒜 𝑗 ) ++(𝒞 𝑖′,𝑧 − 𝒞 𝑒𝑛𝑑,𝑧𝒜 𝑗 ) (13) where 𝒞 𝑖 ′ ,𝑥 , 𝒞 𝑖 ′ ,𝑦 , 𝒞 𝑖 ′ ,𝑧 are the coordinates of the 𝑖 th abandoned task, and ( 𝒞 𝑒𝑛𝑑,𝑥𝒜 𝑗 , 𝒞 𝑒𝑛𝑑,𝑦𝒜 𝑗 , 𝒞 𝑒𝑛𝑑,𝑧𝒜 𝑗 ) are the coordinates of the last executed task by the idle AUV 𝒜 𝑗 . The distance should be obtained for the all abandoned tasks. The described sections are iteratively applied to the population until the optimal solution is achieved. Algorithm (4) is the general pseudocode of the proposed HFC algorithm. Algorithm (2) –

Pseudocode of Screening Stage

Input: the individual to be screened

𝐅𝐨𝐫 𝑗 = 1 to 𝓀 //Take number of N input individual as “route” for each vehicle 𝒜 𝑗 , where 𝑆 𝑠 𝑗 , 𝑆 𝐺 𝑗 are start and goal stations ℜ 𝒜 𝑗 𝑁 : 〈𝑆 𝑠 𝑗 , … , 𝒞 𝑖 , … , 𝑆 𝐺 𝑗 〉 𝑛′ = size (ℜ 𝒜 𝑗 = ℜ 𝑗 ) , //Number of the tasks in route ℜ 𝒜 𝑗 𝑙 = {1,2, … 𝑛} //Allocated tasks index 𝐅𝐨𝐫 𝑖 = 1 to 𝑛 //For Each task with index of 𝑖 While 𝒯 𝒜 𝑗 ℳ < 𝒯 𝒜 𝑗 ∇ // check the time violation ℜ 𝑗 ∖ { 𝑙 } = {𝒞 𝑖 : 𝒞 𝑖 ∈ℜ 𝑗 , ∼ ( 𝒞 𝑖 ∈ {𝑙})} //exclude task 𝒞 𝑖 from route ℜ 𝑗 𝑚𝑎𝑠𝑘 = ones (1, 𝑛) //Create a 1-by-n neutral matrix as mask 𝑚𝑎𝑠𝑘(𝑖) = 0 //Disable the 𝑖 th element in mask 𝑆𝑅(𝒞 𝑖 ) = 𝑚𝑎𝑠𝑘 × ℜ 𝑗 //Disable the 𝒞 𝑖 by applying the mask 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝑆𝑅(𝒞 𝑖 )) //Calculate the cost in the absence of the 𝒞 𝑖 𝒯 𝒜 𝑗 ℳ = (𝒟 ℜ𝑗 −𝒟 ℜ𝑗 (𝑆𝑅(𝒞 𝑖 )))|𝑣 𝒜𝑖 | − 𝑡 𝒞 𝑖 // calculate the mission time 𝒯 𝒜 𝑗 ℳ in absence of task 𝒞 𝑖 end if 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝒞 𝑖 ) > 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝑆𝑅(𝒞 𝑖 )) //Find the order number of the min cost ℜ = ℜ 𝑗 (𝑆𝑅(𝒞 𝑖 )) // Excluding task 𝒞 𝑖 from ℜ 𝑗 else-if 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝒞 𝑖 ) < 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝑆𝑅(𝒞 𝑖 )) ℜ = ℜ 𝑗 (𝒞 𝑖 ) // including task 𝒞 𝑖 in ℜ 𝑗 end if end For ScreenedRoute = ℜ //Find the best screened route end For

Output: screened individuals

Algorithm (3) –

Pseudocode of Cooperation Stage  Get the label of the abandoned tasks 𝑞 = {𝒞 𝑖 ′ , … , 𝒞 𝑖 ′′ }  Get the position of the abandoned tasks: {𝒞 𝑖 ′ ,𝑥𝑦𝑧 , … , 𝒞 𝑖 ′′ ,𝑥𝑦𝑧 }  Find the idle vehicle: 𝒜 𝑗 = 𝒜 𝑖𝑑𝑙𝑒  Get the absolute velocity of the idle vehicle |𝑣 𝒜 𝑗 | ℜ 𝑗 = ℜ 𝒜 𝑖𝑑𝑙𝑒 // Take the idle vehicle’s route as “route” While 𝒯 𝒜 𝑗 ℳ < 𝒯 𝒜 𝑗 ∇ 𝒯 𝒜 𝑖 ∇ is the total battery time for 𝒜 𝑖 𝑛′ = size (𝑞) // Number of abandoned tasks. 𝐅𝐨𝐫 𝑗′ = 1 to 𝑛′ 𝑙′ = 𝑖𝑛𝑑{𝑞} // Abandoned tasks index 𝐅𝐨𝐫 𝑖′ ∈ 𝑙′ 𝐝𝐨 // For each abandoned task 𝑔𝑒𝑡 𝒞 𝑖 ′ ,𝑥𝑦𝑧 // Get the position of the abandoned tasks 𝒟 𝒞 𝑖′ = dist(𝒞 𝑖′,𝑥𝑦𝑧 , 𝒞 𝑒𝑛𝑑,𝑥𝑦𝑧𝒜 𝑗 ) // Calculate the distance of the route’s endpoint 𝒞 𝑒𝑛𝑑𝒜 𝑗 from the abandoned tasks 𝒞 𝑖 ′ end For ∀𝑖 ′ ∈ 𝑙 ′ ; 𝒞 𝑖 ′ (𝑏𝑒𝑠𝑡) = min(𝒟 𝒞 𝑖′ ) 𝒯 𝒞 𝑖′ (𝑏𝑒𝑠𝑡) = ∑ (𝒟 𝒞 𝑖′ (𝑏𝑒𝑠𝑡) × |𝑣 𝒜 𝑗 | −1 ) + 𝑡 𝒞 𝑖′(𝑏𝑒𝑠𝑡) 𝑛 ′ 𝑗 ′ =1 // Find the order number of the minimum distance. // 𝒯 𝒞 𝑡′(𝑏𝑒𝑠𝑡) is the time for completing the abounded tasks of 𝒞 𝑖 ′ (𝑏𝑒𝑠𝑡) ℜ 𝑗 ′ = [ℜ 𝑗 + 𝒞 𝑖 ′ (𝑏𝑒𝑠𝑡)] // Overtake the nearest abandoned task end For // repeat the process ∀ℜ 𝑗 ′ ; ∃ 𝒯 𝒜 𝑗 ℳ = (𝒯 ℜ 𝑗 ) + 𝒯 𝒞 𝑖′ (𝑏𝑒𝑠𝑡) // calculate the mission time( 𝒯 𝒜 𝑗 ℳ ) for ℜ 𝒜 𝑗 end For Output:

BestCooperation = [ℜ 𝑗 ′ ]

8 According to Algorithm (4), the HFC has a loop corresponding to the number of the vehicles ( 𝓀 ). Hence, analysis of the algorithm itself may result in a judgment that it converges with the complexity of 𝑂(𝓀𝑁𝑡) . However, considering the complexity of the MRTA problem over all possible instances, the complexity of HFC also relies on the implementation of the evolution operators (

Ordering , Screening , and

Cooperation ), design and encoding of the individuals in the population, and certainly the complexity of the cost function that can significantly impact the convergence speed of the HFC. Hence, given the possible choices for evolution operators, at the extreme condition, the algorithm complexity is obtained as follows:

𝑂 (𝓀𝑁𝑛𝑡 (2 + )) = 𝑂 (( 𝓀𝑁𝑛 ⏞ Ordering + 𝓀𝑁𝑛 ⏞

Screening + 𝓀𝑁 ⏞

Cooperation ) × 𝑡) (14) where, 𝑁 is the population size and 𝑛 the size of the individuals (solution vectors). The cost function complexity is ignored in this computation as it depends on the application. IV. R ESULTS AND D ISCUSSIONS

The AUVs’ mission planner in this research, uses a priori information of COTS distributions (described in Section II (A)), maximum operation time, and battery capacity of each vehicle to compute the most appropriate order of tasks (from beginning toward the destination). Perception of the operating field is achieved via the AUVS’ navigation aids such as Horizontal Acoustic Doppler Current Profiler (H-ADCP) and Doppler Velocity Logger (DVL) and their situational awareness modules. Information sharing and exchange between the AUVs are carried out via acoustic commination using DUSBLs. In this study, all computations were performed on a desktop PC with an Intel i7 3.20 GHz quad-core processor in MATLAB®2019a. In the subsequent sections, different mission scenarios are defined, and the performance of the proposed algorithm in eradicating the maximum number of COTS is evaluated in detail. A. Qualitative assessment of HFC-based mission planning

To evaluate the performance productivity of the proposed algorithm on multi-vehicle cooperative task assignment and completion, the following modes are defined: i. Non-Cooperative mode 1 (NCM1): where only the ordering operator is switched on while screening and cooperation operators remain off; ii.

Non-Cooperative mode 2 (NCM2): where the ordering and screening operators are switched on and the cooperation operator remains off; iii.

Cooperative mode (CM): where the ordering, screening and cooperation operators are switched on; In the first scenario, the HFC-based mission planner’s performance is investigated in the NCM1 mode, where the vehicles do not communicate to each other; however, the catching order of the tasks for each cluster is changed with the aim of minimizing the route length and mission time. In this mode, each vehicle individually plans its mission regardless of the others’ mission. The time constraint is not considered in NCM1 mode, and each vehicle aims at completing all the existing tasks in the assigned cluster taking the shortest route and minimizing the route time. Consequently, the expectation is to have large time violations for this scenario. The second scenario is the augmented version of the first one in which the screening operator is also enabled to identify and decide the tasks that should be abandoned, while considering

Algorithm (4) –

Pseudocode of HFC algorithm

Inputs:

Population size ( 𝑁 ), Population Index ( 𝑃 ), Maximum iteration ( 𝑡 ), Number of vehicles ( 𝓀 ), Maximum available time ( 𝒯 𝒜 𝑗 ∇ ) for vehicle 𝒜 𝑗 , Number of screened individuals ( 𝑛𝑆𝑐 ), Number of tasks in a route ( 𝑛 ) 1. //Initialization define 𝑡𝑎𝑠𝑘𝑠: 〈𝒞 𝑖,𝑥𝑦𝑧 〉; 𝑖 ∈ {1, … , 𝑛} 𝐅𝐨𝐫 𝑖 = 1 to 𝑁 𝑖𝑛𝑑(𝑖) = cluster(𝑡𝑎𝑠𝑘𝑠, 𝓀) 𝐅𝐨𝐫 𝑗 = 1 to 𝓀 get(𝑖𝑛𝑑(𝑖, 𝑗)) 𝐶𝑜𝑠𝑡 ℜ (𝑖, 𝑗) = cost(𝑖𝑛𝑑(𝑖, 𝑗)) 𝐞𝐧𝐝 𝐅𝐨𝐫 ℜ(𝑖, 𝑗) = 𝑖𝑛𝑑(𝑖, 𝑗) 𝐞𝐧𝐝

For // HFC main loop

𝐅𝐨𝐫 𝑘 = 1 to 𝑡 //Ordering 𝐅𝐨𝐫 𝑖 = 1 to 𝑁 𝐅𝐨𝐫 𝑗 = 1 to 𝓀 ℜ 𝑜𝑟𝑑 (𝑖, 𝑗) = order(ℜ(𝑖, 𝑗)) 𝐶𝑜𝑠𝑡 𝑜𝑟𝑑 (𝑖, 𝑗) = 𝐶𝑜𝑠𝑡(ℜ 𝑜𝑟𝑑 (𝑖, 𝑗)) 𝐢𝐟 𝐶𝑜𝑠𝑡 𝑜𝑟𝑑 (𝑖, 𝑗) < 𝐶𝑜𝑠𝑡 ℜ (𝑖, 𝑗) ℜ(𝑖, 𝑗) = ℜ 𝑜𝑟𝑑 (𝑖, 𝑗) 𝐞𝐧𝐝 if 𝐞𝐧𝐝

For 𝐞𝐧𝐝

For

Output ordered individuals ℜ 𝑜𝑟𝑑 //Screening 𝑆𝑐 = randomSample(ℜ 𝑜𝑟𝑑 , 𝑛𝑆𝑐)

𝐅𝐨𝐫 𝑙 = 1 𝑡𝑜 𝑆𝑐

𝐅𝐨𝐫 𝑗 = 1 to 𝓀 𝐢𝐟 𝒯 𝒜 𝑗 ℳ > 𝒯 𝒜 𝑗 ∇ ℜ 𝑠𝑐𝑟 (𝑙, 𝑗) = screen(ℜ 𝑜𝑟𝑑 (𝑙, 𝑗)) 𝐢𝐟 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝒞 𝑖 ) > 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝑆𝑅(𝒞 𝑖 )) ℜ 𝑗 = ℜ 𝑠𝑐𝑟 else-if 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝒞 𝑖 ) < 𝐶𝑜𝑠𝑡 ℜ 𝑗 (𝑆𝑅(𝒞 𝑖 )) ℜ = ℜ 𝑜𝑟𝑑 𝐞𝐧𝐝 if 𝐞𝐧𝐝 if 𝐞𝐧𝐝

For 𝐞𝐧𝐝

For

Output the screened individuals {ℜ} //Cooperation

𝐅𝐨𝐫 𝑗 = 1 to 𝓀 𝑇 𝑑𝑖𝑓𝑓 (𝑗) = 𝒯 𝒜 𝑗 ∇ − 𝒯 ℜ(𝑗)ℳ 𝐢𝐟 𝑇 𝑑𝑖𝑓𝑓 (𝑗) > 0

ℜ′(𝑗) = cooperate(ℜ(𝑗)) ℜ 𝑏𝑒𝑠𝑡 (𝑗) = order(ℜ′(𝑗)) 𝐞𝐧𝐝 if 𝐞𝐧𝐝 For 𝐞𝐧𝐝

For

Output:

The shortest route ℜ for each 𝒜 𝓀 with minimum 𝑇 𝑑𝑖𝑓𝑓 and best order of maximum completed tasks. non - cooperative and the main drawback of this mode is that if any vehicle completes its mission with considerable remaining time, it cannot use this time to help other vehicles in completing their tasks. On the other hand, the CM facilitates the vehicles to communicate with each other to cover battery restrictions and to cover multiple missions, while the vehicles managing the endurance time to handle more COTs injection tasks. In this case, if any vehicle gets discarded from the team (abort its mission for any reason) or runs out of battery, the closest vehicle(s) with the most similar configurations undertake(s) the incomplete mission or the mission is divided between the operational vehicles and the planner module re-plans a new mission scenario for them. This leverages mission coverage range, areas, system resilience, and risk management. To investigate the performance evaluation of the proposed algorithm, three homogeneous COTSbot AUVs are deployed to a coastal environment, while all three vehicles are configured with the same setting. The battery life-time threshold for each vehicle is equally set to 3.6×10 ( sec ), and the vehicles are assumed to be operated with a constant average velocity of 1 m/s. Ninety tasks (Starfish spots) are non-uniformly distributed in the area of (1000×1000) m . The injection time (task completion time) for each COTS spot is 90 sec . The AUVs start their mission from the same initial position, and after the completion of the mission they should arrive at the rendezvous point (a priori known coordinate). Figure 6 ( a -1, b -1, c -1) shows a contour map of the operating field, including the tasks, the route lines, the start point and also the rendezvous point. The colormap represents the density of the COTS covered area while the probability of distribution ranges from 0 to 1 (with least to most probablity of Starfish mass in the area). In this figure, the lighter colors, leaned to yellow, correspond to the higher intensity COTS areas and the darkest parts, indicated by black color, are the shore lands. Having a closer look at the figure, the distribution of tasks is concentrated around the deeper spots denoted by lighter colors, and shallow areas are out of tasks. In order to demonstrate the effect of the algorithm’s operators, the operators are enabled step by step, and their contributions are displayed in different contour maps. As demonstrated in Fig.6, cluster (1) and (3) encircle the higher number of tasks (the 2 nd cluster encapsulates fewer number of tasks). We use this difference to show how cooperative mode enables vehicles to undertake others’ tasks in their spare time. In the first scenario (NCM1) given by Fig.6 ( a -1) and ( a -2), the time constraint is not taken into account, and the goal is to find the shortest route through the tasks. In this scenario, the main purpose of ordering operator is to maximize the number of completed tasks regardless of time threshold (that is satisfied considering the pattern of completed tasks as shown in Fig.6 ( a -2)). In the second scenario given by Fig.6 ( b -1) and ( b -2), NCM2 is applied in which both screening and ordering operators are enabled to maximize the completed tasks and minimize the route length while taking the time constraint into account. As shown in Fig.6 ( b -2), some tasks are abandoned by the 1 st and 3 rd AUVs due to the applied time threshold. However, the 2 nd AUV accomplishes a few numbers of tasks encapsulated in a smaller cluster; in addition, all three AUVs arrive rendezvous point without violating the time threshold. Although the time threshold is not violated by the vehicles, the 2 nd vehicle could effectively use its residual time for assisting the other vehicles after completing its own tasks, which is addressed using the cooperation operator. The CM is applied in the third scenario and the complete performance of the HFC algorithm is illustrated by Fig.6 ( c -1) and ( c -2), where the ordering, screening and cooperation operators are enabled. Considering the given task completion pattern in the CM (see Fig.6 ( c -2)), the 2 nd vehicle successfully completes all the existing tasks in cluster (2) in the given time and devotes its remained time to cooperate with the 3 rd vehicle for accomplishing the abandoned tasks in cluster (3). In contrast, it is apparent that there is no such a cooperation in the first two scenarios as all vehicles just concentrated on their assigned cluster without being aware of other vehicles’ operation detail. This is a good indication of how cooperation impacts the task completion performance, while the vehicles communicate and update each other with their completed tasks. Ultimately, all three vehicles terminate their mission in the rendezvous point. Figure 7 demonstrates the algorithm performance for all three mentioned scenarios based on the mission cost metric which is a direct function of the number of completed tasks and the time violation. It is noteworthy to mention that since the evaluation criterion for deciding the best solution is the total cost of the three vehicles, in some iterations one vehicle may experience a slight increase in cost value comparing with the previous iterations; but the summation of the three costs is always less than the previous iteration. As shown in Fig.7 ( a ), ( b ), the total cost follows a decreasing trend and settles on about 2.27 ×10 after 150 iterations. Even though the number of completed tasks in the first scenario is high, the screening mode is off which means the planner is not able to well manage the time violation. Therefore, mission planning cost in the first scenario (Fig.7 ( a )) is considerably higher than the other two scenarios. In the second scenario, the screening operator is enabled along with the ordering operation, meaning that the planner tends to maximize the number of completed tasks and minimize the route length, while taking time constraint into account. The time constraint enforces the vehicles to abandon some of the tasks in the crowded clusters. Considering the total cost trend in Fig. 7( b ) the algorithm effectively suppresses the total cost value to be settled on 0.71 ×10 , which is remarkably less than the produced cost in Fig.7 ( a ). Although the total cost in Fig.7 ( b ) is significantly reduced compared to the first scenario, it is almost twice greater than the total cost in Fig.7 ( c ) (all vehicles are only responsible for their assigned clusters). It is inferred from Fig.7 ( c ) that the best performance in minimizing the cost belongs to the 3 rd scenario (0.38 ×10 ) in which the vehicles can cooperate with each other. 10 Fig.6. ( a -1) and ( a -2) The task completion pattern by multiple vehicles in NCM1, where ordering operator is on, screening and cooperation operators remain off; ( b -1) and ( b -2) The task completion pattern in NCM2, where the ordering and screening operators are functioning but cooperation operator is disabled; ( c -1) and ( c -2) CM, where the ordering, screening and cooperation operators are functioning.

11 The total cost and the cost of individual vehicles are reduced iteratively meaning that the population effectively converges to the optimum solution, i.e., taking the best use of mission available time. To further assess the performance of the proposed algorithm on the given scenarios, qualitative assessment is undertaken with performance indices of time violation and time difference between battery lifetime (considered 3600 sec) and the actual operation time for each vehicle. Figure 8 demonstrates the evolution trend of the proposed algorithm in the mentioned scenarios. Fig.8 ( a -2) shows that the 1 st and 3 rd AUVs take negative values of time-difference ( T diff ), and they cannot entirely eliminate the violation (referring to Fig.8 ( a -1)); this means that the two mentioned AUVs experience the shortage of time for accomplishing their missions. Although the 2 nd AUV does not violate the time limit (due to the small size of cluster (2)), the corresponding T diff is significantly high; this time difference could be used to reduce the load of other vehicles, and help them to eliminate their time violations. Fig.7 Mission performance in terms of total and individual vehicles’ cost variation over 150 iterations where: (a) only ordering mode is functioning while screening and cooperation operators are disabled (

NCM1 ); (b) ordering and screening mode are functioning but the cooperation mode is disabled (

NCM2 ); (c) ordering, screening and cooperation mode are switched on ( CM ). Fig.8 Mission performance in terms of individual vehicles’ Time violations in three scenarios of ( a -1) only ordering mode is functioning while screening and cooperation operators are disabled ( NCM1 ); ( b -1) ordering and screening mode are functioning but the cooperation mode is disabled ( NCM2 ); ( c -1) ordering, screening and cooperation operator are functioning ( CM ). Mission time difference ( T diff ) for individual vehicles’ over 150 iterations where: ( a -2) NCM1 ; ( b -2) NCM2 ( c -2) CM

12 Referring to Fig.8 ( b -1) and ( b -2), the performance of the algorithm is slightly improved due to enabling the screening operator. This results in managing the operations’ time violation. However, due to the large size of clusters (1) and (3), the vehicles are forced to ignore some of the tasks to satisfy the time constraint, and consequently to eliminate the violation. However, the 2 nd vehicle ends up with a large positive time difference, which this time could be used to handle unaddressed tasks of the nearby clusters. Figure 8 ( c -1) and ( c -2) demonstrate the ultimate performance of the algorithm while the cooperation operator is enabled. In this scenario, the 2 nd AUV devotes its remained time to the other AUVs to cooperate in tasks’ handling. The time difference in Fig.8( c -2) and the violation graphs in Fig.8 ( c -1) completely endorse the performance of the proposed algorithm. As illustrated in in Fig.8 ( c -1), the 2 nd AUV does not violate the time constraint over the 150 iterations while the 1 st and 3 rd AUVs have experienced the high level of violation at early iterations; however, the algorithm gradually reduces the violation value and touches the zero line after 100 iterations. Tracking the trend of cost for each vehicle in Fig.7( c ), Fig.8( c -1), and Fig.8( c -2) reveals that the algorithm accurately enforces the solutions to approach the least cost and navigates the solutions to eliminate the violation within 150 iterations in the CM. Table I provides a detailed numerical comparison of the algorithm performance for the mentioned scenarios. As indicated in Table I, in the NCM1, vehicles (1) and (3) terminate the mission with a large negative time difference of V = - 2,002 and V = - 2,433 ( sec ), respectively. This means that the planner in NCM1 does not consider the time restriction as the screening operator is disabled. Consequently, the mission planner in the NCM1 can meet the maximum number of tasks as no restriction confines the planner to avoid negative time difference. Due to the smaller size of cluster (2) that includes only 13 tasks, the 2 nd vehicle can complete all the 13 tasks in its mission with positive time difference of V = + 806 ( sec ), where this time can be used to take the negative load of other vehicles and reduce their violation. It should be noted that the cost value has a direct relation to the violation and therefore, the cost value for the first scenario is considerably higher than the other two modes. In the second scenario, there is no violation, and T diff for all vehicles tends to have a positive value; however, the operation cost of 0.714 ×105 is still larger than the third scenario with total cost of 0.386 ×105. The reason is that the second scenario only enables ordering and screening mode helping the vehicles to individually optimize the route length and tasks quantity with no time violation. This enforces the vehicles in larger clusters to ignore most of the tasks to meet time restriction. However, the vehicles in smaller clusters can complete all their tasks in a small portion of their available time and finish their mission without considering the nearby clusters. As a result, only 52 tasks out 90 are completed which leads the planner to experience a higher cost compared to the cooperative mode. Ultimately, Table I shows that the operation time for all vehicles in the CM is smaller than total available time, and the violation value for all solutions is equal to zero. This confirms the feasibility of the produced routes. The total T diff for the cooperative operation is only + 124.81 (sec) out of total available time 10800 sec (3×3.6×10 ) for three vehicles. Obviously, the total operation cost for the collaborative mode (0.386 ×105) is significantly less than the other two scenarios due to use of all the three operators of ordering (maximizes number of completed tasks), screening (satisfies time restriction), and cooperation (facilitates efficient mission time management), confirming the superior performance of the CM. As can been seen in the CM, the 2 nd AUV completes all the 13 tasks in its cluster and allocates its remaining time to cooperate with the 3 rd AUV to handle 6 more tasks; this ends up with a small time difference of + 93.07 (sec). Referring to the results in Table I, it is notable that the operation cost for CM experiences a decrease of 83.02% compared to NCM1 and 45.94% with respect to operation cost of the planner in NCM2. The planner in NCM1 completed 100% of the existing tasks due to having no time restriction while this number is decreased to 58% of existing tasks for NCM2 due to added time restriction. In the third scenario, 67.78 % of existing tasks are completed indicating a decline of 32.22% with respect to NCM1 due to having time restriction and improvement of 9.78% compared to NCM2 due to enabling the cooperative operator. In the sequel, the residual time for CM is minimized to 14.89% of the residual time for NCM2, which means the best use of total available time is taken in the CM to complete maximum possible number of tasks.

TABLE I. N UMERICAL ASSESSMENT OF THE ALGORITHM PERFORMANCE IN

NCM1,

NCM2,

AND

CM.

Scenarios

Active Operators Operation Cost Operation Duration Time Difference Completed Tasks NO Violation NCM1

Ord: ON Scr: OFF Col: OFF V ≅ V ≅ V ≅ V = 5,602.00 ( sec ) V = 2,794.00 ( sec ) V = 6,033.00 ( sec ) V = - 2,002 ( sec ) V = + 806 ( sec ) V = - 2,433 ( sec ) V = 41 V = 13 V = 36 V = 0.4081 V = 0.0000 V = 0.3630 Total = sec ) V , V : Failed V = + 806 ( sec )

90 / 90 0.7711

NMC2

Ord: ON Scr: ON Col: OFF V ≅ ×10 V ≅ V ≅ V = 3,587.03 ( sec ) V = 2,782.97 ( sec ) V = 3,591.84 ( sec ) V = + 12.97 ( sec ) V = + 817.03 ( sec ) V = + 08.16 ( sec ) V = 19 V = 13 V = 20 V = 0.0000 V = 0.0000 V = 0.0000 Total = sec ) + 838.16 ( sec )

52 / 90 CM Ord: ON Scr: ON Col: ON V ≅ ×10 V ≅ V ≅ V = 3,582.56 ( sec ) V = 3,506.93 ( sec ) V = 3,585.74 ( sec ) V = + 17.48 ( sec ) V = + 93.07 ( sec ) V = + 14.26 ( sec ) V = 21 V = 19 V = 21 V = 0.0000 V = 0.0000 V = 0.0000 Total = sec ) + 124.81 ( sec )

61 / 90 B. Quantitative assessment of HFC-based mission planning

To assess the robustness and reliability of the designed cooperative mission planner, 30 execution runs are performed in a Monte Carlo simulation for each vehicle in the CM. Figs.9-12 demonstrate the quantitative performance of the cooperative mission planner in dealing with problem’s space deformation and against several significant mission’s metrics including mission cost, time difference ( T diff ), and number of successful task completion (number of killed COTs spots). The COTs distribution and the topology of the graph is changed randomly based on a Gaussian distribution on the problem search space. The total battery life is set equally with 3.6×10 ( sec ) for each vehicle. The desired performance is defined based on the maximum number of COTS kills subject to the vehicles’ battery capacity /mission time ( ∑ 𝒯 𝒜 𝑗 ∇𝓀 𝑗=1 ). Time violation is for the case that any of the AUVs goes beyond the specified time threshold, and its operation takes longer than what battery capacity allows. Fig.9 (a) Mission cost variation in 30 Monte Carlo simulations in dealing with space deformation; (b) time violation over the given time threshold of 3.6×10 ( sec ) for each vehicle in 30 experiments. Figure 9 ( a ) shows that the median of mission cost is about 2.3 for all three vehicles with maximum range of variation about 0.5 under the Monte Carlo experiments; this reveals the robustness of the planner against variations. It should be noted that within the 30 Monte Carlo runs, there just exist four failures (13% failure rate) indicated by outliers in Fig.9 ( a ). Fig.9( b ) shows that the mission planner is able to meet the specified constraint as the time violation for all the three vehicles in 30 missions is centralized in zero, which is a good indication of system’s time management capability. The violation diagram shows how the generated solutions respects the defined restriction (restriction of mission time to available time and feasibility criteria). This means the mission planner accurately enforces the solutions for all the three vehicles to approach the least cost and manage the solutions to eliminate the violation as the range of violation variations converges to zero in all Monte Carlo experiments. Figure 10 illustrates the results of Monte Carlo trials against performance metrics such as rout length, mission time, and number of task completion. Fig.10 (a) Average distance travelled by each vehicle in 30 Monte Carlo simulations; (b) Average operation time for each vehicle; (c) Average number of completed tasks by each vehicle.

Fig.10 ( a ) shows that the minimum and maximum statistical range of traveled rout belongs to the second vehicle (~700 m ) and third vehicle (~ 800 m ) respectively while all three vehicles have an approximate median route length of 1.5 km . This is followed by the boxplots in Fig.10 ( b ) indicating the variations of the operation time for all the three vehicles is centralized in a very narrow boundary in range of 3.5×10 to 3.6×10 sec in all executions. This confirms that all three vehicles tend to efficiently use their maximum battery time (3.6×10 sec ) in the CM. In addition, Fig.10 ( b ) indicates that the planner robustly makes a balance between available time and the actual operation time as the average variations for all executions are lied in similar range and very close to the upper bound of 3.6×10 sec . This confirms efficient cooperation of the vehicles in handling the tasks in a way that none of them is left with considerable time difference (residual battery time). Also, Fig.10( c ) demonstrates that the vehicles in the CM are able to accomplish the COTS killing task with 70% success rate (sum median of 63 out of 90) that is consistent with the success rate of 67% indicated in Table I. This is another indication of the algorithm robustness under dynamicity and uncertainty of the operating environment. 14 Fig.11 Time management performance for each vehicle in the CM.

Figure 11 shows time management performance analysis of the mission planner for all three vehicles under 30 Monte Carlo trials. The operation commences with the equal time constraint of 𝒯 𝒜 𝑖 ∇ = sec for each vehicle (presented by red dashed horizontal line in Fig.11). As shown in Fig.11, the T diff for all three vehicles is a very small value in all the missions indicating optimally use of mission time ( 𝒯 𝒜 𝑖 ℳ ) in CM even under variations of operating field (good indication of the planner time management robustness). Fig. 12 provides a zoom-in view of the time difference index for all three vehicles under 30 Monte Carlo trials. As given in Fig.12, the planner reasonably manages the time as T diff in most of the cases tends to have a tiny positive value meaning that the vehicles accomplish their mission before running out of battery. To be more specific, the best performance belongs to experiment T diff . Also, in a number of experiments, negligible delays are found; for example, experiment sec for vehicle V , experiment sec for V , experiment sec for V , and experiment sec for V , that can be ignored as they are insignificant compared to the total available time of 3600 sec . C. Comparative Study

The purpose of this section is to evaluate the performance of the proposed HFC algorithm against an existing benchmark method of the state-of-the art. To this end, a benchmark method presented in [24] is used. The study proposed a GA- based algorithm to solve a cooperative task allocation problem for a multi-robot system. In this work, the operation space was decomposed via k-means clustering method and the GA was employed to optimize the route length of each robot in the assigned cluster. The partially mapped crossover (PMC) method [25] was utilized for implementing the cross-over term and the mutation term played a role analogous to the ordering operator of the proposed HFC.

Fig.13. The task completion pattern by multiple vehicles using GA-based task assignment method. Fig.12 Variations of time difference ( T diff ) between total available time of 𝒯 𝒜 𝑖 ∇ and the vehicles’ actual operation time of 𝒯 𝒜 𝑖 ℳ . Figure 13 demonstrate the simulation results of the GA-based task assignment method [1] for the mission scenario introduced in Section IV. As shown in the figure, the GA-based task assignment method is successfully able to provide an optimal route for each vehicle. The number of accomplished tasks is 52, which is equal to the NCM2 mode of the proposed HFC. However, in each cluster the nearby area to the start point has been completely eradicated of COTS, while the distant area from the initial position has remained untouched. In contrast, screening policy in the HFC algorithm has uniformly taken the tasks from the whole allocated area, and this contributes to reducing the density of the COTS in a wider coverage. Furthermore, the GA-based task assignment method has no mechanism to use the remained time of the smaller cluster to assist the other ones, while in cooperative mode of the HFC algorithm, by using the cooperation operator, the 2 nd cluster has devoted its extra time to overtake more tasks from nearby clusters. Fig.14. Mission performance criteria evaluation using GA-based task assignment method.

As shown in Fig14 (a) and (b), the algorithm has succeeded in finding the optimum solution in almost 170 iterations. Figure 14 (c) illustrates that by leaving the last ordered task in each iteration the algorithm is finally able to eliminate the violation, while the time difference factor of the 2 nd AUV (figure (d)) has not been reduced due to the lack of cooperation mechanism in GA. From numerical point of view, the total time difference factor is 1103 seconds with the GA-based task assignment method while in the two modes of the HFC, this parameter is 838 and 124 seconds respectively. The number of completed tasks in GA mode is 52, which is equal to NCM2 in HFC, while in the CM of HFC this number is noticeably increased to 61 tasks. Overall, by employing the HFC algorithm in the cooperative mode, 14.7% improvement in the COTS eradicating performance, compared to the GA-based task assignment method, is achieved. V. CONCLUSION This paper proposed a new approach to address the environmental problem of control of COTS in the Australia’s Great Barrier Reef. To this end, a novel cooperative mission planner algorithm for a certain class of underwater vehicles namely COTSbot AUVs was developed and its performance was investigated within extensive simulation studies. The problem of COTS control was transcribed in the context of a constrained task assignment problem and cooperative operation of the AUVs was used to maximize the number of task completion. The simulation results indicate the effectiveness of this approach and the planner algorithm while comparing with the non-cooperative or individual-based AUV COTS killing operation. The robustness of the proposed planner in the cooperative operation was examined with Monte Carlo simulations by applying topological variations on the COTS clusters. The result of Monte Carlo analysis confirmed the stability and robustness of the planner in dealing with random deformation of the COTS distribution and environmental topology. Moreover, the result of the comparative study demonstrated the superior performance of the HFC algorithm against the benchmark GA-based task assignment method. Future extension of this study includes field trials and evaluation of the planner in practice. R

EFERENCES [1] R. Ferretti, M. Bibuli, G. Bruzzone, M. Caccia, A. Odetti, M. Coltorti, “Automatic Posidonia Oceanica monitoring by means of Autonomous Underwater Vehicles to study the effects of anthropogenic impacts on marine ecosystems,” In Geophysical Research Abstracts, vol. 21, 2019. [2] I.V. Lygin, V.A. Lygin, T.I. Lygina, “Deep Sea Near Bottom Gravity and Magnetic Acquisition with Autonomous Underwater Vehicles,” In Engineering and Mining Geophysics 2019 15th Conference and Exhibition, vol. 2019, pp. 1-5, 2019. [3] M. M. van Katwijk, A. Thorhaug, N. Marbà, R. J. Orth, C. M. Duarte, G. A. Kendrick, H. J. Althuizen, E. Balestri, G. Bernard, M. L. Cambridge, A. Cunha, C. Durance, W. Giesen, Q. Han, S. Hosokawa, W. Kiswara, T. Komatsu, C. Lardicci, K. S. Lee, A. Meinesz, M. Nakaoka, K. R. O'Brien, E. I. Paling, C. Pickerell, A. M. A. Ransijn, J. J. Verduin “Global analysis of seagrass restoration: the importance of large‐scale planting,” J. Appl. Ecol., vol. 53, pp. 567–578, 2016. [4] K. McMahon, K. J. Van Dijk, L. Ruiz-Montoya, G. A. Kendrick, S. L. Krauss, M. Waycott, J. Verduin, R. Lowe, J. Statton, “The movement ecology of seagrasses,” Proc. R. Soc. B Biol. Sci., vol. 281, 2014. [5] F. Dayoub, M. Dunbabin, and P. Corke, “Robotic detection and tracking of crown-of-thorns starfish,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1921–1928, 2015. [6] R. Clement, M. Dunbabin, and G. Wyeth, “Toward robust image detection of crown-of-thorns starfish for autonomous population monitoring,” in Australasian Conference on Robotics and Automation 2005, 2005. [7] H. Zhao, Z. Nie, and X. Wang, “Design and Analysis of Multi-robot Grouping Aggregation Algorithm,” J. Robot. Netw. Artif. Life, vol. 6, pp. 60–65, 2019. [8] H. Kurdi, M. F. AlDaood, S. Al-Megren, E. Aloboud, A. S. Aldawood, “Adaptive task allocation for multi-UAV systems based on bacteria foraging behaviour,” Appl. Soft Comput., vol. 83, pp. 105643, 2019. [9] X. Chen, P. Zhang, G. Du, and F. Li, “A distributed method for dynamic multi-robot task allocation problems with critical time constraints,” Rob. Auton. Syst., vol. 118, pp. 31–46, 2019. [10] M. Chen and D. Zhu, “A workload balanced algorithm for task assignment and path planning of inhomogeneous autonomous underwater vehicle system,” IEEE Trans. Cogn. Dev. Syst., 2018. [11] J. Chen, J. Wang, Q. Xiao, and C. Chen, “A Multi-Robot Task Allocation Method Based on Multi-Objective Optimization,” in 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), pp. 1868–1873, 2018. [12] C. Sarkar, H. S. Paul, and A. Pal, “A scalable multi-robot task allocation algorithm,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–9, 2018. [13] X. Bai, M. Cao, W. Yan, and S. S. Ge, “Efficient Routing for Precedence-Constrained Package Delivery for Heterogeneous Vehicles,” IEEE Trans. Autom. Sci. Eng., vol. 17, pp. 248–260, 2019. [14] Y. Liu, R. Song, R. Bucknall, and X. Zhang, “Intelligent multi-task allocation and planning for multiple unmanned surface vehicles (USVs) using self-organising maps and fast marching method,” Inf. Sci. (Ny)., vol. 496, pp. 180–197, 2019. [15] X. Cao, H. Sun, and G. E. Jan, “Multi-AUV cooperative target search and tracking in unknown underwater environment,” Ocean Eng., vol. 150, pp. 1–11, 2018. [16] L. Zhong, Q. Luo, D. Wen, S. D. Qiao, J. M. Shi, and W. M. Zhang, “A task assignment algorithm for multiple aerial vehicles to attack targets with dynamic values,” IEEE Trans. Intell. Transp. Syst. , vol. 14, no. 1, pp. 236–248, 2013. [17] D. Zhu, X. Cao, B. Sun, and C. Luo, “Biologically inspired self-organizing map applied to task assignment and path planning of an AUV system,”

IEEE Trans. Cogn. Dev. Syst. , vol. 10, no. 2, pp. 304–313, 2018. [18] D. Zhu, H. Huang, and S. X. Yang, “Dynamic task assignment and path planning of multi-AUV system based on an improved self-organizing map and velocity synthesis method in three-dimensional underwater workspace,”

IEEE Trans. Cybern. , vol. 43, no. 2, pp. 504–514, 2013. [19] S. MahmoudZadeh, D.M.W. Powers, R.B. Zadeh, “Autonomy and Unmanned Vehicles: Augmented Reactive Mission–Motion Planning Architecture for Autonomous Vehicles,” Springer Nature (2018), Cognitive Science and Technology, ISBN 978-981-13-2245-7, Series ISSN: 2195-3988. DOI: 10.1007/978-981-13-2245-7. [20] A.M. Yazdani, K. Sammut, O.A. Yakimenko, A. Lammas, Y. Tang, S. MahmoudZadeh, “IDVD-based trajectory generator for autonomous underwater docking operations,” Journal of Robotics and Autonomous Systems, Elsevier, vol. 92: 12-29. 2017. [21] G. Hamerly and J. Drake, “Accelerating Lloyd’s algorithm for k-means clustering,” in Partitional clustering algorithms, Springer, pp. 41–78, 2015. [22] F. Janati, F. Abdollahi, S. Ghidary, M. Jannatifar, J. Baltes, and S. Sadeghnejad, “Multi-robot Task Allocation Using Clustering Method,” vol. 447, pp. 549, 2016. [23] F. Jolai, M. Rabiee, and H. Asefi, “A novel hybrid meta-heuristic algorithm for a no-wait flexible flow shop scheduling problem with sequence dependent setup times,” Int. J. Prod. Res. - INT J PROD RES, vol. 50, pp. 1–20, 2012. [24]F. Janati, F. Abdollahi, S. S. Ghidary, M. Jannatifar, J. Baltes, “Multi-robot task allocation using clustering method,” in

Robot Intelligence Technology and Applications 4 , Springer, pp. 233–247, 2017. [25]A. Hussain and I. Ahmad, “Development a New Crossover Scheme for Traveling Salesman Problem by aid of Genetic Algorithm,” in “

Intelligent Systems and Applications 2019 ”, pp. 46–52, 2019.

Amin Abbasi was graduated in Master of Control Engineering in Azad University of Khomeinishahr, Esfahan, Iran. His research field is motion control and path planning of mobile robots. He is interested in intelligent algorithms and machine learning, and their application in autonomous engineering. [email protected]

Somaiyeh MahmoudZadeh is Assistant Professor of Data Science at School of IT, Deakin University of Australia. She was employed as a Postdoctoral Research Fellow by Monash University since 2017 to 2018. Her research area includes Computational Intelligence, Autonomous Decision Making and Situational Awareness in Unmanned Vehicles Mission-Motion planning. [email protected]

Amirmehdi Yazdani (S’14-M’18) received his PhD degree in Electrical-Control Engineering from Flinders University, SA, Australia, in 2017. From 2017 to 2018, he was employed as a Postdoctoral Research Associate by Flinders University. He is currently working as a Lecturer in Electrical Engineering in the College of Science, Health, Engineering and Education at Murdoch University, WA, Australia. He is serving as an Academic Chair of Engineering Technology, Electrical Power Engineering, and Renewable Energy Engineering at Murdoch University. He is also the Vice-Chair of IEEE Industrial Electronic Society, WA Chapter. His areas of research specialization are concerned with guidance and control of robotic, autonomous, and mechatronic systems, optimal control and state estimation theory, and intelligent control applications. [email protected]@murdoch.edu.au