Optimized flocking of autonomous drones in confined environments

See allHide authors and affiliations

Science Robotics  18 Jul 2018:
Vol. 3, Issue 20, eaat3536
DOI: 10.1126/scirobotics.aat3536


We address a fundamental issue of collective motion of aerial robots: how to ensure that large flocks of autonomous drones seamlessly navigate in confined spaces. The numerous existing flocking models are rarely tested on actual hardware because they typically neglect some crucial aspects of multirobot systems. Constrained motion and communication capabilities, delays, perturbations, or the presence of barriers should be modeled and treated explicitly because they have large effects on collective behavior during the cooperation of real agents. Handling these issues properly results in additional model complexity and a natural increase in the number of tunable parameters, which calls for appropriate optimization methods to be coupled tightly to model development. In this paper, we propose such a flocking model for real drones incorporating an evolutionary optimization framework with carefully chosen order parameters and fitness functions. We numerically demonstrated that the induced swarm behavior remained stable under realistic conditions for large flock sizes and notably for large velocities. We showed that coherent and realistic collective motion patterns persisted even around perturbing obstacles. Furthermore, we validated our model on real hardware, carrying out field experiments with a self-organized swarm of 30 drones. This is the largest of such aerial outdoor systems without central control reported to date exhibiting flocking with collective collision and object avoidance. The results confirmed the adequacy of our approach. Successfully controlling dozens of quadcopters will enable substantially more efficient task management in various contexts involving drones.


Groups of gregarious animals often display an interesting and spectacular collective pattern (1): They establish ordered structures without collisions in a limited amount of time (2, 3). They can also react extremely fast to environmental changes, such as the sudden appearance of a predator or an obstacle (4, 5). Although these systems are enormously complex, they are also perfectly optimized, and thus, their expressed motion patterns remain gracefully natural (6). When these systems are modeled, one tends to focus on the replication of the smooth optimal motion patterns by making idealistic assumptions about the underlying complexity. This simultaneous simplification of the “input” and “output” explains why so many different statistical physical models of swarm behavior can be efficient in reproducing the same natural collective motion patterns with abstract mathematical formalism.

According to early microscopic agent-based models (7), establishing and maintaining collision-free cohesive flocking require only three simple interactions between idealistic agents: repulsion in short range, velocity alignment in middle range, and attraction in long range. On the basis of these general rules, hundreds of models have emerged to describe the synchronized collective motion of animals, humans, or even migrating cells (810). We call these systems self-organized because interactions in them are local and decisions are made by the agents themselves.

One of the recent applications of self-organizing flocking models is in collective robotics (11, 12), where decentralized control algorithms for groups of autonomous drones can be developed on the basis of these interactions, as a prerequisite for safe operation. Driving the behavior of such systems toward some desirable pattern is highly nontrivial. First, the agents (robots and drones) are autonomous and imperfect. That is, every agent has (i) its own onboard computer for performing the calculations needed for controlling its own actions, (ii) its own sensor system for measuring relative positions and velocities, and (iii) its own communication device for data exchange with neighboring agents. These features reflect the current definition of sensory and reactive autonomy described in (13). Second, these systems should work without central control. That is, although agents can observe each other and may exchange information, they do not send and receive direct control commands because there is no leader within the group, nor is there an external supervisor such as a base station or human overseer.

In developing decentralized control algorithms for swarms of flying robots in stochastic environments where communication outages and delays are common, one soon faces a set of severe challenges that are rarely targeted by previous idealistic agent-based models. As an example, 32 representative microscopic flocking models were selected and compared out of more than 100 (9). Fine and Shell state that “there is no consensus on the precise details of the motions needed to produce rich flocking motions under realistic sensing models, actuation, and dynamics constraints”; most works lack completeness and precision in terms of repeatable modeling and validation; only a few included motion constraints and collision avoidance [e.g., (7, 14, 15)]; and none handled motion constraints explicitly. Finally, only one investigated bounded space with obstacles (16).

While aiming for a stable and scalable flocking model for real flying robots, some serious design challenges need to be addressed:

(1) Reality gap. Flocking models that are stable in simulation under idealistic conditions tend to oscillate and destabilize quickly under real-life conditions when delays, uncertainties, and kinematic constraints are present (1721).

(2) Adaptivity. Flocking models developed for open space or periodic boundary conditions do not necessarily work in confined spaces and with obstacles in the way (9, 22).

(3) Scalability. Flocking models that are developed for a specific speed or group size might not be scalable; that is, for higher velocities or larger groups, motion patterns may become unstable (16, 23, 24).

(4) High dimensionality. Flocking models that work well in real life generally have a substantial number of parameters with complex nonlinear interactions that need to be tuned for a wide range of conditions in reasonable time (6).

The largest drone swarms so far were developed for show business by Intel (25) and by Ehang (26) with more than 1000 drones each; however, these drones were individually programmed for predefined trajectories or were centrally controlled and did not satisfy the above criteria of autonomy. The music band Metallica recently included dozens of drones in their concerts that seemed to exhibit some kind of partially autonomous swarming behavior by using a dedicated indoor positioning system and central control mechanisms (27). The U.S. military is also experimenting with fixed-wing drone swarms called Perdix (28). The press release stated that the system of 103 autonomous drones performed adaptive formation flying. The published video suggests that the drones received a set of predefined targets, chose one with a collective decision, and followed that individually. Drones also loitered around a common point, although at different heights. Unfortunately, there are no public details about the control mechanism, the communication scheme, or possible collision avoidance behavior for a reliable assessment of the work.

Autonomous drone swarms also appear in the scientific literature, using indoor motion capture–based (29, 30), outdoor Global Positioning System (GPS)–based (24, 3133), or even vision-assisted (34, 35) navigation. These systems typically have a much smaller flock size than preprogrammed drone swarms. Although motion capture–based indoor systems (with 20 minidrones and 49 nanodrones in the mentioned citations) are remarkably accurate and dynamic, they represent a very different type of system because they do not have to tolerate profound imperfections such as meter-level positioning, external wind turbulence, or long-range communication decay. The mentioned GPS-based outdoor swarms consisted of no more than 10 drones, except for (32), where 50 fixed-wing unmanned aerial vehicles (UAVs) were flown but at different altitudes, without any explicit collision avoidance mechanism. Vision-based solutions have used only a few drones as the state of the art.

In this article, we build on our previous results (24), where an outdoor drone swarm of 10 agents were presented as a proof of concept with flocking and formation flight capabilities. Although the previous work included preliminary results of closed-area flocking, trajectories were quite oscillatory even though they were executed in the simplest arena: a circular one that actually helped to develop smooth turns. Furthermore, the system—due to the improper treatment of acceleration limits—was not scalable to speeds higher than 4 m/s.

Creating a large decentralized outdoor drone swarm with synchronized flocking behavior using autonomous collision and object avoidance in a bounded area is as yet an unresolved task. We filled this gap by presenting real flights of 30 autonomous quadcopters performing tight and stable flocking in a bounded and cluttered environment. To achieve this goal we used a scalable, optimized control framework, based on realistic dynamic modeling and the explicit treatment of motion constraints in the flocking equations.

The overall descriptors that specified a given setup of our system were the number of drones and the predefined flocking speed. The desired swarm behavior was defined as being collision-free and coherent, that is, with strongly correlated velocity values of the individual drones, and exhibiting a velocity close to the flocking speed. Furthermore, we aimed for stable swarm behavior with persistent global collective motion patterns resembling those of natural systems with collective intelligence.

The explicit treatment of motion constraints was based on a special concept for the velocity alignment interaction. The key idea was to abandon the generally used fixed spatial boundaries of the local interactions. Instead, the alignment interaction range (and magnitude) was determined dynamically, based on the expected optimal relation between distance and velocity difference. Because the acceleration of agents is limited, they need time and space to brake and avoid collisions. Consequently, the amount of allowed velocity difference must be distance-dependent: Close agents should align perfectly, whereas distant agents are allowed to have larger velocity difference up to a certain limit. To find the upper bound of velocity difference for a given distance, we used an acceleration-limited braking curve. The goal of the alignment was thus to reduce velocity difference below this distance-dependent threshold. This workflow was easy to calculate and provided optimal foundations of scalability in the velocity domain because it took into account the limited acceleration of agents, the source of many undesired oscillations.

The model has many independent parameters with which a broad range of emerging behaviors and visually pleasing collective patterns could be generated. However, our requirements of stability and coherence provide quantifiable criteria for the instantiation of the general model with suitable parameter values. This implies a highly nontrivial optimization problem because of the large number of parameters, their complex nonlinear interactions, and the noisy relations between parameter values and collective motion patterns.

An important element of our approach is the focus on model instances, that is, on models together with specific values of their parameters. The rationale is grounded in our interest in system behavior. Having a model is not enough to generate and study motion patterns; to make a model executable, it must be instantiated by parameter values. Blatantly disregarding theoretical benefits of models, we could say that any model is worth as much as its best instance. Therefore, we considered optimizing the parameter setup as an essential part of the model generation.

This view is missing from current flocking models and the realization of the corresponding robotic swarms, although it stands to reason that natural systems operate at the optimal values of their “tunable” parameters (in the spirit of the Darwinian theory). As the complexity of artificial intelligence increases, we will be forced to include more and more optimization into model design.

To solve the optimization problem, we used evolutionary algorithms, population-based stochastic search methods inspired by natural evolution that have proven competitive in solving hard problems in the face of challenging characteristics such as nondifferentiability, discontinuities, multiple local optima, noise, and nonlinear interactions among the variables (36). The family of evolutionary algorithms contains several variants of the main principles, including genetic algorithms, evolution strategies, differential evolution, and particle swarm optimization (37). Evolution strategies, particularly the covariance matrix adaptation evolution strategy (CMA-ES) (38), are considered to be excellent optimizers in continuous parameter spaces; therefore, we used this algorithm to find good settings for our model.

The main contributions of this paper are (i) a flocking model that explicitly treats motion constraints by maintaining an improved balance between distance and velocity difference; (ii) a method to design individual drone controllers by optimizing self-organized group-level behavior in a complex, noisy, real-world system; (iii) simulations of this system for presenting its scalability for wide velocity ranges and group sizes; and (iv) the demonstration of the framework with a fleet of 30 quadcopters, performing fully autonomous, synchronized outdoor flights with collective collision and obstacle avoidance in a confined space.


Flocking model instantiation through evolutionary optimization

Our generic flocking model included proper evaluation metrics, that is, order parameters and fitness functions (see Materials and Methods for detailed equations). The model was instantiated with proper parameter values first in simulation. Evolutionary optimization has been used to find parameter values that maximize flock coherence and speed while minimizing collisions. First, we optimized the parameters under conditions suitable for real experiments: using a square-shaped obstacle-free arena with a side length of Larena = 250 m and three different flocking speed (vflock) values: 4, 6, and 8 m/s. The corresponding values for the maximum allowed speed (vmax) were 6, 8, and 10 m/s, respectively. We examined the behavior of 100 simulated agents in all cases. For each of the vflock values, we performed at least three independent, randomly initialized optimization processes to identify possible multiple local optima in the parameter space but found very similar solutions and convergence in the alternative evolutionary runs. Therefore, below, we will only refer to the best (highest fitness) evolutionary run for each flocking speed.

The populations in our evolutionary algorithm consisted of parameter vectors whose fitness was determined by a 10-min-long realistic simulation of the system. In all runs, we used a population size of 100 and terminated after 150 generations. The 15,000 fitness evaluations turned out to be sufficient in all cases. The optimization was performed on the Atlasz supercomputer cluster of Eötvös University, Budapest, Hungary (39); the overall execution time of a single evolutionary run varied between 2 and 6 days.

The evaluation of the phenotypes was based on a single fitness value that was created as the product of six independent normalized partial fitness values (corresponding to minimized collision risk, minimized collision with walls, maximized velocity correlation, velocity magnitude as close as possible to flocking speed, maximized cluster size, and minimized number of disconnected agents). Each partial fitness, as well as the final fitness value, takes values between 0 (worst case) and 1 (ideal case).

The final fitness of the best solutions after optimization converged to 0.92, 0.87, and 0.8 for vflock = 4, 6, and 8 m/s, respectively. In these best stochastic simulation instances, four of the six partial fitnesses were exactly 1 (corresponding to a perfectly collisionless and fully connected flock), and only the velocity correlation and the average velocity reduced the overall fitness. This is a natural and inevitable tendency because hitting the wall in a bounded area requires the flock to change direction, and this cannot be performed without temporarily reducing the speed and velocity correlation. It is worth noting that these high fitness values have been reached under harsh realistic conditions with a 1-s communication delay and substantial noise, which generally act against perfect synchronization. Optimized parameter values are given in table S1.

The stability of the optimized models was investigated next by executing 100 parallel stochastic simulations for each speed. Detailed results about the statistical evaluations can be found in table S2. The average fitnesses naturally became somewhat lower than the maxima: 0.812 ± 0.101 (SD), 0.776 ± 0.086, and 0.728 ± 0.075 for vflock = 4, 6, and 8 m/s, respectively, with the appearance of occasional collisions. Note that in simulation, the partial fitness of collisions must be a continuous and not-too-steep function; otherwise, the optimizer cannot find the direction of gradients from suboptimal solutions. This means that having a few collisions can be banned only with a limited decrease of fitness, and the optimizer will not devote a special priority to entirely collisionless solutions; it will aim for a maximally dense flock implicitly to increase connectedness and velocity correlation, which narrows that stability range with regard to collisions. On the other hand, in reality, collisions have to be eliminated completely at all times as a first rule. Several workarounds exist for this problem: (i) running the optimizer with a larger radius of collision or a larger communication delay to optimize to a solution where minimum interagent distances become larger and (ii) increasing interagent distance (repulsion) manually after optimization with the harmless compromise of reducing overall velocity correlation in the enlarged flock. This time, we chose the second method because, in real flights, one needs to start from an oversecured parameter setup anyways, compared with the optimum with highest possible fitness but lower stability.

Evolutionary optimization produced a huge number of stochastic fitness evaluations, which also contain precious information about the reasonable parameter ranges where fitness is expected to be high. These working ranges are summarized in table S1 to provide tuning information for real drones as well. We also listed all model parameters in table S3 with a detailed explanation on meaning and usage.

Finally, let us note a surprising benefit of evolving parameter settings. The evolutionary algorithm found unexpected parameter settings in both the repulsive and alignment interactions between agents:

(1) Instead of a strong hard-core repulsion (as expected intuitively), a spatially more extended and smoother repulsion is preferable according to the evolutionary optimization.

(2) The velocity alignment between close-by agents should be maximal and mostly distance-independent, allowing only a certain, relatively small velocity difference slack, mainly to speed up the collective turning process.

Overall, we achieved our first goal: The general model could be instantiated well with suitable parameter values in simulation; the optimized setup displayed a stable and efficient flock in the investigated velocity regime, which can serve as the basis for the real field experiments. Comparing our new results to our previous work (40), we can see a substantial increase in flock stability and coherence (see Fig. 1). Movies S1 and S2 show corresponding motion of old and new simulations at 4 m/s.

Fig. 1 Comparing previous simulation work with current study.

Sample timelines of two order parameters (right, velocity correlation at top and normalized velocity magnitude at bottom) from our previous work (40) (algorithm A) and our novel flocking model (algorithm B). Trajectories corresponding to the gray sections of the timelines are shown for both models on the left, with color mapped to time. Corresponding motion can be seen in movies S1 and S2. Algorithm B performs much better and has a lower transient time. We used the following interaction parameter set for algorithm A: Cfrict = 30 m2, Embedded Image m, Embedded Image m, and prep = 1 s−1 [for details on parameters, see (40)]. For algorithm B, we used the optimized parameter set for vflock = 4 m/s. Using an average from 10 simulations with the same parameter setup, the order parameters averaged over time were Embedded Image and Embedded Image m/s for algorithm A and Embedded Image and Embedded Image m/s for algorithm B.

Scalability in velocity

The most important feature of the acceleration-limited velocity alignment term in the agent-agent and wall-agent interactions is its inherent scalability in the velocity regime. To demonstrate this, we performed further optimizations with higher flocking speed values: 16 and 32 m/s, beyond most bird migration speeds (41). We changed two parameters in the environmental setup: We increased the communication range from 80 to 160 and 320 m and the size of the arena from 250 to 500 and 1000 m for the two speed values, respectively. The first change was needed because the communication delay remained at 1 s, which created a much larger positional uncertainty and braking distance at higher speeds. This can be compensated only if agents have information about each other at higher distances. The second change was a consequence of the first: With such a large communication delay and speed, the interagent distances became larger, and thus, the flock could not fit into a smaller arena with enough freedom for nice flocking behavior.

The optimized solutions obtained a high fitness again, with a maximum of 0.91 and 0.89 and a statistical average of 0.79 ± 0.12 and 0.63 ± 0.23 for 16 and 32 m/s, respectively. Detailed fitness values of the statistical evaluations are summarized in table S2. Note that for the highest speed of 32 m/s, the lower average of the fitness is mostly a result of the slightly increased number of collisions (3.53 ± 3.61). We investigated the role of communication range and delay in this case and found that collisions disappeared when we reduced the delay below 1 s, assuming that the communication range was large enough (see Fig. 2 for details). The first part of movie S3 shows the optimized and stable flocking behavior of 100 agents in simulation for 16 m/s.

Fig. 2 Distribution of the number of collisions and the average closest-neighbor distance as a function of communication range and delay.

Every bin is the average of 20 simulations with the optimized parameter setup for a flocking speed of 32 m/s. As can be seen, safe flocking can be achieved with small enough delay (<1 s) and large enough communication range (>240 m) for this setup.

Scalability in agent number with collective obstacle avoidance

Because of the locality of the communication and the interactions, the proposed flocking model provides the foundations of scalability in agent number. However, when more agents synchronize in such a nonequilibrium system, the overall momentum of the flock also scales with flock size, which creates increased “pressure” of agents when the flock bumps into walls. In similar situations, human crowds are prone to injuries or even death during panic events (16) or, for example, around mosh pits at heavy metal concerts (42). To provide collisionless solutions with higher agent numbers, one needs to prevent accumulating pressure of agents, for example, using obstacles inside the arena. Obstacles can be introduced with the same type of interactions as surrounding walls (see Materials and Methods for details).

Without going into statistical details, movie S3 shows some examples of the realistic simulation with flock sizes between 30 and 1000 and flocking speeds between 4 and 32 m/s, with different types of obstacles in the way. Overall, we see that the presented flocking model can be used as a general framework to handle flocking-type motion in a confined area, with a large number of agents, large flocking speeds, and obstacles.

Experimental results with outdoor flying robots

We implemented the described model in our custom-built multidrone framework as a control algorithm. Details of the drone setup are given in Materials and Methods.

We performed two-dimensional experiments with 30 drones flocking at the same altitude, with a horizontal speed of vflock = 4, 6, and 8 m/s and Larena = 200 to 260 m. Parameter values were initialized mostly within the working range of the corresponding evolutionary optimized results (table S1) with some notable changes from simulation optima based on the following precautious and preventive safety considerations: (i) repulsion strength was somewhat increased (larger gain) to minimize the chance of collisions (compromise: sparser flock); (ii) coefficient of alignment was increased to reduce possible oscillations (compromise: more sluggish motion); (iii) shill interaction strength was reduced (smaller shill velocity), but range was extended (larger distance offset) to avoid very high interaction terms at walls while maintaining overall strength. The final parameter values used in the experiments and detailed comments on their possible changes from simulation optima are summarized in table S4.

To assess the quality of the flights, we calculated a set of order parameters that describe different aspects of the motion. We calculated the cluster-dependent velocity correlation (ϕcorr), the average velocity (ϕvel), the average and minimum of interagent distances (minrij and Embedded Image), and the average normalized velocity expressed in local angle polar (LAP) coordinates (43):Embedded Image(1)where rx and ry represent the average position of agents in the horizontal plane relative to the center of the arena and vx and vy are the average velocity components of the agents in the horizontal plane. ϕLAP is a simple descriptor of rotational behavior: An instantaneous value of zero means no correlated motion tangentially, whereas a value of 1 or −1 represents correlated circular flight in the counterclockwise and clockwise directions, respectively.

Figure 3 shows the order parameters as a function of time for a selection of full 10- to 15-min stable flights (flight time depended on battery and wind conditions, with a maximum tolerated wind of about 40 km/hour). Flights were selected to represent the most common emerging collective patterns, namely, repetitive circular and diagonal flights in an obstacle-free arena and a lively random collective flight with obstacles. Note that the emergent rotational pattern is a universal one (44), appearing in a large variety of flocking systems ranging from elongated rods (45) through locusts (2) and fish (5) to humans (46, 47). A 10-min part of the trajectories from the same flights can be seen in Fig. 4, a long-exposure photo of a shorter section is shown in Fig. 5, whereas movies S4 to S6 show the corresponding dynamic flight log visualizations for the three selected flights. Finally, movie S7 shows a summary of the results including actual footage of the flights, too.

Fig. 3 Order parameters as a function of time for different vflock values during real experiments with 30 drones.

ϕcorr is the cluster-dependent velocity correlation, ϕvel/vflock is the average normalized velocity, and Embedded Image represents the average of the closest neighbors, whereas min(rij) is the minimum of the closest neighbors. The depicted region corresponds to the middle 5 min of Fig. 4. There are two typical, mostly stable behaviors in a square-shaped arena without obstacles: (A) shows mostly linear motion along the main diagonals with a cyclic expansion and shrinking of the flock (cyclic red and orange curves) and sudden turns at corners (blue and green curves dropping to zero), whereas (C) shows circular motion within the boundaries (nearly constant order parameters at all times). (B) The repetitive, trivial patterns were broken and became livelier due to obstacles in the way. Correspondingly, velocity correlation and average velocity magnitude drops, whereas minimal interagent distance remains the same, showing the stability of the flight even in this obstructed case.

Fig. 4 Multidrone flight trajectories and corresponding order parameters.

Ten-minute trajectory sections of 30 drones in the horizontal plane for (A) 4 m/s, (B) 6 m/s and (C) 8 m/s flights, representing a selection of typical flight patterns. Blue squares show the boundaries of the virtual arena. (A) The trajectories show diagonal linear motion of the flock, bouncing back from the right-angled corners. Trajectory colors represent speed in the horizontal plane, whereas a random single trajectory is highlighted in gray scale. (B) The motion is still locally correlated, but the obstacles (red shapes) induce a very rich dynamic pattern, resembling lively flocks of birds or other animals. (C) The trajectories show a highly correlated close-to-circular flight. Colors and line styles are mapped to individual drones here; black dots show terminal positions of drones. (D) Comparison of the three qualitatively different behaviors of (A) to (C) with the timeline of a dedicated order parameter: the average normalized velocity, expressed in local angle polar coordinates (ϕLAP).

Fig. 5 Long-exposure photo of a flight with multiple drones. [Credit: Zsolt Bézsenyi]

As can be seen from the experiments, real drones performed well within the whole tested flocking speed regime. Namely, there are no critical oscillations or collisions with each other, with the wall, or with obstacles. Furthermore, motion is smooth and lively during collective turns when the flock hits the walls, flies into the right-angled corner of the arena, or splits when obstacles are in the way. On the basis of Fig. 3, we can see that, for the flocking speed regime of 4 to 8 m/s, the average closest-neighbor distance varied between 12 and 30 m and the minimum interagent distance remained between about 5 and 15 m. With Global Navigation Satellite System (GNSS) positioning errors in the range of 2 and 3 m and possible communication outages in the order of 1 s, we believe that these results show a tight and stable flock.


We have presented a tunable distributed flocking model for a large group of autonomous flying robots with which they were capable of maintaining stable, collision-free collective motion in a closed space with or without obstacles, within a large velocity regime. The solution is based on the simplest force-based rules presented by the earliest self-propelled particle models, repulsion and alignment, but uses a form of alignment that takes into account the desired acceleration regime of agents. The model works in a noisy environment, with inaccurate sensors and short-range communication devices, and in the presence of substantial communication delay and with possible local communication outages—these are usual features of current outdoor multidrone experimental setups. The model produces a very rich dynamic of motion, especially in an obstructed space, with a variety of emergent collective motion patterns, resembling lively natural flocks.

The model has 11 tunable parameters that call for automated and efficient optimization methods, such as the CMA-ES. With the introduced single-objective fitness function taking into account several important order parameters, we could instantiate our model to find working ranges and optimal parameter setups for a wide range of velocities. Both our optimized simulation results with 30 to 1000 drones and 4 to 32 m/s flocking speeds and our real experiments on 30 drones and 4 to 8 m/s flocking speeds showed stable flocking behavior of agents.

Because of the concept behind the new alignment term, we believe that much higher velocity regimes can be targeted with the same approach, if needed. For this, one would need to have larger interagent distances, larger radii of interactions, and thus proper large-distance communication methods.

Limitations of the general usage of the model could arise from further scaling in the velocity regime and in the number of drones. With 30 drones, we demonstrated an order-of-magnitude scalability relative to the smallest drone swarms of only a few agents; however, further scaling in numbers also implies an increase in emergent pressure among frontal agents facing walls and obstacles, such as that in human crowds, resulting in smaller, possibly dangerous interagent distances. This issue has to be solved in systems with a very high number of agents. A related limiting factor is the large number of necessary parameters that need to be optimized for every system separately. Although we have selected the parameters with special care to have independent meaning and significance, in such a complex system, a deep understanding of the rich dynamic behavior and substantial experience is needed before safely applying the results to other vehicles with different characteristics. A final shortcoming of the present study is the lack of rigorous quantitative analysis of stability, because it is not straightforward to do in such a high-dimensional parameter space. We avoided this by analyzing the fitness evaluations statistically and gave approximate, independent ranges for the parameters above which fitness is expected to be high. For certain applications, though, a more sophisticated analysis would be more appropriate. Despite these limitations, we believe that the presented concept of velocity alignment, the model in general, together with the fitness evaluation method, can be used optimally in a wide range of multidrone scenarios requiring sophisticated cooperation and/or collaboration.


A tunable self-propelled flocking model

On the basis of statistical-physical methods, a basic flocking model was introduced in (40) and (24). This is a minimal realistic approach of flocking behavior that was demonstrated to work with up to 10 flying robots with a maximum flocking speed of 4 m/s. In this model, the three interaction terms yield a momentary desired velocity vector vd, which has to be achieved by the agents. Of course, maintaining the desired velocity is generally hard due to several robot-specific deficiencies such as communication delays and reaction times, inaccuracy of the onboard sensors, effects of wind, sensor signal outages, inertia etc. The question we examine here is whether there are interactions that can guarantee more stable behavior than previously published attempts under these conditions for larger flocking speeds and also in confined spaces. In the subsections below, we present the exact mathematical formulation of our novel generic flocking model, taking into account the realistic limitations of autonomous flying agents with the explicit treatment of motion constraints in the equations.


For local repulsion, we choose a simple half-spring model, that is, a linear distance-dependent central velocity term with a cutoff at a maximum interaction range, Embedded Image, under which agents start to repulse each other:Embedded Image(2)In the equation above, prep is the linear gain of the pairwise repulsion and rij = |rirj| is the distance between agents i and j. The total repulsion term calculated for agent i with respect to the other agents isEmbedded Image(3)where j is iterated for all other agents. Note that we have experimented with more complex repulsion functions in the v(r) plane (40, 48), but according to our experience, the half-spring model is sufficiently simple and effective at the same time.

Velocity alignment

Pairwise velocity alignment can be obtained with a velocity term that depends on the difference of the velocity vectors of nearby agents. Previous works typically used a power law of the velocity difference of the interacting agents decaying in space asymptotically to zero (49, 50). These models work fine in some specific conditions, within a lower velocity regime. However, our objectives regarding the velocity alignment are complex. It is the very term that synchronizes motion to achieve collective flocking behavior, but it also has to serve as a damping medium, reducing self-excited oscillations emerging due to the delayed and noisy response to for example, repulsion. It has to be local, but it also has to be scalable for large velocities (and therefore large possible velocity differences) at the same time. This last condition implies that if the acceleration of the agents is limited, large velocity differences should be relaxed at larger distances to avoid collisions.

To fulfill all the requirements above, as a theoretical basis for the velocity alignment term, we have chosen an ideal braking curve, that is, a smooth velocity decay function in space [denoted by D(.)], with constant acceleration at high speeds and exponential approach in time at low speeds (51):Embedded Image(4)where r is the distance between an agent and an expected stopping point, a is the preferred acceleration, and p is a linear gain also determining the crossover point between the two phases of deceleration.

The rationale behind our velocity alignment term is to prohibit two agents having a larger velocity difference at a given distance than what is allowed by this ideal braking curve and, thus, to serve as kind of a motion planning term in the otherwise momentary force-based equations (see Fig. 6 for a visual representation):Embedded Image(5)Embedded Image(6)In the equations above, Cfrict is a linear coefficient of the velocity alignment error reduction, vfrict is a velocity slack to allow for certain amount of velocity difference independently of interagent distance, Embedded Image is the distance of the stopping point for agent i relative to and in front of agent j, pfrict and africt are the linear gain and the acceleration parameters of the pairwise alignment, and vij = |vivj| is the amplitude of the velocity difference between agents i and j. The total velocity alignment term calculated for agent i with respect to the other agents—similarly to the repulsion term—isEmbedded Image(7)where j is iterated for all other agents. Note that the superscript “frict” comes from the concept that velocity alignment should be a strong local velocity-damping term, analogous to viscous friction (24).

Fig. 6 Visual explanation of the interaction terms.

The blue line depicts repulsion between agents as a function of interagent distance. The green line is the maximum allowed velocity difference between agents as a function of interagent distance. The velocity alignment term is proportional to the difference between this and the actual velocity difference between agents (red dashed line). All exemplary parameter values are in SI units.

In addition, the locality condition of the velocity alignment in this form is only implicitly included: The interaction range is upper bounded by the distance where D(.) = 2vmax. On the other hand, this solution allows for flexible scalability in the velocity domain. If the flocking speed is much higher, then it is obviously preferable to start reducing velocity difference at a much larger distance (as an analogy, compare the deceleration behavior and braking distances of a toy drone against an object and a large manned aircraft reaching its destination).

Interaction with walls and obstacles

Long-range attraction (7) is not explicitly part of our flocking system. To keep agents together, we instead define a bounded flight arena for the agents surrounded with soft repulsive virtual walls. One of the ideal ways to define such repulsion is to define virtual “shill” agents near the arena walls (52). These virtual agents are heading toward the arena with a certain speed, vshill. The real agents close to the walls should relax their velocity to the velocity of the shill agents. We do this here with the velocity alignment term introduced before:Embedded Image(8)Embedded Image(9)These equations are very similar to Eqs. 5 and 6, with two simplifications: We do not allow velocity slack for the wall and keep the error proportional term (Cshill) at 1 to have the strongest shill alignment possible. In the equations above, the index s refers to the shill agents defined for all wall polygon edges separately; ris = |rirs|, where rs is the position of the shill agent, located at the closest point of the given edge of an arbitrarily shaped convex wall polygon relative to agent i; vis = |vivs|, where vs is the velocity of the shill agent, pointing perpendicularly to the wall polygon edge inward the arena, with magnitude vshill.

Convex obstacles inside the arena can be avoided with the same concept, but with shill agents moving outward from the obstacle, not inward, as described above for the arena. Another difference is that, whereas all wall polygon edges generate a separate shill agent inside the arena, obstacles are represented with a single shill agent located at the closest point of the obstacle polygon relative to the agent. Thus, for every agent i and obstacle s, we can define a velocity component Embedded Image similarly to Eq. 9, using the same shill parameters as for the wall.

Self-propelling term

Besides the agent-agent and agent-wall interactions introduced above, a simple self-propelling term is added to the desired velocity of the agents. For the ith agent, this term is parallel with the actual velocity vector, vi, and has a certain constant magnitude, vflock.

Final equation of desired velocity

To calculate the desired velocity, we take the vectorial sum of all the interaction terms introduced before:Embedded Image(10)After this superposition, we also introduced a cutoff at vmax, keeping the direction of the desired velocity but reducing its magnitude if it is over the limit:Embedded Image(11)In the flocking model above, we have introduced a substantial number of parameters to give the necessary degree of freedom to the general model. To help readers understand complex model behavior, we provide an overview of the parameters with detailed descriptions on meaning and usage (table S3).

Tuning the above model means that we choose an optimal set of parameters for a fixed flocking speed vflock and maximal speed vmax for a given arena with characteristic size Larena. The other parameters (namely, Embedded Image, prep, Embedded Image, Cfrict, vfrict, pfrict, africt, Embedded Image, vshill, pshill, and ashill) have to be optimized. Note that the parameter space is 11-dimensional; therefore, manual tuning, global optimization methods, or parameter sweeping would be generally too time-consuming.

General model of a flying robot

For testing any flocking algorithm in a realistic environment before actual flights, we used a simulation framework, which was originally developed for modeling special features of flying robots based on second-order ordinary differential equations. In this subsection, we present only the main features of this framework, without details. For further details, see (40) or download the simulation framework from The following general features of flying robots can be taken into account with our framework:

(1) Communication delay. The position and velocity data received by an agent from neighboring agents are old due to the necessary time for data transmission and processing. In the simplest case, we modeled this effect with a constant time delay tdel.

(2) Inertia. A flying robot cannot change its velocity immediately because of its mass, aerodynamic effects, and specific features of its low-level control algorithm. We assumed that the real velocity vi converges to the desired velocity Embedded Image exponentially with a characteristic time τCTRL. A maximal acceleration of the units (amax) is also assumed.

(3) Refresh rate of the sensors. The agents cannot update their sensory data continuously, only with a nonzero time period ts. For simplicity, in the simulation framework, this parameter is constant and uniform for all agents.

(4) Locality of the communication. If two agents are too far from each other, they cannot exchange messages; that is, they do not see each other. This is a common feature of any decentralized, radio-based communication device. For modeling this effect, a maximum communication range rc is defined in our approach.

(5) Inaccuracy of the onboard sensors. We also had to model the fluctuating behavior of measured positions and velocities. This behavior can be described as a stochastic process. For the ith agent, this process can be chosen as a fictive Langevin equation with a Gaussian noise term and a parabolic potential centered at ri. The noise can be characterized by its SD σs.

(6) Outer noises. To take into account the environmental effects such as wind compensation of the low-level control algorithm, we added a delta-correlated Gaussian noise term with SD σ to the acceleration of the robots.

According to the list above, one can define a simulated realistic homogeneous multirobot system by giving certain values to all elements of the set {tdel, τCTRL, amax, rc, ts, σs, σ}. We presented the optimization through a realistic example based on measurements performed with quadcopters [for further details, see (24, 40)]; we chose the values listed in table S5.

Note that we prefer and tend to overestimate the errors compared with their real mean value to simulate worst-case scenarios. This makes model selection and optimization harder, but once a proper solution is found, it will ensure a larger stability range in real experiments.

Order parameters

In this subsection, the quantitative requirements of stability and coherence of a flock will be discussed. To this end, we used three measures: coherence, collision avoidance, and obstacle avoidance. Coherence is frequently described by the spatial velocity correlation, which is a commonly accepted order parameter of collective motion. However, on a large area, correlated subflock clusters without global coherence can also be treated as a reasonably good solution for flocking. Therefore, it is practical to measure the correlation function only within connected clusters. To define clusters, we defined a communication graph that contains the agents as nodes. An edge exists between two nodes if the agents referred by the nodes are closer to each other than a given rcluster, typically defined by the range of interactions. We used the valueEmbedded Image(12)where Embedded Image is the braking distance r for which D(r, a, p) = v.

Let N be the number of agents, Ni be the number of agents in the cluster that contains the ith agent, and let Ji refer to the set of indices of agents that are in the same cluster as the ith agent. On the basis of these notations, the expression of cluster-dependent velocity correlation takes the following form:Embedded Image(13)This value needs to be maximized. Besides high velocity correlation inside clusters, one can characterize the flock with properties of the communication graph itself. For example, the number of disconnected points (Ndisc, referring to agents that cannot communicate with any other agents) can be measured, or a minimum cluster size (Nmin) can be defined as an error threshold for avoiding the situation where only very small groups of agents are present in the system. Of course, the minimum cluster size should depend on the total number of agents. For intuitive reasons, we chose Nmin > N/5 as an experimentally good lower threshold.

The next important requirement is collision-free motion. We defined a characteristic distance rcoll = 3 m, which refers to a dangerous zone around agents. If two agents are closer to each other than rcoll, a dangerous situation (collision) occurs. During algorithm tuning, the number of collisions should be minimized, which is similar to minimizing the following parameter, the so-called collision risk:Embedded Image(14)where Θ(.) is the Heaviside step function.

With the wall-agent interaction velocity term defined as Eq. 9, one can restrict the motion of the flocking agents into a closed space, which can be viewed as a method for maintaining the cohesiveness of the group (this is a general criterion of flocking behavior) but also can be treated as a general approach of obstacle avoidance, which is a common task in collective robotics. Below, we define an order parameter for calculating the possible collisions with the walls of the arena or obstacles (this parameter has to be minimized):Embedded Image(15)where Embedded Image is the signed form of ris, taking a positive value outside, and a negative value inside, the arena (and the opposite for obstacles), assuring that the calculation of the average is performed only at the points where the agents are outside of the arena (or inside obstacles).

With the parameters presented above, one can define a quantitative criterion for safe flocking behavior for the simulated (or real) robots, namely, ϕcorr → 1, ϕcoll → 0, ϕwall → 0, and Nmin > N/5.

Finally, we also require the flock to move with a certain flocking speed:Embedded Image(16)

Fitness function

Instead of parameter sweeping or any relatively slow global optimization method, the search for the optimal values of the 11 model parameters (Embedded Image, prep, Embedded Image, Cfrict, vfrict, pfrict, africt, Embedded Image, vshill, pshill, and ashill) is known to be much more efficient by using evolutionary optimization. Within such a framework with the defined set of order parameters, one can choose between single- or multiobjective evolutionary algorithms. We chose using a single-valued fitness function that contains several criteria about the order parameters presented in the previous subsection. According to these assumptions, a global fitness function can be defined using three different types of transfer functions (the value of these functions should be between 0 and 1.

The first type is a monotonically growing function, F1(ϕ), which converges to 1 with increasing ϕ:Embedded Image(17)where S(x, x0, d) is a sigmoid function with a smooth sinusoidal decay from x0d to x0:Embedded Image(18)The second transfer function is derived from the probability density of the normal distribution, with a single maximum at ϕ = 0 and a smooth decay around it:Embedded Image(19)Finally, the third transfer function is a sharp peak, which gives a harsh constraint to the fitness around ϕ = 0:Embedded Image(20)A visual illustration of the shape of the three transfer functions is given in Fig. 7. With these transfer functions, we can construct a single-objective fitness function that takes into account all necessary requirements of safe flocking behavior:Embedded Image(21)whereEmbedded Image(22)and vtol, atol, and rtol are tolerance values for speed, collision risk, and collisions with walls and obstacles, respectively. These tolerance values can be chosen arbitrarily, depending on the absolute and relative importance of the partial fitness components in the optimization. We chose Embedded Image m/s, atol = 0.00003, and rtol = 2 m, which gave a balanced weighting to the overall fitness function. Note that the order parameter ϕcorr is present in the fitness function only as a multiplicative term with a cutoff at 0 because its values are originally between −1 and 1.

Fig. 7 Three different types of transfer functions with a codomain of [0,1].

A global single-objective fitness value can be defined as the multiplication of several partial fitness functions based on these transfer functions.

With this method, we created a single-objective optimization scenario, which can be solved using state-of-the-art evolutionary algorithms such as the CMA-ES (38, 53). To perform this task, we used an open-source Python library (54), with default settings (see table S6). Parameters were initialized at mid-value, with an initial SD of one-sixth of their allowed range.

Drone setup

Our quadcopters use the Pixhawk autopilot (55) for controlling the rotors with a slightly modified ArduPilot controller [see our open-source code that runs on the Pixhawk system at (56)]. We also used an onboard, Linux-based companion minicomputer (Odroid C1+) through which we gave desired velocity commands at 20 Hz to the autopilot. The desired velocity was calculated onboard using the flocking model presented above as the control algorithm.

We used two independent, parallel wireless modules for interagent communication in the 2.4-GHz range, both broadcasting the same status packets. One is an XBee module broadcasting through its own proprietary protocol at 1 Hz; the other one is a small universal serial bus (USB) wifi dongle (Odroid Wifi Module 0) transmitting user datagram protocol (UDP) packets through a local ad hoc wireless network at 10 Hz. The two modules are complementary in bandwidth and range (XBee being small bandwidth and longer range and Wifi being large bandwidth but shorter range). Packets contained an absolute time stamp, geodetic position, and velocity principally measured by onboard GNSS receivers and other safety-related status info about the actual state of the drone that was not relevant to the main control algorithm. Relative position and velocity were calculated by the differences of GNSS-based absolute measurements. The net payload size of a status packet was 46 bytes.

Because of the properties of the wireless media and the broadcasting transport protocols, packet delivery was unreliable; at a transmission frequency of 10 + 1 status packets sent per second per UAV, typically only around 40 to 80% of them were received by close-by peers on average in a general flight with 30 drones, where the reception rate was dependent on many factors, such as the bandwidth usage of other nearby wireless solutions or the details of the surrounding environment. Because signal power decays quickly with space [ideally at 1/r2 in open space (57)], reception from nearby agents (<20 m) was nearly perfect, whereas large communication outages started to occur at larger distances (>50 m). As an illustration of the actual communication characteristics, a detailed log of the distance dependence of the reception quality from a 5-min sample flight with 32 drones can be seen in Fig. 8. Note that the observed spatial decay fits naturally into our targeted distributed communication approach for two reasons: (i) the most critical information comes from the closest agents, which is always the most reliable, and (ii) the communication network naturally becomes scalable with flock size as the bandwidth overlap decays with distance.

Fig. 8 Probability distribution of the communication outages as a function of distance.

The database was gathered from a 5-min section of a general flight with 32 drones in a remote open-air setting. Each drone logged a 5-Hz sampling of the elapsed time since the reception of the last status packet from all other drones. This value [we call it timeout for simplicity but it actually also contains a small (<0.2 s) processing delay] was matched later with the position of the drones recorded accurately by each drone onboard. The distribution shows logs from all drones (1,349,490 data points in total), and it is normalized for each row (distance) separately. Color indicates timeout probability in each bin for a given distance. Average timeouts with SD and with the number of data points are indicated on the right for 50-m distance binning, whereas the black line on the plot indicates average and SD of timeout for each distance bin of 10 m. Database is very sparse and thus less accurate above 150 m, but as a general tendency, communication is most stable between close-by drones, whereas outages were more frequent and longer with increasing distance.

There is a trivial reality gap in the communication aspects: Simulations contain constant delay and communication range, whereas the real setup was more stochastic and distance-dependent. Furthermore, in the real setup, we compensated for communication outages to some extent with linearly extrapolating neighboring drone positions using the global time stamps and velocity. With this approach, our overall aim was to have the safest real system possible and an underestimated communication quality through the model design phase to be prepared for a worst-case communication scenario. As a result, with this setup, we did not experience any mission-critical communication outage so far.


Movie S1. Simulation of the old flocking model (algorithm A) with 100 agents.

Movie S2. Simulation of the new flocking model (algorithm B) after evolutionary optimization with 100 agents.

Movie S3. Simulation of flocking for different speeds (4 to 32 m/s), flock sizes (30 to 1000 agents), and scenarios.

Movie S4. Flight log visualization of 30 drones at 4 m/s in a diagonal flight pattern.

Movie S5. Flight log visualization of 30 drones at 6 m/s with obstacles.

Movie S6. Flight log visualization of 30 drones at 8 m/s in a circular flight pattern.

Movie S7. Summarizing documentary with simulation, flight log visualization, and footage on real flights.

Table S1. Optimized model parameter values and working ranges in simulation.

Table S2. Statistic evaluation of optimized simulations.

Table S3. Explanation of flocking model parameters.

Table S4. Model parameter values used on real drones.

Table S5. Environmental parameters of the realistic setup.

Table S6. Parameter settings of the evolutionary optimization.


Acknowledgments: The long-exposure photo was taken by Zs. Bézsenyi. We are grateful for the help of other members of the robotics team, namely, B. Badár, I. Donkó, and N. Gál, who are not present here as authors, but have made valuable contribution to make our drones fly. Funding: This work was partly supported by the following grants: U.S. Air Force grant no. FA9550-17-1-0037, K_16 Research Grant of the Hungarian National Research, Development and Innovation Office (K 119467), and János Bolyai Research Scholarship of the Hungarian Academy of Sciences (BO/00219/15/6). Author contributions: The project was conceived by T.V. Cs.V., G.V., and T.V. developed the concept and algorithms. Cs.V. and G.V. developed the simulation. T.N. and G.V. developed the software for drones. G.S. and G.V. developed the drone hardware. A.E.E. and G.V. implemented the evolutionary optimization framework. T.N., G.S., Cs.V., and G.V. performed drone experiments. The paper was written by A.E.E., Cs.V., G.V., and T.V. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The project website is Flight logs related to the results presented are available at Source code of our robotic simulation framework is available at All other data needed to evaluate the conclusions in the paper are present in the paper or the Supplementary Materials.
View Abstract

Navigate This Article