Having recently completed my latest M.Eng block on the subject of "Natural and Artificial Intelligence", I became aware of advances made in the recent decade towards a new paradigm of network traffic engineering that was being researched. This new model turns its back on traditional destination based solutions, (OSPF, EIGRP, MPLS) to the combinatorial problem of decision making in network routing favouring instead a constructive greedy heuristic which uses stochastic combinatorial optimisation. Put in more accessible terms, it leverages the emergent ability of sytems comprised of quite basic autonomous elements working together, to perform a variety of complicated tasks with great reliability and consistency.
In 1986, the computer scientist Craig Reynolds set out to investigate this phenomenon through computer simulation. The mystery and beauty of a flock or swarm is perhaps best described in the opening words of his classic 1986 paper on the subject:
The motion of a flock of birds is one of nature’s delights. Flocks and related synchronized group behaviors such as schools of fish or herds of land animals are both beautiful to watch and intriguing to contemplate. A flock ... is made up of discrete birds yet overall motion seems fluid; it is simple in concept yet is so visually complex, it seems randomly arrayed and yet is magnificently synchronized. Perhaps most puzzling is the strong impression of intentional, centralized control. Yet all evidence dicates that flock motion must be merely the aggregate result of the actions of individual animals, each acting solely on the basis of its own local perception of the world.
An analogy with the way ant colonies function has suggested that the emergent behaviour of ant colonies to reliably and consistently optimise paths could be leveraged to enhance the way that the combinatorial optimisation problem of complex network path selection is solved.
The fundamental difference between the modelling of a complex telecommunications network and more commonplace problems of combinatorial optimisation such as the travelling salesman problem is that of the dynamic nature of the state at any given moment of a network such as the internet. For example, in the TSP the towns, the routes between them and the associated distances don’t change. However, network routing is a dynamic problem. It is dynamic in space, because the shape of the network – its topology – may change: switches and nodes may break down and new ones may come on line. But the problem is also dynamic in time, and quite unpredictably so. The amount of network traffic will vary constantly: some switches may become overloaded, there may be local bursts of activity that make parts of the network very slow, and so on. So network routing is a very difficult problem of dynamic optimisation. Finding fast, efficent and intelligent routing algorithms is a major headache for telcommunications engineers.
So how you may ask, could ants help here? Individual ants are behaviourally very unsophisticated insects. They have a very limited memory and exhibit individual behaviour that appears to have a large random component. Acting as a collective however, ants manage to perform a variety of complicated tasks with great reliability and consistency, for example, finding the shortest routes from their nest to a food source.
These behaviours emerge from the interactions between large numbers of individual ants and their environment. In many cases, the principle of stigmergy is used. Stigmergy is a form of indirect communication through the environment. Like other insects, ants typically produce specific actions in response to specific local environmental stimuli, rather than as part of the execution of some central plan. If an ant's action changes the local environment in a way that affects one of these specific stimuli, this will influence the subsequent actions of ants at that location. The environmental change may take either of two distinct forms. In the first, the physical characteristics may be changed as a result of carrying out some task-related action, such as digging a hole, or adding a ball of mud to a growing structure. The subsequent perception of the changed environment may cause the next ant to enlarge the hole, or deposit its ball of mud on top of the previous ball. In this type of stigmergy, the cumulative effects of these local task-related changes can guide the growth of a complex structure. This type of influence has been called sematectonic. In the second form, the environment is changed by depositing something which makes no direct contribution to the task, but is used solely to influence subsequent behaviour which is task related. This sign-based stigmergy has been highly developed by ants and other exclusively social insects, which use a variety of highly specific volatile hormones, or pheromones, to provide a sophisticated signalling system. It is primarily this second mechanism of sign based sigmergy that has been successfully simulated with computer models and applied as a model to a system of network traffic engineering.
In the traditional network model, packets move around the network completely deterministically. A packet arriving at a given node is routed by the device which simply consults the routing table and takes the optimum path based on its destination. There is no element of probability as the values in the routing table represent not probabilities, but the relative desirability of moving to other nodes.
In the ant colony optimisation model, virtual ants also move around the network, their task being to constantly adjust the routing tables according to the latest information about network conditions. For an ant, the values in the table are probabilities that their next move will be to a certain node.The progress of an ant around the network is governed by the following informal rules:
- Ants start at random nodes.
- They move around the network from node to node, using the routing table at each node as a guide to which link to cross next.
- As it explores, an ant ages, the age of each individual being related to the length of time elapsed since it set out from its source. However, an ant that finds itself at a congested node is delayed, and thus made to age faster than ants moving through less choked areas.
- As an ant crosses a link between two nodes, it deposits pheromone however, it leaves it not on the link itself, but on the entry for that link in the routing table of the node it left. Other 'pheromone' values in that column of the nodes routing table are decreased, in a process analogous to pheromone decay.
- When an ant reaches its final destination it is presumed to have died and is deleted from the system.R.I.P.
Testing the ant colony optimisation system, and measuring its performance against that of a number of other well-known routing techniques produced good results and the system outperformed all of the established mechanisms however there are potential problems of the kind that constantly plague all dynamic optimisation algorithms. The most significant problem is that, after a long period of stability and equilibrium, the ants will have become locked into their accustomed routes. They become unable to break out of these patterns to explore new routes capable of meeting new conditions which could exist if a sudden change to the networks conditions were to take place. This can be mitigated however in the same way that evolutionary computation introduces mutation to fully explore new possibilities by means of the introduction of an element of purely random behaviour to the ant.
'Ant net' routing has been tested on models of US and Japanese communications networks, using a variety of different possible traffic patterns. The algorithm worked at least as well as, and in some cases much better than, four of the best-performing conventional routing algorithms. Its results were even comparable to those of an idealised ‘daemon’ algorithm, with instantaneous and complete knowledge of the current state of the network.
It would seem we have not heard the last of these routing antics.... (sorry, couldnt resist).