MP in Action‎ > ‎Other Applications‎ > ‎

Modelling Telecommunications Networks

This article is based on the 1996 Blackett Memorial Lecture given by Professor Frank Kelly of the Statistical Laboratory, Cambridge.

Managing a Telecoms Network

There are two related problems which a telecoms company faces in managing its network:

  • deciding the capacity of the links between nodes, i.e. the number of calls which can be transmitted simultaneously along them;
  • managing that capacity in real time, i.e. deciding how to route a call.

Although a link may have a capacity of C simultaneous calls (circuits), it is reasonable to expect it to handle many more than C calls because calls occur randomly and last a finite time.

The fundamental result in this area is Erlang's formula, published in 1917. Telephone calls are modelled as a random (Poisson) process in which calls arrive at a rate a. A call is blocked and lost if all C circuits are occupied. Otherwise the call is accepted and occupies one circuit for a random holding time with unit mean. Then the proportion of calls lost is

This formula has been used for 80 years to size individual components in a telephone network and is used today to size Web servers. In the 1970s and 1980s the analysis was extended to calculate the probability of blocking in a network where calls could be routed in different ways. BT's network is a dual-parent network in which each node (local exchange) is connected to at least two parents in the fully-connected core (see Figure 1).

Figure 1: A Dual-Parent Network with a Fully-connected Core

Although a fully-connected network offers the ultimate in flexibility, there remain issues in deciding how to use that flexibility. It is obviously best to route calls between two nodes, A and B, by the direct route AB if that is available. If it is not, the call can be routed via any other node, C, say (the tandem node),  provided capacity exists on the routes AC and CB.

Congestion in Networks

Leaving aside for the moment the issue of selecting the tandem node C, consider what happens as the traffic in the network increases. To start with, calls are routed directly. But as traffic increases, more and more calls are routed via tandem nodes, using two circuits instead of one. As this happens, it becomes even more likely that the direct route between two nodes is full and so all calls take two circuits. This is an example of positive feedback which occurs in congested networks and which can give rise to counter-intuitive results.

A classic instance of this was first observed by Braess with road-traffic networks. Drivers are trying to get from west to east across a city. There are two routes, the northern and the southern, with journey times for each leg as shown in Figure 2, where f is the flow rate along the leg.

Figure 2: Braess' Paradox

Given 6 units of traffic, drivers divide themselves evenly among the routes so that each driver takes 30 + 53 = 83 minutes to traverse the network. The system is in equilibrium and appears optimal to each user because if one driver switches route his journey will take longer.

Now suppose that an extra road is built from B to C. Drivers at B see the empty road to C and the prospect of reaching their destination more quickly by going via C. They take it and so the system moves to a new equilibrium with 4 units of traffic along AB and CD and 2 units of traffic along AC, BD and BC. As a result all drivers now face a journey of 92 minutes!

What is happening here is that drivers' attempts at local optimization are resulting in a system which is far from the global optimum because individual users are not receiving pricing signals which reflect the costs they are imposing on others. In order for a local optimization technique to work, such external costs must be priced.

With a telecoms network this is achieved by trunk reservation. Priority is given on links to directly routed calls. An indirectly routed call is allowed onto a link only if more than s% (e.g. 5%) of the circuits are free. This approach is highly successful in preventing runaway congestion and consequent breakdown of the network.

Dynamic Alternative Routing

The problem remains of selecting the tandem node. The conventional approach has been to hard-wire it or define increasingly complex rules. Inevitably these come up against Murphy's Law and sometimes worsen the congestion. Note that there is an intrinsic difficulty in selecting the tandem node. The decision has to be made locally and dynamically, with (at best) out-of-date knowledge of the load at other nodes.

The solution has been Dynamic Alternative Routing, developed by researchers at Cambridge and BT's laboratory at Martlesham. In this, at any time for each route AB the node A has a "current tandem" C to which it is sending calls destined for B which it has to route indirectly. Trunk reservation is used and when an indirectly routed call to B is rejected, it selects another node C' at random as the "current tandem" for AB.

This algorithm had been shown to be extremely effective in simulations and trials, approaching the theoretical maximum capacity of a network. In effect it performs simulated annealing on the hardware of the network itself, using a greedy probabilistic hill-climbing algorithm. But this was not enough to secure the necessary approval of BT's board for it to be implemented nationwide. That took a JCB driver who cut through one of the main fibre-optic links. The difficulties which the static rerouting algorithms had were enough to secure the go-ahead. The JCB driver was the toast of Cambridge and Martlesham!

Global Routing and Coalitions

Within a national telecoms network, indirect routing enables more calls to be carried between two nodes than there is direct capacity. Similarly at the international level, indirect routing offers scope to increase the capacity between two countries without laying more cables. This is particularly true when considering countries in different time zones, e.g. USA; western Europe; and the Far East / Australasia. The different times of day mean that when traffic between two zones is heaviest, the third zone is asleep and so there is plenty of spare capacity available to route calls indirectly through it.

Although savings of some 20 - 25% are available, little progress has been made in taking advantage of them. Most national telecoms companies are monopolies and control how much they charge both their own nationals for originating calls and other national telecoms companies for receiving calls. It seems that global takeovers and mergers, such as BT's with MCI, are easier to negotiate than reciprocal international transit fees.

Broadband Traffic and Statistical Sharing

With the convergence between computers and telecoms, the characteristics of traffic on telecoms networks are changing. There is a greater spectrum of data intensity and timeliness with voice, e-mail, Web browsing and video. This diverse traffic is being brought to a common format through the emerging standard of Asynchronous Transfer Mode, in which everything is reduced to 48-byte packets.

With conventional telecoms traffic, if we look at the number of packets of data flowing per unit time, the statistical structure changes as we focus on smaller and smaller time-scales. At fine time resolutions, voice calls are either on or off. This is what makes it possible to load so many calls into a single circuit without any one suffering perceptible degradation.

But with new broadband networks, the statistical structure is present at all time-scales, as in fractals. This means that conventional statistical analysis based on the peak and the mean is no longer sufficient and that in designing networks we need to consider the tails of distributions.

Tied up with designing the network is the issue of how to charge for it. One has only to consider the Internet today to realise the importance of having a charging mechanism which conveys information to the user about the congestion he is causing and to the network provider about the value of providing a better service.

It turns out that there is a such a tariff, or rather, family of tariffs. This is to charge:

am * t + bm * u

where am, bm are a pair of constants, t is the duration of the call and u the number of bytes in the call. As with mobile phone tariffs today, the network provider may offer the user a family of constants from which he selects the pair which he considers best value. Users with a high peak data rate will select a tariff with a low bm but a high am while those with a low peak data rate will prefer one the other way round. In this way users indicate to network providers in advance the likely nature of their load, thereby enabling the the network provider to size the network cost-effectively.

Related articles include Pricing Mobile Phone Tariffs. To find other articles, refer to the MP in Action page.

Comments