5.3 Discrete Variables

Integer Variables

Integer variables are ones which take the values 0, 1, 2, ... N. An upper bound, N, must always be specified. These have obvious application where some material must be used in an integer multiple, e.g. bags of various items to be mixed together to make animal feed.

In practice models are never exact and it is therefore spurious to seek the purported accuracy of identifying whether a decision variable whose value in the continuous solution is 20.45 should actually be 20 or 21 (or possibly some other value). Some IP codes, such as XPRESS-MP, recognise this and enable the user to define partial integer variables, which take integer values up to some limit and continuous ones thereafter. For myself, I believe in the counting system of the Yancos people of the Amazon: "one, two, three, lots", but you may wish to count a little further before reaching "lots".

Note that within an MP model you cannot use a single integer variable as an index to a function: y = f(x), where x takes values 1, 2, 3, ... N. This can indeed be represented, but by a series of binary variables, preferably within a Special Ordered Set of type 1.

Binary Variables

By far the most frequent occurrence of an integer variable is the binary variable. Here only two values are permitted: 0 or 1. This is used to represent a yes/no decision, 1 representing "yes" and 0 "no". This is also the hardest type for an LP-based technology such as Integer Programming to handle. This is because in the continuous solution a binary variable is constrained simply to lie between 0 and 1 and so if the continuous optimum value does not happen to lie at 0 or 1 the IP algorithm must do all the work of deciding which extreme to push it to.

Contrast this with a general integer variable. If we have a general integer variable with an upper bound of 10 and it takes the value 5.5 in the continuous solution, the chances are that the optimum integer value will be 5 or 6. Further, pushing the variable to either of those values is unlikely to have as much effect on other decision variables in the solution as pushing a binary variable from, say, 0.3 to 0 or to 1. Even pushing a binary variable from 0.01 to 0 can have an enormous effect (indeed it may make the problem infeasible) because it is eliminating the activities associated with that variable entirely.

There are very many uses of binary variables. Examples of the use of individual binary variables (as opposed to binary variables grouped together into Special Ordered Sets) include:

  • whether to do something or not (indicator variables);
  • fixed costs;
  • representing logical conditions;
  • non-mutually-exclusive sets.

Binary variables are conventionally represented by the Greek letter δ (delta).

Using a Binary Variable

Within an MP model we may wish to define a binary variable, δ, which is to take the value 1 if some activity is carried out, e.g. a new factory is built, and 0 otherwise. In order to do this we must do two things:

  • define the decision variable δ as a binary variable;
  • specify constraints which relate the value of δ to the consequences of building or not building the factory.
That is, we must provide constraints within the model which, coupled with the requirement that the variable δ takes the value 0 or 1, suffice to characterise the behaviour of the system.

Suppose that we have decision variables, xr, for the quantity of raw material r processed by the factory. There will be other constraints which relate to the outputs from the factory and to its capacity, but we need not concern ourselves with those here. In order to give effect to the δ variable we need to implement a constraint which says that there can only be non-zero xr's if δ = 1. In order to do this we need to know the maximum possible quantity of raw material which could be processed. Suppose that for each material r this is XMAXr. Then we write the constraints:

xr - XMAXr δ ≤ 0 for all r

When δ takes the value 1, this permits each xr to take any value up to XMAXr. But when δ takes the value 0, each xr must take the value 0. Given that the xr's were in any case limited to taking values less than or equal to XMAXr, these constraints permit the xr's to take any acceptable value if δ = 1 and force them to be zero if δ = 0.

There are several points to note here:

  • we could have used any upper bound on the value of xr as the coefficient of δ instead of XMAXr;
  • we could have implemented the restriction as a single constraint on the total quantity of raw material processed: Σr xr - XTOTMX δ ≤ 0 where XTOTMX is the maximum total quantity of raw material which can be processed;
  • we have not forced δ to be 0 if all the xr's are zero.

There is a good reason for using as tight a bound as the coefficient of the δ as we can obtain. We are constructing the δ variable for a purpose, e.g. to impose a fixed cost, FIXCST, on it for building the factory. When we come to solve the problem, the first stage is to solve the problem as an LP. The model may decide that it wants to use some of the xr's. If it does, it will want to incur as little of the cost of building the factory as possible. It will set δ to the smallest possible value which is consistent with the quantities of the xr's which it is processing, i.e.:

δ = max { xr / XMAXr }

Using a higher limit instead of XMAXr will reduce the value of δ in the continuous solution and therefore increase the distance between the continuous and the integer solutions.

What would be the effect of implementing the relationship between the δ variable and the xr's by a single constraint relating δ to Σr xr instead of to the xr's individually? It depends on how XTOTMX compares with the XMAXr's. Suppose for the sake of argument that there are 10 r's and that each XMAXr = 10. Then if XTOTMX = ΣXMAXr = 100, the model would be tighter using separate constraints. For suppose that the model wanted x1 = 4 and x2 = 6. Then the individual constraints will result in δ = 0.6 while the aggregate constraint will result in δ = 0.1. But if XTOTMX = 10, the aggregate constraint is tighter, yielding δ = 1.0.

Thus the tightness of the individual constraints compared with the aggregate constraint depends on the relationship between the limits which apply in each case. But there is a complication: the continuous solution will reflect the constraints which are included in the model. Thus if the optimum solution without any fixed cost would be x1 = 4, x2= 6, x3 = x4 = ... = x10 = 0 it does not follow that this would be the solution with the separate constraints. That would incur 0.6 of the fixed cost. There might be only a small preference for raw materials 1 and 2 over the others, in which case you might well find that the continuous solution was x1 = x2 = x3 = ... = x10 = 0.1 with δ = 0.1. In that case any value of XTOTMX less than 100 would make the aggregate constraint tighter than the separate ones.

What about including the individual constraints and the aggregate constraint? Surely that gives the best of both worlds? Certainly it does, in terms of tightness (although the aggregate constraint is redundant if XTOTMX = ΣXMAXr). But there is a trade-off to be struck between tightness of the model and its size. The larger the model, the longer it takes to solve it as an LP and the longer it takes to solve each sub-problem in the IP algorithm (see section 5.7). In this case these disadvantages would probably outweigh the benefits. In fact, the disadvantages of increased size would probably be such that you would do better to use the aggregate constraint alone provided XTOTMX ≤ 50.

In this model there would be no point in imposing a constraint which forced δ = 0 if all the xr's are zero. The fact that we are imposing a cost on the δ variable means that the optimization will set it to zero anyway.

What if this were not so? We should need some form of constraint

δ - M Σr xr ≤ 0

where M is some large constant. Strictly speaking, this does not impose the constraint we require. It states that δ must be 0 if Σr xr < 1 / M, and that it can be 1 if Σr xr ≥ 1 / M. This is the best we can achieve: it is impossible to force something to happen exactly at zero; it must be at a point very slightly above zero.

In practice if you find yourself needing to impose constraints both ways on some logical relationship, it is usually possible, and better, to reformulate it using a Special Ordered Set of Type 2.

previous contents next

Comments