MP in Action‎ > ‎How to Use MP‎ > ‎

How to Succeed with Integer Programming

This article continues the occasional series on the use of extensions to Linear Programmng.

Previous articles have surveyed the various structures which can be used in integer programming - integer variables, special ordered sets, etc - and the types of problem which can be addressed. This article looks at the way in which integer programming problems are solved and how you can work with the IP code to make the algorithms more effective.

Branch-and-Bound

The most widely-used algorithm for solving integer programming problems is Branch-and-Bound. This was described in Using Idle PCs to Solve Scheduling Problems, from which Figure 1 is borrowed.

Figure 1: Branch-and-Bound Tree

For our purposes it will suffice to recall three key points:

  • we first solve the problem as an LP, i.e. without the requirement that the integer structures be obeyed. Typically this solution will contain a number of integer infeasibilities (integer variables which are not at integer values, etc);
  • we then seek to find an integer feasible solution by successively creating and solving sub-problems in which we try to force the integer structures to be obeyed. In so doing we create a tree of sub-problems which grows exponentially in size;
  • once we have found an integer feasible solution we are not finished but must continue to search the tree for better solutions. We use the value of our integer feasible solution as a cut-off to eliminate sub-problems which are worse than it. Only when we have traversed the entire tree can we declare that the best solution which we have found is (one of) the global optimum solution(s).

Why do we need to know how Branch-and-Bound works? Although many of us have learned the Simplex Method, few know how LP codes really work. It is because Branch-and-Bound is a much weaker algorithm and is much more in need of our assistance that we can gain from such an understanding.

The key difficulty with the Branch-and-Bound algorithm is that there is no good way of estimating what will happen to the objective function as we force variables to integer values, or indeed, whether the sub-problem will remain feasible. This means that the Branch-and-Bound "search" is often more like a random walk. It is because of this that IP codes provide numerous mechanisms to enable you to guide the search. Using these effectively can make the difference between success and failure.

Guiding the Search

With most "industrial-strength" IP codes you can do the following things:

  • use more powerful integer structures such as Special Ordered Sets;
  • specify priorities on integer variables and structures;
  • control the "shape" of the search, i.e. whether the search is broad across many nodes or proceeds deep quickly;
  • use cut-offs to prune the search, possibly at the expense of not finding the best solution.

All of these will help. In addition there may be options which do not require you to participate actively and which will improve the IP code's performance:

  • an "integer presolve", which tightens bounds on variables to reflect the fact that certain variables must take integer values;
  • automatic cut generation, which adds extra constraints automatically.

In practice these tend to help a bit but they won't usually make a dramatic difference. Formulating your problem better and participating actively with the Branch-and-Bound algorithm is much more likely to transform its performance.

Improving the Formulation

This is potentially the area where the greatest improvements can be made. It is also the hardest in which to provide general advice. Some points may be made, however:

  • use Special Ordered Sets wherever possible: they enable the IP code to branch on several variables simultaneously and the reference row provides much better estimation;
  • try to capture the intersection of decisions within the definition of the decision variables rather than in the constraints;
  • try to work out how the LP solution will "cheat" and add extra rows and columns which will reduce the scope for "cheating".

In purely LP models you should keep the model as small as possible. With integer models you should still keep models small but this must be balanced against the improvements in the performance of Branch-and-Bound from having a "tighter" formulation. On the other hand do not go too far in tightening the formulation, because extra rows and columns slow down the solution of the original problem and all the sub-problems. Also be on the lookout for tightenings which are more theoretical than practical: if the effect of adding extra rows is to push the solution very slightly aside, you may be better off without them.

Priorities

Priorities are a way of telling the IP code which are the important integer structures. It will branch on these first. If you would tackle the real-world problem by making certain decisions first and others later, this will probably be a good strategy for the IP code. You should set the priorities accordingly.

There is one big problem with priorites. It doesn't ensure that an integer variable with a high priority won't find itself being moved a long way down the branching tree. The trouble is that the Branch-and-Bound algorithm doesn't seek to apply constraints to variables which happen to be at integer values already. So you can have a high-priority decision which appears to have been made by the LP, but which is actually free to be changed as a result of consequential branchings. When this happens, the new value is almost certainly wrong but IP codes tend to plunge on downwards regardless. You can mitigate this effect with a broad search (see below) but what is really required is for the IP code to back-track and choose again.

What this means is that if you have a true hierarchy of decisions, it can well be more effective to solve the problem with only the high-priority variables required to be integer and then fix these variables at their optimal values and re-solve the problem with the lower-priority variables required to be integer.

A by-product of this approach is that it yields a tighter bound on the value of the optimum integer solution. If you require only the high-priority variables to be integer, you create a much simpler problem for the IP code. You are therefore far more likely to be able to complete the Branch-and-Bound search. If you do, the value of this partial-integer solution provides a tighter lower bound (for a minimization) than the continuous optimum on the value of any full integer feasible solution. Such a lower bound is of interest in itself and it can guide you when setting a cut-off.

Controlling the Shape of the Search

By default most IP codes try to search downwards as fast as possible. That is, they try to find an integer feasible solution quickly. Having made a branching decision they will only revisit it if the resulting sub-problem is infeasible as an LP. Once they have found an integer feasible solution they will go back and make a free choice of which sub-problem to develop further before plunging headlong downwards with that one.

This approach is often effective in finding integer feasible solutions quickly. But it does mean that a wrong turn early in the tree cannot be rectified until one finds both of a pair of sub-problems infeasible as LPs. Even then, the estimation process tends to bias it against going back to a node early in the tree. So it is really a case of; "if it works, fine; if not, it probably won't produce anything much however long you wait".

The main alternative is the "broad search". This forces the IP code to consider all the nodes at each stage rather than just the deepest. It is thus much less serious if a wrong decision is made early in the tree. Conversely it takes much longer to reach any integer feasible solutions. But once the integer feasible solutions are reached, they come quickly and you are far more likely to complete the Branch-and-Bound search than with a depth-first search.

One disadvantage of the broad search is that if there are many low-priority integer variables, it can take an enormous amount of time to reach any integer feasible solution. Often what is required is a broad search on the high-priority integer entities followed by a deep search on the low-priority ones.

Cut-offs

Before they find an integer feasible solution, IP codes will by default plough on regardless of the value of the objective function of the sub-problems they are exploring. If you know an integer feasible solution or are only interested in solutions better than some value, you should set a cut-off. This tells the IP code when to give up on one branch and try another.

Once an integer feasible solution is found the cut-off is automatically reset to its value. This means that the IP code will look only for solutions which are better, albeit infinitesimally so. If you are concerned only to find a good integer feasible solution you can be more aggressive by resetting the cutoff to look only for solutions which are significantly better than the one you have just found.

The disadvantage of aggressive pruning is that you may not find the best solution. On the other hand, beware of the concept of the best solution: there are very often many solutions with the same objective. When you reach a solution, try to understand which elements have been chosen because they are necessary and which are random selections from among many equally good options.

Related articles include But my Problem Isn't Linear! and What Can Integer Programming Do?. To find other articles, refer to the MP in Action page.

Comments