Understanding “Big-O” for Algorithm Running Time

How long will it take to get a computational result? This is an important question when choosing an algorithm to solve a problem. A poor choice will result in bad performance.

A more specific form of question is how the algorithm will behave as its parameters get larger or more complex. If sorting a hundred items takes ten microseconds, will sorting ten thousand of them take a thousand microseconds? More? Less?

You’ll hear discussions with comments like “The time this algorithm needs is on the order of n^2.” What that says is that for a relevant value n, the time needed is proportional to n^2. The variable n might be the number being operated on, the length of a string, the number of significant digits required, or the number of items in a set.

Bachmann-Landau notation

There’s a more formal way of putting it, with a more precise meaning. It’s called Bachmann-Landau notation. Paul Bachmann invented the notation, and Edmund Landau helped to promote its use. It’s also called big-O notation, because a capital O represents the idea of order.

You’ll often see the notation in the form of an equation, though it isn’t one. It looks like this:

f(x) = O(g(x)) as x -> ∞

You can read that as saying that some measure of the performance of a function f(x) is asymptotically proportional to a function g(x) as x approaches infinity. In this context, f isn’t an abstract mathematical function but an implementation of it. That is, it’s an algorithm or a specific piece of computer code.

Understanding the big O

Several points need clarification here. The first is that the big-O function is a proportion, not a definite quantity. How long f will take to run will depend on the underlying hardware and software, such as a virtual machine or interpreter. Those details are abstracted away. We’re only considering how the required time grows, not its absolute value.

The expression describes the asymptotic behavior of an algorithm. That is, the function g(x) comes closer and closer to being the actual proportion as x moves toward a limit, usually infinity. We’ll discuss this some more in a moment.

It describes the worst-case behavior. Most sets of parameters produce running times close to the worst case, while a few special cases allow better performance, so this is a reasonable approach.

Big-O functions can measure the growth of any kind of resource. In some cases, it’s important to know how much memory a process will require as its parameters grow, or how much stack depth it could use up. If you have a solid understanding of the concept, you’ll understand these cases as well. For the present, we’re focusing on execution time.

Different algorithms to achieve the same result might have different Big-O orders. That’s the whole point. If g(x) = x, that says the time grows linearly with x. That’s pretty good in a lot of situations. If g(x) = x^2, then the time grows with the square of x. That may be a problem. And if g(x) = 2^x, that says the time grows exponentially with x. Expect seriously slow performance if x is likely to get large.

Asymptotic behavior

The fact that the big-O function describes asymptotic behavior is important. It simplifies the expression when focusing on large values. For instance, the proportionality factor for an algorithm might be more precisely expressed as x^2 + x + 10. This expression would give a better estimate of the performance with small values, taking the irreducible overhead into account. But as x grows larger, the second and third parts of the sum become steadily less important. As x grows to infinity, x^2 by itself is an increasingly accurate approximation, with the relative error going toward zero.

Whenever the time to execute an algorithm is proportional to the sum of several factors, the one that grows fastest defines the asymptotic behavior.

Big-O and big-theta

A related measure of order of performance is called big-theta (ϴ). Big-O actually gives the upper bound of an algorithm’s behavior. It says it won’t be any worse than proportional to a given function. Big-theta defines a stricter limit, specifying both the upper and the lower bounds of the behavior.

For example, a sorting function might have O(n^2), where n is the number of items to sort. If the items already happen to be in order, the performance might actually be proportional to n rather than n^2, which is the best case it can encounter.

Calculating the best-case behavior of an algorithm can be difficult, and generally the upper bound is more important. Most discussions of the time for an algorithm concern big-O rather than big-Theta.

Putting it into context

The order of an algorithm is a major consideration in many cases, but it’s not always the driving factor. It’s important when several conditions apply:

The algorithm will be heavily used.
It will be applied to difficult cases, such as many data points and large parameters, which require a lot of computation.
Performance is important.
Other factors, such as memory usage, don’t have overriding importance.

When writing code that will just get occasional use and won’t affect system performance, then getting the job done without a lot of extra effort is the main consideration. If the design restricts the parameters to simple cases — for instance, a string operation will never take a parameter of more than 200 characters — the function with the better big-O performance may not offer an advantage. It may even be worse, if it has a high set-up cost.

Some algorithms which have very good asymptotic performance time have other costs, such as memory requirements. Some algorithms for calculating pi can produce any number of digits in a finite amount of memory (not counting the output itself). With some others, the amount of memory required goes to infinity along with the number of digits calculated. Once it reaches a certain point, it will have to swap memory on and off the disk drive. It won’t achieve the speed which the abstract description of the algorithm predicts. If you’re trying to set a world record for digits of pi, you need to take memory requirements into account. If you just want a reasonable value for calculations, you won’t run out of memory quickly.

Developers choosing a way to implement calculations need to understand the big-O order of the methods available to them. They need to recognize when it’s an overriding consideration and when other factors are more important. This lets them choose implementations that will provide good performance.