C++ Notes: Algorithms: Big-Oh Notation

How resource use grows as the problem becomes larger

It's useful to estimate the cpu or memory resources an algorithm requires. "Complexity analysis" attempts to determine the relationship between the number of data elements and resource usage (time or space) with a simple formula. This can be useful if you are testing a program with a small amount of data, but later production runs will use large data sets.

Dominant Term

Big-oh (the "O" stands for "order of") notation is concerned with what happens for very large values of N, therefore only the largest term in the formula is needed. For example, the number of operations in some sorts is N2 - N. For large values, N is insignificant compared to N2, therefore this is described as an O(N2) algorithm, and the N term is ignored.

Why it Matters

Here is a table of typical cases. Logarithms to base 2 (as used here) are proportional to logarithms in other base, so this doesn't affect the big-oh formula.
 constantlogarithmiclinear quadraticcubic
nO(1) O(log N) O(N) O(N log N) O(N2) O(N3)
11 1 1 1 1 1
21 1 2 2 4 8
41 2 4 8 16 64
81 3 8 24 64 512
161 4 16 64 256 4,096
10241101,02410,2401,048,5761,073,741,824

Best, worst, and average cases

You should always be clear about which cases big-oh notation describes. The characteristics for best, worst, and average cases can be very different.

Why big-oh notation isn't always useful

Complexity analysis can be very useful, but there are problems with it too.

Benchmarks are better

Big-oh notation can give some very good ideas about performance for large amounts of data, but the only real way to know for sure is to actually try it with large data sets. There may be performance issues that are not taken into account by big-oh notation, eg, the effect on paging as virtual memory usage grows.