2015-05-22

Sorting is computationally harder than playing chess

I will argue below that sorting of a big array by comparison is inherently harder than solving the problem of choosing the best reply move on a chess game.

One could say that it is not fair to put in competition a big, however finite, problem like chess with an arbitrary big problem like sorting n elements. Fair enough; the interresting thing is that sorting becomes inherently harder than chess for an "n" that is way smaller than you would expect, like sorting 3000 elements. At least in theory...

Sorting takes a minimum of n*log2(n) comparisons, 

more exactly log2(n!)


This is about sorting "by comparison only" and can be proven in 2 step. You can skip to the "chess" part if you already know this.

1. Sorting can be done in log2(n!) comparisons, that is just slightly less than n*log2(n).
We are used to consider sorting an O(n*log2(n)) problem. Actually there are algorithms that can do the sorting with little less than n*log2(n) comparisons. The sorting by insertion in an increasingly sorted array takes log2(n!) comparisons.

Sorting by insertions: we start with a vector containing only one element, then use binary insertion for each of the following elements. It will take log2(1)+log2(2)+log2.(3)+...+log2(n) = log2(1*2*3*...*n) = log2(n!) comparisons. This algorithm is not practical because you need many additional operations for moving the elements while inserting, however this is a prove that we can do sorting with only log2(n!) comparisons. Actually, this is not much better than n*log2(n), as n*log2(n)/log2(n!) -> 1 when n->infinity
2. Sorting by comparison cannot be done in less than log2(n!) comparisons
We can actually prove that log2(n!) is the minimum number of comparisons that are needed in worst case scenario to do a comparison-based sorting. After I realised this, I found that it is just a consequence of the Shannon theorem. However, this particular case can be explained much simpler.
Let's suppose that there is an algorithm that could sort any array of n elements in less than log2(n!) comparisons, for example log2(n!)-1. Such algorithm should be also able to sort all the permutations of the numbers 1..n in less than log2(n!)-1 comparisons. 
The solution for a certain permutation of the n! possible inputs is actually the reverse permutation that will put the numbers in the right position (1,2,....,n). The input of sorting can be any of the n! permutations of (1,2,....,n), so the solution can be any of the n! reverse permutations.
Remember, we cannot do other tests with the numbers except comparison. As all elements are distinct and there is no logic in comparing one element with himself, all comparisons will have only 2 possible results (a[i]>a[j] or a[i]<a[j]). If we denote the 2 cases by 0 respectively 1, we see that each comparison will add only a bit of information. After log2(n!) comparisons, we would have had 2^log2(n!) = n! distinct comparison results.
However, if we have less than log2(n!) comparisons, we will have strictly less than n! distinct results. For each of this results, the algorithm should be able to identify each of the possible n! reverse permutations that will make the array to become sorted. But this is impossible in our hypothesis, we cannot select all n! possible permutations based on log2(n!)-1 comparisons. Therefore, by contradiction, we need a minimum of log2(n!) comparisons, in worst case scenario, when doing sorting by comparison.

The best chess move can be chosen, theoretically, in only hundreds of bit comparisons


Choosing the best next move in a chess game is a function from the set of valid configurations to a set of valid moves. Usually chess is considered an exponential problem, as the number of possible table configuration increases exponentially with the number of moves that are evaluated. But let's approach the problem in some other way:

Chess configuration encoding
There are multiple ways to encode a chess configurations. For example FEN encoding will use something like "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1", that takes tens of bytes. However there are other chess encoding that takes in worst case 32bytes (256 bits). The encoding can be further compressed to 204 bitsFor this argument it is enough to know that any chess configuration can be represented in a reasonable number of bits (hundreds).

Next move encoding
The move can be encoded usually in 2-4 characters (like Qxd5), however it could be future compressed. For this argument, 4 characters is small enough.

With this encoding, a perfect chess playing algorithm should take as input a string of 2-300 of bits and output like 32 bits that encodes the best perfect move (or one of them).

This logic can be implemented as a simple binary tree with 2-300 levels. When receiving a certain input (a string of 0's and 1's), for each bit we will go left (if '0') or right (if '1'). In the leaf we will find the best next move, as a string of 32 bits. The whole search will take 2-300 bit comparisons. This could work reasonably fast even if we store the tree in a network of spinning hard drives. Of course, this is not feasible, at least for 2 reasons: we don't have so much space, and we cannot calculate all the best moves in the first place. However, let's stay a bit in the theoretical space, where we can imagine things that are not feasible today.

We seen that we could have an algorithm that can answer the best move in a really small number of comparisons. This algorithm would be incredibly huge to represent, but if we would have it on a huge hard drive it would play chess very fast.

What is interesting is that solving the best chess next move does not depend at all on the history of the game and not even on the exponential number of possible developments of the game. A chess oracle could directly have the best next move for each configuration.

Another way to put this is that you only need a function to measure the likelihood of winning for any board configuration. If you know this, you only need to chose the next move that will produce the configuration with the best likelihood of winning.

Can the best chess move be actually computed?
As long as we talk theoretically, this problem is actually solvable. We have a board with 64 squares, 32 pieces, so a finite number of possible configurations, let's say C. After maximum C moves, you will eventually cycle and this could be considered a draw. For all the valid C next moves, an algorithm can compute the value of the move considering all possible responses and responses to responses... Not all the moves are legal, some branches goes to a clear checkmate. However, the remaining are probably yet impossible to calculate with the current technology. This doesn't mean that is is impossible in theory.

A bit more realistic approach
Actually, you can use the best known algorithm as "backend" and the above tree as a cache. When you encounter a new configuration, you go to the classical chess algorithm. After many games, most of the configurations will be in the tree cache, and the algorithm will be able to answer in only 2-300 comparisons.

Back to sorting
We saw that, in theory, you could create an algorithm that answers a chess problem in 2-300 comparisons. On the other hand, we sow that sort is inherently hard, while chess could be, in theory, solved in a small number of comparisons. This means that a sorting will start to take inherently more comparisons as soon as the number of elements will go over 300*log2(300) < 300 * 9 = 2700.

Sorting by comparison is thus proven to be inherently hard, while chess can have, a constant complexity - however impractical. Chess just needs an unrealistically huge algorithm in order to be able to answer in a very short number of comparisons. On the other hand, sort will remain log2(n!), that is close to n*log2(n), with any imaginable huge algorithm we could made.

Even if we know the numbers to be sorted in advance and we store all their permutations in memory, we still need at least log2(n!) comparisons, worst case scenarion, in order to be able to select the right answer for the sorting problem.

Of course, the situation is usually different in the real life, you can sort an array with 2700 elements in a reasonable time, but cannot calculate the best chess move in the same time period. My intuitions tells me that the difference resides somewhere in the Kolmogorov complexity, but this might be a subject for a future reflection.

The space complexity of the above chess algorithm is huge, way over sorting. However, there is still possible to create a way more simple function that computes the same. We already have heuristics like "the sum of the value of all pieces", the question is how more complex this function must be to do the perfect evaluation. As you know now that it is possible, you might be even more motivated to find chess algorithms that are close to perfection in reasonable time. Or, conversely, prove that the space complexity expands exponentially as you reduce the time complexity.

P.S. These results are more philosophical than practical at this time. However, I find it very interesting from a theoretical point of view. Even if you found flaws in my argumentation, I might still be right! ;)

No comments:

Post a Comment