Master thesis in data compression

(Generation and compression of endgame tables in chess with fast random access using OBDDs)

As my master thesis I'm working on endgame tables in chess. My objective is to maximise the number of endgame combinations that can be stored in RAM, while still being able to access the tables fast enough to be used during full depth searches.

My original goal was to use data mining methods to improve the strength of an AI chess player, but I didn't have any concrete ideas, and after implementing a functional engine my interest shifted to endgame tables and how to compress them (motivated by having just completed the course data compression).

2005.02.15: The thesis has just been handed in (available from here).
2005.03.15: The defence, where I was also examined in the course Data Compression.

My supervisor is Peter Bro Miltersen



Before I started working on the endgame tables, I implemented a decent chess engine (an alpha-beta search with hash tables, quiescence search, iterative deepening). Also a lot of work was put in making functionality for e.g. loading games (in pgn or fen format).

I started doing some experiments with the nontrivial KBBK endgame (king and 2 bishops (one on black square and one on white) versus king).
This is explained here, together with some basic introduction to endgame tables in chess.

This experiment showed that the KBBK endgame table of size 10*32*32*64*2 (1310720) bytes could be compressed down to 181353 bytes using gzip.

However, for this purpose it is not an option to decompress an entire file, just for accessing one single value. Hence the compressed data format should allow for fast retrieval of a single value. That's why I have focused on compression based on ordered binary decision diagrams which should allow decompression of a single entry approximately as fast as a cpu engine evaluates a position.


Results

The thesis

Presentation at the defence (in danish, 20 pages pdf, 587.998 bytes)

Presentation in the course Strategic Game Playing (in danish)


Links:

Rules, formats, definitions

Legality of positions of simple chess endgames.
Wikipedia: Chess, Rules of chess, Chess terminology, Computer Chess, Computer Olympiad.
Specification of Forsyth-Edwards Notation for a chess position

Binary Decision Diagrams

Dynamic reordering by sifting (slides)

Endgames:

online search in Nalimov's 3-5 men endgames (relevant frame).
Download Nalimov's Tablebases (Often too many users)
Download Nalimov's Tablebases (Only 3-5 men endgames available)
Space-efficient indexing of chess endgame tables (gzipped .ps)
DarkThought. Contains most (all?) of the book "Scalable Search in Computer Chess".
An Approach to Data Compression in Coding Chess Positions. How many bits are needed to represent a typical chess position?

Chess programming:

In best first order:
  • Lecture notes: Strategy and board game programming
  • Chess Programming Theory
  • Using bitboards
    XBoard (a graphic interface used by many chess engines - including mine).
    Chesslib (here I have downloaded a file containing 2.000.000+ games)

    Chess programs:

    See Tim Mann's Chess Pages for the programs below:
  • GNU Chess. C source code available.
  • Evaluation function used by deep thought.
  • Beowulf project. C source code available.

    Other stuff:

    Chinook, a checkers program
    Arimaa, an even greater programming challenge.

    Data mining:

    Backgammon, neural net

    Programming

    Threads, concurrency, etc. in C
    Using and Porting the GNU Compiler Collection (GCC)

    Links to more links

    Articles about computer chess