Sudoku is a neat little puzzle played on a 9x9 grid, further divided into 9 3x3 subgrids. A small number of cells are initialized with digits, and the digits 1 to 9 are to be filled into the remaining blank cells so that each column, row, and subgrid contains each of the digits 1-9 exactly once.
I got kind of interested in learning about them a while ago and wrote a couple of programs to solve them, one in emacs and one in C.
The emacs one is probably nicer, but it requires that you run emacs. It sets up an environment that helps you solve the puzzles by doing the bookkeeping for you, to let you concentrate on the logic of solving. It also has undo, checkpointing (in case you want to follow a guess out), imports several different file formats, can give hints or solve the puzzles for you (showing the logic behind its choices), and even supply you with new very hard puzzles, if you download the data from Gordon Royle.
The emacs version is open source and available on my emacs page.
I started to think about how to solve it, and came up with a really nice and efficient data structure that is very efficient, fast, and can solve any solvable puzzle. It took my free time at work for about a week, but here it is. It solves 4x4, 9x9, 16x16, and 25x25 puzzles.
|Download the 3 files (tar or zip), compile them (the file sudoku_debug.c is not needed unless you want access to extra functions in a debugger), and run them. Full instructions are given in the files.
A quick description: the key is not the numbers that are in the cells, but the sets of numbers that are not yet eliminated from the empty cells. When all but one number is eliminated clearly the other cells in the row, column, and square cannot have that digit.
This generalizes to more than one digit as well. For instance, if in a row one cell must be 1 or 2, and a second cell also can be shown to be 1 or 2, even without knowing which is which, 1 and 2 can be eliminated from every other cell in that row.
For more than 2 it also holds, but there is no need for any single cell to have all the bits set. For instance, if in a row (or column, or square) one cell has 1 or 2, a second has 2 or 3, a third has 3 or 4, a fourth has 4 or 5, and a fifth has 5 or 1, this creates a cycle, and the remaining 4 cells cannot have the values 1 through 5.
The program keeps a list of cells that potentially can eliminate possibilities sorted by level. A cell is put on the list 3 times, once for each direction, and the linked list is kept sorted by level. The level starts out as the number of possible values of the cell. When the smallest level is taken off the list to be checked, 2 things can happen: either it is in a cycle, or it is not. If it is in a cycle the row/column/square removes their values from the remaining cells, and each of the cells in the cycle is removed from the list, since it has already made its contribution. If it is not in a cycle then the level is incremented by 1, and the cell is shuffled up the list.
When a cell has 1 or more bits cleared it potentially carries new information, so it is put back onto the list 3 times (once for each direction) according to its level.
Usually this process will solve the entire puzzle. If not, the program will recursively guess a cell, and see if that allows it to be solved. The process will continue until a solution has been found, or it has been demonstrated that no solution is possible. None of the published puzzles that I tested required guessing, to test that part I needed to remove information from them.
The algorithm works nicely on different size grids. It will solve 4x4, 9x9, 16x16, and 25x25 puzzles. The only limit is I use a bit mask to hold the potential values, so it is limited to squares <= 32. It would not be too hard to write subroutines to remove this limitation, but I have not seen any 100x100 puzzles on the web anywhere, and until I do this will suffice.