Difficulty of a Su Doku puzzle

For what I have read about Su Duko puzzle difficulty, there are no established rules and one must know how a puzzle can be solved before he can assess its difficulty. I have decided to experiment with some measures I have developed in the “Template as a help for solving a Su Doku puzzle” in the perspective of assessing difficulty.

Some principles behind the “Template…”

Part of the template is a table named “global performance” that indicates for each cell not containing a clue the number of different integers that could be placed in that cell. These numbers are the sums of the 9 “integer performance tables” where a 1(one) indicate that the given integer could occupy that position after applying the basic rules to the given puzzle.

A “global” number of 1 means that there is only one digit that can occupy that cell; there is no was around it and we have proposed a procedure using these “ones” in conjunction with summary measures of the integer tables for solving the puzzle. We have also pointed to the fact that there are puzzles that can be solved entirely while using these simple mandatory assignments of digits to cells and that in such case the solution is independent of the order in which the operations are conducted; those puzzles have a “direct solution”. The puzzles that cannot be solved directly with the proposed use-all-the-1’s procedure require intervention(s) of the player who must choose between alternative possibilities. Each choice brings the solution a step closer and I will refer to these puzzles as requiring a “stepped solution”.

A basic measure : degrees of freedom

The numbers larger than 1 once reduced by one are the number of free choices that have to be considered for a given cell for a given step. If we find that 4 and 7 can be placed in a given cell (the “global” number is 2), if we try 4 and that it does not work, then 7 must be placed in that cell, there is only one free choice. Statistical procedures use often a similar notion of degree of freedom, the number of parameters that can be set freely and independently to define uniquely a situation. I propose to name the sum of all the “number-minus-one”s in the global table the initial “degrees of freedom” (dof) of a puzzle.

That measure does not reveal however if a puzzled has a direct or stepped solution. One must try solving it to find it out, and the template I offer is an easy and relatively quick way to do it. I propose to deal as a start only with direct puzzles and to see the relationship between “assessed” difficulty (i.e. the level that is most often given by the puzzle author) and “measured initial” dof.

“Direct” puzzle examples

I prepared a sample from various origins (2 books and 2 web facilities) each one with its own ranking scale, some with 3 levels, others with 4 or 5. The calculated dof varied from about 40 to slightly almost 165. The first diagram is a scattergram  for the 70 puzzles in the sample showing the relationship between the number of clues and the degrees of freedom and one can see that there is a well established inverse relationship. The relatively small number of cases may explained that the spread of dof is not very visible for the largest numbers of clues; there is a bulge towards the center (around 29 clues) and a narrowing for the smallest numbers of clues that is probably due to the fact that it become difficult to find puzzles with small number of clues that do not required a stepped solution.

The second diagram is the same data but presented with the levels of difficulties as given by the “authors” as a basis for the class symbols. I have used the codes N1 for the easiest level to N4 is the hardest. In a book, there were only 3 levels, thus no N4 from it.

On the basis of their degrees of freedom, the levels are somewhat mixed up, mainly N3 (small circles)  that overlaps both its neighbors. If an objective was to bring some order in those different classifications, one possibility could be to adopt a dof scale with 3 reference values: 100 (below 100 easy and possibly below 80 very easy) 120 (between 100 and 120 medium) 140 (120-140 hard and above 140 very hard). That simple scheme will clear up existing confusion by reassigning some puzzles to the right class in the center of the range as the following table shows:

 “authors” level Proposed levels L1 L2 L3 L4 N1 8 8 N2 3 16 11 30 N3 18 7 25 N4 7 7 11 16 29 14 70

The previous exploration was intended to show that the use of the calculated initial degrees of freedom could become an objective basis to define some “universal” scale of difficulty, but not before learning more about the distribution of dofs at the extremities of the scale, mainly with low numbers of clues, and expanding the scale to include “stepped” puzzles. But that will require dealing first with the notion of the difficulty of a puzzle.

Extension to “Stepped” puzzles

I have dealt only with “direct” puzzles what can be solved by the automatic application of the basic rules and the dof measures the amount of choices available initially. The “solver” does not have to make any decision other than the order in which he will choose filling in the puzzle “mechanically”, order that has no impact on the solution. “Dof” indicates in this case the apparent multiplicity of the choices initially offered by the puzzle, its “complexity”.

We can add incidentally here that “direct” puzzles offer only one solution. Ambiguous puzzles offering more than one solution must be of the stepped variety to allow alternative choices that can lead to different acceptable solutions.

When the procedure used to fill cells in an automatic way (assigning integers on the basis of the 1s present in global or integer performance tables) stops before the puzzle is finished (or cannot even start with some very rare puzzles), the “solver” is faced with having to exercise some choice and follow it to the end. This will be the first step in the solution of the puzzle. His choice could be between 2 integers in one position (a 2 in that position of the global performance table) or between two positions for one integer (a 2 in a group, column or row total of an integer performance table). Of course, if there are no 2s, the reasoning can be extended to a higher value.

If the choice made leads directly to the solution, we conclude that the puzzle is a one-step puzzle and that the user was faced with a decision situation offering a difficulty measured by the dof recorded at that phase, i.e. at the end of the automatic assignment stage. We propose that the difficulty of a one-step puzzle be measured by the sum of the initial dof and the “step” dof.

If the choice does not lead to a solution, there are two outcomes: if an impossibility is detected, the alternative choice must be used and we are back to the previous situation (direct solution or not); or we have encountered another step. The description of an algorithm for finding the solution is not the purpose of this paper and would be too cumbersome because the validation of a step implies that entire branches be explored until encountering impossibility (that invalidates the entire branch) or final solution (that solves the puzzle); if another step is encountered in a branch all sub-branches must be explored until encountering impossibility or solution.

A general measure

We will assume that a solution has been found and that its “path” can be described as a series of steps (specific choices of an integer in a given position followed most generally by automatic assignations of integers) with their corresponding dofs. We propose as a measure of difficulty the sum of the initial dof and of all the step dofs.

I believe that the real difficulty of a puzzle lies essentially in the numbers of choices to be made (steps) but measuring difficulty only by the number of steps will not be a sufficiently detailed measure to establish various levels among the “direct” puzzles. Using the degrees of freedom as suggested above provides a unique measure combining “complexity” (dof) and number of steps by the repetition of a part of the previous dof. That proposed measure can be used to design a pertinent scale of difficulty.

Questions to explore before making a final recommendation

Many questions must be answered before such a proposal could be adopted. I see them belonging to two areas. One group of questions is centered around the notion of “solution path”; is there only one path for a given puzzle? If more than one path each with different total dofs, which total path dof should be chosen? Etc.

The second area deals with the numerical impact of the calculated total path dof. I expect to see two particular consequences, one is a clear differentiation between direct and stepped puzzles (my first impression being that stepped puzzles are among the highest initial dofs and that adding step dof will give them clearly higher values) and the other is that the upper values that can be encountered will be much higher than the highest for a direct puzzle (I expect to see values reaching 500 and may be exceeding that).

I continue exploring these questions.

21.02.2006