The mathematical analysis of Sudoku falls into two main areas: analyzing the properties of a) completed grids and b) puzzles. Grid analysis has largely focused on counting (enumerating) possible solutions for different variants. Puzzle analysis centers on the initial given values. The techniques used in either are largely the same: combinatorics and permutation group theory, augmented by the dexterous application of programming tools.
There are many Sudoku variants, (partially) characterized by the size (N) and shape of their regions. For classic Sudoku, N=9 and the regions are 3x3 squares (blocks). A rectangular Sudoku uses rectangular regions of row-column dimension R×C. For R×1 (and 1×C), i.e. where the region is a row or column, Sudoku becomes a Latin square.
Other Sudoku variants also exist, such as those with irregularly-shaped regions or with additional constraints (hypercube) or different (Samunamupure) constraint types. See Sudoku - Variants for a discussion of variants and Sudoku terms and jargon for an expanded listing.
Mathematics uses elegant and succinct symbolism, often meaningful only with formal training. For this article, an abstract mathematical statement is used at the upper levels. Detail descriptions are deferred to sub-sections, usually with the term 'detail' in the section heading, and can be skipped where the condensed mathematical statement is (deemed) sufficient. The extended explanations can also be used as a practical introduction to the mathematical concepts involved.
The mathematics of Sudoku is a relatively new area of exploration, mirroring the popularity of Sudoku itself. NP-completeness was documented late 2002 , enumeration results began appearing in May 2005 This article is incomplete and will undoubtedly remain so as these efforts continue.
In contrast with the two main mathematical approaches of Sudoku mentioned above, an approach resting on mathematical logic and dealing with the resolution of the puzzles from the viewpoint of a player has recently been proposed in Denis Berthier's book "The Hidden Logic of Sudoku". This uncovers and formalizes all the generalized mathematical symmetries of the game and elicits new resolution rules based on them, such as hidden xy-chains.
Solving Sudoku puzzles can be expressed as a graph coloring problem. Let us talk about the 9 × 9 = 3^{2} × 3^{2} case. The aim of the puzzle in its standard form is to construct a proper 9-coloring of a particular graph, given a partial 9-coloring. The graph in question has 81 vertices, one vertex for each cell of the grid. The vertices can be labeled with the ordered pairs (x, y), where x and y are integers between 1 and 9. In this case, two distinct vertices labeled by (x, y) and (x′, y′) are joined by an edge if and only if:
The puzzle is then completed by assigning an integer between 1 and 9 to each vertex, in such a way that vertices that are joined by an edge do not have the same integer assigned to them.
A valid Sudoku solution grid is also a Latin square. The relationship between the two theories is now known precisely: it has been shown by D. Berthier that a first order formula that does not mention regions (also called blocks) is valid for Sudoku if and only if it is valid for Latin Squares.
There are significantly fewer valid Sudoku solution grids than Latin squares because Sudoku imposes the additional regional constraint. Nonetheless, the number of valid Sudoku solution grids for the standard 9×9 grid was calculated by Bertram Felgenhauer in 2005 to be 6,670,903,752,021,072,936,960 . This number is equal to 9! × 72^{2} × 2^{7} × 27,704,267,971, the last factor of which is prime. The result was derived through logic and brute force computation. The derivation of this result was considerably simplified by analysis provided by Frazer Jarvis and the figure has been confirmed independently by Ed Russell. Russell and Jarvis also showed that when symmetries were taken into account, there were 5,472,730,538 solutions . The number of valid Sudoku solution grids for the 16×16 derivation is not known.
The maximum number of givens that can be provided while still not rendering the solution unique, regardless of variation, is four short of a full grid; if two instances of two numbers each are missing and the cells they are to occupy are the corners of an orthogonal rectangle, and exactly two of these cells are within one region, there are two ways the numbers can be added. The inverse of this—the fewest givens that render a solution unique—is an unsolved problem, although the lowest number yet found for the standard variation without a symmetry constraint is 17, a number of which have been found by Japanese puzzle enthusiasts , and 18 with the givens in rotationally symmetric cells.
Take for example $mathbb\{Z\}\_\{n\}oplusmathbb\{Z\}\_\{n\}$ the group of pairs, adding each component separately modulo some $n$. By omitting one of the components we suddenly find ourselves in $mathbb\{Z\}\_\{n\}$ (and this mapping is obviously compatible with the respective additions, i.e. it is a group homomorphism). One also says that the latter is a quotient group of the former, because some once different elements become equal in the new group. However, it is also a subgroup, because we can simply fill the missing component with $0$ to get back to $mathbb\{Z\}\_\{n\}oplusmathbb\{Z\}\_\{n\}$.
Under this view we write down the example $n=3$: {{Sudoku 9x9 grid|title=The addition table in $mathbb\{Z\}\_\{3\}oplusmathbb\{Z\}\_\{3\}$|align=center|cellsize=24pt|= |s1=(0,0) |s2=(0,1) |s3=(0,2) |s4=(1,0) |s5=(1,1) |s6=(1,2) |s7=(2,0) |s8=(2,1) |s9=(2,2) |= |s10=(0,1)|s11=(0,2)|s12=(0,0)|s13=(1,1)|s14=(1,2)|s15=(1,0)|s16=(2,1)|s17=(2,2)|s18=(2,0)|= |s19=(0,2)|s20=(0,0)|s21=(0,1)|s22=(1,2)|s23=(1,0)|s24=(1,1)|s25=(2,2)|s26=(2,0)|s27=(2,1)|= |s28=(1,0)|s29=(1,1)|s30=(1,2)|s31=(2,0)|s32=(2,1)|s33=(2,2)|s34=(0,0)|s35=(0,1)|s36=(0,2)|= |s37=(1,1)|s38=(1,2)|s39=(1,0)|s40=(2,1)|s41=(2,2)|s42=(2,0)|s43=(0,1)|s44=(0,2)|s45=(0,0)|= |s46=(1,2)|s47=(1,0)|s48=(1,1)|s49=(2,2)|s50=(2,0)|s51=(2,1)|s52=(0,2)|s53=(0,0)|s54=(0,1)|= |s55=(2,0)|s56=(2,1)|s57=(2,2)|s58=(0,0)|s59=(0,1)|s60=(0,2)|s61=(1,0)|s62=(1,1)|s63=(1,2)|= |s64=(2,1)|s65=(2,2)|s66=(2,0)|s67=(0,1)|s68=(0,2)|s69=(0,0)|s70=(1,1)|s71=(1,2)|s72=(1,0)|= |s73=(2,2)|s74=(2,0)|s75=(2,1)|s76=(0,2)|s77=(0,0)|s78=(0,1)|s79=(1,2)|s80=(1,0)|s81=(1,1)|= |c55=yellow|c56=yellow|c57=yellow|c64=yellow|c65=yellow|c66=yellow|= |c73=yellow|c74=yellow|c75=yellow|= }}
Each Sudoku region looks the same on the second component (namely like the subgroup $mathbb\{Z\}\_\{3\}$), because these are added regardless of the first one. On the other hand, the first components are equal in each block, and if we imagine each block as one cell, these first components show the same pattern (namely the quotient group $mathbb\{Z\}\_\{3\}$). As already outlined in the article of Latin squares, this really is a Latin square of order $9$!
Now, to yield a Sudoku, let us permute the rows (or equivalently the columns) in such a way, that each block is redistributed exactly once into each block - for example order them $1,4,7,2,5,8,3,6,9$. This of course preserves the Latin square property. Furthermore, in each block the lines have distinct first component by construction and each line in a block has distinct entries via the second component, because the blocks' second components originally formed a Latin square of order $3$ (from the subgroup $mathbb\{Z\}\_\{3\}$).
Thus we really get a Sudoku (Rename the pairs to numbers 1...9 if you wish)! With the example and the row permutation above we yield:
For this method to work, one generally does not need a product of two equally-sized groups. A so-called short exact sequence of finite groups of appropriate size already does the job! Try for example the group $mathbb\{Z\}\_\{4\}$ with quotient- and subgroup $mathbb\{Z\}\_\{2\}$. It seems clear (already from enumeration arguments), that not all Sudokus can be generated this way!
A proper puzzle has a unique solution. The constraint 'each digit appears in each row, column and region' is called the One Rule.
See basic terms or the List of Sudoku terms and jargon for an expanded list of terminology.
The size ordering of Sudoku puzzle can be used to define an integer series, e.g. for square Sudoku, the integer series of possible solutions ().
Sudoku with square NxN regions are more symmetrical than immediately obvious from the One Rule. Each row and column intersects N regions and shares N cells with each. The number of bands and stacks also equals N. Rectangular Sudoku do not have these properties. The "3x3" Sudoku is additionally unique: N is also the number of row-column-region constraints from the One Rule (i.e. there are N=3 types of units).
See the List of Sudoku terms and jargon for an expanded list and classification of variants.
The number of solutions depends on the grid dimensions, rules applied and the definition of distinct solutions. For the 3×3 region grid, the conventions for labeling the rows, columns, blocks (boxes) are as shown. Bands are numbered top to bottom, stacks left to right. By extension the labeling scheme applies to any rectangular Sudoku grid.
Term labels for box-row and box-column triplets are also shown.
The {a, b, c} notation also reflects that fact a triplet is a subset of the allowed digits. For regions of arbitrary dimension, the related object is known as a minicol(umn) or minirow.
These operations define a symmetry relation between equivalent grids. Excluding relabeling, and with respect to the 81 grid cell values, the operations form a subgroup of the symmetric group S_{81}, of order 3!^{8}×2 = 3359232.
Since then, enumeration results for many Sudoku variants have been calculated: these are summarised below.
Dimensions | Nr Grids | Attribution | Verified? | Rel Err |
---|---|---|---|---|
1x? | see Latin squares | n/a | ||
2×2 | 288 | various | Yes | -11.1% |
2×3 | 28200960 = c. 2.8×10^{7} | Pettersen | Yes | -5.88% |
2×4 | 29136487207403520 = c. 2.9×10^{16} | Russell | Yes | -1.91% |
2×5 | 1903816047972624930994913280000 = c. 1.9×10^{30} | Pettersen | Yes | -0.375% |
2×6 | 38296278920738107863746324732012492486187417600000 = c. 3.8×10^{49} | Pettersen | No | -0.238% |
3×3 | 6670903752021072936960 = c. 6.7×10^{21} | Felgenhauer/Jarvis | Yes | -0.207% |
3×4 | 81171437193104932746936103027318645818654720000 = c. 8.1×10^{46} | Pettersen/Silver | No | -0.132% |
3×5 | unknown, estimated c. 3.5086×10^{84} | Silver | n/a | |
4×4 | unknown, estimated c. 5.9584×10^{98} | Silver | n/a | |
4×5 | unknown, estimated c. 3.1764×10^{175} | Silver | n/a | |
5×5 | unknown, estimated c. 4.3648×10^{308} | Silver/Pettersen | n/a |
The band counts for problems whose full Sudoku grid-count is unknown are listed below. As in the previous section, "Dimensions" are those of the regions.
Dimensions | Nr Bands | Attribution | Verified? |
---|---|---|---|
2×C | (2C)! (C!)^{2} | (obvious result) | Yes |
3×C | $(3C)!\; (C!)^6\; sum\_\{k=0ldots\; C\}\; \{C\; choose\; k\}^3$ | Pettersen | Yes |
4×C | (long expression: see below) | Pettersen | Yes |
4×4 | 16! × 4!^{12} × 1273431960 = c. 9.7304×10^{38} | Silver | Yes |
4×5 | 20! × 5!^{12} × 879491145024 = c. 1.9078×10^{55} | Russell | Yes |
4×6 | 24! × 6!^{12} × 677542845061056 = c. 8.1589×10^{72} | Russell | Yes |
4×7 | 28! × 7!^{12} × 563690747238465024 = c. 4.6169×10^{91} | Russell | Yes |
(calculations up to 4×100 have been performed by Silver, but are not listed here) | |||
5×3 | 15! × 3!^{20} × 324408987992064 = c. 1.5510×10^{42} | Silver | Yes^{#} |
5×4 | 20! × 4!^{20} × 518910423730214314176 = c. 5.0751×10^{66} | Silver | Yes^{#} |
5×5 | 25! × 5!^{20} × 1165037550432885119709241344 = c. 6.9280×10^{93} | Pettersen/Silver | No |
5×6 | 30! × 6!^{20} × 3261734691836217181002772823310336 = c. 1.2127×10^{123} | Pettersen/Silver | No |
5×7 | 35! × 7!^{20} × 10664509989209199533282539525535793414144 = c. 1.2325×10^{154} | Pettersen/Silver | No |
5×8 | 40! × 8!^{20} × 39119312409010825966116046645368393936122855616 = c. 4.1157×10^{186} | Pettersen/Silver | No |
5×9 | 45! × 9!^{20} × 156805448016006165940259131378329076911634037242834944 = c. 2.9406×10^{220} | Pettersen/Silver | No |
5×10 | 50! × 10!^{20} × 674431748701227492664421138490224315931126734765581948747776 = c. 3.2157×10^{255} | Pettersen/Silver | No |
The expression for the 4×C case is: $(4C)!(C!)^\{12\}sum\_\{a,\; b,\; c\}\; \{left(frac\{C!^2\}\{a!\; b!\; c!\}\; *\; sum\_\{begin\{matrix\}k\_\{12\},k\_\{13\},k\_\{14\},k\_\{23\},k\_\{24\},k\_\{34\}end\{matrix\}\}\; \{bchoose\; k\_\{13\}\}\{c\; choose\; k\_\{14\}\}\{c\; choose\; k\_\{23\}\}\{b\; choose\; k\_\{24\}\}\{a\; choose\; k\_\{34\}\}\; \}\; right)^2\; \}$
where:
Type | Nr Grids | Attribution | Verified? |
---|---|---|---|
3doku | 104015259648 | Stertenbrink | Yes |
Disjoint Groups | 201105135151764480 | Russell | Yes |
Hypercube | 37739520 | Stertenbrink | Yes |
Magic Sudoku | 5971968 | Stertenbrink | Yes |
Sudoku X | 55613393399531520 | Russell | Yes |
NRC Sudoku | 6337174388428800 | Brouwer | Yes |
All Sudokus remain valid (no repeated numbers in any row, column or region) under the action of the Sudoku preserving symmetries (see also Jarvis). Some Sudokus are special in that some operations merely have the effect of relabelling the digits; several of these are enumerated below.
Transformation | Nr Grids | Attribution | Verified? |
---|---|---|---|
Transposition | 10980179804160 | Russell | Indirectly |
Quarter Turn | 4737761280 | Indirectly | |
Half Turn | 56425064693760 | Indirectly | |
Band cycling | 5384326348800 | Indirectly | |
Within-band row cycling | 39007939461120 | Indirectly |
Additional constraints (here, on 3×3 Sudokus) lead to a smaller minimum number of clues.
A variant on Miyuki Misawa's web site replaces sums with relations: the clues are symbols =, < and > showing the relative values of (some but not all) adjacent region sums. She demonstrates an example with only eight relations. It is not known whether this is the best possible.
The strategy begins by analyzing the permutations of the top band used in valid solutions. Once the Band1 symmetries and equivalence class for the partial solutions are identified, the completions of the lower two bands are constructed and counted for each equivalence class. Summing the completions over the equivalence classes gives the total number of solutions as 6,670,903,752,021,072,936,960 (c. 6.67×10^{21}).
1 2 3 4 5 6 7 8 9
The permutations for Band1 are 9! × 56 × 6^{6 } = 9! × 2612736 ~ 9.48 × 10^{11}.
(Note: Conditional combinations becomes an increasingly difficult as the computation progresses through the grid. At this point the impact is minimal.)
Case 0: No Overlap. The choices for the triplets can be determined by elimination. r21 can't be r11 or r12 so it must be = r13; r31 must be = r12 etc.
The Case 0 diagram shows this configuration, where the pink cells are triplet values that can be arranged in any order within the triplet. Each triplet has 3! = 6 permutations. The 6 triplets contribute 6^{6} permutations.
Case 3: 3 Digits Match: triplet r21 = r12. The same logic as case 0 applies, but with a different triplet usage. Triplet r22 must be = r13, etc. The number of permutations is again 6^{6}. (Felgenhauer/Jarvis call the cases 0 and 3 the pure match case. Case 1: 1 Match for r21 from r12
In the Case 1 diagram, B1 cells show canonical values, which are color coded to show their row-wise distribution in B2 triplets. Colors reflect distribution but not location or values. For this case: the B2 top row triplet (r21) has 1 value from B1 middle triplet, the other colorings can now be deduced. E.g. the B2 bottom row triplet (r23) coloring is forced by r21: the other 2 B1 middle values must go to bottom, etc. Fill in the number of B2 options for each color, 3..1, beginning top left. The B3 color coding is omitted since the B3 choices are row-wise determined by B1, B2. B3 always contributes 3! permutations per row triplet, or 6^{3} for the block.
For B2, the triplet values can appear in any position, so a 3! permutation factor still applies, for each triplet. However, since some of the values were paired relative to their origin, using the raw option counts would overcount the number of permutations, due to interchangeability within the pairing. The option counts need to be divided by the permuted size of their grouping (2), here 2!.= 2 (See n Choose k) The pair in each row cancels the 2s for the B2 option counts, leaving a B2 contribution of 3^{3} × 6^{3}. The B2×B3 combined contribution is 3^{3} × 6^{6}. Case 2: 2 Matches for r21 from r12. The same logic as case 1 applies, but with the B2 option count column groupings reversed. Case 3 also contributes 3^{3}×6^{6} permutations.
Totaling the 4 cases for Band1 B1..B3 gives 9! × 2 × (3^{3} + 1) × 6^{6 } = 9! 56 × 6^{6} permutations.
A symmetry in mathematics is an operation that preserves a quality of an object. For a Sudoku solution grid, a symmetry is a transformation whose result is also a solution grid. The following symmetries apply independently for the top band:
Combined, the symmetries give 9! × 6^{5} = 362880 × 7776 equivalent permutations for each Band1 solution.
A symmetry defines an equivalence relation, here, between the solutions, and partitions the solutions into a set of equivalence classes. The Band1 row, column and block symmetries divide the 56*6^6 permutations into (not less than) 336 (56×6) equivalence classes with (up to) 6^{5} permutations in each, and 9! relabeling permutations for each class. (Min/Max caveats apply since some permutations may not yield distinct elements due to relabeling.)
Since the solution for any member of an equivalence class can be generated from the solution of any other member, we only need to enumerate the solutions for a single member in order to enumerate all solutions over all classes. Let
The total number of solutions N is then:
Counting symmetry partitions valid Band1 permutations into classes that place the same completion constraints on lower bands; all members of a band counting symmetry equivalence class must have the same number of grid completions since the completion constraints are equivalent. Counting symmetry constraints are identified by the Band1 column triplets (a column value set, no implied element order). Using band counting symmetry, a minimal generating set of 44 equivalence classes was established.
The following sequence demonstrates mapping a band configuration to a counting symmetry equivalence class. Begin with a valid band configuration. | |
Build column triplets by ordering the column values within each column. This is not a valid Sudoku band, but does place the same constraints on the lower bands as the example. | |
Construct an equivalence class ID from the B2, B3 column triplet values. Use column and box swaps to achieve the lowest lexicographical ID. The last figure shows the column and box ordering for the ID: 124 369 578 138 267 459. All Band1 permutations with this counting symmetry ID will have the same number of grid completions as the original example. An extension of this process can be used to build the largest possible band counting symmetry equivalence classes. |
The Band1 (6^{5}) symmetries divide the (56×6^{6}) Band1 valid permutations into (not less than) 336 (56×6) equivalence classes with (up to) 6^5 permutations each. The not less than and up to caveats are necessary, since some combinations of the transformations may not produce distinct results, when relabeling is required (see below). Consequently some equivalence classes may contain less than 6^{5} distinct permutations and the theoretical minimum number of classes may not be achieved.
Each of the valid Band1 permutations can be expanded (completed) into a specific number of solutions with the Band2,3 permutations. By virtue of their similarity, each member of an equivalence class will have the same number of completions. Consequently we only need to construct the solutions for one member of each equivalence class and then multiply the number of solutions by the size of the equivalence class. We are still left with the task of identifying and calculating the size of each equivalence class. Further progress requires the dexterous application of computational techniques to catalogue (classify and count) the permutations into equivalence classes.
Felgenhauer/Jarvis catalogued the Band1 permutations using lexicographical ordered IDs based on the ordered digits from blocks B2,3. Block 1 uses a canonical digit assignment and is not needed for a unique ID. Equivalence class identification and linkage uses the lowest ID within the class.
Application of the (2×6^{2}) B2,3 symmetry permutations produces 36288 (28×6^{4}) equivalence classes, each of size 72. Since the size is fixed, the computation only needs to find the 36288 equivalence class IDs. (Note: in this case, for any Band1 permutation, applying these permutations to achieve the lowest ID provides an index to the associated equivalence class.)
Application of the rest of the block, column and row symmetries provided further reduction, i.e. allocation of the 36288 IDs into fewer, larger equivalence classes. When the B1 canonical labeling is lost through a transformation, the result is relabeled to the canonical B1 usage and then catalogued under this ID. This approach generated 416 equivalence classes, somewhat less effective than the theoretical 336 minimum limit for a full reduction.
Application of counting symmetry patterns for duplicate paired digits achieved reduction to 174 and then to 71 equivalence classes. The introduction of equivalence classes based on band counting symmetry (subsequent to Felgenhauer/Jarvis by Russell) reduced the equivalence classes to a minimum generating set of 44.
The diversity of the ~2.6×10^{6}, 56×6^{6} Band1 permutations can be reduced to a set of 44 Band1 equivalence classes. Each of the 44 equivalence classes can be expanded to millions of distinct full solutions, but the entire solution space has a common origin in these 44. The 44 equivalence classes play a central role in other enumeration approaches as well, and speculation will return to the characteristics of the 44 classes when puzzle properties are explored later.
The computation required for the lower band solution search can be minimised by the same type of symmetry application used for Band1. There are 6! (720) permutations for the 6 values in column 1 of Band2,3. Applying the lower band (2) and row within band (6×6) permutations creates 10 equivalence classes of size 72. At this point, completing 10 sets of solutions for the remaining 48 cells with a recursive descent, backtracking algorithm is feasible with 2 GHz class PC so further simplification is not required to carry out the enumeration.
Using this approach, the number of ways of filling in a blank Sudoku grid was shown in May 2005 (original announcement) to be 6,670,903,752,021,072,936,960 (6.67×10^{21}). The paper Enumerating possible Sudoku grids by Felgenhauer and Jarvis, describes the calculation.
The result, as confirmed by Russell, also contains the distribution of solution counts for the 44 equivalence classes. The listed values are before application of the 9! factor for labeling and the two 72 factors (72^{2} = 5184) for each of Stack 2,3 and Band2,3 permutations. The number of completions for each class is consistently on the order of 100,000,000, while the number of Band1 permutations covered by each class however varies from 4 – 3240. Within this wide size range, there are clearly two clusters. Ranked by size, the lower 33 classes average ~400 permutations/class, while the upper 11 average ~2100. The disparity in consistency between the distributions for size and number of completions or the separation into two clusters by size is yet to be examined.
It has been postulated that no proper sudoku can have clues limited to the range of positions in the pattern below.
The largest rectangular orthogonal "hole" (region with no clues) in a proper sudoku is believed to be a rectangle of 30 cells (a 5 x 6 rectangular area). One such example is the following with 22 clues:
The largest total number of empty groups (rows, columns, and squares) in a sudoku is believed to be nine. One example is the following; a sudoku which has 3 empty squares, 3 empty rows, and 3 empty columns and gives 22 clues.