Appendix: Sudoku Strategies - 2008mcmsolutions

Most (but not all) Sudoku puzzles can be solved using a series of logical deductions [What is Sudoku? n.d.]. These deductions have been organized into a number of common patterns, which we have organized by difficulty.

The strategies have been classed into tiers between 0 and 5 based upon the general consensus of many sources on their level of complexity (for example, see Johnson [n.d.] and Sudoku Strategy [n.d.]).

In this work, we have used what seem to be the most commonly oc-curring and accessible strategies together with some simple backtracking.

There are, of course, many more advanced strategies, but since our existing strategies suffice to solve almost all puzzles that we consider, we choose to ignore the more advanced ones.

0. Tier 0 Strategies

• Naked Single: A Naked Single exists in the cell (i, j) if cell (i, j) on the board has no entry, but the corresponding entry(i, j) on the Sudoku Solution Graph has one and only one possible value. For example, in Figure A1. We see that cell(2, 9)is empty. Furthermore,

1 2 3 4 5 6 7 8 ?

Figure A1. Example for Naked Single strategy.

the corresponding Sudoku Solution Graph entry in (2, 9) can only contain the number 9, since the numbers 1 through 8 are already assigned to cells in row 2. Therefore, since cell (2, 9) in the corre-sponding Sudoku Solution Graph only has one (naked) value, we can assign that value to cell(2, 9)on the sudoku board.

Application Enumeration: Since a Naked Single could occur in any empty cell, this is just the number of empty cells, since checking if any empty cell is a Naked Single requires constant time.

• Hidden Single: A Hidden Single occurs in a given cell(i, j)when:

(a) (i, j)has no entry on the Sudoku board

(b) (i, j) contains the value k (among other values) on the Sudoku Solution Graph

Once we find a hidden single in(i, j) with value k, we assignk to (i, j)on the Sudoku board. The logic behind hidden singles is that given any group, all numbers 1 through 9 must appear exactly once.

If we know cell (i, j)is the only cell that could contain the value k in a given row, then we know that it must hold valuekon the actual Sudoku board. We can consider the example in Figure A2.

We look at cell (1, 1). First, (1, 1) does not have an entry, and we can see that its corresponding entry in the Sudoku Solution Graph contains{1, 2, 7, 8, 9}. However, we see that the other cells in region

3 4 1

5 6 1

Figure A2. Example for Hidden Single strategy.

Application Enumeration: Since a Hidden Single could occur in any empty cell, this is just the number of empty cells, since checking if any empty cell is a Hidden Single requires constant time (inspecting other cells in the same group).

1. Tier 1 Strategies

• Naked Double: A Naked Double occurs when two cells on the board in the same group g do not have values assigned, and both their corresponding cells in the Sudoku Solution Graph have only the same two valuesk1andk2assigned to them. A naked double in(i1, j1)and (i2, j2)does not immediately give us the values contained in either (i1, j1)or (i2, j2), but it does allows us to eliminatek1 and k2 from the Sudoku Solution Graph of all cells ingbeside(i1, j1)and(i2, j2). Application Enumeration: For each row, column, and region, we sum up °_n

¢ where n is the number of empty cells in each group, since a Naked Double requires two empty cells in the same group.

• Hidden Double: A Hidden Double occurs in two cells (i1, j1)and (i2, j2)in the same groupg when:

(a) (i1, j1)and(i2, j2)have no values assigned on the board

(b) (i₁, j₁)and(i₂, j₂)share two entriesk₁ andk₂ (and contain pos-sibly more) in the Sudoku Solution Graph

A hidden double does not allow us to immediately assign values to (i₁, j₁) or(i₂, j₂), but it does allow us to eliminate all entries other than k1 and k2 in the Sudoku Solution Graph for cells (i1, j1) and (i₂, j₂).

Application Enumeration: For each row, column, and region, we

sum up ₂ where n is the number of empty cells in each group, since a Hidden Double requires two empty cells in the same group.

• Locked Candidates: A Locked Candidate occurs if we have cells (for simplicity, suppose we only have two: (i1, j1)and(i2, j2)) such that:

(a) (i1, j1)and(i2, j2)have no entries on the board

(b) (i1, j1)and(i2, j2)share two groups,g1 andg2(i.e. both cells are in the same row and region, or the same column and region) (c) (i₁, j₁) and (i₂, j₂) share some value k in the Sudoku Solution

Graph

(d) ∃g³, a group of the same type asg1,g1 6= g³, such thatk occurs in cells ofg₂ ∩ g3

(e) kdoes not occur elsewhere ing3 besidesg3∩ g² (f) kdoes not occur ing2 aside from(g2∩ g¹) ∪ (g² ∩ g³)

Then, sincek must occur at least once ing3, we knowk must occur ing2∩ g³. However, sincekcan only occur once ing2, thenkcannot occur in g2 ∩ g¹, so we can eliminate k from the Sudoku Solution Graph cells corresponding to(i1, j1)and(i2, j2). A locked candidate can also occur with three cells.

Application Enumeration: For every rowi, we examine each three-cell subsetrsij formed as the intersection with some regionj; there are twenty-seven such subsets. Out of those twenty-seven, we denote the number of subsets that have two or three empty cell as rl. We definec_lfor columns analogously, so this is just the sumr_l+ c_l. 2. Tier 2 Strategies

• Naked Triple: A Naked Triple occurs when three cells on the board, (i1, j1), (i2, j2)and (i3, j3), in the same groupg do not have values assigned, and all three of their corresponding cells in the Sudoku Solution Graph share only the same three possible values,k1, k2and k₃. However, each cell of a Naked Triple does not have to have all three values, e.g. we can have (i1, j1) have values k1, k2 and k₃, (i₂, j₂) have k₂, k₃ and (i₃, j₃) have k₁ and k₃ on the Sudoku Solution Graph. We can removek1, k2andk3from all cells except for (i1, j1), (i2, j2)and(i3, j3)in the Sudoku Solution Graph that are also in groupg; the logic is similar to that of the Naked Double strategy.

Application Enumeration: For each row, column, and region, we sum up °_n

¢ where n is the number of empty cells in each group, since a Naked Triple requires three empty cells in the same group.

(b) Valuesk₁, k₂ andk₃ appear among (i₁, j₁), (i₂, j₂)and (i₃, j₃)in their SSG

(c) k1, k2 andk3 do not appear in any other cells ofg in the SSG Then, we can eliminate all values besidek1, k2 andk3 in the SSG of cells(i1, j1), (i2, j2)and(i3, j3). The reasoning is the same as for the Hidden Double strategy.

Application Enumeration: For each row, column, and region, we sum up °_n

¢ where n is the number of empty cells in each group, since a Hidden Triple requires three empty cells in the same group.

• X-Wing: Given a valuek, an X-Wing occurs if:

(a) ∃^{two rows,} r1 and r2, such that the valuek appears in the SSG for exactly two cells each ofr₁ andr₂

(b) ∃distinct columnsc₁ andc₂ such thatk only appears in rowsr₁ andr2the Sudoku Solution Graph in the set(r1∩ c¹) ∪ (r¹∩ c²) ∪ (r₂ ∩ c1) ∪ (r2 ∩ c2)

Then, we can eliminate the valuek as a possible value for all cells in c1andc2that are not also inr1andr2, sincekcan only appear in each of the two possible cells of in each rowr1 and r2 and k. Similarly, the X-Wing strategy can also be applied if we have a valuek that is constrained in columnsc1 andc2 in exactly the same two rows.

Application Enumeration: For each valuek, 1 through 9, we count the number of rows that contain k exactly twice in the SSG of its empty cells,rk. Since we need two such rows to form an X-Wing for any one number, we take°_r_k

¢. We also count the number of columns that containk exactly twice in the SSG of its cells, c_k, and similarly take°_c_k

¢. We sum over all valuesk, so this value isP

°_r_k

¢+°_c_k

¢. 3. Tier 3 Strategies

• Naked Quad: A Naked Quad is similar to a Naked Triple; it occurs when four unfilled cells in the same groupg contain only elements of setKof at most four possible values in their SSG. In this case, we can remove all values inK from all other cells in groupg, since the values inK must belong only to the four unfilled cells.

Application Enumeration: For each row, column, and region, we sum up°_n

¢wherenis the number of empty cells in each group, since a Naked Quad requires three four empty cells in the same group.

• Hidden Quad: A Hidden Quad is analogous to a Hidden Triple. It occurs when we have four cells(i1, j1), (i2, j2), (i3, j3)and(i4, j4)in the same groupgsuch that:

(a) (i1, j1), (i2, j2), (i3, j3) and(i4, j4) share (among other elements) elements of the setK of at most four possible values in their SSG (b) No values ofK appear in the SSG of any other cell ing

Then we can eliminate all values that cells(i₁, j₁), (i₂, j₂), (i₃, j₃)and (i4, j4)take on other than values inKfrom their corresponding cells in the Sudoku Solution Graph. The reasoning is analogous to the Hidden Triple strategy.

Application Enumeration: For each row, column, and region, we sum up °_n

¢ where n is the number of empty cells in each group, since a Hidden Quad requires three four empty cells in the same group.

• Swordfish: The Swordfish Strategy is the three-row analogue to the X-Wing Strategy. Suppose we have three rows, r1, r2 and r3, such that the valuekhas not been assigned to any cell inr1, r2orr3. If the cells ofr1, r2andr3that havekas a possibility in their corresponding SSG are all in the same three columnsc1, c2andc3, then no other cells inc1, c2andc3can take on the valuek, so we may eliminate the value kfrom the corresponding cells in the SSG. (This strategy can also be applied if we have columns that restrict the occurrence ofk to three rows).

Application Enumeration: For each valuek, 1 through 9, we count the number of rows that containk exactly two or three times in the SSG of its empty cells, rk. Since we need three such rows to form a Swordfish for any one number we take °_r_k

¢. We also count the number of columns that contain k two or three times in the SSG of its cells,ck, and similarly take°_c_k

¢. We sum over all valuesk, so this value isP

°_r_k

¢+°_c_k

¢. 4. Tier 4 Strategies

• Jellyfish: The Jellyfish Strategy is analogous to the Swordfish and X-Wing strategies. We apply similar reasoning to four rowsr₁, r₂, r₃ andr4 in which some valuekis restricted to the same four columns c1, c2, c3 and c4. If the appearance of k in cells of r1, r2, r3 and r4

in the Sudoku Solution Graph is restricted to four specific columns, then we can eliminate k from any cells in c1, c2, c3 and c4 that are not in one ofr1, r2, r3orr4. Like the Swordfish strategy, the Jellyfish strategy may also be applied to columns instead of rows.

Application Enumeration: For each valuek, 1 through 9, we count the number of rows that containk exactly two, three or four times in the SSG of its empty cells, rk. Since we need four such rows to form a Jellyfish for any one numberk, we take°_r_k

¢. We also count the number of columns that containk°two, three or four times in the¢

version of complete search. When cell(i, j) has no assigned value, but exactly2possible valuesk1, k2 in its SSG, the solver will assign a test value (assumek₁) to cell(i, j)and continue solving the puzzle using only Tier0strategies.

There are three possible results. If the solver arrives at a contradic-tion, he deduces that k₂ is in cell (i, j). If the solver completes the puzzle using the test value, this is the unique solution and the puzzle is solved. Otherwise, if the solver cannot proceed further but has not solved the puzzle completely, backtracking has failed and the solver must start a different strategy.

Application Enumeration: Since we only apply Backtracking to cells with exactly two values in its SSG, this is just the number of empty cells that have exactly two values in their SSG.

References

AoPS Inc. 2007. Combinatorics and sequences.

http://www.artofproblemsolving.com/Forum/viewtopic.php?

t=88383.

Caine, Allan, and Robin Cohen. 2006. MITS: A Mixed-Initiative Intelligent Tutoring System for Sudoku. Advances in Artificial Intelligence 550–561.

Cox, Kenneth C., Stephen G. Eick, Graham J. Wills, and Ronald J. Brachman.

1997. Brief application description; visual data mining: Recognizing telephone calling fraud. Data Mining and Knowledge Discovery 225–331.

Emery, Michael Ray. 2007. Solving Sudoku puzzles with the COUGAAR agent architecture. Thesis. http://www.cs.montana.edu/

techreports/2007/MichaelEmery.pdf_.

Eppstein, David. 2005. Nonrepetitive paths and cycles in graphs with ap-plication to Sudoku. http://www.citebase.org/abstract?id=oai:

arXiv.org:cs/0507053.

Felgenhauer, Bertram, and Frazer Jarvis. 2005. Enumerating possible Sudoku grids. http://www.afjarvis.staff.shef.ac.uk/sudoku/

sudoku.pdf.

Goodman, Leo A., and William H. Kruskal. 1954. Measures of association for cross classifications. Journal of the American Statistical Association, 49 (December 1954) 732–764.

GraphPad Software. n.d. QuickCalcs: Online calculators for scientists.

http://www.graphpad.com/quickcalcs/PValue1.cfm.

Hanssen, Vegard. n.d. Sudoku puzzles. http://www.menneske.no/

sudoku/eng/.

Hayes, Brian. 2006. Unwed numbers: The mathematics of Sudoku, a puz-zle that boasts “No math required!” American Scientist Onlinehttp://

www.americanscientist.org/template/AssetDetail/assetid/

48550?print=yes.

Johnson, Angus. n.d. Solving Sudoku. http://www.angusj.com/sudoku/

hints.php.

Knuth, Donald Ervin. 2000. Dancing links. In Millennial Perspectives in Computer Science: Proceedings of the 1999 Oxford-Microsoft Symposium in Honour of Professor Sir Antony Hoare, edited by Jim Davies, Bill Roscoe, and Jim Woodcock, 187–214. Basingstoke, U.K.: Palgrave Macmillan.

http://www-cs-faculty.stanford.edu/~uno/preprints.html. Lenz, Moritz. n.d. Sudoku Garden. http://sudokugarden.de/en_. Lewis, Rhyd. 2007. Metaheuristics can solve sudoku puzzles. Journal of

Heuristics 13 (4): 387–401.

Lynce, Inˆes, and Jo¨el Ouaknine. 2006. Sudoku as a SAT Problem. http:

//sat.inesc-id.pt/~ines/publications/aimath06.pdf.

Mantere, Timo, and Janne Koljonen. 2006. Solving and rating Sudoku puz-zles with genetic algorithms. In Proceedings of the 12th Finnish Artificial Intelligence Conference STeP.

http://www.stes.fi/scai2006/proceedings/

step2006-86-solving-and-rating-sudoku-puzzles.pdf_. Simonis, Helmut. 2005. Sudoku as a constraint problem. In Modelling and Reformulating Constraint Satisfaction, edited by Brahim Hnich, Patrick Prosser, and Barbara Smith, 13–27. http://homes.ieu.edu.tr/~bhnich/

mod-proc.pdf#page=21.

Sudoku. n.d. Times Online. http://entertainment.timesonline.co.

uk/tol/arts_and_entertainment/games_and_puzzles/sudoku/. Sudoku solver. http://www.scanraid.com/sudoku.htm.

Sudoku strategy. n.d. Sudoku Dragon. http://www.sudokudragon.com/

sudokustrategy.htm.

Web Sudoku. n.d. URL:http://www.websudoku.com/_. What is Sudoku? n.d. http://www.sudokuaddict.com.

Yato, Takayuki. 2003. Complexity and completeness of finding another solution and its application to puzzles. Thesis, January 2003. http://

Zhou Fan, Christopher Chang, and Yi Sun.

Taking the Mystery Out of Sudoku Difficulty: An Oracular Model

Sarah Fletcher Frederick Johnson David R. Morrison

Harvey Mudd College Claremont, CA

Advisor: Jon Jacobsen

Summary

In the last few years, the 9-by-9 puzzle grid known as Sudoku has gone from being a popular Japanese puzzle to a global craze. As its popularity has grown, so has the demand for harder puzzles whose difficulty level has been rated accurately.

We devise a new metric for gauging the difficulty of a Sudoku puzzle.

We use an oracle to model the growing variety of techniques prevalent in the Sudoku community. This approach allows our metric to reflect the difficulty of the puzzle itself rather than the difficulty with respect to some particular set of techniques or some perception of the hierarchy of the techniques. Our metric assigns a value in the range[0, 1]to a puzzle.

We also develop an algorithm that generates puzzles with unique solu-tions across the full range of difficulty. While it does not produce puzzles of a specified difficulty on demand, it produces the various difficulty levels frequently enough that, as long as the desired score range is not too narrow, it is reasonable simply to generate puzzles until one of the desired difficulty is obtained. Our algorithm has exponential running time, necessitated by the fact that it solves the puzzle it is generating to check for uniqueness.

However, we apply an algorithm known as Dancing Links to produce a reasonable runtime in all practical cases.

Introduction

The exact origins of the Sudoku puzzle are unclear, but the first modern

“Sudoku” puzzle showed up under the name “Number Place” in a 1979 puzzle magazine put out by Dell Magazines. Nikoli Puzzles introduced the puzzle to Japan in 1984, giving it the name “Suuji wa dokushin ni kagiru,”

which was eventually shortened to the current “Sudoku.” In 1986, Nikoli added two new constraints to the creation of the puzzle: There should be no more than 30 clues (or givens), and these clues must be arranged sym-metrically. With a new name and a more esthetically-pleasing board, the game immediately took off in Japan. In late 2004, Sudoku was introduced to the London Times; and by the summer of 2005, it had infiltrated many major American newspapers and become the latest puzzle craze [Wikipedia 2008b].

Sudopedia is a Website that collects and organizes electronic information on Sudoku, including solving techniques, from how do deal with “Fishy Cycles” and “Squirmbags” to identifying “Skyscrapers” and what to do if you discover that you have a “Broken Wing.” It even explains the possi-bilities for what has happened if you find yourself hopelessly buried in a

“Bivalue Universal Grave.” Some techniques are more logically complex than others, but many of similar complexity seem more natural to different players or are more powerful in certain situations. This situation makes it difficult to use specific advanced techniques in measuring the difficulty of a puzzle.

Our goal is a metric to rate Sudoku puzzles and an algorithm to generate them. A useful metric should reflect the difficulty as perceived by humans, so we analyze how humans approach the puzzle and use the conclusions as the basis for the metric. In particular, we introduce the concept of an

“oracle” to model the plethora of complicated techniques. We also devise a normalized scoring technique, which allows our metric to be extended to a variety of difficulty levels.

We devise a generation algorithm to produce puzzles with unique so-lutions that span all difficulty levels, as measured by our metric. To en-sure uniqueness, our generation algorithm must solve the puzzle (multiple times) to check for extra solutions. Since solving a Sudoku puzzle is an NP-complete problem [Wikipedia 2008b], our algorithm has exponential running time at best.

在文檔中 2008mcmsolutions (頁 137-147)