Data Structure and Algorithm II Homework #5
Due: 10am, Monday, June 13, 2011
=== Homework submission instructions ===
• Submit the answers for writing problems through the CEIBA system (electronic copy) or to the TA in R204 (hard copy). Please write down your name and school ID in the header of your documents. You also need to submit your programming assignment (problem 1) to the Judgegirl System(http://katrina.csie.ntu.edu.
tw/judgegirl/).
• Each student may only choose to submit the homework in one way; either all as hard copies or all through CEIBA except the programming assignment. If you submit your homework partially in one way and partially in the other way, you might only get the score of the part submitted as hard copies or the part submitted through CEIBA (the part that the TA chooses).
• If you choose to submit the answers of the writing problems through CEIBA, please combine the answers of all writing problems into only one file in the doc/docx or pdf format, with the file name in the format of “hw5 [student ID].{pdf,docx,doc}”
(e.g. “hw5 b98902010.pdf”); otherwise, you might only get the score of one of the files (the one that the TA chooses).
• For each problem, please list your references (they can be the names of the classmates you discussed the problem with, the URL of the information you found on the Internet, or the names of the books you read). The TA can deduct up to 100% of the score assigned to the problems where you don’t list your references.
Problem 1. (40%) The n-puzzle problem. The n-puzzle is a sliding puzzle which consists of a frame of numbered square tiles in random order with one tile missing. If the size is 3x3, then the puzzle is called the 9-puzzle and if 4x4, the puzzle is called the 16-puzzle. The goal of the puzzle is to place the tiles in order by making sliding moves (horizontally or vertically, but not diagonally) that use the empty space.
The following example shows a sequence of legal moves from an initial board configuration to the desired board configuration for a 9-puzzle.
1 3 1 3 1 2 3 1 2 3 1 2 3
4 2 5 4 2 5 4 5 4 5 4 5 6
7 8 6 7 8 6 7 8 6 7 8 6 7 8
Finding one solution is relatively easy. However, it has been shown that finding the
“shortest” solution (the solution with the minimum number of moves) is NP-hard. In this programming assignment, we ask you to write a program to solve this NP-hard problem.
We will now describe a classic solution to the problem, which illustrates a general arti- ficial intelligence methodology known as the A* algorithm. In the A* algorithm, we always pick the one board configuration which seems most likely to be the one that leads to the shortest solution to evaluate. For the n-puzzle, it works as follows. First, insert the starting board configuration into a priority queue. Then, delete the board configu- ration with the minimum priority from the queue. It will turn into several new board configuration after moving a tile into the empty space; we put all possible new board configurations after one move back to the priority queue. Repeat this procedure until the board configuration dequeued represents desired board configuration.
The key question is how we determine the likelihood of one board configuration being the one that leads to the shortest solution. A heuristic based on the Manhattan distance between the current board configuration and the desired board configuration can be used for this purpose, and can be calculated for a particular board configuration c as the following:
h(c) =
n−1
X
i=0
(| ¯xi− xi| + | ¯yi− yi|) (1) where ¯xi and ¯yi are the x and y coordinates of the i-th tile in the desired board config- uration, and xi and yi are the x and y coordinates of the i-th tile in the current board configuration. Note that h(c) is a lower bound of the number of moves from the current board configuration c to the desired board configuration. Then the “priority” of a board
configuration when inserting into the priority queue is given by:
p(c) = h(c) + ǫ(c) (2)
where ǫ(c) is the actual number of moves (or, cost) used by the program to reach board configuration c from the starting board configuration.
One thing to note is that not all problems are solvable, i.e., some initial board config- urations cannot be altered to reach the desired board configurations by any number of moves. However, we will make sure that all the input data sets which we use to test your program are solvable.
Your program should follow the input and output formats specified below.
Input
The first line has√
n. The board is√ n×√
n in size. 32 ≤ n ≤ 102. The next √
n lines show the initial board configuration. The number are separated by a single space character ’ ’.
The tiles are numbered from 1 to n − 1. The empty tile is represented by a single space character ’ ’.
Output
The number m in one line. m = the number of moves in the shortest solution.
Sample Input and Output (input)
3 1 3 4 2 5 7 8 6 (output) 4
Please write a program to solve the n-puzzle problem (25%). Please also submit a report (10%) which includes a technical specification of your program, which helps the TA to understand your program. In addition, please answer the following questions in your report.
• Suppose that we replace the Manhattan distance heuristic with a different one, which is no longer a lower bound of the number of moves from the current board configuration to the desired board configuration. Can we still guarantee that the A* algorithm will generate a shortest solution? Please explain. (5%)
• (bonus problem) Can you modify your algorithm to make sure that in the case that the input given to your program is not solvable, your program will be able to output -1 (indicating that the problem is not solvable) in polynomial time? Please explain.
(bonus 10%)
Problem 2. (20%) Reading assignment: Read Painless Functional Specifications part II through IV (we have talked about part I in the class). They can be found here:
• Part I: http://www.joelonsoftware.com/articles/fog0000000036.html
• Part II: http://www.joelonsoftware.com/articles/fog0000000035.html
• Part III: http://www.joelonsoftware.com/articles/fog0000000034.html
• Part IV: http://www.joelonsoftware.com/articles/fog0000000033.html ... Did you read them? No, you didn’t. :P (You’ll see that’s what the author of these articles said too) Go and read them. It is part of this assignment. We will have some questions in the final about this.
Then write a functional specification of your Huffman file compressor/decompressor (HW4-2.1). Note that this is different from the report you wrote for them, which is supposed to be a technical specification. Since when this homework came out, most people probably have not started on HW4-2.1, we encourage you to finish the functional specification before you start the implementation. If you have finished HW4-2.1 when you read this, that is fine too. Just bear in mind that usually people write functional specifications before they start working on the implementation. The main objective for this assignment is for you to practice writing a functional spec for a program you will create (or, in the other case, might be a program you have already created.).
For teams of two students, each should be in charge of either the spec for the compressor or the decompressor. For teams of three students, the third student can pick either one
to write the spec; just make sure that both the compressor/decompressor are covered by the team.
Problem 3. (10%) Show that the ≤P relation is a transitive relation on languages. That is, show that if L1 ≤P L2 and L2 ≤P L3, then L1 ≤P L3 (problem 34.3-2).
Problem 4. (30%) Solve problem 34.3 on p.1103 in the textbook.