• 沒有找到結果。

Database Systems Assignment 3: Storage and Indexing

N/A
N/A
Protected

Academic year: 2021

Share "Database Systems Assignment 3: Storage and Indexing"

Copied!
2
0
0

加載中.... (立即查看全文)

全文

(1)

Database Systems

Instructor: Hao-Hua Chu Fall Semester, 2005

Assignment 3: Storage and Indexing

Deadline: At the end of TA’s office hour on Tuesday, Nov. 8, 2005

This is an individual assignment, that is, no group submissions are allowed.

Cheating Policy: If you are caught cheating, your grade is 0.

Late Policy: You may hand in your late assignment at TA’s office hour on Wed. (11/9/2005) for 80% of original grade, or at TA’s office hour on Thurs. (11/10/2005) for 70%. We will not accept any assignment submissions after Thursday.

Questions

1. Consider the following relations:

Emp( eid: integer, ename: varchar, sal: integer, age: integer, did: integer) Dept( did: integer, budget: integer, floor: integer, mgr_eid: integer)

Salaries range from $10,000 to $100,000, ages vary from 20 to 80, each department has about five employees on average, there are 10 floors, and budgets vary from $10,000 to $1 million.

For each of the following queries, which of the listed index choices would you choose to speed up the query? Please also compute the number of page I/Os for each indexing choices: There are 100 data pages when records are packed onto pages with no wasted space and 100 records per page. The fan-out of B+ tree is 100. As the assumption in the textbook, each data entry in the index is one tenth the size of a data record. In B+ tree files, the pages are at 67% occupancy; in hash files, the pages are at 80% occupancy. Note that clustered implies Alternative 1 (page 17 in 10/31 slide).

If your database system does not consider index-only plans (i.e., data records are always retrieved even if enough information is available in the index entry), how would your answer change? Explain briefly.

(1) Query: Print ename, age, and sal for all employees.

(a) Clustered hash index on <ename, age, sal> fields of Emp. (b) Unclustered hash index on <ename, age, sal> fields of Emp. (c) Clustered B+ tree index on <ename, age, sal> fields of Emp.

(2)

(d) Unclustered hash index on <eid, did> fields of Emp. (e) No index.

(2)Query: Find the dids of departments that are on the 10th floor and have a budget of less than $15,000. (Note: we assume applying the hash function to a record allows us to identify and retrieve the page containing the record with 1 I/O and that there are no overflow chains. We also assume that the matched records are all in the same page. )

(a) Clustered hash index on the floor field of Dept. (b) Unclustered hash index on the floor field of Dept.

(c) Clustered B+ tree index on <floor, budget> fields of Dept. (d) Clustered B+ tree index on the budget field of Eept. (e) No index.

2. Consider the page format for variable-length records that uses a slot directory. (1) One approach to managing the slot directory is to use a maximum size (i.e., a maximum number of slots) and allocate the directory array when the page is created. Discuss the pros and cons of this approach with respect to the approach discussed in the text.

(2) Suggest a modification to this page format that would allow us to sort records (according to the value in some field) without moving records and without changing the record ids.

3. Modern disk drives store more sectors on the outer tracks than the inner tracks. Since the rotation speed is constant, the sequential data transfer rate is also higher on the outer tracks. The seek time and rotational delay are unchanged. Given this information, explain good strategies for placing files with the following kinds of access patterns: (1) Frequent, random accesses to a small file (e.g., catalog relations).

(2) Sequential scans of a large file (e.g., selection from a relation with no index). (3) Random accesses to a large file via an index (e.g., selection from a relation via the index).

(4) Sequential scans of a small file.

Submission

Hand in PAPER PRINTOUT that contains your answers to the three questions. Please include your name and ID.

參考文獻

相關文件

It would be game-changing to form a class atmosphere that encourage diversity and discussion, formed of groups with different microculture that optimized the sense of belonging

Let us consider the numbers of sectors read and written over flash memory when n records are inserted: Because BFTL adopts the node translation table to collect index units of a

– File and index layers organize records on files, and manage the indexing data

• Each row corresponds to one truth assignment of the n variables and records the truth value of φ under that truth assignment. • A truth table can be used to prove if two

(2)Ask each group to turn to different page and discuss the picture of that page.. (3)Give groups a topic, such as weather, idols,

 Promote project learning, mathematical modeling, and problem-based learning to strengthen the ability to integrate and apply knowledge and skills, and make. calculated

Now, nearly all of the current flows through wire S since it has a much lower resistance than the light bulb. The light bulb does not glow because the current flowing through it

(C) Some researchers say that if a person’s brain is right, she or he can easily have good musical skills.. (D) According to some researchers, the right brain helps us to improve