Database Systems
Instructor: Winston Hsu TA: Daniel Hsieh, Ryan Chang
November 17, 2014
DUE DATE 12/03/2014, Hand in PAPER PRINTOUT before THE END of the class
LATE POLICY We will not accept any assignment submissions.
CHEATING POLICY If you are caught cheating, your grade is 0.
SUBMISSION Please include your name and ID along with your answers to questions.
Assignment 4: Storage and Indexing Overview, Disks and Files 1. (9%) Answer the following questions about storage:
(a) Why does a database system store data on external storage?
(b) Why we should pay much attention on I/O costs in a database system?
(c) How do the buffer manager and the disk space manager work in a database system?
2. (16%) Consider these five file organizations implemented in a database system: Heap files, Sorted files, Clustered B+ tree files, Heap files with unclustered tree index, Heap files with unclustered hash index, answer the following questions and expain the reason of your answer:
(a) Can any of these file organizations be superior in all situations?
(b) Which one will you choose for a file if searching for records based on a range of field values is the mose frequent operation?
(c) Which one will you choose for a file if searching for a record based on a particular field value is the mose frequent operation?
(d) Which one will you choose for a file if performing inserts and scans are the most frequent operations, where the order of records does not matter?
3. (8%) Explain the terms:
(a) (1%) Storage nonvolatile
(b) (1%) MTTF (mean-time-to-failure)
(c) (2%) Random access and sequential access (Hint: Why pages placement policy impact on performance of a database?)
(d) (2%) Prefetching
(e) (2%) Data striping and parity scheme
1
4. (6%) Please discuss RAID 0+1 and RAID 1+0 based on performance and reliability. (Figures are shown below)
5. (4%) What are the trade-offs between them?
(a) Two possible record formats (b) Two possible page formats
6. (7%) Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, five double-sided platters, average seek time of 10 msec and block size of 1024 bytes is chosen. Suppose that a file containing 100,000 records of 100 bytes each is to be stored on such a disk and that no record is allowed to span two block. (Remember next block concept!)
(a) (1%) How many blocks are required to store the entire file? If the file is arranged sequentially on disk, how many surfaces are needed?
(b) (1%) How many records of 100 bytes each can be stored using this disk?
(c) (2%) If pages are stored sequentially on disk, with page 1 on block 1 of track 1, what page is stored on block 1 of track 1 on the next disk surface? How would your answer change if the disk were capable of reading and writing from all heads in parallel?
(d) (3%) What is the time required to read (platter rotates at 5400 rpm) a file containing 100,000 records of 100 bytes each in a random order? To read a record, the block containing the record has to be fetched from disk. Assume that each block request incurs the average seek time and rotational delay.
2