• 沒有找到結果。

Database Systems

N/A
N/A
Protected

Academic year: 2022

Share "Database Systems"

Copied!
2
0
0

加載中.... (立即查看全文)

全文

(1)

Database Systems

Instructor: Winston Hsu TA: Daniel Hsieh, Ryan Chang

December 22, 2014

DUE DATE 1/7/2015, Hand in PAPER PRINTOUT

LATE POLICY We will not accept any assignment submissions.

CHEATING POLICY If you are caught cheating, your grade is 0.

SUBMISSION Please include your name and ID along with your answers to questions.

Assignment 6: Query Evaluation, External Sorting and Join Evaluation Algorithms 1. (20%) Consider the following schema:

Sailor(sid :integer, sname :string, rating :integer, age :real)

Note that sid is the primary key. Assume that each tuple of Sailors is 50 bytes long, that a page can hold 80 Sailors tuples, and that we have 500 pages of such tuples. For each of the following selection conditions, estimate the number of pages retrieved, given the catalog information in the question.

(a) Assume that we have a B+-tree index T on the search key Sailors.sid , and assume that IHeight(T) = 4, INPages(T) = 50, Low(T) = 1, and High(T) = 100,000.

i. σSailors.sid<50,000(Sailors) ii. σSailors.sid=50,000(Sailors)

(a) Assume that we have a hash index T on the search key Sailors.sid , and assume that IHeight(T) = 2, INPages(T) = 50, Low(T) = 1, and High(T) = 100,000.

i. σSailors.sid<50,000(Sailors) ii. σSailors.sid=50,000(Sailors)

2. (5%) Answer the following questions about query optimization:

(a) (1%) What are the benefits of storing the system catalogs as relations?

(b) (2%) What is the goal of query optimization? Why is optimization important?

(c) (2%) What role do statistics gathered from the database play in query optimization?

3. (5%) Explain the terms:

(a) (2%) What is external sorting?

(b) (3%) What is double buffering? What is the motivation for using it?

1

(2)

4. (10%) Answer the following questions for each of these scenarios, assuming that our most general external sorting algorithm is used: (1) A file with 10,000 pages and three available buffer pages.

(2) A file with 20,000 pages and five available buffer pages. (3) A file with 2,000,000 pages and 17 available buffer pages.

(a) (2%) How many runs will you produce in the first pass?

(b) (2%) How many passes will it take to sort the file completely?

(c) (3%) What is the total I/O cost of sorting the file?

(d) (3%) How many buffer pages do you need to sort the file completely in just two passes?

5. (10%) Consider the join R∞R.a=S.bS, given the following information about the relations to be joined. The cost metric is the number of page I/Os unless otherwise noted, and the cost of writing out the result should be uniformly ignored.

Relation R contains 10,000 tuples and has 10 tuples per page.

Relation S contains 2,000 tuples and also has 10 tuples per page.

Attribute b of relation S is the primary key for S.

Both relations are stored as simple heap files.

Neither relation has any indexes built on it.

52 buffer pages are available.

(a) What is the cost of joining R and S using a page-oriented simple nested loops join?

(b) What is the cost of joining R and S using a block nested loops join?

(c) What is the cost of joining R and S using a sort-merge join?

(d) What is the cost of joining R and S using a hash join?

(e) How many tuples does the join of R and S produce at most?

2

參考文獻

相關文件

showed that an o(n)-bit auxiliary string suffices to support the following queries in O(1) time: parent, depth, subtree-size, and the rank and select queries for nodes in pre-order

– Each time a file is opened, the content of the directory entry of the file is moved into the table.. • File Handle (file descriptor, file control block): an index into the table

If both a compute-bound process and an I/O-bound process are waiting for a time slice, which should be given

Answer: Yes, it is possible to simplify ARIES. There is no need to perform REDO on any of the actions from committed transactions because their updates have all been forced into

In this article channel behavior under various scenarios is observed for ZigBee nodes placed throughout a midsize sedan.In general, our experiments and measured results indicate

These humanistic qualities of Buddhism are expressed in the following three dichotomous principles: “the good and the most good,” “the expedient and the transcendental,” and

Directions: In this part, you will hear 10 questions or statements, read the three possible answers and choose the best response to the question.. The best answer

Find the probability for having 3 or more accidents in one day’s rush hours (morning and evening included).. (16%) Suppose that the time required for different cashiers form