Database Systems-Assignment 5: Hash-based Indexing & Query Evaluation Overview

(1)

Database Systems

Instructor: Hao-Hua Chu Fall Semester, 2005 Assignment 5: Hash-based Indexing & Query Evaluation Overview

Answers 1. a. (10%) b. (10%) 3 000 001 011 010 2 1 100 101 110 111 5, 17, 29 2 3, 11, 19 3 7, 23, 31 3 2 00 01 11 10 2, 12 1 5, 17, 29 2 3, 11, 19 2

2. (10%) Hash indices enable us to perform point lookup (eg.σA=r(relation)) operations very fast, but for range searches the B+-tree index would be much more efficient.

(10%) If there is a range query to be evaluated, and only a hash index is available, the better strategy might be to perform a file scan rather than using that index.

3. a. (10%) Use the index to locate the first tuple whose branch-city field has value “Brooklyn”. From this tuple, follow the pointer chains till the end, retrieving all the tuples.

b. (10%) For this query, the index serves no purpose. We can scan the file sequentially and select all tuples whose branch-city field is anything other than “Brooklyn”.

c. (10%) This query is equivalent to the query

σ(branch-city>=“Brooklyn” AND assets >= 5000)(branch)

Using the branch-city index, we can retrieve all tuples with branch-city value greater than or equal to “Brooklyn” by following the pointer chains from the first “Brooklyn” tuple. We also apply the additional criteria of assets >= 5000 on every tuple.

(2)

4. r1needs 800 pages, and r2needs 1500 pages a. Index Nested Loop (15%)

Since r1 is smaller, we use it as the build relation and r2 as the probe relation. The total cost

= Cost of Scanning r1 + #r1 tuples*(Cost of retrieving index page + Cost of retrieving data page) = 800 + 20000*(1 + 1)

= 40800 page I/Os. b. Sort-Merge (15%)

The total cost

= Cost of sorting r1 & r2 + Cost of merging

= 2*800*(ceiling(log100 ceiling(800/101)) + 1) + 2*1500*(ceiling(log100 ceiling(1500/101)) + 1) + (800 + 1500)