1
Database Systems
( 資料庫系統
)
October 24, 2005
Lecture #5
2
Course Administration
• Graded assignment #1s are returned
today.
– Pick up outside the TA’s office (336/338)
• Assignment #2 is out on the home
webpage.
– It is due one week from today.
• Next week reading:
– Chapter 8: Overview of Storage and Indexing
3
Kitchen of the Future?
(MIT Media Lab)
4
Reflection: DB design
• Last lecture:
– Query language: how to ask questions
about the [relational] database?
– Mathematical query language:
Relational
Algebra
• This lecture
5
Review: Relational Algebra
• A query is applied to table(s), and
the result of a query is also a table.
• Find the names of sailors who have
reserved boat 103
6
Review: Relational Algebra
• Basic relational algebra operators:
–Selection
(σ, pronounced sigma):
Select a subset of rows from a table.
–
Projection
(π): Delete unwanted
columns from a table.
–
Cross-product
( X ): Combine two
tables.
–
Set-difference
( - ): Tuples in table 1,
but not in table 2.
7
Review: Relational Algebra
(more)
• Additional relational algebra operators:
– Intersection (∩) : Tuples in tables 1 and 2. – Join (∞): conditional cross product
– Division (/) – Renaming (p)
• Operations composed into complex
query expr
• English translation?
π
sid(
σ age > 20 Sailors)
–8
Relational Algebra to SQL
• Relational operators → SQL commands
Relational Algebra:
π
sname(
σ
bid = 103(Sailors
∞
Reserves))
SQL:
SELECT
S.sname
FROM
Sailors S, Reserves R
WHERE
S.sid=R.sid
AND
R.bid=103
9
SQL: Queries, Constraints,
Triggers
10
Lecture Outline
• Basic Query – SELECT • Set Constructs – UNION, INTERSECT,EXCEPT, IN, ANY, ALL, EXISTS
• Nested Queries
• Aggregate Operators
– COUNT, SUM, AVG,
MAX, MIN, GROUP BY, HAVING
• Null Values
• Integrity
Constraints
– CHECK, CREATE ASSERTION• Triggers
– CREATE TRIGGER,11
Example Table Definitions
Sailors(sid: integer, sname: string, rating: integer, age: real) Boats(bid: integer, bname: string, color: string) Reserves(sid: integer, bid: integer, day: date)
• Find names of sailors who’ve reserved
boat #103
SELECT S.sname
FROM Sailors S, Reserves R
12
Basic SQL Query
SELECT [DISTINCT] target-list
FROM relation-list
WHERE qualification
• Relation-list: A list of relation names (possibly
with range-variable after each name).
• Target-list: A list of attributes of relations in relation-list
• Qualification: conditions on attributes (<, >, =, and, or, not, etc.)
• DISTINCT: optional keyword for duplicate removal.
13
How to evaluate a query?
SELECT [DISTINCT] target-list FROM relation-list
WHERE qualification
• Conceptual query evaluation using relational
operators:
1) Compute the cross-product of relation-list.
2) Discard resulting tuples if they fail qualifications. 3) Delete attributes that are not in target-list.
(called column-list)
4) If DISTINCT is specified, eliminate duplicate rows.
• Only conceptual because of inefficiency
computation
14
Example of Conceptual
Evaluation (1)
si
d
sna
me
ratin
g
age
22 dusti
n
7
45.
0
31 lubb
er
8
55.
5
58 rusty 10
35.
0
(1) Compute the
cross-product of
relation-list.
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid=R.sid AND R.bid=103
Sailors
Reserves
sid bid day
22 101 10/10/9
6
58 103 11/12/9
6
15
Example of Conceptual
Evaluation (2)
(2) Discard tuples if
they fail
qualifications.
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid=R.sid AND R.bid=103
16
Example of Conceptual
Evaluation (3)
(3) Delete attribute
columns that not
in target-list.
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96
Sailors X Reserves
SELECT S.sname
FROM Sailors S, Reserves R
17
A Note on Range Variables
• Really needed range variables only if the same relation appears twice in the FROM
clause.
SELECT S.sname
FROM Sailors as S, Reserves R
WHERE S.sid=R.sid AND bid=103 SELECT sname
FROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid AND bid=103 OR SELECT sname FROM Sailors S, Reserves R1, Reserves R2
WHERE S.sid = R1.sid AND
S.sid = R2.sid AND
18
Find the sids of sailors who’ve
reserved at least one boat
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 Sailors X Reserves SELECT
FROM WHERE SELECT
S.sid
FROM
Sailors S, Reserves R
19
DISTINCT
• Find the names and
ages of all sailors
SELECT S.names, S.ages
FROM Sailors S
• Add DISTINCT to this query?
• Replace S.sid by
S.sname in the SELECT clause?
• Add DISTINCT to the above? Sid Snam e Rating Age 22 Dustin 7 45.0 29 Brutus 1 33.0 31 Lubbe r 8 55.5 32 Andy 8 25.5 58 Rusty 10 35.0 64 Horati o 7 35.0 71 Zorba 10 16.0 74 Horati o 9 35.0 85 Art 3 25.5 95 Bob 3 63.5
20
Find sailors whose names begin
and end with B and contain at
least three characters.
SELECT S.age, age1=S.age-5, 2*S.age AS age2 FROM Sailors S
WHERE S.sname LIKE ‘B_ %B’
• AS and = are two ways to name fields in result.
• LIKE for string matching.
– `_’ for one character – `%’ for 0 or more characters. Sid Snam e Rating Age 22 Dusti n 7 45.0 29 Brutu s 1 33.0 31 Lubbe r 8 55.5 74 Horati o 9 35.0 85 Art 3 25.5 95 Bob 3 20 Ag e Age1 Age2 20 15 40
21
Find sid’s of sailors who’ve reserved
a red
or
a green boats.
SELECT DISTINCT S.sid
FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid
AND (B.color=‘red’ OR B.color=‘green’)
• UNION: work on two union-compatible sets of
tuples
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’
UNION
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘green’
22
Find sid’s of sailors who’ve
reserved a red and a green boat
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND
B.color=‘red’
INTERSECT
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND
B.color=‘green’
• What do we get if we replace INTERSECT by EXCEPT?
– (A Except B) returns tuples in A but not in B.
– Find sids of all sailors who have reserved a red boat but not a green boat.
23
SET Construct:
UNION
ALL
• UNION, INTERSECT, and EXCEPT delete
duplicate by default.
• To retain duplicates, use UNION ALL,
INTERSECT ALL, or EXCEPT ALL.
Sid Snam e 71 Zorba 74 Horati o 74 Horati o 95 Bob Sid Snam e 22 Dustin 71 Zorba 74 Horati o 74 Horati o INTERSEC T ALL = Sid Snam e 71 Zorba 74 Horati o 74 Horati o24
Nested Queries
• WHERE clause can itself contain an SQL subquery. (Actually, so can FROM and HAVING clauses.)
• Find names of sailors who’ve reserved boat #103: SELECT S.sname
FROM Sailors S
WHERE S.sid IN (SELECT R.sid
FROM Reserves R WHERE R.bid=103)
• (x IN B) returns true when x is in set B.
– To find sailors who’ve not reserved #103, use NOT IN.
• Nested loops Evaluation
– For each Sailors tuple, check the qualification by computing the subquery.
– Does the subquery result change for each new Sailor row?
Subquery: finds sids who have reserved bid 103
25
Nested Queries with
Correlation
SELECT S.sname FROM Sailors S
WHERE EXISTS (SELECT *
FROM Reserves R
WHERE R.bid=103 AND S.sid=R.sid )
• EXISTS
is another set comparison operator, like
IN
.
– (EXISTS S)
returns true when
Sis not empty.
• What is the above query in English?
– Find sailors who have reserved boat #103• In case of correlation, subquery must be
re-computed for each Sailors tuple.
Correlation: subquery finds all reservations for
bid 103 from current sid
26
Nested Queries with
UNIQUE
Sailors(sid: integer, sname: string, rating: integer, age: real) Boats(bid: integer, bname: string, color: string) Reserves(sid: integer, bid: integer, day: date)
• (UNIQUE S) returns true if S has no duplicate tuples or S is empty.
SELECT S.sname FROM Sailors S
WHERE UNIQUE (SELECT R.bid
FROM Reserves R
WHERE R.bid=103 AND S.sid=R.sid)
• What is the above query in English?
– Finds sailors with at most one reservation for boat #103.
• Replace R.bid with *?
– Finds sailors with at most one reservation for boat #103 in a given day. – (Simplify -> find all sailors)
27
More on Set-Comparison
Operators
• Have seen IN, EXISTS and UNIQUE. Can also use NOT IN, NOT EXISTS, and NOT UNIQUE.
• Also available: op ANY, op ALL, where op can be >, <, =,
≠, ≤, ≥
– (a > ANY B) returns true when a is greater than any one element in set B.
– (a > ALL B) returns true when a is greater than all elements in set B.
SELECT *
FROM Sailors S
WHERE S.rating > ANY (SELECT S2.rating
FROM Sailors S2
WHERE S2.sname=‘Horatio’)
• What is the above query in English?
– Find sailors whose rating is greater than that of some sailor called Horatio.
28
Find sid’s of sailors who’ve
reserved a red and a green boat
SELECT S.sidFROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND
B.color=‘red’
INTERSECT
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND
B.color=‘green’
• Rewrite INTERSECT with IN.
29
Rewriting INTERSECT Using
IN
• Find sid’s of Sailors who’ve reserved
red but not green boats (EXCEPT)
– Replace IN with NOT IN.
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’
AND S.sid
IN
(SELECT S2.sid
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid=R2.sid AND R2.bid=B2.bid
AND B2.color=‘green’)
Find sids
who’ve
reserved a
green boat
30
Division in SQL
• Find sailors who’ve
reserved all boats.
– Find all boats that have been reserved by a
sailor
– Compare with all boats – Do the sailor’s reserved
boats include all boats?
• Yes → include this sailor • No → exclude this sailor
• Can you do it the hard
way, without EXCEPT
& with NOT EXISTS?
SELECT S.sname FROM Sailors S
WHERE NOT EXISTS
((SELECT B.bid FROM Boats B) EXCEPT (SELECT R.bid FROM Reserves R WHERE R.sid=S.sid)) (A EXCEPT B) returns tuples in A but not in B.
31
Aggregate Operators
• COUNT (*)
• COUNT ( [DISTINCT] A)
– A is a column
• SUM ( [DISTINCT] A)
• AVG ( [DISTINCT] A)
• MAX (A)
• MIN (A)
• Count the number of sailors
SELECT COUNT (*)
FROM Sailors S
32
Find the average age of
sailors with rating = 10
Sailors(sid: integer, sname: string,
rating: integer, age: real)
SELECT
AVG (S.age)
FROM Sailors S
33
Count the number of
different sailor names
Sailors(sid: integer, sname: string,
rating: integer, age: real)
SELECT COUNT (DISTINCT S.sname)
FROM Sailors S
34
Find the age of the
oldest sailor
Sailors(sid: integer, sname: string,
rating: integer, age: real)
SELECT MAX(S.AGE)
35
Find name and age of the
oldest sailor(s)
SELECT S.sname, MAX (S.age) FROM Sailors S
• This is illegal, but why?
– Cannot combine a column with a value (unless we use GROUP BY)
SELECT S.sname, S.age FROM Sailors S WHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2)
• Okay, but not supported in every system
– Convert a table (of a single aggregate value) into a single value for comparison
36
GROUP BY
and
HAVING
• So far, aggregate operators are applied to all (qualifying) tuples.
– Can we apply them to each of several groups of tuples?
• Example: find the age of the youngest sailor for each rating level.
– In general, we don’t know how many rating levels
exist, and what the rating values for these levels are!
– Suppose we know that rating values go from 1 to
10; we can write 10 queries that look like this:SELECT MIN (S.age) FROM Sailors S
WHERE S.rating = i
37
Find the age of the youngest
sailor for each rating level
SELECT S.rating, MIN
(S.age) as age
FROM Sailors S
GROUP BY S.rating
(1) The sailors tuples are
put into “same rating”
groups.
(2) Compute the Minimum
age for each rating group.
Sid Sna me Rating Age 22 Dusti n 7 45.0 31 Lubb er 8 55.5 85 Art 3 25. 5 32 Andy 8 25. 5 95 Bob 3 63. 5 Ratin g Age 3 25.5 3 63.5 7 45.0 8 55.5 8 25.5 Rati ng Age 3 25. 5 7 45. 0 8 25. 5 (1) (2)
38
Find the age of the youngest
sailor for each rating level
that
has at least 2 members
SELECT S.rating, MIN
(S.age) as age
FROM Sailors S
GROUP BY S.rating
HAVING COUNT(*) > 1
1. The sailors tuples are put
into “same rating” groups.
2. Eliminate groups that
have < 2 members.
3. Compute the Minimum
age for each rating group.
Sid Sna me Rating Age 22 Dusti n 7 45.0 31 Lubb er 8 55.5 85 Art 3 25. 5 32 Andy 8 25. 5 95 Bob 3 63. 5 Rati ng Age 3 25. 5 3 63. 5 7 45. 0 8 55. 5 8 25. 5 Rati ng Age 3 25. 5 8 25. 5
39
Queries With
GROUP BY
and
HAVING
SELECT [DISTINCT] target-list FROM relation-list
WHERE qualification GROUP BY grouping-list
HAVING group-qualification
• The target-list contains (i) attribute names (ii)
terms with aggregate operations (e.g., AVG (S.age)).
– The attribute list (i) in target-list must be in grouping-list. – The attributes in group-qualification must be in
grouping-list.
40
Say if Attribute list is not in
grouping-list
SELECT S.sname, S.rating, AVG (S.age) as age FROM Sailors S GROUP BY S.rating HAVING COUNT(*) > 1 Snam e Rating Age Dustin 3 25.5 Lubbe r 3 63.5 Art 7 45.0 Andy 8 55.5 Bob 8 25.5 Sna me Rating Age ? 3 44. 5 ? 8 40. 5 Sid Snam e Rating Age 22 Dusti n 7 45.0 31 Lubbe r 8 55.5 85 Art 3 25.5 32 Andy 8 25.5 95 Bob 3 63.541
Say if Group qualification is
not in grouping-list
SELECT S.rating,
AVG (S.age) as
age
FROM Sailors S
GROUP BY S.rating
HAVING
S.sname
≠
‘Dustin’
Snam e Rating Age Dustin 3 25.5 Lubbe r 3 63.5 Art 7 45.0 Andy 8 55.5 Bob 8 25.5 Rati ng Age Not in group-list ? Sid Snam e Rating Age 22 Dusti n 7 45.0 31 Lubbe r 8 55.5 85 Art 3 25.5 32 Andy 8 25.5 95 Bob 3 63.542
Conceptual Evaluation
• Without GROUP BY and HAVING:
– Compute cross-product of relation-list
– Remove tuples that fail qualification
– Delete unnecessary columns
• With GROUP BY and HAVING, continue with
– Partition remaining tuples into groups by the value of attributes in grouping-list (specified in GROUP-BY
clause)
– Remove groups that fail group-qualification
(specified in HAVING clause).
43
For each red boat, find the
number of reservations for
this boat
SELECT B.bid, COUNT(*) AS num_reservations FROM Boats B, Reserves R WHERE R.bid=B.bid AND B.color=‘red’ GROUP BY B.bid
SELECT B.bid, COUNT (*) AS num_reservations FROM Boats B, Reserves
R
WHERE R.bid=B.bid GROUP BY B.bid
HAVING B.color=‘red’
• Illegal, why?
– B.color does not appear in group-list
44
Find the age of the youngest
sailor with age > 18 for each
rating with
at least 2 sailors (of
any age)
SELECT S.rating, MIN(S.age) FROM Sailors S WHERE S.age > 18 GROUP BY S.rating HAVING COUNT(S) > 1 • What is wrong? – COUNT(*) is counting tuples after the
qualification (S.age > 18).
– Eliminate groups with multiple sailors but only one sailor with age > 18.
• How to fix it?
– Use subquery in the HAVING
clause.
SELECT S.rating, MIN (S.age) FROM Sailors S
WHERE S.age > 18 GROUP BY S.rating HAVING
1 < ANY (SELECT COUNT (*) FROM Sailors
S2
WHERE S.rating=S2.rating)
45
Find rating(s) for (which the
average age is the minimum)
over all rating groups
SELECT S.rating FROM Sailors S WHERE S.age =
(SELECT MIN (AVG (S2.age)) FROM Sailors S2 GROUP BY S2.rating) • What’s wrong? – Aggregate operations cannot be nested
• How to fix it?
SELECT Temp.rating
FROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS Temp WHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp) A temp table (rating, avg age)
46
Table Constraints
• Specify constraints over a single table
– Useful when more
general ICs than keys are involved.
• Constraints can be named.
CREATE TABLE Sailors ( sid INTEGER, sname CHAR(10), rating INTEGER, age REAL,
PRIMARY KEY (sid), CHECK ( rating >= 1
AND rating <= 10 )
CREATE TABLE Reserves ( sname CHAR(10), bid INTEGER,
day DATE,
PRIMARY KEY (bid,day),
CONSTRAINT noInterlakeRes CHECK (`Interlake’ ≠ ( SELECT B.bname FROM Boats B WHERE B.bid=bid))) The boat ‘Interlake’ cannot be reserved
47
Assertions: Constraints Over
Multiple Tables
CREATE TABLE Sailors ( sid INTEGER,
sname CHAR(10), rating INTEGER, age REAL,
PRIMARY KEY (sid), CHECK
( (SELECT COUNT (S.sid) FROM Sailors S)
+ (SELECT COUNT (B.bid) FROM Boats B) < 100 ) • Awkward and wrong! – If Sailors is empty, the number of Boats tuples can be anything! • ASSERTION is the right solution; not associated with either table.
CREATE ASSERTION smallClub CHECK
( (SELECT COUNT (S.sid) FROM Sailors S)
+ (SELECT COUNT (B.bid) FROM Boats B) < 100 ) Number of boats
plus number of sailors is < 100
48
Triggers
• Trigger: procedure that starts automatically if specified changes occur to the DBMS
• A trigger has three parts:
– Event (activates the trigger)
– Condition (tests whether the triggers should run)
– Action (what happens if the trigger runs)
CREATE TRIGGER incr_count AFTER INSERT ON Students // Event
WHEN (new.age < 18) // Condition FOR EACH ROW
BEGIN // ACTION: a procedure in Oracle’s PL/SQL syntax count := count + 1