• 沒有找到結果。

[08]CategoriesofOLAP

N/A
N/A
Protected

Academic year: 2021

Share "[08]CategoriesofOLAP"

Copied!
55
0
0

加載中.... (立即查看全文)

全文

(1)

Categories of OLAP

„

Categories of OLAP tools

„ MOLAP, ROLAP, HOLAP, DOLAP

„

OLAP extension to SQL

„ ROLLUP, CUBE, RANK() OVER, Windowing

(2)

2

Categories of OLAP Tools

„ OLAP tools are categorized

according to the architecture used to store and process

multi-dimensional data.

„ There are four main categories:

„ Multi-dimensional OLAP (MOLAP) „ Relational OLAP (ROLAP)

„ Hybrid OLAP (HOLAP) „ Desktop OLAP (DOLAP)

(3)

Multi-dimensional OLAP

(MOLAP)

„ Use specialized data structures and

multi-dimensional Database

Management Systems (MDDBMSs) to organize, navigate, and analyze data.

„ Data is typically aggregated and

stored according to predicted usage to enhance query performance.

(4)

4

Multi-dimensional OLAP

(MOLAP)

„ Use array technology and efficient

storage techniques that minimize the disk space requirements

through sparse data management.

„ Provides excellent performance

when data is used as designed, and the focus is on data for a specific

(5)

Multi-dimensional OLAP

(MOLAP)

„ Traditionally, require a tight

coupling with the application layer and presentation layer.

„ Recent trends segregate the OLAP

from the data structures through the use of published application programming interfaces (APIs).

(6)

6

Typical Architecture for

MOLAP Tools

Query on multidimensional database, cubes More storage, less time

(7)

MOLAP Tools - Development

Issues

„ Underlying data structures are

limited in their ability to support multiple subject areas and to

provide access to detailed data.

„ Navigation and analysis of data is

limited because the data is

designed according to previously determined requirements.

(8)

8

MOLAP Tools - Development

Issues

„ MOLAP products require a different

set of skills and tools to build and maintain the database, thus

increasing the cost and complexity of support.

(9)

Relational OLAP (ROLAP)

„ Fastest-growing style of OLAP

technology due to requirements to analyze ever-increasing amounts of data and the realization that

users cannot store all the data they require in MOLAP databases.

(10)

10

Relational OLAP (ROLAP)

„ Supports RDBMS products using a

metadata layer - avoids need to create a static multi-dimensional data structure - facilitates the

creation of multiple

multi-dimensional views of the two-dimensional relation.

(11)

Relational OLAP (ROLAP)

„ To improve performance, some

products use SQL engines to

support the complexity of

multi-dimensional analysis, while others recommend, or require, the use of highly denormalized database

(12)

12

Typical Architecture for

ROLAP Tools

Query on fact and dimension tables Less storage, more time

(13)

ROLAP Tools - Development

Issues

„ Performance problems associated

with the processing of complex

queries that require multiple passes through the relational data.

„ Middleware to facilitate the

development of multi-dimensional

applications. (Software that converts the two-dimensional relation into a multi-dimensional structure).

(14)

14

ROLAP Tools - Development

Issues

„ Development of an option to create

persistent, multi-dimensional

structures with facilities to assist in the administration of these

(15)

Hybrid OLAP (HOLAP)

„ Provide limited analysis capability,

either directly against RDBMS products, or by using an intermediate MOLAP

server.

„ Deliver selected data directly from the DBMS or via a MOLAP server to the

desktop (or local server) in the form of a data cube, where it is stored, analyzed, and maintained locally.

(16)

16

Hybrid OLAP (HOLAP)

„ Promoted as being relatively

simple to install and administer with reduced cost and

(17)

Typical Architecture for

HOLAP Tools

(18)

18

HOLAP Tools - Development

Issues

„ Architecture results in significant data redundancy and may cause problems for networks that support many users.

„ Ability of each user to build a custom data cube may cause a lack of data consistency among users.

„ Only a limited amount of data can be efficiently maintained.

(19)

Desktop OLAP (DOLAP)

„ Store the OLAP data in

client-based files and support multi-dimensional processing using a client multi-dimensional engine.

„ Requires that relatively small

extracts of data are held on client machines. They may be distributed in advance, or created on demand (possibly through the Web).

(20)

20

Desktop OLAP (DOLAP)

„ As with multi-dimensional

databases on the server, OLAP data may be held on disk or in RAM, however, some DOLAP

products allow only read access.

„ Most vendors of DOLAP exploit the

power of desktop PC to perform some, if not most,

(21)

Desktop OLAP (DOLAP)

„ The administration of a DOLAP

database is typically performed by a central server or processing

routine that prepares data cubes or sets of data for each user.

„ Once the basic processing is done,

each user can then access their portion of the data.

(22)

22

Typical Architecture for

DOLAP Tools

(23)

DOLAP Tools

-Development Issues

„ Provision of appropriate security controls to support all parts of the

DOLAP environment. Since the data is physically extracted from the system, security is generally implemented by limiting the information compiled into each cube. Once each cube is uploaded to the user's desktop, all additional

meta data becomes the property of the local user.

(24)

24

DOLAP Tools

-Development Issues

„ Reduction in the effort involved in

deploying and maintaining the DOLAP tools. Some DOLAP vendors now

provide a range of alternative ways of deploying OLAP data such as through e-mail, the Web or using traditional

client/server architecture.

„ Current trends are towards thin client machines.

(25)

OLAP Extensions to SQL

„ Advantages of SQL include that it is easy to learn, non-procedural,

free-format, DBMS-independent, and that it is a recognized international standard. „ However, major limitation of SQL is the

inability to answer routinely asked

business queries such as computing the percentage change in values between this month and a year ago or to

(26)

26

OLAP Extensions to SQL

„ Answer is ANSI adopted a set of

OLAP functions as an extension to SQL to enable these calculations as well as many others that used to

be impossible or even impractical within SQL.

„ IBM and Oracle jointly proposed

these extensions early in 1999 and they now form part of the current SQL standard, namely SQL: 2003.

(27)

OLAP Extensions to SQL

-RISQL

„ The extensions are collectively referred to as the ‘OLAP package’ and are described as follows:

„ Feature T431, ‘Extended Grouping capabilities’

„ SELECT..GROUP BY ROLLUP(columnlist) „ SELECT..GROUP BY CUBE(columnlist)

„ Feature T611, ‘Extended OLAP operators’

„ RANK() OVER (ORDER BY columnlist)

„ DENSE_RANK() OVER (ORDER BY columnlist)

(28)

28

Extended Grouping

Capabilities

„ Aggregation is a fundamental part of

OLAP. To improve aggregation capabilities the SQL standard

provides extensions to the GROUP BY clause such as the ROLLUP and CUBE functions.

(29)

Extended Grouping

Capabilities

„ ROLLUP supports calculations using

aggregations such as SUM, COUNT, MAX, MIN, and AVG at increasing levels of

aggregation, from the most detailed up to a grand total.

„ CUBE is similar to ROLLUP, enabling a

single statement to calculate all possible combinations of aggregations. CUBE can generate the information needed in cross-tabulation reports with a single query.

(30)

30

Extended Grouping

Capabilities

„ ROLLUP and CUBE extensions specify

exactly the groupings of interest in the GROUP BY clause and produces a single result set that is equivalent to a UNION ALL of differently grouped rows.

(31)

Extended Grouping

Capabilities

„ ROLLUP Extension to GROUP BY

„ enables a SELECT statement to calculate

multiple levels of subtotals across a specified group of dimensions. ROLLUP appears in the GROUP BY clause in a SELECT statement using the following format:

(32)

32

Extended Grouping

Capabilities

„ ROLLUP creates subtotals that roll up from the

most detailed level to a grand total, following a column list specified in the ROLLUP clause.

„ ROLLUP first calculates the standard

aggregate values specified in the GROUP BY clause and then creates progressively higher level subtotals, moving from right to left

through the column list until finally completing with a grand total.

(33)

Extended Grouping

Capabilities

„ ROLLUP creates subtotals at n + 1 levels, where n is the number of

grouping columns. For instance, if a query specifies ROLLUP on grouping columns of propertyType, yearMonth, and city (n = 3), the result set will

(34)

34

Example - Using the ROLLUP

Group Function

„

Show the totals for sales of flats or

houses by branch offices located in

Aberdeen, Edinburgh, or Glasgow

for the months of September and

October of 2004.

(35)

Example - Using the

ROLLUP Group Function

SELECT propertyType, yearMonth, city, SUM(saleAmount) AS sales

FROM Branch, PropertyFor Sale, PropertySale

WHERE Branch.branchNo = PropertySale.branchNo AND PropertyForSale.propertyNo =

PropertySale.propertyNo

AND PropertySale.yearMonth IN ('2004-08', '2004-09') AND Branch.city IN (‘Aberdeen’, ‘Edinburgh’,

‘Glasgow’)

(36)

36

Example - Using the

(37)

Extended Grouping

Capabilities

„ CUBE Extension to GROUP BY

„ CUBE takes a specified set of grouping

columns and creates subtotals for all of the possible combinations. CUBE

appears in the GROUP BY clause in a SELECT statement using the following format:

(38)

38

Extended Grouping

Capabilities

„ CUBE generates all the subtotals that

could be calculated for a data cube with the specified dimensions.

„ CUBE can be used in any situation

requiring cross-tabular reports. The data needed for cross-tabular reports can be generated with a single SELECT using

CUBE. Like ROLLUP, CUBE can be helpful in generating summary tables.

(39)

Extended Grouping

Capabilities

„ CUBE is typically most suitable

in queries that use columns

from multiple dimensions rather than columns representing

different levels of a single dimension.

(40)

40

Example - Using the CUBE

Group Function

„

Show all possible subtotals for

sales of properties by branches

offices in Aberdeen, Edinburgh,

and Glasgow for the months of

September and October of 2004.

(41)

Example - Using the CUBE

Group Function

SELECT propertyType, yearMonth, city, SUM(saleAmount) AS sales

FROM Branch, PropertyFor Sale, PropertySale WHERE Branch.branchNo = PropertySale.branchNo AND PropertyForSale.propertyNo = PropertySale.propertyNo AND PropertySale.yearMonth IN ('2004-08', '2004-09')

AND Branch.city IN (Aberdeen, Edinburgh, Glasgow)

(42)

42

Example - Using the CUBE Group Function

(43)

Elementary OLAP Operators

„ Supports a variety of operations such as rankings and window calculations. „ Ranking functions include

cumulative distributions, percent rank, and N-tiles.

„ Windowing allows the calculation of cumulative and moving aggregations using functions such as SUM, AVG, MIN, and COUNT.

(44)

44

Elementary OLAP Operators

„ Ranking Functions

„ Computes the rank of a record compared

to other records in the dataset based on the values of a set of measures. There are various types of ranking functions, including RANK and DENSE_RANK. The syntax for each ranking function is:

RANK( ) OVER (ORDER BY columnList) DENSE_RANK( ) OVER (ORDER BY

(45)

Elementary OLAP Operators

„ The difference between RANK

and DENSE_RANK is that

DENSE_RANK leaves no gaps in the sequential ranking sequence when there are ties for a

(46)

46

Example - Using the RANK

and DENSE_RANK Functions

„ Rank the total sales of properties for

branch offices in Edinburgh.

SELECT branchNo, SUM(saleAmount) AS sales,

RANK() OVER (ORDER BY SUM(saleAmount)) DESC AS ranking,

DENSE_RANK() OVER (ORDER BY

SUM(saleAmount)) DESC AS dense_ranking

FROM Branch, PropertySale WHERE Branch.branchNo =

PropertySale.branchNo

AND Branch.city = Edinburgh

(47)

Example - Using the RANK

and DENSE_RANK Functions

(48)

48

Elementary OLAP Operators

„ Supports a variety of operations such as rankings and window calculations. „ Ranking functions include

cumulative distributions, percent rank, and N-tiles.

„ Windowing allows the calculation of cumulative and moving aggregations using functions such as SUM, AVG, MIN, and COUNT.

(49)

Elementary OLAP Operators

„ Windowing Calculations

„ Can be used to compute

cumulative, moving, and centered aggregates. They return a value for each row in the table, which depends on other rows in the

(50)

50

Elementary OLAP Operators

„ Windowing Calculations

„ Can be used to compute cumulative,

moving, and centered aggregates. They return a value for each row in the table, which depends on other rows in the

corresponding window.

„ These aggregate functions provide

access to more than one row of a table without a self-join and can be used only in the SELECT and ORDER BY clauses of the query.

(51)

Example - Using Windowing

Calculations

„

Show the monthly figures and

three-month moving averages

and sums for property sales at

branch office B003 for the first

six months of 2004.

(52)

52

Example - Using Windowing

Calculations

SELECT yearMonth, SUM(saleAmount) AS monthlySales, AVG(SUM(saleAmount)) OVER (ORDER BY yearMonth, ROWS 2

PRECEDING) AS 3-month moving avg,

SUM(SUM(salesAmount)) OVER (ORDER BY yearMonth ROWS 2 PRECEDING)

AS 3-month moving sum FROM PropertySale

WHERE branchNo = B003

AND yearMonth BETWEEN ('2004-01' AND '2004-06)

GROUP BY yearMonth ORDER BY yearMonth;

(53)

Example - Using

(54)

54

Storage Comparison of

MOLAP & ROLAP

Cube

Items

Locations

Tim es

Sales Tim_ID Item_ID L_ID Sales

1 1 1 1795 1 2 1 1006 1 3 1 993 1 1 2 1237 1 2 2 1059 1 3 2 998 1 1 3 987 1 2 3 689 1 3 3 589 2 1 1 963 2 2 1 789 Time_ID Year Item_ID ItemName Location_ID Location 1 2005 2 2006 3 2007 1 Game 2 PC 3 Phone 1 DC 2 NY 3 SF Dimension tables

(55)

Storage Comparison of

MOLAP & ROLAP

Time_ID Year Quarter month

Dimension tables ROLAP MOLAP 50 rows Year 2005 2006 Quarter 1 1 Month 1 2

Class Elect Elect Subclass Home Home

Name PC PC

Country USA USA

L_name NY NY

Sales 1003 987

Item_ID Class subclass Name 50

L_ID Country L_Name

50

Fact table

Time_ID Item_ID L_ID Sales

參考文獻

相關文件

After students have had ample practice with developing characters, describing a setting and writing realistic dialogue, they will need to go back to the Short Story Writing Task

• helps teachers collect learning evidence to provide timely feedback & refine teaching strategies.. AaL • engages students in reflecting on & monitoring their progress

Robinson Crusoe is an Englishman from the 1) t_______ of York in the seventeenth century, the youngest son of a merchant of German origin. This trip is financially successful,

fostering independent application of reading strategies Strategy 7: Provide opportunities for students to track, reflect on, and share their learning progress (destination). •

Strategy 3: Offer descriptive feedback during the learning process (enabling strategy). Where the

How does drama help to develop English language skills.. In Forms 2-6, students develop their self-expression by participating in a wide range of activities

These include new curriculum and standards frameworks, a range of professional development programmes, autonomy for schools to select and manage staff, flexibility in the use

Study the following statements. Put a “T” in the box if the statement is true and a “F” if the statement is false. Only alcohol is used to fill the bulb of a thermometer. An