Prof. Michael Tsai 2013/2/19
What is a data structure?
• An organization of information, usually in memory, for better algorithm efficiency.
• Or, a way to store and organize data in order to facilitate acce ss and modifications.
• 又可分為 :
• Linear data structure: 必須循序地存取 ( 如 linked list, stack, queue)
• Non-linear data structure: 可以不循序的存取 ( 如 tree, graph)
0 1 2 3 4 5 6
27 11 -3 0 0 2 0
2�5− 3�2+11�+27
What is an algorithm?
• An algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output.
• An algorithm is a tool for solving a well-specified computationa l problem.
• Computational problem input/output relationship
• The algorithm describes a specific computational procedure for achieving that input/output relationship.
What is an algorithm?
• Example:
• Sorting problem:
• Input: A sequence of n numbers
• Output: A permutation (reordering) of the input sequence suc h that
• An instance of the sorting problem:
• A sorting algorithm should return as output the sequence .
•
What is an algorithm?
•
All algorithm must satisfy the following criteria:
• Input: 外部給的資訊 ( 可以是零個或多個 )
• Output: 產生的結果 ( 至少一個 )
• Definiteness: 每一個指令都是清楚而不模糊的
• Finiteness: 所有的狀況下 ( 所有的 input), 演算法會在有限 步驟之後結束
• Effectiveness: 每一個指令都必須是簡單可以直接執行的 ( 必須可以執行 )
Example
• Statement 1: “Is n=2 the largest value of n for wh ich there exist positive integers x, y, and z suc h that has a solution?”
• Statement 2: “Store 5 divided by zero into x and g o to statement ㄅ .”
• Which criterion do they violate?
• Input
• Output
• Definiteness
• Finiteness
• Effectiveness
•
Definiteness
Effectiveness
Why are algorithms/data structure important?
• 它們被用在生活中的每個層面 :
Why are algorithms/data structure important?
• Q: 如果電腦無限快 / 記憶體免錢,我們還需要研究資料結構與演算法嗎 ?
• A: Yes. 我們仍然需要確認我們想出來的解法會停止 ( 不會無窮地執行下 去 ) ,而且每次都產生正確的答案。
• 在這個假想的世界中,任何正確的解法都適用,因此我們通常會選最容 易實作的方法。
• 但是在真實的世界裡 :
• 電腦不是無限快 ( 計算需要時間 )
• 記憶體不是免錢 ( 儲存資料需要空間 )
• 因此我們需要學習如何好好利用這些資源來解決問題 !
How do we describe an algorithm?
• Human language (English, Chinese, …)
• Programming language
• A mix of the above
1. 拿平底鍋 2. 拿沙拉油
1. 我們有油嗎 ?
1. 有的話 , 倒一茶匙的沙拉油到鍋子裡 2. 沒有的話 , 我們想要買油嗎 ?
1. 是的話 , 就去全聯買一罐沙拉油 2. 如果不想的話 , 只好先不煮了 . 3. 打開火爐 , …
Example: Selection Sort
• Integers are stored in an array, list. The i-th integer is stored in list [i], 0<i<n.
• Solution: From those integers that are currently unsorted, find the smallest and place it next in the sorted list.
ㄅ ㄆ 1
1 ㄆ 2 ㄅ
1 2 ㄆ ㄅ
Sorting problem:
Input: A sequence of n numbers
Output: A permutation (reordering) of the input sequence such that
Example: Selection Sort
•
First attempt:
for (i=0; i<n; ++i) {
Examine list[i] to list[n-1] and suppose that the smallest integer is at list[min];
Interchange list[i] and list[min];
}
Task 1
Task 2
Task 2
void swap(int *x, int *y) { int temp = *x;
*x=*y;
*y=temp;
}
Or
#define SWAP(x,y,t) ((t)=(x), (x)=(y), (y)=(t)) Task 2
Task 1
min=i;
for(j=i;j<n;++j)
if (list[j]<list[min]) min=j;
Task 1
#define MAX-SIZE 101
#define SWAP(x,y,t) ((t) = (x), (x) = (y), (y) = (t))
void sort(int [],int); /*selection sort */
void main(void) {
int i,n;
int list[MAX-SIZE];
printf("Enter the number of numbers to generate: ");
scanf (" %d", &n) ;
if( n < 1 I In> MAX-SIZE) {
fprintf(stderr, "Improper value of n\n");
exit(EXIT_FAILURE);
}
for (i = 0; i < n; i++) {/*randomly generate numbers*/
list[i] = rand() % 1000;
printf("%d ",list[i]);
}
printf("\n");
}
void sort(int list[],int n) {
int i, j, min, temp;
for (i = 0; i < n-1; i++) { min = i;
for (j = i+1; j < n; j++) if (list[j] < list[min])
min = j;
SWAP(list[i],list[min],temp);
} }
How do we prove that it is correc t?
• [Theorem] Function sort(list,n) correctly sorts a set of n1 intege rs. The result remains in list[0], …, list[n-1] such that .
• Proof:
When the outer for loop completes its iteration for i=q, we hav e . Further, on subsequent iterations, i>q and list[0] through list [q] are unchanged. Hence following the last iteration of the out er for loop (i.e., i=n-2), we have .
•
Example: Binary Search
• Input:
• searchnum: the number to be found
• list: sorted array, size n, and
• Output:
• -1 if searchnum is not found in list
• the index of searchnum in list[] if searchnum is found
•
Example:
1 3 4 4 6 7 11 13 13 13 18 19
0 1 2 3 4 5 6 7 8 9 10 11
searchnum=13;
Example:
1 3 4 4 6 7 11 13 13 13 18 19
0 1 2 3 4 5 6 7 8 9
searchnum=13;
lef middle right
middle=(left+right)/2;
left=middle+1;
10 11
Example:
1 3 4 4 6 7 11 13 13 13 18 19
0 1 2 3 4 5 6 7 8 9
searchnum=5;
return -1;
10 11
int binsearch(int list[], int searchnum, int left, int right) {
int middle;
while(left<=right) { middle=(left+right)/2;
switch(COMPARE(list[middle], searchnum)) { case -1: left=middle+1; break;
case 0: return middle;
case 1: right=middle-1;
} }
return -1;
}
list: 存 sort 好數字的 array searchnum: 要找的數字
lef, right: 正在找的範圍左邊和右邊邊界
What is a data Type?
• A data type is a collection of objects and a set of operations that act on those objects.
• 每種 data type 有所占的記憶體大小及可表示的資料數值範圍
• Data types in C
• char, int, float, long, double (unsigned, signed, …)
• Array
• Structure
(User-defined) struct {int a;
int b;
char str[16];
int * iptr;
} blah;
int iarray[16];
Operations of Data Types
• Operations
• +, -, *, /, %, ==
• =, +=, -=
• ? :
• sizeof, - (negative)
• giligulu(int a, int b)
Data Type
• Representation of the objects of the data type
• Example: char
• char blah=‘A’; (‘A’: ASCII code is 65(dec), or 0x41 (hex))
Q: The maximum number which can be represented with a char variable?
A: 255.
• How about char, int, long, float?
01000001 01000001 1 byte of memory:
Data Type
• Q: 我們需要知道 data type 的 representation 嗎 ?
• A: 不一定 .
• 知道 representation 可能可以設計出更有效率的 algorithm .
• 但是當 data type 的 representation 被修改以後,程式
可能必須重新確認、修正、或
完全重寫
。 .囧
• 移植到不同的平台上 (x86, ARM, embedded system, …)
• 改變 program 或 library 的 specification( 規格 ) (ex. 16-bit int 32-bi t long)
Abstract Data Type
•
“Abstract Data Type” (ADT):
•
Separate the specifications from the representation an d the implementation
Representation and Implementation Specification (Interface)
User
Abstract Data Type
•
Specifications:
• Operations:
• Name of the function and the description of what the function does
• The type of the argument(s)
• The type of the result(s) (return value)
• Data (usually hidden)
•
Function categories:
• Creator/constructor
• Transformers
• Observer/reporter
Example
ADT NaturalNumber is objects:
an ordered subrange of the integers starting at zero and ending at the maxim um integer (lNT-MAX) on the computer
functions:
for all x, Y E NaturalNumber; TRUE, FALSE E Boolean
and where +, -, <, and == are the usual integer operations
NaturalNumber Zero() ::=0
Boolean IsZero(x) ::= if (x) return FALSE
else return TRUE Boolean Equal(x, y) ::= if (x == y) return TRUE else return FALSE
NaturalNumber Successor(x) ::= if (x == INT-MAX) return x else return x + 1
NaturalNumber Add(x, y) ::= if ((x + y) <= INT-MAX) return x + y
else return INT-MAX NaturalNumber Subtract(x, y) ::= if (x < y) return 0 else return x-y end NaturalNumber
怎麼評估一個程式寫得好不好 ?
1.Does the program meet the original specifications of the task?
2.Does it work correctly?
3.Does the program contain documentation that shows how to u se it and how it works?
4.Does the program effectively use functions to create logical uni ts?
5.Is the program’s code readable?
怎麼評估一個程式寫得好不好 ?
6.Does the program efficiently use primary and secondary stor age?
Primary storage: memory?
Secondary storage: Hard drive, flash disk, etc.
7.Is the program’s running time acceptable for the task?
Example: Network intrusion detection system
(1) 99.8% detection rate, 50 minutes to finish analysis of a minute of traffic
(2) 85% detection rate, 20 seconds to finish analysis of a minute of traffic
怎麼評估一個程式寫得好不好 ?
6. 程式是否有效地使用主要及次要的儲存 ?
7.程式的執行時間是否適合所需解決的工作內容 ?
Time complexity
Space complexity
空間及時間複雜度
•
程式的空間複雜度 :
• 程式執行完畢所需使用的所有空間 ( 記憶體 )
•
程式的時間複雜度 :
• 程式執行完畢所需使用的 ( 執行 ) 時間
•
Goal: 找出執行時間 / 使用空間”如何”隨著 input size 變長 ( 成長的有多快 )
•
什麼是 input size?
•
問題給的 input 的”元素數量” , 如 :
• Array 大小
• 多項式最高項的次方
• 矩陣的長寬
• 二進位數的位元數目
空間複雜度
• 程式所需空間 :
1.
固定的空間
• 和 input/output 的大小及內容無關
2.
變動的空間
• 和待解問題 P 的某個 input instance I( 某一個 input) 有關
• 跟 recursive function 會使用到的額外空間有關
•
時間複雜度
• 一個程式 P 所需使用的時間 :
• Compile 所需時間
• 執行時間 (execution time or run time)
• Compile 時間 : 固定的 . ( 例外 ?)
• C (and other compiled programming languages)
One Compilation Multiple Executions
• Run time:
• 和 input instance 的特性有關 !
•
e
• Cormen Chapter 1
• Horowitz Chapter 1.3-1.4