String Matching
Hsuan-Tien Lin
Dept. of CSIE, NTU
June 19, 2012
String Matching
find the position that pat (first) shows up in string for i ← 0 to len(string) − 1 do
if pat matches strings[i, i + len(pat) − 1]
matching found end if
end for
matching not found
the IF takes O(m) for m = len(pat)
—can use heuristic on comparing begin and end first, but still O(m) in the worst case
so total O(n ∗ m) for n = len(string)
H.-T. Lin (NTU CSIE) String Matching 06/19/2012 1 / 6
Slow (Naive) Pattern Matching
i, j ← 0
while i < len(string) and j < len(pat) do if pat[j] == string[i]
i ← i + 1, j ← j + 1 (continue matching) else
i ← i − j + 1 j ← 0
(fail and totally go back) end if
end while
check matching status
see demo
“Jump” Pattern Matching
i, j ← 0
while i < len(string) and j < len(pat) do if pat[j] == string[i]
i ← i + 1, j ← j + 1 (continue matching) else
i ← i − min(jump, j) + 1 j ← 0
(fail and go to next possible starting point) end if
end while
check matching status
see demo
H.-T. Lin (NTU CSIE) String Matching 06/19/2012 3 / 6
Fast (Knuth-Morris-Pratt) Pattern Matching
i, j ← 0
while i < len(string) and j < len(pat) do if pat[j] == string[i]
i ← i + 1, j ← j + 1 (continue matching) else
i ← i
decrease j such that pat[0, j − 1] matches string[i − j, i − 1]
(fail but continue partially) end if
end while
check matching status
see demo
Donald Knuth 高德納
Ph.D., Caltech Math
Professor Emeritus, Stanford 1974 ACM A. M. Turing Award
(who is Turing and what is Turing Award?) 1995 IEEE John von Neumann Medal (who is von Neumann?)
For his major contributions to the analysis of algorithms and the design of programming languages, and in particular for his contributions to
“The Art of Computer Programming” through his well-known books in a continuous series by this title
H.-T. Lin (NTU CSIE) String Matching 06/19/2012 5 / 6
KMP Pattern Matching
number of increase i = O(len(string)) = number of increase j number of decrease j = O( number of increase j) because j ≥ 0 total: O(len(string)) IF the decrease step is O(1)