KMP string matching algorithm
KMP 算法本质
This is a very good KMP algorithm article if you can read Chinese: KMP algorithm. But it still doesn't clear state what known information is used to reduce comparison. I figured this out by myself. The "known information" is the longest unique substring.
Last Step
The naive solution what it fails to match ' ' with 'D', it will reset j = 0, and i = 5 to re-match the whole string from beginning. Therefore the "known information" S[5] == P[1] == B, S[6] == P[2] == C,
S[7] == P[3] == D, and S[8] == P[4] == A is discarded. The KMP algorithm is utilizing this info to reduce comparison. Since we know P[0] != P[1] != P[2] != P[3] and P[0] == P[4] == A, there is no necessary to compare P[0] with S[5], S[6], S[7], S[8]. Why? Because we know S[5] == P[1] == B, and P[0] != P[1], therefore, P[0] != S[5], similarly, P[0] cannot be S[6], S[7] either. Since P[0] == P[4] == A, we cannot skip S[8], that's why we start with S[8] as the next comparison point. The next array used to calculate the S[8] position.
This is a very good KMP algorithm article if you can read Chinese: KMP algorithm. But it still doesn't clear state what known information is used to reduce comparison. I figured this out by myself. The "known information" is the longest unique substring.
Last Step
The naive solution what it fails to match ' ' with 'D', it will reset j = 0, and i = 5 to re-match the whole string from beginning. Therefore the "known information" S[5] == P[1] == B, S[6] == P[2] == C,
S[7] == P[3] == D, and S[8] == P[4] == A is discarded. The KMP algorithm is utilizing this info to reduce comparison. Since we know P[0] != P[1] != P[2] != P[3] and P[0] == P[4] == A, there is no necessary to compare P[0] with S[5], S[6], S[7], S[8]. Why? Because we know S[5] == P[1] == B, and P[0] != P[1], therefore, P[0] != S[5], similarly, P[0] cannot be S[6], S[7] either. Since P[0] == P[4] == A, we cannot skip S[8], that's why we start with S[8] as the next comparison point. The next array used to calculate the S[8] position.



Comments
Post a Comment