STRING MATCHING USING HORSPOOL'S ALGORITHM

Introduction

In string matching algorithm, we check whether a text T contains a pattern P(length of T>P) i.e to check whether P is a substring of T or not. Since we are trying to check a fixed string P, sometimes these algorithms are called exact string matching algorithms.

In horspool's method, we shift the pattern for matching by looking at the character(say c) of the text that is aligned against the last character of the pattern. This is the case even if character c itself matches its counterpart in the pattern. The pattern match checking occurs from right to left in this method.

Working Procedure

Do refer the code available the end of this section to understand the following theory.

  1. Let m and n be the lengths of pattern P and text T respectively. Initially the checking starts from mth position in T with mth position of P(last letter of P).
  2. General Possibilities (Say char ch1 from text is being compared with char ch2 from pattern).
    1. If ch1 doesn't occur in T then we can safely shift the pattern by it's length.
    2. If ch1 doesn't match with ch2 but it has occurrences in m-1 positions of P, then shift the pattern such that the rightmost occurrence of ch1 in the pattern aligns with the ch1 in the text.
    3. If ch1 matches with ch2 and other characters don't match and there are no other occurrences of ch1 in T then this case is similar to first case i.e. shift the pattern by it's entire length.
    4. If ch1 matches with ch2 and other characters don't match and there are other occurrences of ch1 in T then shift the pattern similar to case 2.
  3. Shift Table:
    1. Initialize 0 to all 128 characters in an array, table[].
    2. Loop from i=0 to m-1
    3. For the index, j=pattern[i]; assign m-i-1 in the table. That is table[ pattern[i] ] = m-i-1.
    4. This determines the amount of positions to shifted for each character in the pattern based on the right most occurrence of that character in the pattern
  4. String Matching:
    1. Loop from i=m-1 to n-1
      1. Loop until from k=0 to m-1.
        If pattern[m-k-1]==text[i-k]
        then increment k,
        else break the loop.
        (This itself is the comparison from right to left.)
      2. If k==m then the pattern was matched and the position(= i-m+1) can be returned.
      3. Else increment i by the value in table corresponding to the current character (from the text) being compared with the pattern.
    2. If you arrived at this step then it means that the pattern wasn't found, so return -1.

Time Complexity: O(n*m)

Code file

Please open this in your pc or with a compatible app in your mobile.

C++ Implementation for STRING MATCHING USING HORSPOOL'S ALGORITHM

That's it from this blog post. If you liked it then do share this blog with your friends or people who wanna get into programming world. Thank You!

Copyright © NStF Blogs 2021