User:Scl758/draft-of-quicksearch-algorithm

Introduction
The Quick-Search algorithm is a modification of an existing algorithm called Boyer-Moore algorithm. In other words, it is a simplification and improvement to Boyer-Moore algorithm. It is a string matching algorithm that compares a pattern with a string of text. It checks whether the pattern is a substring of the string of text or not.

Description
The design of Quick-Search algorithm is based on the Quick-Search Bad-Character Table (QSBC). The algorithm works like this, the comparison start from the first element of the string of text with the first elements in the pattern. Whenever there is a character mismatch in the comparison, it will shift the pattern to the right at least one character but it will never be shifted by more than m characters (m represents the total number of characters in the pattern). The algorithm is divided into two main parts; Preprocessing of QSBC Table and the actual comparison of the pattern with the string of text.

Preprocessing Phase: Quick-Search Bad-Character Table (QSBC)

In this phase, we construct a QSBC Table based on the input pattern. Let x be a character in pattern P and record the position of the first x start counting from the right end of the pattern. However, if x does not exist in P then we would assign the position to be m+1. In this phase, a table will be built which contains the position of each character in the pattern.

Searching Phase: Compare pattern P with string of text T. The comparison starts from the left end of both P and T. Whenever there is a mismatch, it will shift pattern P to the right by a certain character based on the values in the QSBC Table. The shifting of the pattern P will not exceed m.

==Algorithm ==

'''Preprocessing Phase: Quick-Search Bad-Character Table (QSBC)

QSBC[c] = min{ 0 < i ≤ m | P[m-i]=c } if c exists in P. QSBC[c] = m+1 if c doesn't exist in P.

Searching Phase:

Input: pattern P, string of text T MatchFound ← false k ← 0 m ← # of characters in P n ← # of characters in T

while MatchFound = false AND k+m ≤ n   i ← 0 while i < m AND P[i] = T[k+i] i ← i + 1 if i = m      MatchFound ← true else k ← k + QSBC[T[k+m]] if MatchFound = true return P is a substring in T else P is not a substring in T

Example
Inputs: P ← ACABTGC, T ← BACGCHACABTGCHAA



Correctness
The QSBC Table will be constructed during the Pre-processing Phase. This table is very important because it will be used in the Searching Phase that decides how many positions to shift pattern P from left to right. Since the QSBC Table has already constructed, if there is a character in the text T which is does not exist in pattern P then it will skip m+1 positions to the right. Which means that If there is no such character in P then there is no need to compare the whole pattern P because it will be a waste of processing time. By skipping this process, it will greatly reduce the comparison time.

Time Complexity
During the Pre-processing Phase, QSBC Table, has a running time of O(m+n). The factor n represents the number of different alphabets in pattern P.

In the Searching Phase, it has a time complexity of O(m•n). It will begin the comparison from the beginning of text T. Whenever there is a mismatch, it looks on the QSBC Table in order to shift the entire pattern P to the right direction. The number of shifts depends on the QSBC Table.