KMP string matching

O(n + m)

Naive string search restarts the pattern from scratch after every mismatch, re-comparing characters it already matched. KMP avoids that. First it builds the LPS table (longest proper prefix that is also a suffix) for each prefix of the pattern — shown as small numbers under the pattern. Then it scans the text with two pointers: on a match both advance; on a mismatch, instead of moving the text pointer back, it slides the pattern forward by consulting the LPS, which says how much of the matched prefix is still usable. Because the text pointer never retreats, it runs in O(n + m). The LPS is the whole trick — it encodes the pattern's self-similarity so no comparison is ever wasted. Matches are highlighted green as they're found.

Text & pattern
Edit the input and press Play

How it works

  1. Precompute the LPS table — for each pattern prefix, its longest prefix that's also a suffix.
  2. Scan the text and pattern together; on a match, advance both.
  3. On a mismatch, slide the pattern forward using the LPS (don't move back in the text).
  4. When the whole pattern matches, record the occurrence and slide on to find more.

Pseudocode

1kmp(text, pattern):                 # O(n + m)2  build lps[]: lps[i] = length of the longest3    proper prefix of pattern[0..i] that is also a suffix4  i ← 0 (text), j ← 0 (pattern)5  while i < n:6    if text[i] = pattern[j]: i++, j++; if j = m: report match, j ← lps[j-1]7    else if j > 0: j ← lps[j-1]      # slide, keep matched prefix8    else: i++