These are a type of dynamic programming problem where we’re looking for common sequences of subelements within two or more elements. Most commonly, chars within a string.
Longest common substring
Given two strings , , return the largest substring present in both.
Solution
We’ll first solve the longest common suffix problem. Let be any strings with having the longest common suffix . Now, what is the longest common substring of ?
- If , then is the longest common suffix.
- Else, is the longest common substring, they have no common suffix. Let’s define the elements of our table.
- two-dimensional table , is index of first string, index of second
Pseudocode
Considering the problem in terms of suffices is easier than in substrings. The table stores suffix lengths, and the longest suffix of a prefix is the longest substring.
Longest common subsequence
Take the example “abcbdab”, = “bdcaba”. Subsequences are not contiguous, so for the example, subsequences “bcba” and “bdab” are valid. Let , , where is the longest common subsequence of and . Then if , and is the longest common subsequence. If , then or and is still the longest common subsequence.
Solution
Let’s define our recurrence as the longest common subsequence of and . If either of or is zero, we want zero because there is no possible subsequence. If two letters are equal, we want to add that character so we take the LCS of both strings without that character and add one for that character. Otherwise, the LCS is the max of the LCSs with one less character.
We take the max since if both letters are different, it may increase a previous LS in one, but not both.
Pseudocode
The runtime here is because of the table, where and .
Longest palindromic subsequence
Suppose you have as input one string and we want to find the longest palindromic subsequence. This is the same as .
Solution
We can construct the dp array with = LPS from . We have the base cases where empty strings and single chars are palindromic, so . If has the fact that , then it could be the ends of a palindrome, depending on , so we obtain the recurrence:
Pseudocode
The runtime is because it’s a two-dimensional table.