These are a type of dynamic programming problem where we’re looking for common sequences of subelements within two or more elements. Most commonly, chars within a string.

Longest common substring

Given two strings , , return the largest substring present in both.

Solution

We’ll first solve the longest common suffix problem. Let be any strings with having the longest common suffix . Now, what is the longest common substring of ?

  • If , then is the longest common suffix.
  • Else, is the longest common substring, they have no common suffix. Let’s define the elements of our table.
  • two-dimensional table , is index of first string, index of second

Pseudocode

def lcsubstring(x, y):
	dp = [0; len(x) + 1][0; len(y) + 1]
	maximum = 0
	maxpos = (0, 0)
 
	for i in 1..(len(x) + 1):
		for j in 1..(len(y) + 1):
			if x[i-1] = y[j-1]:
				dp[i, j] = dp[i - 1, j - 1] + 1
			else
				dp[i, j] = 0
			if maximum < dp[i, j]:
				maximum = dp[i, j]
				maxpos = i, j

Considering the problem in terms of suffices is easier than in substrings. The table stores suffix lengths, and the longest suffix of a prefix is the longest substring.

Longest common subsequence

Take the example “abcbdab”, = “bdcaba”. Subsequences are not contiguous, so for the example, subsequences “bcba” and “bdab” are valid. Let , , where is the longest common subsequence of and . Then if , and is the longest common subsequence. If , then or and is still the longest common subsequence.

Solution

Let’s define our recurrence as the longest common subsequence of and . If either of or is zero, we want zero because there is no possible subsequence. If two letters are equal, we want to add that character so we take the LCS of both strings without that character and add one for that character. Otherwise, the LCS is the max of the LCSs with one less character.

We take the max since if both letters are different, it may increase a previous LS in one, but not both.

Pseudocode

def lcs(x, y):
	dp = [0; len(x)+1][0; len(y)+1]
 
	for i in 1..(len(x) + 1):
		for j in 1..(len(y) + 1):
			if x[i] = y[j]:
				dp[i, j] = dp[i-1, j-1] + 1
			else:
				dp[i, j] = dp[i, j-1]
 

The runtime here is because of the table, where and .

Longest palindromic subsequence

Suppose you have as input one string and we want to find the longest palindromic subsequence. This is the same as .

Solution

We can construct the dp array with = LPS from . We have the base cases where empty strings and single chars are palindromic, so . If has the fact that , then it could be the ends of a palindrome, depending on , so we obtain the recurrence:

Pseudocode

def lps(x):
	dp = [0; |x|][0; |x|]
 
	for i in 1..|x|:
		dp[i][i] = 1
 
	for s in range 1..|x|:
		for i in range |x| - s:
			j = i+s
			if x[i] = x[j]:
				dp[i][j] = 2 + dp[i+1][j-1]
			else:
				dp[i][j] = max(dp[i+1][j], dp[i][j-1])

The runtime is because it’s a two-dimensional table.