# Longest Common String Subsequence

# Problem:

Write a function that takes two string arrays as input and returns the longest contiguous sequence of strings that appear in both.

For example, given the following two arrays:

`a = `**[**“/home”, “/register”, “/login”, “/user”, “/one”, “/two”**]**

b = **[**“/home”, “/red”, “/login”, “/user”, “/one”, “/pink”**]**

You should return the following:

**[**“/login”, “/user”, “/one”**]**

# Approach:

The key insight comes from asking: “Is there a recursive substructure to the problem that we can exploit?” and “Can we cache intermediate results of the recursive calls?”

- Consider locations
`i`

and`j`

in arrays`a`

and`b`

respectively. How can we find the longest common subsequence of arrays*starting at*`a[i]`

and`b[j]`

? - if
`a[i] == b[j]`

then the length of optimal subsequence is one plus the length of the longest common subsequence of arrays*starting at*`a[i+1]`

and`b[j+1]`

. - Otherwise, the optimal subsequence is empty.
- We can use
**Dynamic Programming**to cache intermediate computation of the recursive call in a hash table to be reused later.

Once you have the recursive function, the longest common subsequence can be found by repeatedly calling it on the two arrays:

# Iterative Implementation with Compact Loops:

What is the complexity of the above implementation? We call the `longest_recursive`

function for every `(i, j)`

pair of indices of the original arrays. Hence, the complexity is at least `O(MN)`

where `M`

and `N`

are the sizes of the two arrays. Although we are making further recursive calls to `longest_recursive`

for each `(i, j)`

, note that all but one of those calls return immediately by using a constant-time lookup in the table, and hence the total complexity is `O(MN)`

.

This complexity analysis underscores the limitations of the recursive implementation (although it’s pretty good to have, and easy to code). Here are a few others:

- The function signature of
`longest_recursive`

is long, and is a maintenance headache. A shorter function signature is always preferable. - The
`unordered_map`

does not guarantee a collision-free lookup, and in worst case, the map lookup might not be constant time, resulting in degradation of performance. - Runtime errors in recursive (especially stack depth exceeding/bad base cases) are hard to debug.
- For this problem, a single iterative function is sufficient to compute the
`max_count`

and update the table as well, making the recursive implementation slightly longer.

We can modify the `longest()`

method above and replace the recursive call as follows:

# Testing:

This problem demonstrates the use of `ElementsAre`

method of Gunit tests.

A few simple tests for empty lists:

Simple tests for identical arrays and for arrays with no common elements

And, finally the complex example in the problem statement: