Subsequence

3 min readJan 24, 2020

We have 1) subset 2) subarray and 3) subsequence problems. In this article, I am going to summary some subsequence problems.

792. Number of Matching Subsequences

https://leetcode.com/problems/number-of-matching-subsequences/

class Solution:
    def numMatchingSubseq(self, S: str, words: List[str]) -> int:
        s_i = collections.defaultdict(list)
        for i,ch in enumerate(S):
            s_i[ch].append(i)
        from functools import lru_cache
        @lru_cache(None)
        def good(i,j,word):
            if not word:return True
            w = word[0]
            if w not in s_i:return False
            begin_i = bisect.bisect(s_i[w], j)
            valid_idx = s_i[w][begin_i:]
            if not valid_idx:return False
            return good(i+1,valid_idx[0], word[1:])
            
        return sum(good(0,-1,w) for w in words)

Above solution is my first version naive DFS solution and got TLE

28 / 49 test cases passed.Status:

Time Limit Exceeded

Similar idea this code works

class Solution:
    def numMatchingSubseq(self, S: str, words: List[str]) -> int:
        def isMatch(word, w_i, d_i):
            if w_i == len(word): return True
            l = dict_idxs[word[w_i]]
            if len(l) == 0 or d_i > l[-1]: return False
            i = l[bisect.bisect_left(l, d_i)]
            return isMatch(word, w_i + 1, i + 1)dict_idxs = collections.defaultdict(list)
        for i in range(len(S)):
            dict_idxs[S[i]].append(i)
        return sum(isMatch(word, 0, 0) for word in words)

The above code is from

Simple Python solution - LeetCode Discuss

Last Edit: October 16, 2018 8:42 AM 4.3K VIEWS The idea is to use the String S and build a dictionary of indexes for…

leetcode.com

Finally, I realize that my original code is on the edge of AC.

This version is TLE

class Solution:
    def numMatchingSubseq(self, S: str, words: List[str]) -> int:
        s_i = collections.defaultdict(list)
        for i,ch in enumerate(S):
            s_i[ch].append(i)
        #from functools import lru_cache
        #@lru_cache(None)
        def good(i,j,word):
            if i==len(word):return True
            w = word[i]
            d = s_i[w]
            if not d or d[-1]<=j:return False
            valid_i = bisect.bisect(d, j)
            valid_idx = d[valid_i:]
            if not valid_idx:return False
            return good(i+1,valid_idx[0], word)
            #return good(i+1,d[valid_i], word)
            
        return sum(good(0,-1,w) for w in words)

If we don’t generate a valid_idx list. Instead, if we directly use valid_i, we will get AC. From this example, one lesson is learned. Do not slice a list from a large list, it is kind of time and memory consuming.

class Solution:
    def numMatchingSubseq(self, S: str, words: List[str]) -> int:
        s_i = collections.defaultdict(list)
        for i,ch in enumerate(S):
            s_i[ch].append(i)
        #from functools import lru_cache
        #@lru_cache(None)
        def good(i,j,word):
            if i==len(word):return True
            w = word[i]
            d = s_i[w]
            if not d or d[-1]<=j:return False
            valid_i = bisect.bisect(d, j)
            """
            valid_idx = d[valid_i:]
            if not valid_idx:return False
            return good(i+1,valid_idx[0], word)
            """
            return good(i+1,d[valid_i], word)
            
        return sum(good(0,-1,w) for w in words)

runtime

Runtime: 812 ms, faster than 33.27% of Python3 online submissions for Number of Matching Subsequences.Memory Usage: 53.2 MB, less than 25.00% of Python3 online submissions for Number of Matching Subsequences.

This problem can also be solved by using another elegant way. See Stefan's post

https://leetcode.com/problems/number-of-matching-subsequences/discuss/117634/Efficient-and-simple-go-through-words-in-parallel-with-explanation

class Solution:
    def numMatchingSubseq(self, S: str, words: List[str]) -> int:
        waiting = collections.defaultdict(list)
        for it in map(iter, words):
            waiting[next(it)].append(it)
        for c in S:
            for it in waiting.pop(c, ()):
                waiting[next(it, None)].append(it)
        return len(waiting[None])

Complexity analysis

Since ‘The length of S will be in the range of [1, 50000]’, in the worst case, we will have the occurrence of the letter occurring the most as 50000.

Stefan’s solution complexity is O(len(S) + O(numbers of letters in all words))
This one in the worst case:
O(len(S) + O(numbers of letters in all words)*log(len(S)))
For reference, math.log(50000, 2) = 15.609640474436812

Subsequence

792. Number of Matching Subsequences

Time Limit Exceeded

Simple Python solution - LeetCode Discuss

Last Edit: October 16, 2018 8:42 AM 4.3K VIEWS The idea is to use the String S and build a dictionary of indexes for…

Written by Jimmy (xiaoke) Shen

No responses yet