Post

LC 139 - Word Break

LC 139 - Word Break

Question

Given a string s and a dictionary of strings wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words.

Note that the same word in the dictionary may be reused multiple times in the segmentation.

Example 1:

1
2
Input: s = "leetcode", wordDict = ["leet","code"]
Output: true

Explanation: Return true because “leetcode” can be segmented as “leet code”.

Example 2:

1
2
Input: s = "applepenapple", wordDict = ["apple","pen"]
Output: true

Explanation: Return true because “applepenapple” can be segmented as “apple pen apple”. Note that you are allowed to reuse a dictionary word.

Example 3:

1
2
Input: s = "catsandog", wordDict = ["cats","dog","sand","and","cat"]
Output: false

Constraints:

  • 1 <= s.length <= 300
  • 1 <= wordDict.length <= 1000
  • 1 <= wordDict[i].length <= 20
  • s and wordDict[i] consist of only lowercase English letters.
  • All the strings of wordDict are unique.

Question here and solution here

Solution

concept

top down (memoization)

We can use DFS to check the remainder string if it can be solved, the key part is use for j in range(i, len(s)) in each DFS call such that each possible starting position in that particular substring being investigated by the DFS is looked into.

bottom up

we solve from backwards, the subproblem is can we break the remain words fromi using wordDict ?

code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
class Solution:
	"""
	brute force
	TLE
	"""
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        word_set = set(wordDict)

        def dfs(i):
            if i == len(s):
                return True
            
            for j in range(i, len(s)):
                if s[i:j+1] in word_set:
                    if dfs(j+1):
                        return True
            return False

        return dfs(0)
        
class Solution:
	"""
	top down (memoization)
	"""
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        word_set = set(wordDict)
        cache = {len(s): True} # i: if from i onwards can be matched

        def dfs(i):
            if i == len(s):
                return cache[i]
            if i in cache:
                return cache[i]
            
            for j in range(i, len(s)):
                if s[i:j+1] in word_set:
                    if dfs(j+1):
                        cache[i] = True
                        return cache[i] # True
            cache[i] = False
            return cache[i] # False

        return dfs(0)
        
class Solution:
	"""
	Brute force 
	TLE
	similar to above but check each word instead of iterate through all index
	this is slightly more efficient
	"""
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:

        def dfs(i):
            if i == len(s):
                return True

            for w in wordDict:
                if ((i + len(w)) <= len(s) and
                     s[i : i + len(w)] == w
                ):
                    if dfs(i + len(w)):
                        return True
            return False

        return dfs(0)
        
class Solution:
	"""
	top down (memoization) of the above solution
	"""
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        memo = {len(s) : True}
        def dfs(i):
            if i in memo:
                return memo[i]

            for w in wordDict:
                if ((i + len(w)) <= len(s) and
                     s[i : i + len(w)] == w
                ):
                    if dfs(i + len(w)):
                        memo[i] = True
                        return True
            memo[i] = False
            return False

        return dfs(0)

class Solution:
	"""
	brute force
	"""
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        len_dict = defaultdict(set)
        for w in wordDict:
            len_dict[len(w)].add(w) # len -> set of words
        
        def dfs(i):
            if i == len(s):
                return True
            if i > len(s):
                return False

            for j in range(i, len(s)):
                if len(s[i:j+1]) in len_dict and s[i:j+1] in len_dict[len(s[i:j+1])]:
                    if dfs(j+1):
                        return True
            return False
        
        return dfs(0)

class Solution:
	"""
	top down memoization from above solution
	"""
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        len_dict = defaultdict(set)
        for w in wordDict:
            len_dict[len(w)].add(w)
        cache = {}
        
        def dfs(i):
            if i in cache:
                return cache[i]
            if i == len(s):
                return True
            if i > len(s):
                return False

            for j in range(i, len(s)):
                if len(s[i:j+1]) in len_dict and s[i:j+1] in len_dict[len(s[i:j+1])]:
                    if dfs(j+1):
                        cache[i] = True
                        return True
            cache[i] = False
            return False
        
        return dfs(0)

class Solution:
	"""
	bottom up solution
	"""
    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
        # starting at index i till end of str, can it be matched with wordDict ?
        dp = [False] * (len(s) + 1) # one extra space for base case, i.e. if you reach the end
        dp[-1] = True

        for i in range(len(s)-1, -1, -1):
            for word in wordDict:
                if i + len(word) <= len(s) and s[i:i+len(word)] == word:
                    dp[i] = dp[i + len(word)] # this might turns out to be false, so this will propagate through
                if dp[i]: # slight optimisation
                    break

        return dp[0]

Complexity

time: $O(nmt)$ where $n$ is the length of string s, $m$ is length of the wordDict and t is the max length of the word in wordDict
space: $O(n)$

This post is licensed under CC BY 4.0 by the author.