Problem
Given a string, find the length of the longest substring without repeating characters.
Examples:
Given "abcabcbb", the answer is "abc", which the length is 3.
Given "bbbbb", the answer is "b", with the length of 1.
Given "pwwkew", the answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.
大致意思:給定一個字符串,找到其中最大不重復子串,返回其長度。
My View
一開始的思路就是暴力解決,兩個循環來檢查是否滿足條件,然后就有了下面的代碼
class Solution {
public int lengthOfLongestSubstring(String s) {
if(s.equals("")){
return 0;
}else{
int count;
int ecount = 0;
for(int i = 0;i<s.length(); i++){
count = 1;
for(int j = i+1; j < s.length(); j++){
if(s.substring(i,j).contains(String.valueOf(s.charAt(j)))){
break;
}
count++;
}
if(count > ecount){
ecount = count;
}
}
return ecount;
}
}
}
很可惜,
得到了這樣的結果,982/983,最后一組數據因為太長,規定時間耗盡,然后不符合題目要求。
第一反應是HashMap,因為剛做做過的第一題優解是采用HashMap的,可是搞了半天還是搞不明白,對Hashmap了解甚少……感覺Java白學了,挖個坑復習去。
Solution
solution1
那么下面是官方的solution
To enumerate all substrings of a given string, we enumerate the start and end indices of them. Suppose the start and end indices are ii and jj, respectively. Then we have 0 \leq i \lt j \leq n0≤i<j≤n (here end index jj is exclusive by convention). Thus, using two nested loops with ii from 0 to n - 1n?1 and jj from i+1i+1 to nn, we can enumerate all the substrings of s.
To check if one string has duplicate characters, we can use a set. We iterate through all the characters in the string and put them into the set one by one. Before putting one character, we check if the set already contains it. If so, we return false. After the loop, we return true.
具體的意思就是建立一個雙層循環,來取得子字符串,然后通過檢查函數來檢查此字符串滿不滿足條件,而檢查函數的寫法是通過一個set集合來儲存被比較值,然后不斷拿比較值來比較是否存在。
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length();
int ans = 0;
for (int i = 0; i < n; i++)
for (int j = i + 1; j <= n; j++)
if (allUnique(s, i, j)) ans = Math.max(ans, j - i);
return ans;
}
public boolean allUnique(String s, int start, int end) {
Set<Character> set = new HashSet<>();
for (int i = start; i < end; i++) {
Character ch = s.charAt(i);
if (set.contains(ch)) return false;
set.add(ch);
}
return true;
}
}
可以看到allUnique函數,建立一個數據集,然后在比較范圍內,沒有包含字符就添加進結果集,有包含的話直接返回false。
然后通過max來找到最后的最大值。
由于用了三層循環,所以是O(n?3??)的復雜度,結果自然是[Time Limit Exceeded]了。
solution2
然后官方第二種solution
Sliding Window(滑動窗口)
The naive approach is very straightforward. But it is too slow. So how can we optimize it?
上來就來一句:明顯的嘲諷……
In the naive approaches, we repeatedly check a substring to see if it has duplicate character. But it is unnecessary. If a substring s_{ij}
?? from index ii to j - 1j?1 is already checked to have no duplicate characters. We only need to check if s[j] is already in the substring s_{ij}
那么這里的意思是,如果我們已經檢查了某個字串不含重復字符的話,就不用在多檢查一次了,而我們用雙層循環的時候,其實是對已經檢查過的串又檢查一遍。
By using HashSet as a sliding window, checking if a character in the current can be done in O(1).
這里就給出了解決方案,用一個HashSet集合來作為一個滑動窗口來檢查字符串,可以降低我們的復雜度,那么突破點也就在這。
A sliding window is an abstract concept commonly used in array/string problems.
那么滑動窗口多用來解決一些數組或者字符串的問題,是一個抽象的概念
Back to our problem. We use HashSet to store the characters in current window [i, j) (j ==i initially). Then we slide the index j to the right. If it is not in the HashSet, we slide j further. Doing so until s[j] is already in the HashSet. At this point, we found the maximum size of substrings without duplicate characters start with index ii. If we do this for all ii, we get our answer.
回到問題中,我們用哈希集合來儲存一個[i,j)區間的子串作為一個窗口,然后我們移動右邊值,如果發現的新值不在集合里就接著找,直到找到一個在集合里已經有的值,記錄下此時的不重復長度,那么接下來把窗口左值向右移動就可以遍歷完所有的子串。從而最后得到我們的最大值。
接下來代碼
public class Solution {
public int lengthOfLongestSubstring(String s) {
//取得字符串長度
int n = s.length();
//建立HashSet
Set<Character> set = new HashSet<>();
int ans = 0, i = 0, j = 0;
//遍歷字符串
while (i < n && j < n) {
// try to extend the range [i, j]
//如果數據集不包含查找到的新字符
//就把他放到集合里
if (!set.contains(s.charAt(j))){
set.add(s.charAt(j++));
//返回可以找到的最大值
ans = Math.max(ans, j - i);
}
else {
//否則本次查找結束,窗口左邊向右移動
set.remove(s.charAt(i++));
}
}
return ans;
}
}
那么很顯然這個算法的復雜度是O(n),那么空間復雜度的話,是取決于Hash占用的,當然也取決于字符串的長度。
在這里會發現Hash在解決數組或字符串問題上很有用,完全可以替代兩層循環的復雜度,在下面官方還給了一個滑動窗口改進版,
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length(), ans = 0;
Map<Character, Integer> map = new HashMap<>(); // current index of character
// try to extend the range [i, j]
for (int j = 0, i = 0; j < n; j++) {
if (map.containsKey(s.charAt(j))) {
i = Math.max(map.get(s.charAt(j)), i);
}
ans = Math.max(ans, j - i + 1);
map.put(s.charAt(j), j + 1);
}
return ans;
}
}
太深奧,慢慢理解吧。