前言
Okio是一款輕量級IO框架,由安卓大區最強王者Square公司打造,是著名網絡框架OkHttp的基石。Okio結合了java.io和java.nio,提供阻塞IO和非阻塞IO的功能,同時也對緩存等底層結構做了優化,能讓你更輕快的獲得、存儲和處理數據。
這篇文章主要是對Okio框架的核心部分做詳盡的解析。由于Okio的代碼量不大且比較精巧,核心的代碼大約5000行,本文將采用自底向上的分析方法。先談下Java IO的缺點,并對Okio的整體框架做個介紹,再依次詳細分析Okio的各個模塊的實現,包括緩存模塊、定時模塊等,之后對阻塞IO和非阻塞IO的執行過程,通過閱讀源碼,進行流程分析,最后做個總結,總結Okio的優化思想和設計精髓。
借著這篇文章的機會,向大家介紹這款優雅的IO框架,也想和大家探討設計的相關問題。希望通過這篇文章,能讓大家對Okio有個了解,甚至樂于放棄JAVA原生的IO體系,轉而使用這款IO框架來作為自己日常開發的工具。
如果你對一些基礎的IO模型(阻塞IO、非阻塞IO、同步IO、異步IO、多路復用、BIO、NIO、AIO)不清楚的話,下面是一些不錯的補課資料。
Linux IO模式及 select、poll、epoll詳解
Java NIO Tutorial
Java NIO - Ron Hitchens
源碼下載地址
https://github.com/square/okio
文中部分圖片可能看不清楚,可以點一下看原圖。
全文較長,這里先放出整體的一個目錄圖
- 前言
- 從Java IO說起
- Okio框架結構
- 緩存結構
- 定時機制
- 自定義字符串ByteString
- 流程分析
- 總結
從Java IO說起
大量獨立拓展的裝飾者導致類爆炸
用過Java IO的同學都應該有體會,Java的流用起來很麻煩和笨重。這主要是因為Java IO體系采用裝飾者模式構建和擴展,整個體系十分復雜龐大,基礎接口就有4個(InputStream, OutputStream, Reader, Writer),為了支持每一種組合而產生大量獨立拓展的子類,使得子類的數目呈爆炸性增長,每個類對應一種IO需求。
下面是一段Java IO調用代碼。僅僅是一個簡單需求就要寫這么一大堆代碼。相信大家早已對此心懷不滿。
// Java IO
public static void writeTest(File file) {
try {
FileOutputStream fos = new FileOutputStream(file);
OutputStream os = new BufferedOutputStream(fos);
DataOutputStream dos = new DataOutputStream(os);
dos.writeUTF("write string by utf-8.\n");
dos.writeInt(1234);
dos.flush();
fos.close();
} catch (Exception e) {
e.printStackTrace();
}
}
使用Okio實現同樣的功能,明顯輕松得多。而且Okio中的類被特意地設計為支持鏈式調用。正確的使用鏈式調用,就能產生簡潔、優美、易讀的代碼。現在很多框架都是這樣設計,是個流行趨勢。
// Okio
public static void writeTest(File file) {
try {
Okio.buffer(Okio.sink(file))
.writeUtf8("write string by utf-8.\n")
.writeInt(1234).close();
} catch (Exception e) {
e.printStackTrace();
}
}
阻塞IO的瓶頸
傳統Java socket的阻塞性質曾經是Java程序可伸縮性的最重要制約之一。維持一個socket連接必須單獨創建一個線程來管理,由此產生大量的線程切換,導致程序性能急劇降低。有了非阻塞IO,進程僅需一個線程就能管理所有的連接,非阻IO是許多復雜的、高性能的程序構建的基礎。
服務器端經常會考慮到非阻塞socket通道,因為它們使同時管理很多socket 通道變得更容易。但是,在客戶端使用一個或幾個非阻塞模式的socket 通道也是有益處的,例如,借助非阻塞的socket 通道,GUI 程序可以專注于用戶請求并且同時維護與一個或多個服務器的會話。在很多程序上,非阻塞模式都是有用的。
為了解決這個問題Java的1.4版本加入了nio庫,引入了Buffer,Channel,Selector等概念,實現了非阻塞IO多路復用模型。
而Okio另辟蹊徑,對的Java原生流做了一個分裝,自己設計了一套非阻塞調用的機制(看門狗)。至于為什么底層采用的是原生流而不是Channel,我只能對大佬的思想做一個猜測。因為Okio被設計出來主要是為了做網絡通信,而TCP/IP本身就是流式協議,所以底層采用的還是Java的原生流。使用看門狗而不是Selector,是為了更輕量的IO操作,更適合移動端。
Okio框架結構
廢話不多說,先直接上類圖。下圖畫出了Okio中的一些核心類(部分裝飾者類和工具類沒有畫出來)。圖片看出清楚可以點一下放大。
可以看出Okio的類圖是非常簡單的,這也是Okio之所以輕量的原因。
最基本的接口只有兩個:Sink、Source,大概相當于OutputStream和InputStream在原生接口中的地位。這兩個接口中只定義了一些最基礎的IO操作方法。
BufferedSink和BufferedSource接口分別繼承自Sink和Source,擴展了讀寫功能,定義了各式各樣的讀和寫。
public interface BufferedSink extends Sink {
Buffer buffer();
BufferedSink write(ByteString byteString) throws IOException;
BufferedSink write(byte[] source) throws IOException;
BufferedSink write(byte[] source, int offset, int byteCount) throws IOException;
long writeAll(Source source) throws IOException;
BufferedSink write(Source source, long byteCount) throws IOException;
BufferedSink writeUtf8(String string) throws IOException;
BufferedSink writeUtf8(String string, int beginIndex, int endIndex) throws IOException;
BufferedSink writeString(String string, int beginIndex, int endIndex, Charset charset)
throws IOException;
BufferedSink writeByte(int b) throws IOException;
BufferedSink writeShort(int s) throws IOException;
BufferedSink writeShortLe(int s) throws IOException;
BufferedSink writeInt(int i) throws IOException;
BufferedSink writeIntLe(int i) throws IOException;
BufferedSink writeLong(long v) throws IOException;
BufferedSink writeLongLe(long v) throws IOException;
BufferedSink writeDecimalLong(long v) throws IOException;
BufferedSink writeHexadecimalUnsignedLong(long v) throws IOException;
@Override void flush() throws IOException;
BufferedSink emit() throws IOException;
BufferedSink emitCompleteSegments() throws IOException;
OutputStream outputStream();
}
public interface BufferedSource extends Source {
Buffer buffer();
boolean exhausted() throws IOException;
void require(long byteCount) throws IOException;
boolean request(long byteCount) throws IOException;
byte readByte() throws IOException;
short readShort() throws IOException;
short readShortLe() throws IOException;
int readInt() throws IOException;
int readIntLe() throws IOException;
long readLong() throws IOException;
long readLongLe() throws IOException;
long readDecimalLong() throws IOException;
long readHexadecimalUnsignedLong() throws IOException;
void skip(long byteCount) throws IOException;
ByteString readByteString() throws IOException;
ByteString readByteString(long byteCount) throws IOException;
int select(Options options) throws IOException;
byte[] readByteArray() throws IOException;
byte[] readByteArray(long byteCount) throws IOException;
int read(byte[] sink) throws IOException;
void readFully(byte[] sink) throws IOException;
int read(byte[] sink, int offset, int byteCount) throws IOException;
void readFully(Buffer sink, long byteCount) throws IOException;
long readAll(Sink sink) throws IOException;
String readUtf8() throws IOException;
String readUtf8(long byteCount) throws IOException;
@Nullable String readUtf8Line() throws IOException;
String readUtf8LineStrict() throws IOException;
String readUtf8LineStrict(long limit) throws IOException;
int readUtf8CodePoint() throws IOException;
String readString(Charset charset) throws IOException;
String readString(long byteCount, Charset charset) throws IOException;
long indexOf(byte b) throws IOException;
long indexOf(byte b, long fromIndex) throws IOException;
long indexOf(byte b, long fromIndex, long toIndex) throws IOException;
long indexOf(ByteString bytes) throws IOException;
long indexOf(ByteString bytes, long fromIndex) throws IOException;
long indexOfElement(ByteString targetBytes) throws IOException;
long indexOfElement(ByteString targetBytes, long fromIndex) throws IOException;
boolean rangeEquals(long offset, ByteString bytes) throws IOException;
boolean rangeEquals(long offset, ByteString bytes, int bytesOffset, int byteCount)
throws IOException;
InputStream inputStream();
}
Buffer實現了BufferedSink和BufferedSource,是個集大成者,同時還增加了一些處理數據的操作,是一個可讀、可寫、可處理數據的緩存類。Buffer的數據操作依賴ByteString類,這個類配合著Buffer進行數據處理。由于篇幅限制,下面僅貼出Buffer中一些新增方法的聲明,具體實現大家可自行查看源碼。
public final class Buffer implements BufferedSource, BufferedSink, Cloneable {
@Nullable Segment head;
long size;
public long size();
public Buffer copyTo(OutputStream out) throws IOException;
public Buffer copyTo(OutputStream out, long offset, long byteCount) throws IOException;
public Buffer copyTo(Buffer out, long offset, long byteCount);
public Buffer writeTo(OutputStream out) throws IOException;
public Buffer writeTo(OutputStream out, long byteCount) throws IOException;
public Buffer readFrom(InputStream in) throws IOException;
public Buffer readFrom(InputStream in, long byteCount) throws IOException;
private void readFrom(InputStream in, long byteCount, boolean forever) throws IOException;
public byte getByte(long pos);
int selectPrefix(Options options);
public void clear();
Segment writableSegment(int minimumCapacity);
List<Integer> segmentSizes();
public ByteString md5();
public ByteString sha1();
public ByteString sha256();
public ByteString sha512() ;
private ByteString digest(String algorithm);
public ByteString hmacSha1(ByteString key);
public ByteString hmacSha256(ByteString key);
public ByteString hmacSha512(ByteString key);
private ByteString hmac(String algorithm, ByteString key);
public ByteString snapshot();
public ByteString snapshot(int byteCount);
}
RealBufferedSink和RealBufferedSource是BufferedSink和BufferedSource的實現類,實現了接口的所有方法,同時內部擁有一個Buffer對象,是真正進行的緩沖讀寫的角色。
Okio類相當于一個簡單工廠,對外暴露接口,可以產生各式各樣的Sink和Source。
Buffer的存儲容器用的不是數組,而是Segment類對象構成的循環鏈表,Segment用了享元模式,有SegmentPool對Segment進行管理。
定時模塊主要由Timeout和其子類AnsycTimeout類組成。
緩存結構
緩存是Okio中最重要的部分,很多優化思想都體現在這里,非常值得學習。Okio的緩存設計在cpu利用率和內存利用率之間做了權衡,即時間與空間的權衡,精巧而高效。
緩存模塊主要由Buffer,Segment,SegmentPool這三個類構成,三者之間的關系如下圖所示。Buffer內實際存儲數據的容器是一條由Segment構成的的循環鏈表。暫時不用的Segment由SegmentPool通過單鏈表保存,防止頻繁GC,避免內存抖動,增加資源的重復利用,提高效率。
Segment是存儲數據的基本單元,也是鏈表結構中的一個節點,其源碼如下。
final class Segment {
static final int SIZE = 8192;
static final int SHARE_MINIMUM = 1024;
final byte[] data;
int pos;
int limit;
boolean shared;
boolean owner;
Segment next;
Segment prev;
Segment() {
this.data = new byte[SIZE];
this.owner = true;
this.shared = false;
}
Segment(Segment shareFrom) {
this(shareFrom.data, shareFrom.pos, shareFrom.limit);
shareFrom.shared = true;
}
Segment(byte[] data, int pos, int limit) {
this.data = data;
this.pos = pos;
this.limit = limit;
this.owner = false;
this.shared = true;
}
public @Nullable Segment pop() {
Segment result = next != this ? next : null;
prev.next = next;
next.prev = prev;
next = null;
prev = null;
return result;
}
public Segment push(Segment segment) {
segment.prev = this;
segment.next = next;
next.prev = segment;
next = segment;
return segment;
}
public Segment split(int byteCount) {
if (byteCount <= 0 || byteCount > limit - pos) throw new IllegalArgumentException();
Segment prefix;
if (byteCount >= SHARE_MINIMUM) {
prefix = new Segment(this);
} else {
prefix = SegmentPool.take();
System.arraycopy(data, pos, prefix.data, 0, byteCount);
}
prefix.limit = prefix.pos + byteCount;
pos += byteCount;
prev.push(prefix);
return prefix;
}
public void compact() {
if (prev == this) throw new IllegalStateException();
if (!prev.owner) return; // Cannot compact: prev isn't writable.
int byteCount = limit - pos;
int availableByteCount = SIZE - prev.limit + (prev.shared ? 0 : prev.pos);
if (byteCount > availableByteCount) return; // Cannot compact: not enough writable space.
writeTo(prev, byteCount);
pop();
SegmentPool.recycle(this);
}
public void writeTo(Segment sink, int byteCount) {
if (!sink.owner) throw new IllegalArgumentException();
if (sink.limit + byteCount > SIZE) {
// We can't fit byteCount bytes at the sink's current position. Shift sink first.
if (sink.shared) throw new IllegalArgumentException();
if (sink.limit + byteCount - sink.pos > SIZE) throw new IllegalArgumentException();
System.arraycopy(sink.data, sink.pos, sink.data, 0, sink.limit - sink.pos);
sink.limit -= sink.pos;
sink.pos = 0;
}
System.arraycopy(data, pos, sink.data, sink.limit, byteCount);
sink.limit += byteCount;
pos += byteCount;
}
}
一個Segment可以分為三個部分,用pos和limit區分,如下圖所示。紅色部分的數據已經被讀過了,為失效數據;綠色部分是剛寫入的數據,還沒有被讀過;黃色部分還沒有被使用,可以寫入新數據。這個設計模仿了java.nio中的緩存設計,但卻更加巧妙。java.nio中緩存讀寫操作需要調用很多額外的操作方法,如從寫切換到讀需要調用flip,客戶需要對緩存的結構非常熟悉才能使用。而Okio的這種設計對用戶是透明的,用戶不需要清楚底層結構也能使用。
Segment提供的一些操作:
public Segment push(Segment segment)
節點插入。在調用該方法的節點后插入segment節點,并返回新插入的節點。public @Nullable Segment pop()
節點刪除。在雙向鏈表中刪除調用該方法的節點,并返回后繼節點。若該節點為頭節點(此時只剩頭節點,鏈表為空),則返回null。public Segment split(int byteCount)
節點分裂。將一個節點分裂成兩個,第一個節點獲得原節點[pos, pos+byteCount)區間的數據,第二個節點獲得[pos+byteCount, limit)的數據,返回第一個節點。如下圖所示
注意,這里有技巧。由于第一個節點是新產生的,如果第一個節點數據長度大于SHARE_MINIMUM(1024),那么就調用拷貝構造函數創造新節點,拷貝構造函數做的是淺拷貝,即兩個節點都持有同一個data數組的引用,這樣就省去了開辟內存及復制內存的開銷。若小于,則從SegmentPool中取出一個節點,并做真實的數據拷貝。Avoid short shared segments. These are bad for performance because they are readonly and may lead to long chains of short segments.(這句話是大佬的原文,怕翻譯的不好沒有翻譯) 可以看出,這是一個權衡性的設計。
public void compact()
節點合并。當前驅節點沒有被共享時,若兩個節點可以合并(兩個節點的數據長度小于SIZE(8192)),則將該節點的數據寫入前驅節點,并回收該節點。public void writeTo(Segment sink, int byteCount)
將sink節點的前byteCount個字節寫入到調用該方法的節點,當該節點的尾部長度不足byteCount時,會將該節點的數據字段前移pos位,與首部對齊。
SegmentPool非常簡單,其內部維持一條單鏈表保存暫時不用的Segment,緩存池的大小限制為64KB,所以最多能保存8個Segment。SegmentPool提供兩個同步方法,分別用來存取Segment。
final class SegmentPool {
static final long MAX_SIZE = 64 * 1024; // 64 KiB.
static @Nullable Segment next;
static long byteCount;
private SegmentPool() {
}
static Segment take() {
synchronized (SegmentPool.class) {
if (next != null) {
Segment result = next;
next = result.next;
result.next = null;
byteCount -= Segment.SIZE;
return result;
}
}
return new Segment(); // Pool is empty. Don't zero-fill while holding a lock.
}
static void recycle(Segment segment) {
if (segment.next != null || segment.prev != null) throw new IllegalArgumentException();
if (segment.shared) return; // This segment cannot be recycled.
synchronized (SegmentPool.class) {
if (byteCount + Segment.SIZE > MAX_SIZE) return; // Pool is full.
byteCount += Segment.SIZE;
segment.next = next;
segment.pos = segment.limit = 0;
next = segment;
}
}
}
真正做Segment分裂、合并的地方是Buffer類中的write(Buffer source, long byteCount)方法,該方法把傳入的source Buffer的前byteCount個字節寫到調用該方法的Buffer中去。由于兩個Buffer里的數據結構都是循環鏈表,所以寫入過程是將source鏈表的節點按從頭到尾的順序一個個取下來,然后插入到被寫入到鏈表,并看看新插入的節點能否和前一個節點合并。如果要寫的只是一個Segment的部分數據,那么這個Segment進行分裂,把要寫的數據分裂出來。
public final class Buffer implements BufferedSource, BufferedSink, Cloneable {
// ...
@Override
public void write(Buffer source, long byteCount) {
if (source == null) throw new IllegalArgumentException("source == null");
if (source == this) throw new IllegalArgumentException("source == this");
checkOffsetAndCount(source.size, 0, byteCount);
while (byteCount > 0) {
// Is a prefix of the source's head segment all that we need to move?
if (byteCount < (source.head.limit - source.head.pos)) {
Segment tail = head != null ? head.prev : null;
if (tail != null && tail.owner
&& (byteCount + tail.limit - (tail.shared ? 0 : tail.pos) <= Segment.SIZE)) {
// Our existing segments are sufficient. Move bytes from source's head to our tail.
source.head.writeTo(tail, (int) byteCount);
source.size -= byteCount;
size += byteCount;
return;
} else {
source.head = source.head.split((int) byteCount);
}
}
// Remove the source's head segment and append it to our tail.
Segment segmentToMove = source.head;
long movedByteCount = segmentToMove.limit - segmentToMove.pos;
source.head = segmentToMove.pop();
if (head == null) {
head = segmentToMove;
head.next = head.prev = head;
} else {
Segment tail = head.prev;
tail = tail.push(segmentToMove);
tail.compact();
}
source.size -= movedByteCount;
size += movedByteCount;
byteCount -= movedByteCount;
}
}
}
好了,到這Okio的緩存結構已經看得很清楚了。
定時機制
基類Timeout
Okio中使用Timeout類來控制I/O的定時操作。該定時機制使用了時間段和絕對時間點兩種計算定時的方式,可以選擇使用其中一種。下面我們看其源碼
public class Timeout {
private boolean hasDeadline;
private long deadlineNanoTime;
private long timeoutNanos;
// ...
public void throwIfReached() throws IOException {
if (Thread.interrupted()) {
throw new InterruptedIOException("thread interrupted");
}
if (hasDeadline && deadlineNanoTime - System.nanoTime() <= 0) {
throw new InterruptedIOException("deadline reached");
}
}
public final void waitUntilNotified(Object monitor) throws InterruptedIOException {
try {
boolean hasDeadline = hasDeadline();
long timeoutNanos = timeoutNanos();
if (!hasDeadline && timeoutNanos == 0L) {
monitor.wait(); // There is no timeout: wait forever.
return;
}
// Compute how long we'll wait.
long waitNanos;
long start = System.nanoTime();
if (hasDeadline && timeoutNanos != 0) {
long deadlineNanos = deadlineNanoTime() - start;
waitNanos = Math.min(timeoutNanos, deadlineNanos);
} else if (hasDeadline) {
waitNanos = deadlineNanoTime() - start;
} else {
waitNanos = timeoutNanos;
}
// Attempt to wait that long. This will break out early if the monitor is notified.
long elapsedNanos = 0L;
if (waitNanos > 0L) {
long waitMillis = waitNanos / 1000000L;
monitor.wait(waitMillis, (int) (waitNanos - waitMillis * 1000000L));
elapsedNanos = System.nanoTime() - start;
}
// Throw if the timeout elapsed before the monitor was notified.
if (elapsedNanos >= waitNanos) {
throw new InterruptedIOException("timeout");
}
} catch (InterruptedException e) {
throw new InterruptedIOException("interrupted");
}
}
}
可以看出Timeout類處理超時的機制比較簡單,首先是有3個實例變量:
private boolean hasDeadline; // 是否設置了超時的時間點
private long deadlineNanoTime; // 超時時間點
private long timeoutNanos; // 超時時間段
然后有一堆getter和setter方法,沒有什么好說的,代碼中為了簡潔也沒有列出來。而針對定時處理的方法有兩個:
public void throwIfReached() throws IOException
如果當前線程被中斷了或者定時時間點到了,拋出中斷異常。public final void waitUntilNotified(Object monitor) throws InterruptedIOException
首先是處理沒有等待時長的特殊情況,即無限期等待,直到有人喚醒。如果設置了等待時長,則計算時長以后進入等待狀態,并等待一定時間。定時時間到了之后拋出中斷異常。
異步事件定時類AsyncTimeout
真正實現異步事件定時的類是AsyncTimeout類,該類繼承自TimeOut類,主要的邏輯如下圖所示。類中維護著一條由AsyncTimeout對象構成的異步事件最小剩余時間優先隊列(由單列表實現),即最先超時的節點在隊首。類中定義了一個內部類WatchDog(看門狗),看門狗將作為守護線程在后臺運行,不斷取出隊首元素并判斷是否到達定時時間,若到達定時時間則執行該AsyncTimeout節點對象的timedOut方法。timedOut方法為空方法,需要在繼承的子類中重寫。
AsyncTimeout類有兩個方法用于包裝輸入和輸出,source和sink,這兩個方法都返回代理對象。通過源碼可以看出source和sink方法都會先調用enter方法將異步事件放入隊列,再執行真實對象的輸入、輸出方法,當然若出現異常或者在超時之前讀寫完成將調用exit函數進入異常處理。
public class AsyncTimeout extends Timeout {
// ...
static @Nullable AsyncTimeout head;
private boolean inQueue;
private @Nullable AsyncTimeout next;
private long timeoutAt;
protected void timedOut() {
}
public final Source source(final Source source) {
return new Source() {
@Override
public long read(Buffer sink, long byteCount) throws IOException {
boolean throwOnTimeout = false;
enter();
try {
long result = source.read(sink, byteCount);
throwOnTimeout = true;
return result;
} catch (IOException e) {
throw exit(e);
} finally {
exit(throwOnTimeout);
}
}
@Override
public void close() throws IOException {
boolean throwOnTimeout = false;
try {
source.close();
throwOnTimeout = true;
} catch (IOException e) {
throw exit(e);
} finally {
exit(throwOnTimeout);
}
}
@Override
public Timeout timeout() {
return AsyncTimeout.this;
}
// ...
};
}
public final Sink sink(final Sink sink) {
return new Sink() {
@Override
public void write(Buffer source, long byteCount) throws IOException {
checkOffsetAndCount(source.size, 0, byteCount);
while (byteCount > 0L) {
// Count how many bytes to write. This loop guarantees we split on a segment boundary.
long toWrite = 0L;
for (Segment s = source.head; toWrite < TIMEOUT_WRITE_SIZE; s = s.next) {
int segmentSize = s.limit - s.pos;
toWrite += segmentSize;
if (toWrite >= byteCount) {
toWrite = byteCount;
break;
}
}
// Emit one write. Only this section is subject to the timeout.
boolean throwOnTimeout = false;
enter();
try {
sink.write(source, toWrite);
byteCount -= toWrite;
throwOnTimeout = true;
} catch (IOException e) {
throw exit(e);
} finally {
exit(throwOnTimeout);
}
}
}
@Override
public void flush() throws IOException {
boolean throwOnTimeout = false;
enter();
try {
sink.flush();
throwOnTimeout = true;
} catch (IOException e) {
throw exit(e);
} finally {
exit(throwOnTimeout);
}
}
@Override
public void close() throws IOException {
boolean throwOnTimeout = false;
enter();
try {
sink.close();
throwOnTimeout = true;
} catch (IOException e) {
throw exit(e);
} finally {
exit(throwOnTimeout);
}
}
@Override
public Timeout timeout() {
return AsyncTimeout.this;
}
// ...
};
}
}
enter方法將節點放入異步事件隊列,而真正執行放入隊列的操作的是scheduleTimeout(AsyncTimeout node, long timeoutNanos, boolean hasDeadline)方法。該方法為同步方法,若隊列為空就創建隊列,并創建守護線程看門狗,之后計算事件被觸發的剩余時間,并將事件放入隊列,如果新放入隊列的元素是在隊首,就喚醒看門狗,檢查該事件是否超時。
public class AsyncTimeout extends Timeout {
// ...
public final void enter() {
if (inQueue) throw new IllegalStateException("Unbalanced enter/exit");
long timeoutNanos = timeoutNanos();
boolean hasDeadline = hasDeadline();
if (timeoutNanos == 0 && !hasDeadline) {
return; // No timeout and no deadline? Don't bother with the queue.
}
inQueue = true;
scheduleTimeout(this, timeoutNanos, hasDeadline);
}
private static synchronized void scheduleTimeout(
AsyncTimeout node, long timeoutNanos, boolean hasDeadline) {
// Start the watchdog thread and create the head node when the first timeout is scheduled.
if (head == null) {
head = new AsyncTimeout();
new Watchdog().start();
}
long now = System.nanoTime();
if (timeoutNanos != 0 && hasDeadline) {
node.timeoutAt = now + Math.min(timeoutNanos, node.deadlineNanoTime() - now);
} else if (timeoutNanos != 0) {
node.timeoutAt = now + timeoutNanos;
} else if (hasDeadline) {
node.timeoutAt = node.deadlineNanoTime();
} else {
throw new AssertionError();
}
// Insert the node in sorted order.
long remainingNanos = node.remainingNanos(now);
for (AsyncTimeout prev = head; true; prev = prev.next) {
if (prev.next == null || remainingNanos < prev.next.remainingNanos(now)) {
node.next = prev.next;
prev.next = node;
if (prev == head) {
AsyncTimeout.class.notify(); // Wake up the watchdog when inserting at the front.
}
break;
}
}
}
private long remainingNanos(long now) {
return timeoutAt - now;
}
}
異常處理涉及以下幾個方法,具體就是將事件從隊列中移除并拋出合適的異常。
public class AsyncTimeout extends Timeout {
// ...
final void exit(boolean throwOnTimeout) throws IOException {
boolean timedOut = exit();
if (timedOut && throwOnTimeout) throw newTimeoutException(null);
}
final IOException exit(IOException cause) throws IOException {
if (!exit()) return cause;
return newTimeoutException(cause);
}
public final boolean exit() {
if (!inQueue) return false;
inQueue = false;
return cancelScheduledTimeout(this);
}
// Returns true if the timeout occurred.
private static synchronized boolean cancelScheduledTimeout(AsyncTimeout node) {
// Remove the node from the linked list.
for (AsyncTimeout prev = head; prev != null; prev = prev.next) {
if (prev.next == node) {
prev.next = node.next;
node.next = null;
return false;
}
}
// The node wasn't found in the linked list: it must have timed out!
return true;
}
protected IOException newTimeoutException(@Nullable IOException cause) {
InterruptedIOException e = new InterruptedIOException("timeout");
if (cause != null) {
e.initCause(cause);
}
return e;
}
}
看門狗調用同步方法每次從隊列中取出隊首元素,若發現隊列為空就休眠IDLE_TIMEOUT_MILLIS(1分鐘),休眠完成后,若還是為空則線程退出。取出后檢查隊首元素的定時時間,發現還沒到,則休眠剩余時間;發現已超時,則回掉隊首元素的timedOut()方法,并將該元素彈出隊列。看門狗設計的非常高效,沒有任務的時候處于休眠或退出狀態。
public class AsyncTimeout extends Timeout {
private static final long IDLE_TIMEOUT_MILLIS = TimeUnit.SECONDS.toMillis(60);
private static final long IDLE_TIMEOUT_NANOS = TimeUnit.MILLISECONDS.toNanos(IDLE_TIMEOUT_MILLIS);
private static final class Watchdog extends Thread {
Watchdog() {
super("Okio Watchdog");
setDaemon(true);
}
public void run() {
while (true) {
try {
AsyncTimeout timedOut;
synchronized (AsyncTimeout.class) {
timedOut = awaitTimeout();
// Didn't find a node to interrupt. Try again.
if (timedOut == null) continue;
// The queue is completely empty. Let this thread exit and let another watchdog thread
// get created on the next call to scheduleTimeout().
if (timedOut == head) {
head = null;
return;
}
}
// Close the timed out node.
timedOut.timedOut();
} catch (InterruptedException ignored) {
}
}
}
}
static @Nullable AsyncTimeout awaitTimeout() throws InterruptedException {
// Get the next eligible node.
AsyncTimeout node = head.next;
// The queue is empty. Wait until either something is enqueued or the idle timeout elapses.
if (node == null) {
long startNanos = System.nanoTime();
AsyncTimeout.class.wait(IDLE_TIMEOUT_MILLIS);
return head.next == null && (System.nanoTime() - startNanos) >= IDLE_TIMEOUT_NANOS
? head // The idle timeout elapsed.
: null; // The situation has changed.
}
long waitNanos = node.remainingNanos(System.nanoTime());
// The head of the queue hasn't timed out yet. Await that.
if (waitNanos > 0) {
long waitMillis = waitNanos / 1000000L;
waitNanos -= (waitMillis * 1000000L);
AsyncTimeout.class.wait(waitMillis, (int) waitNanos);
return null;
}
// The head of the queue has timed out. Remove it.
head.next = node.next;
node.next = null;
return node;
}
}
自定義字符串ByteString
ByteString是自定義的字節字符串類,此類被設計為不可變的(創建后之后不能修改其數據),和String類似。當然,Java語言可沒有不可變標記關鍵字,如果想要實現一個不可變的對象,還需要一些操作。
- 不要提供任何會修改對象狀態的方法
- 保證類不會被擴展
- 使所有的域都是final的
- 使所有的域都是private的
- 確保對于任何可變組件的互斥訪問
不可變的對象有許多的好處,首先本質是線程安全的,不要求同步處理,也就是沒有鎖之類的性能問題,而且可以被自由的共享內部信息,當然壞處就是需要創建大量的類的對象。
ByteString不僅是不可變的,同時在內部有兩個filed,分別是byte[]數據,以及String的數據,這樣能夠讓這個類在Byte和String轉換上基本沒有開銷,同樣的也需要保存兩份引用,這是明顯的空間換時間的方式,為了性能Okio做了很多的事情。但是這個String前面有 transient 關鍵字標記,也就是說不會進入序列化和反序列化,反序列化的過程會進行懶加載,節省開銷。
ByteString提供了哪些功能,我們看一下方法就一目了然。
public class ByteString implements Serializable, Comparable<ByteString> {
final byte[] data;
transient int hashCode; // Lazily computed; 0 if unknown.
transient String utf8; // Lazily computed.
ByteString(byte[] data);
public static ByteString of(byte... data);
public static ByteString of(byte[] data, int offset, int byteCount);
public static ByteString of(ByteBuffer data);
public static ByteString encodeUtf8(String s);
public static ByteString encodeString(String s, Charset charset);
public String utf8();
public String string(Charset charset);
public String base64();
public ByteString md5();
public ByteString sha1();
public ByteString sha256();
public ByteString sha512();
private ByteString digest(String algorithm);
public ByteString hmacSha1(ByteString key);
public ByteString hmacSha256(ByteString key);
public ByteString hmacSha512(ByteString key);
private ByteString hmac(String algorithm, ByteString key);
public String base64Url();
public static @Nullable ByteString decodeBase64(String base64);
public String hex();
public static ByteString decodeHex(String hex);
private static int decodeHexDigit(char c);
public static ByteString read(InputStream in, int byteCount) throws IOException;
public ByteString toAsciiLowercase();
public ByteString toAsciiUppercase();
public ByteString substring(int beginIndex);
public ByteString substring(int beginIndex, int endIndex);
public int size();
public byte[] toByteArray();
byte[] internalArray();
public ByteBuffer asByteBuffer();
public void write(OutputStream out) throws IOException;
void write(Buffer buffer);
public boolean rangeEquals(int offset, ByteString other, int otherOffset, int byteCount);
public boolean rangeEquals(int offset, byte[] other, int otherOffset, int byteCount);
public final boolean startsWith(ByteString prefix);
public final boolean startsWith(byte[] prefix);
public final boolean endsWith(ByteString suffix);
public final boolean endsWith(byte[] suffix);
public final int indexOf(ByteString other);
public final int indexOf(ByteString other, int fromIndex);
public final int indexOf(byte[] other);
public int indexOf(byte[] other, int fromIndex);
public final int lastIndexOf(ByteString other);
public final int lastIndexOf(ByteString other, int fromIndex);
public final int lastIndexOf(byte[] other);
public int lastIndexOf(byte[] other, int fromIndex);
@Override public boolean equals(Object o);
@Override public int hashCode();
@Override public int compareTo(ByteString byteString);
@Override public String toString();
static int codePointIndexToCharIndex(String s, int codePointCount);
private void readObject(ObjectInputStream in) throws IOException;
private void writeObject(ObjectOutputStream out) throws IOException;
}
流程分析
阻塞調用
讓我們再回過頭來看看文章開始的那個同步調用是個怎樣的流程,代碼如下。
Okio.buffer(Okio.sink(file))
.writeUtf8("write string by utf-8.\n")
.writeInt(1234).close();
先看看Okio.sink(file)。
// Okio.java
public static Sink sink(File file) throws FileNotFoundException {
if (file == null) throw new IllegalArgumentException("file == null");
return sink(new FileOutputStream(file));
}
public static Sink sink(OutputStream out) {
return sink(out, new Timeout());
}
private static Sink sink(final OutputStream out, final Timeout timeout) {
if (out == null) throw new IllegalArgumentException("out == null");
if (timeout == null) throw new IllegalArgumentException("timeout == null");
return new Sink() {
@Override public void write(Buffer source, long byteCount) throws IOException {
checkOffsetAndCount(source.size, 0, byteCount);
while (byteCount > 0) {
timeout.throwIfReached();
Segment head = source.head;
int toCopy = (int) Math.min(byteCount, head.limit - head.pos);
out.write(head.data, head.pos, toCopy);
head.pos += toCopy;
byteCount -= toCopy;
source.size -= toCopy;
if (head.pos == head.limit) {
source.head = head.pop();
SegmentPool.recycle(head);
}
}
}
@Override public void flush() throws IOException {
out.flush();
}
@Override public void close() throws IOException {
out.close();
}
@Override public Timeout timeout() {
return timeout;
}
@Override public String toString() {
return "sink(" + out + ")";
}
};
}
從源碼可以看出Okio.sink(file)最終會調用Okio.sink(final OutputStream in, final Timeout timeout)方法。傳入的OutputStream對象是new出來的FileOutputStream對象,到這里我們可以看出,Sink只是包裹了Java原生流,可以看成原生流的代理,包裝了寫操作增加了一些處理,最終底層的寫操作將由FileOutputStream完成。傳入的Timeout對象是通過默認構造函數new出來的Timeout對象,沒有設置延時。
調用最終返回一個Sink對象,這個對象重寫了write(Buffer source, long byteCount)方法,是為了RealBufferSink作準備,該方法將Buffer里的byteCount個字節寫入到Java原生流中,寫操作會改變Buffer的size以及涉及到的Segment的狀態。需要注意的是,若timeout設置了定時,則將延遲設置的時間,直到超時后才寫數據,這是一個阻塞I/O。返回的Sink對象也重寫了close(),flush()等方法,實際上都是對Java原生流的操作。
得到Sink對象后將進入Okio.buffer(Sink sink)方法。
// Okio.java
public static BufferedSink buffer(Sink sink) {
return new RealBufferedSink(sink);
}
這個方法非常簡單,僅僅是new了一個RealBufferedSink對象就返回了。構造把Sink對象傳進去了,RealBufferedSink內部持有傳入的Sink,也可以看成是Sink的代理,各種操作都是在Sink上操作。RealBufferedSink內部也持有一個Buffer對象,作為緩存數據的容器。
之后調用就到了RealBufferedSink.writeUtf8(String string)方法。
// RealBufferedSink.java
@Override public BufferedSink writeUtf8(String string) throws IOException {
if (closed) throw new IllegalStateException("closed");
buffer.writeUtf8(string);
return emitCompleteSegments();
}
// Buffer.java
@Override public Buffer writeUtf8(String string) {
return writeUtf8(string, 0, string.length());
}
@Override public Buffer writeUtf8(String string, int beginIndex, int endIndex) {
if (string == null) throw new IllegalArgumentException("string == null");
if (beginIndex < 0) throw new IllegalArgumentException("beginIndex < 0: " + beginIndex);
if (endIndex < beginIndex) {
throw new IllegalArgumentException("endIndex < beginIndex: " + endIndex + " < " + beginIndex);
}
if (endIndex > string.length()) {
throw new IllegalArgumentException(
"endIndex > string.length: " + endIndex + " > " + string.length());
}
// Transcode a UTF-16 Java String to UTF-8 bytes.
for (int i = beginIndex; i < endIndex;) {
int c = string.charAt(i);
if (c < 0x80) {
Segment tail = writableSegment(1);
byte[] data = tail.data;
int segmentOffset = tail.limit - i;
int runLimit = Math.min(endIndex, Segment.SIZE - segmentOffset);
// Emit a 7-bit character with 1 byte.
data[segmentOffset + i++] = (byte) c; // 0xxxxxxx
// Fast-path contiguous runs of ASCII characters. This is ugly, but yields a ~4x performance
// improvement over independent calls to writeByte().
while (i < runLimit) {
c = string.charAt(i);
if (c >= 0x80) break;
data[segmentOffset + i++] = (byte) c; // 0xxxxxxx
}
int runSize = i + segmentOffset - tail.limit; // Equivalent to i - (previous i).
tail.limit += runSize;
size += runSize;
} else if (c < 0x800) {
// Emit a 11-bit character with 2 bytes.
writeByte(c >> 6 | 0xc0); // 110xxxxx
writeByte(c & 0x3f | 0x80); // 10xxxxxx
i++;
} else if (c < 0xd800 || c > 0xdfff) {
// Emit a 16-bit character with 3 bytes.
writeByte(c >> 12 | 0xe0); // 1110xxxx
writeByte(c >> 6 & 0x3f | 0x80); // 10xxxxxx
writeByte(c & 0x3f | 0x80); // 10xxxxxx
i++;
} else {
// c is a surrogate. Make sure it is a high surrogate & that its successor is a low
// surrogate. If not, the UTF-16 is invalid, in which case we emit a replacement character.
int low = i + 1 < endIndex ? string.charAt(i + 1) : 0;
if (c > 0xdbff || low < 0xdc00 || low > 0xdfff) {
writeByte('?');
i++;
continue;
}
// UTF-16 high surrogate: 110110xxxxxxxxxx (10 bits)
// UTF-16 low surrogate: 110111yyyyyyyyyy (10 bits)
// Unicode code point: 00010000000000000000 + xxxxxxxxxxyyyyyyyyyy (21 bits)
int codePoint = 0x010000 + ((c & ~0xd800) << 10 | low & ~0xdc00);
// Emit a 21-bit character with 4 bytes.
writeByte(codePoint >> 18 | 0xf0); // 11110xxx
writeByte(codePoint >> 12 & 0x3f | 0x80); // 10xxxxxx
writeByte(codePoint >> 6 & 0x3f | 0x80); // 10xxyyyy
writeByte(codePoint & 0x3f | 0x80); // 10yyyyyy
i += 2;
}
}
return this;
}
Segment writableSegment(int minimumCapacity) {
if (minimumCapacity < 1 || minimumCapacity > Segment.SIZE) throw new IllegalArgumentException();
if (head == null) {
head = SegmentPool.take(); // Acquire a first segment.
return head.next = head.prev = head;
}
Segment tail = head.prev;
if (tail.limit + minimumCapacity > Segment.SIZE || !tail.owner) {
tail = tail.push(SegmentPool.take()); // Append a new empty segment to fill up.
}
return tail;
}
RealBufferedSink的writeUtf8方法調用其內部Buffer的writeUtf8方法,最終String以“utf-8”編碼寫入了Buffer里。"utf-8"是一種變長前綴碼,相當于在Unicode的基礎上做了個信源壓縮。
注意,在每次真實的寫之前會調用writableSegment(int minimumCapacity)方法,以獲得足夠寫入大小的容器。
寫操作完成后將調用emitCompleteSegments()方法,我們繼續跟進去看一看。
// RealBufferedSink.java
@Override public BufferedSink emitCompleteSegments() throws IOException {
if (closed) throw new IllegalStateException("closed");
long byteCount = buffer.completeSegmentByteCount();
if (byteCount > 0) sink.write(buffer, byteCount);
return this;
}
// Buffer.java
public long completeSegmentByteCount() {
long result = size;
if (result == 0) return 0;
// Omit the tail if it's still writable.
Segment tail = head.prev;
if (tail.limit < Segment.SIZE && tail.owner) {
result -= tail.limit - tail.pos;
}
return result;
}
這段代碼的邏輯就是寫操作完成后計算Buffer中可寫的數據量,由于最后一個Segment有可能不滿,所以要特殊處理下。然后根據計算出的字節數執行Sink的寫操作,將數據寫入FileOutputStream中。
RealBufferSink確實比Sink多了緩存的作用,先將數據寫到Buffer里,寫操作完成后再把Buffer中緩存的數據一把寫到流中。
至此將String寫入流中已經完畢了。寫入Int的過程非常類似沒有太多好說的。
// RealBufferedSink.java
@Override public BufferedSink writeInt(int i) throws IOException {
if (closed) throw new IllegalStateException("closed");
buffer.writeInt(i);
return emitCompleteSegments();
}
// Buffer.java
@Override public Buffer writeInt(int i) {
Segment tail = writableSegment(4);
byte[] data = tail.data;
int limit = tail.limit;
data[limit++] = (byte) ((i >>> 24) & 0xff);
data[limit++] = (byte) ((i >>> 16) & 0xff);
data[limit++] = (byte) ((i >>> 8) & 0xff);
data[limit++] = (byte) (i & 0xff);
tail.limit = limit;
size += 4;
return this;
}
最后是調用RealBufferedSink.close方法關閉流。
// RealBufferedSink.java
@Override public void close() throws IOException {
if (closed) return;
Throwable thrown = null;
try {
if (buffer.size > 0) {
sink.write(buffer, buffer.size);
}
} catch (Throwable e) {
thrown = e;
}
try {
sink.close();
} catch (Throwable e) {
if (thrown == null) thrown = e;
}
closed = true;
if (thrown != null) Util.sneakyRethrow(thrown);
}
close方法首先會檢查Buffer中是否還有未寫入的數據,若有則一把寫入到流里,不這樣的話就內存泄漏了,Buffer中的數據永遠得不到處理,沒用的Segment也不會回收。最后將執行Sink的關閉操作,其實就是關閉掉FileOutputStream流。
至此整個阻塞調用的流程已經分析完了,可以看出Okio的阻塞IO與Java的阻塞IO是非常相似的,主要是在緩存上做了優化。
之所以叫阻塞IO,是指IO調用會使線程阻塞,直到IO完成時線程才繼續執行。
非阻塞調用
我們將上例中的file換成socket就變成了一個非阻塞的調用。
Okio.buffer(Okio.sink(socket))
.writeUtf8("write string by utf-8.\n")
.writeInt(1234).close();
依然從Okio.sink(socket)開始看。
// Okio.java
public static Sink sink(Socket socket) throws IOException {
if (socket == null) throw new IllegalArgumentException("socket == null");
AsyncTimeout timeout = timeout(socket);
Sink sink = sink(socket.getOutputStream(), timeout);
return timeout.sink(sink);
}
private static AsyncTimeout timeout(final Socket socket) {
return new AsyncTimeout() {
@Override protected IOException newTimeoutException(@Nullable IOException cause) {
InterruptedIOException ioe = new SocketTimeoutException("timeout");
if (cause != null) {
ioe.initCause(cause);
}
return ioe;
}
@Override protected void timedOut() {
try {
socket.close();
} catch (Exception e) {
logger.log(Level.WARNING, "Failed to close timed out socket " + socket, e);
} catch (AssertionError e) {
if (isAndroidGetsocknameError(e)) {
logger.log(Level.WARNING, "Failed to close timed out socket " + socket, e);
} else {
throw e;
}
}
}
};
}
可以看出sink方法首先調用timeout方法產生一個AsyncTimeout對象,該對象重寫了timedOut方法,超時則將socket關閉。之后將調用sink(final OutputStream out, final Timeout timeout)創建原生流的代理對象,這與之前的邏輯一樣。最后調用timeout.sink(sink),把異步事件放入定時隊列,并返回經過AsyncTimeout包裝的sink對象。之后的邏輯和之前一摸一樣,也沒有什么好分析的了。
這個IO是非阻塞的,線程不會因為等待網絡數據而一致阻塞,超時的IO操作會被看門狗移出隊列,并回調timedOut方法,具體就是把socket關閉。
總結
到這里整個Okio框架的解析就結束。由于篇幅和時間的限制很多功能和模塊沒有寫出來,如Pipe,以及一些實現壓縮、轉碼的類,不過著無傷大雅,我們已經能看清楚Okio的核心部分,并體會到其優化思想,總結如下:
- 使用方便。對比Java IO和Okio我們可以看出OKio使用更方便,支持鏈式調用,代碼簡潔、優美。緩存等功能對用戶都是透明的,不需要了解底層結構也嫩方便實用。
- 功能整合。Java IO進行不同的讀寫功能需要包裹各種裝飾類,而Okio把各種讀寫操作都整合了起來,不需要串上一堆裝飾類。
- cpu和內存的優化。數據容器采用循環鏈表實現,Segment通過分裂、合并、共享等操作避免了拷貝操作。SegmentPool會對暫時不用的Segment回收保存,避免頻繁GC。看門狗在沒任務的時候都處于休眠狀態,不占用cpu。ByteString通過空間換時間,同時懶加載實現了cpu優化。
- 功能強大。支持阻塞IO和非阻塞IO,提供了一系列的方便工具,如GZip的透明處理,對數據計算md5、sha1等都提供了支持,對數據校驗非常方便。
最后貼出一些其他分析Okio寫得不錯的文章,本文在一定程度上參考了它們
OKio - 重新定義“短小精悍”
大概是最完全的Okio源碼解析文章
深入理解okio的優化思想