2023-09-24 01 RNN 的前向傳播

來源:https://hyunhp.tistory.com/448

1. RNN cell 與 RNN 直觀圖示

RNN ---->? Recurrent Neural Network?

You can think of the recurrent neural network as the repeated use of a single cell,the computations for a single time step.?

2. 輸入的維度Dimensions of input x

2.1 Input with?n_{x}number of units

? ? For a single time step of a single input example,?x^{(i)<t>}?is a one-dimensional input vector

?? ?Using language as an example, a language with a 5000-word vocabulary could be one-hot encoded into a vector that has n_{x}=5000?units. so?x^{(i)<t>}?could have the shape (5000,)

?? The notation?n_{x}?is used here to denote the number of units in a single time step of a single training example

2.2 Time Steps of size?T_{x}

? A recurrent neural network has multiple time steps, which you'll be index with t.

??In the lessons, you saw a single training example?x^{(i)}consisting of multiple time steps?T_{x}. In this notebook,?T_{x}will denote the number of timesteps in the longest sequence.

2.3 Batches of size m

? ?Let's say we have mini-batches, each with 20 training examples

? ?To benefit from vectorization, you'll stack 20 columns of?x^{(i)}examples

? ?For example, the tensor has the shape (5000,20,10)

? ?You'll use m to denote the number of training examples

? ?So, the shape of a mini-batch is

2.4 3D Tensor of shape?(n_{x},m,T_{x})

? ?The 3-dimensional tensor x of shape?(n_{x},m,T_{x})?represents the input x that is fed into the RNN

2.5 Take a 2D slice for each time step:?x^{<t>}

?? At each time step, you'll use a mini-batch of training examples (not just a single example)

? So, for each time step t, you'll use a 2D slice of shape (n_{x},m)

? This 2D slice is referred to as?x^{t}. The variable name in the code is xt.

3. 隱藏狀態的維度 hidden state a

the activation?a^{<t>}?that is passed to the RNN from one time step to another is called a "hidden state"

3.1 Dimensions of hidden state a

??Similar to the input tensor x, the hidden state for a single training example is a vector of length?

?? If you include a mini-batch or m training examples, the shape of a mini-batch is?(n_{a},m)

? When you include the time step dimension, the shape of the hidden state is?(n_{a},m,T_{x})

? You'll loop through the time steps with index t, and work with 2 2D slice of the 3D tensor

? This 2D slice is referred to as?a^{<t>}

?? In the code, the variable names used are either a_prev or a_next, depending on the function being implemented

?? The shape of this 2D slice is?(n_{a},m)

4. 輸出的維度Dimensions of prediction?\hat{y}

??Similar to the inputs? and hidden states,?\hat{y} ?is a 3D tensor of shape?(n_{y},m,T_{y})

????????????■?n_{y}?: number of units in the vector representing the prediction

????????????■ m :? ? number of examples in a mini-batch

????????????■?T_{y}:? number of time steps in the prediction

??For a single similar time step t, a 2D slice?\hat{y} ^{<t>}?has shape?(n_{y},m)

??In the code, the varriable names are:

? ? ? ? ? ? ?●? y_pred :?\hat{y}

? ? ? ? ? ? ?●? yt_pred :?\hat{y} ^{<t>}

5. 構建RNN

?? Here is how you can implement an RNN:

Steps:

????????????● Implement the calculations needed for one time step of the RNN.

????????????● Implement a loop over T_{x} time steps in order to process all the inputs, one at a time

?? 關于 RNN Cell

You can think of the recurrent neural network as the repeated use of a single cell. First, you'll implement the computations for a single time step.?

?? RNN cell? versus RNN_cell_forward:

● Note that an RNN cell outputs the hidden state?a^{<t>}

? ? ? ■?RNN cell is shown in the figure as the inner box with solid lines

●?The function that you'll implement, rnn_cell_forward, also calculates the prediction?\hat{y} ^{<t>}

? ? ?■ RNN_cell_forward is shown in the figure as the outer? ox with dashed lines

??The following figure describes the operations for a single time step of an RNN cell:

代碼如下:

# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_cell_forward

def rnn_cell_forward(xt, a_prev, parameters):?

? ? ? """

? ? ?【代碼注釋】

? ? ? ?Implements a single forward step of the RNN-cell as described in Figure (2)

? ? ? ?Arguments:

? ? ? ?xt -- your input data at timestep "t", numpy array of shape (n_x, m).

? ? ? ?a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m)

? ? ? ?parameters -- python dictionary containing:

? ? ? ? ? ? ? ? ? ? ? ? Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

? ? ? ? ? ? ? ? ? ? ? ? Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

? ? ? ? ? ? ? ? ? ? ? ? Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

? ? ? ? ? ? ? ? ? ? ? ? ba --? Bias, numpy array of shape (n_a, 1)

? ? ? ? ? ? ? ? ? ? ? ? by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

? ? ? ? Returns:

? ? ? ? a_next -- next hidden state, of shape (n_a, m)

? ? ? ? yt_pred -- prediction at timestep "t", numpy array of shape (n_y, m)

? ? ? ? cache -- tuple of values needed for the backward pass, contains (a_next, a_prev, xt, parameters)

? ? ? ? """

? ? # Retrieve parameters from "parameters"? ?

? ? Wax = parameters["Wax"]

? ? Waa = parameters["Waa"]

? ? Wya = parameters["Wya"]

? ? ba = parameters["ba"]

? ? by = parameters["by"]


? ? ### START CODE HERE ### (≈2 lines)? ?

? ? # compute next activation state using the formula given above? ?

? ? a_next = np.tanh(np.dot(Wax, xt) + np.dot(Waa, a_prev) + ba)

? ? # compute output of the current cell using the formula given above? ?

? ? yt_pred = softmax(np.dot(Wya, a_next) + by)

? ? ### END CODE HERE ###? ?

? ? # store values you need for backward propagation in cache? ?

? ? cache = (a_next, a_prev, xt, parameters)

? ? return a_next, yt_pred, cache

執行上述代碼

def?rnn_cell_forward_tests(rnn_cell_forward):

????????np.random.seed(1)

????????xt_tmp = np.random.randn(3, 10)

????????a_prev_tmp = np.random.randn(5, 10)

????????parameters_tmp = {}

????????parameters_tmp['Waa'] = np.random.randn(5, 5)

????????parameters_tmp['Wax'] = np.random.randn(5, 3)

????????parameters_tmp['Wya'] = np.random.randn(2, 5)

????????parameters_tmp['ba'] = np.random.randn(5, 1)

????????parameters_tmp['by'] = np.random.randn(2, 1)

????????a_next_tmp, yt_pred_tmp, cache_tmp = rnn_cell_forward(xt_tmp, a_prev_tmp, parameters_tmp)

????????print("a_next[4] = \n", a_next_tmp[4])

????????print("a_next.shape = \n", a_next_tmp.shape)

????????print("yt_pred[1] =\n", yt_pred_tmp[1])

????????print("yt_pred.shape = \n", yt_pred_tmp.shape)

# UNIT TESTS

rnn_cell_forward_tests(rnn_cell_forward)

6. RNN前向傳播的過程 RNN Forward Pass

? A recurrent neural network (RNN) is repetition of the RNN cell that you've just built.

? ? ? ● If your input sequence of data is 10 time steps long, then you will re-use the RNN cell 10 times

? Each cell takes two inputs at each time step:

? ? ? ●?a^{<t-1>}: The hidden state from the previous cell

? ? ? ● x^{<t>}?:??The current time step's input data

? It has two outputs at each time step:

? ? ? ●? A hidden state?(a^{<t>})

? ? ??●? A prediction?(y^{<t>})

? The weights biases?(W_{aa},b_{a},W_{ax},b_{x})?are resued each time step

? ? ? ●? They are maintained between calls to rnn_cell_forward in the 'parameters' dictionary

?? 上面代碼里面沒有提b_{x}

# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_forward

def rnn_forward(x, a0, parameters):

? ? """ Implement the forward propagation of the recurrent neural network described in Figure (3).? ? ? ? Arguments:

? ? x -- Input data for every time-step, of shape (n_x, m, T_x).

????a0 -- Initial hidden state, of shape (n_a, m)

????parameters -- python dictionary containing:

????Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

????Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

????Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

????ba -- Bias numpy array of shape (n_a, 1)

????by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

????Returns:

????a -- Hidden states for every time-step, numpy array of shape (n_a, m, T_x)

????y_pred -- Predictions for every time-step, numpy array of shape (n_y, m, T_x)

????caches -- tuple of values needed for the backward pass, contains (list of caches, x)

"""

# Initialize "caches" which will contain the list of all caches

caches = []

# Retrieve dimensions from shapes of x and parameters["Wya"]

n_x, m, T_x = x.shape

n_y,n_a = parameters["Wya"].shape

### START CODE HERE ###

# initialize "a" and "y_pred" with zeros (≈2 lines)

a = np.zeros((n_a, m, T_x))

y_pred = np.zeros((n_y, m, T_x))

# Initialize a_next (≈1 line)

a_next = a0

# loop over all time-steps

for t in range(T_x):

????# Update next hidden state, compute the prediction, get the cache (≈1 line)

????a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t] ,a_next, parameters)

????# Save the value of the new "next" hidden state in a (≈1 line)

????a[:,:,t] = a_next

????# Save the value of the prediction in y (≈1 line)

????y_pred[:,:,t] = yt_pred

????# Append "cache" to "caches" (≈1 line)

????caches.append(cache)

### END CODE HERE

### # store values needed for backward propagation in cache

caches = (caches, x)

return a, y_pred, caches

執行 上述代碼

def?rnn_forward_test(rnn_forward) :
????np.random.seed(1)

????x_tmp = np.random.randn(3, 10, 4)

????a0_tmp = np.random.randn(5, 10)

????parameters_tmp = {}

????parameters_tmp['Waa'] = np.random.randn(5, 5)

????parameters_tmp['Wax'] = np.random.randn(5, 3)

????parameters_tmp['Wya'] = np.random.randn(2, 5)

????parameters_tmp['ba'] = np.random.randn(5, 1)

????parameters_tmp['by'] = np.random.randn(2, 1)

????a_tmp, y_pred_tmp, caches_tmp = rnn_forward(x_tmp, a0_tmp, parameters_tmp)

????print("a[4][1] = \n", a_tmp[4][1])

????print("a.shape = \n", a_tmp.shape)

????print("y_pred[1][3] =\n", y_pred_tmp[1][3])

????print("y_pred.shape = \n", y_pred_tmp.shape)

????print("caches[1][1][3] =\n", caches_tmp[1][1][3])

????print("len(caches) = \n", len(caches_tmp))

#UNIT TEST? ?

rnn_forward_test(rnn_forward)

7. 小結

You've successfully built the forward propagation of a recurrent network from scratch.

??Situations when this RNN will peform better:

●?This will work well enough for some applications, but it suffers from vanishing gradients.

●?The RNN works best when each output?\hat{y}^{<t>} ?can be estimated using "local" context.

●?"Local" context refers? to information that is close to the prediction's time step t.

●? More formally, local context refers to inputs?x^{<t_j> }and predictions?\hat{y}^{<t>} ?where is t_jclose to?t

??What you should remember:

●?The recurrent neural network, or RNN , is essentially the repeated use of a single cell.

●?A basic RNN reads inputs one at a time, and remembers information through the hidden layer activations(hidden states) that are passed from one step to the next

? ? ? ■?The timestep dimension determines how many times to re-use the RNN cell

● Each cell takes into two inputs at each time step:

? ? ? ■? The hidden state from the previous cell

? ? ? ■? ?The current time step's input data

●?Each cell has two outputs at each time step:

? ? ? ■? ?A hidden state

? ? ? ■? ?A prediction

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 227,663評論 6 531
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 98,125評論 3 414
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 175,506評論 0 373
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 62,614評論 1 307
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 71,402評論 6 404
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 54,934評論 1 321
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,021評論 3 440
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 42,168評論 0 287
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 48,690評論 1 333
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 40,596評論 3 354
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 42,784評論 1 369
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,288評論 5 357
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,027評論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,404評論 0 25
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 35,662評論 1 280
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 51,398評論 3 390
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 47,743評論 2 370

推薦閱讀更多精彩內容