來源：https://hyunhp.tistory.com/448

1. RNN cell 與 RNN 直觀圖示

RNN ---->? Recurrent Neural Network?

You can think of the recurrent neural network as the repeated use of a single cell，the computations for a single time step.?

2. 輸入的維度Dimensions of input x

2.1 Input with? $n_{x}$ number of units

? ? For a single time step of a single input example,? $x^{(i)<t>}$ ?is a one-dimensional input vector

?? ?Using language as an example, a language with a 5000-word vocabulary could be one-hot encoded into a vector that has $n_{x}=5000$ ?units. so? $x^{(i)<t>}$ ?could have the shape (5000,)

?? The notation? $n_{x}$ ?is used here to denote the number of units in a single time step of a single training example

2.2 Time Steps of size? $T_{x}$

? A recurrent neural network has multiple time steps, which you'll be index with t.

??In the lessons, you saw a single training example? $x^{(i)}$ consisting of multiple time steps? $T_{x}$ . In this notebook,? $T_{x}$ will denote the number of timesteps in the longest sequence.

2.3 Batches of size m

? ?Let's say we have mini-batches, each with 20 training examples

? ?To benefit from vectorization, you'll stack 20 columns of? $x^{(i)}$ examples

? ?For example, the tensor has the shape (5000,20,10)

? ?You'll use m to denote the number of training examples

? ?So, the shape of a mini-batch is

2.4 3D Tensor of shape? $(n_{x},m,T_{x})$

? ?The 3-dimensional tensor x of shape? $(n_{x},m,T_{x})$ ?represents the input x that is fed into the RNN

2.5 Take a 2D slice for each time step:? $x^{<t>}$

?? At each time step, you'll use a mini-batch of training examples (not just a single example)

? So, for each time step t, you'll use a 2D slice of shape $(n_{x},m)$

? This 2D slice is referred to as? $x^{t}$ . The variable name in the code is xt.

3. 隱藏狀態的維度 hidden state a

the activation? $a^{<t>}$ ?that is passed to the RNN from one time step to another is called a "hidden state"

3.1 Dimensions of hidden state a

??Similar to the input tensor x, the hidden state for a single training example is a vector of length?

?? If you include a mini-batch or m training examples, the shape of a mini-batch is? $(n_{a},m)$

? When you include the time step dimension, the shape of the hidden state is? $(n_{a},m,T_{x})$

? You'll loop through the time steps with index t, and work with 2 2D slice of the 3D tensor

? This 2D slice is referred to as? $a^{<t>}$

?? In the code, the variable names used are either a_prev or a_next, depending on the function being implemented

?? The shape of this 2D slice is? $(n_{a},m)$

4. 輸出的維度Dimensions of prediction? $\hat{y}$

??Similar to the inputs? and hidden states,? $\hat{y}$ ?is a 3D tensor of shape? $(n_{y},m,T_{y})$

????????????■? $n_{y}$ ?: number of units in the vector representing the prediction

????????????■ m :? ? number of examples in a mini-batch

????????????■? $T_{y}$ :? number of time steps in the prediction

??For a single similar time step t, a 2D slice? $\hat{y} ^{<t>}$ ?has shape? $(n_{y},m)$

??In the code, the varriable names are:

? ? ? ? ? ? ?●? y_pred :? $\hat{y}$

? ? ? ? ? ? ?●? yt_pred :? $\hat{y} ^{<t>}$

5. 構建RNN

?? Here is how you can implement an RNN:

Steps:

????????????● Implement the calculations needed for one time step of the RNN.

????????????● Implement a loop over $T_{x}$ time steps in order to process all the inputs, one at a time

?? 關于 RNN Cell

You can think of the recurrent neural network as the repeated use of a single cell. First, you'll implement the computations for a single time step.?

?? RNN cell? versus RNN_cell_forward:

● Note that an RNN cell outputs the hidden state? $a^{<t>}$

? ? ? ■?RNN cell is shown in the figure as the inner box with solid lines

●?The function that you'll implement, rnn_cell_forward, also calculates the prediction? $\hat{y} ^{<t>}$

? ? ?■ RNN_cell_forward is shown in the figure as the outer? ox with dashed lines

??The following figure describes the operations for a single time step of an RNN cell:

代碼如下：

# UNQ_C1 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_cell_forward

def rnn_cell_forward(xt, a_prev, parameters):?

? ? ? """

? ? ?【代碼注釋】

? ? ? ?Implements a single forward step of the RNN-cell as described in Figure (2)

? ? ? ?Arguments:

? ? ? ?xt -- your input data at timestep "t", numpy array of shape (n_x, m).

? ? ? ?a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m)

? ? ? ?parameters -- python dictionary containing:

? ? ? ? ? ? ? ? ? ? ? ? Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

? ? ? ? ? ? ? ? ? ? ? ? Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

? ? ? ? ? ? ? ? ? ? ? ? Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

? ? ? ? ? ? ? ? ? ? ? ? ba --? Bias, numpy array of shape (n_a, 1)

? ? ? ? ? ? ? ? ? ? ? ? by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

? ? ? ? Returns:

? ? ? ? a_next -- next hidden state, of shape (n_a, m)

? ? ? ? yt_pred -- prediction at timestep "t", numpy array of shape (n_y, m)

? ? ? ? cache -- tuple of values needed for the backward pass, contains (a_next, a_prev, xt, parameters)

? ? ? ? """

? ? # Retrieve parameters from "parameters"? ?

? ? Wax = parameters["Wax"]

? ? Waa = parameters["Waa"]

? ? Wya = parameters["Wya"]

? ? ba = parameters["ba"]

? ? by = parameters["by"]

? ? ### START CODE HERE ### (≈2 lines)? ?

? ? # compute next activation state using the formula given above? ?

? ? a_next = np.tanh(np.dot(Wax, xt) + np.dot(Waa, a_prev) + ba)

? ? # compute output of the current cell using the formula given above? ?

? ? yt_pred = softmax(np.dot(Wya, a_next) + by)

? ? ### END CODE HERE ###? ?

? ? # store values you need for backward propagation in cache? ?

? ? cache = (a_next, a_prev, xt, parameters)

? ? return a_next, yt_pred, cache

執行上述代碼

def?rnn_cell_forward_tests(rnn_cell_forward):

????????np.random.seed(1)

????????xt_tmp = np.random.randn(3, 10)

????????a_prev_tmp = np.random.randn(5, 10)

????????parameters_tmp = {}

????????parameters_tmp['Waa'] = np.random.randn(5, 5)

????????parameters_tmp['Wax'] = np.random.randn(5, 3)

????????parameters_tmp['Wya'] = np.random.randn(2, 5)

????????parameters_tmp['ba'] = np.random.randn(5, 1)

????????parameters_tmp['by'] = np.random.randn(2, 1)

????????a_next_tmp, yt_pred_tmp, cache_tmp = rnn_cell_forward(xt_tmp, a_prev_tmp, parameters_tmp)

????????print("a_next[4] = \n", a_next_tmp[4])

????????print("a_next.shape = \n", a_next_tmp.shape)

????????print("yt_pred[1] =\n", yt_pred_tmp[1])

????????print("yt_pred.shape = \n", yt_pred_tmp.shape)

# UNIT TESTS

rnn_cell_forward_tests(rnn_cell_forward)

6. RNN前向傳播的過程 RNN Forward Pass

? A recurrent neural network (RNN) is repetition of the RNN cell that you've just built.

? ? ? ● If your input sequence of data is 10 time steps long, then you will re-use the RNN cell 10 times

? Each cell takes two inputs at each time step:

? ? ? ●? $a^{<t-1>}$ : The hidden state from the previous cell

? ? ? ● $x^{<t>}$ ?:??The current time step's input data

? It has two outputs at each time step:

? ? ? ●? A hidden state? $(a^{<t>})$

? ? ??●? A prediction? $(y^{<t>})$

? The weights biases? $(W_{aa},b_{a},W_{ax},b_{x})$ ?are resued each time step

? ? ? ●? They are maintained between calls to rnn_cell_forward in the 'parameters' dictionary

?? 上面代碼里面沒有提 $b_{x}$

# UNQ_C2 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

# GRADED FUNCTION: rnn_forward

def rnn_forward(x, a0, parameters):

? ? """ Implement the forward propagation of the recurrent neural network described in Figure (3).? ? ? ? Arguments:

? ? x -- Input data for every time-step, of shape (n_x, m, T_x).

????a0 -- Initial hidden state, of shape (n_a, m)

????parameters -- python dictionary containing:

????Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a)

????Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x)

????Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a)

????ba -- Bias numpy array of shape (n_a, 1)

????by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1)

????Returns:

????a -- Hidden states for every time-step, numpy array of shape (n_a, m, T_x)

????y_pred -- Predictions for every time-step, numpy array of shape (n_y, m, T_x)

????caches -- tuple of values needed for the backward pass, contains (list of caches, x)

"""

# Initialize "caches" which will contain the list of all caches

caches = []

# Retrieve dimensions from shapes of x and parameters["Wya"]

n_x, m, T_x = x.shape

n_y,n_a = parameters["Wya"].shape

### START CODE HERE ###

# initialize "a" and "y_pred" with zeros (≈2 lines)

a = np.zeros((n_a, m, T_x))

y_pred = np.zeros((n_y, m, T_x))

# Initialize a_next (≈1 line)

a_next = a0

# loop over all time-steps

for t in range(T_x):

????# Update next hidden state, compute the prediction, get the cache (≈1 line)

????a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t] ,a_next, parameters)

????# Save the value of the new "next" hidden state in a (≈1 line)

????a[:,:,t] = a_next

????# Save the value of the prediction in y (≈1 line)

????y_pred[:,:,t] = yt_pred

????# Append "cache" to "caches" (≈1 line)

????caches.append(cache)

### END CODE HERE

### # store values needed for backward propagation in cache

caches = (caches, x)

return a, y_pred, caches

執行上述代碼

def?rnn_forward_test(rnn_forward) :
????np.random.seed(1)

????x_tmp = np.random.randn(3, 10, 4)

????a0_tmp = np.random.randn(5, 10)

????parameters_tmp = {}

????parameters_tmp['Waa'] = np.random.randn(5, 5)

????parameters_tmp['Wax'] = np.random.randn(5, 3)

????parameters_tmp['Wya'] = np.random.randn(2, 5)

????parameters_tmp['ba'] = np.random.randn(5, 1)

????parameters_tmp['by'] = np.random.randn(2, 1)

????a_tmp, y_pred_tmp, caches_tmp = rnn_forward(x_tmp, a0_tmp, parameters_tmp)

????print("a[4][1] = \n", a_tmp[4][1])

????print("a.shape = \n", a_tmp.shape)

????print("y_pred[1][3] =\n", y_pred_tmp[1][3])

????print("y_pred.shape = \n", y_pred_tmp.shape)

????print("caches[1][1][3] =\n", caches_tmp[1][1][3])

????print("len(caches) = \n", len(caches_tmp))

#UNIT TEST? ?

rnn_forward_test(rnn_forward)

7. 小結

You've successfully built the forward propagation of a recurrent network from scratch.

??Situations when this RNN will peform better:

●?This will work well enough for some applications, but it suffers from vanishing gradients.

●?The RNN works best when each output? $\hat{y}^{<t>}$ ?can be estimated using "local" context.

●?"Local" context refers? to information that is close to the prediction's time step t.

●? More formally, local context refers to inputs? $x^{<t_j> }$ and predictions? $\hat{y}^{<t>}$ ?where is $t_j$ close to? $t$

??What you should remember:

●?The recurrent neural network, or RNN , is essentially the repeated use of a single cell.

●?A basic RNN reads inputs one at a time, and remembers information through the hidden layer activations(hidden states) that are passed from one step to the next

? ? ? ■?The timestep dimension determines how many times to re-use the RNN cell

● Each cell takes into two inputs at each time step:

? ? ? ■? The hidden state from the previous cell

? ? ? ■? ?The current time step's input data

●?Each cell has two outputs at each time step:

? ? ? ■? ?A hidden state

? ? ? ■? ?A prediction

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

2023-09-24 01 RNN 的前向傳播

2023-09-24 01 RNN 的前向傳播

1. RNN cell 與 RNN 直觀圖示

2. 輸入的維度Dimensions of input x

2.1 Input with? $n_{x}$ number of units

2.2 Time Steps of size? $T_{x}$

2.3 Batches of size m

2.4 3D Tensor of shape? $(n_{x},m,T_{x})$

2.5 Take a 2D slice for each time step:? $x^{<t>}$

3. 隱藏狀態的維度 hidden state a

3.1 Dimensions of hidden state a

4. 輸出的維度Dimensions of prediction? $\hat{y}$

5. 構建RNN

6. RNN前向傳播的過程 RNN Forward Pass

7. 小結

??Situations when this RNN will peform better:

??What you should remember:

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

2023-09-24 01 RNN 的前向傳播

1. RNN cell 與 RNN 直觀圖示

2. 輸入的維度Dimensions of input x

2.1 Input with?number of units

2.2 Time Steps of size?

2.3 Batches of size m

2.4 3D Tensor of shape?

2.5 Take a 2D slice for each time step:?

3. 隱藏狀態的維度 hidden state a

3.1 Dimensions of hidden state a

4. 輸出的維度Dimensions of prediction?

5. 構建RNN

6. RNN前向傳播的過程 RNN Forward Pass

7. 小結

??Situations when this RNN will peform better:

??What you should remember:

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

2.1 Input with? $n_{x}$ number of units

2.2 Time Steps of size? $T_{x}$

2.4 3D Tensor of shape? $(n_{x},m,T_{x})$

2.5 Take a 2D slice for each time step:? $x^{<t>}$

4. 輸出的維度Dimensions of prediction? $\hat{y}$