上一篇文章講到了神經網絡分類邊界的一些問題。這次再來探討一下,對于非線性數據,神經網絡的分類邊界能夠達到什么效果。已證明:只需一個包含足夠多神經元的隱層,多層前饋網絡就能以任意精度逼近任意復雜度的連續函數。
為了便于畫出分類邊界,這里采用Feature只有2維的二分類數據。簡單起見,就采用sinx來隨機生成數據好了。[微笑]
用sinx隨機生成500個數據,根據數據在曲線的兩端將其分成兩個類別,數據分布圖如下:
用sinx隨機生成500個數據
注意:圖中紅色的線是真實的曲線
 $$)
【吐槽一下,這個markdown編輯器里面怎么在文本中插入短公式 [憂桑] 】
仍然采用TensorFlow來實現一個簡單的3層神經網絡,將隱藏結點個數設置為20,隨著網絡參數的更新分類邊界的變化如下:
第10000輪
第30000輪
第50000輪
隨著迭代次數的增加,曲線的擬合越來越接近真實邊界。
將隱藏層個數設置為500,第50000輪的結果如下:
第50000輪
似乎跟上面的圖沒有太大的區別。然而,太復雜的結構可能就會引起over-fitting等問題,而且上面隱層結點個數從20變到500后,訓練時間明顯有所增加。相同的配置下迭代50000次,2個隱層結點耗時26秒,20個隱層結點耗時32秒,而500個隱層結點耗時250秒。
TensorFlow訓練code
import numpy as np
import tensorflow as tf
# Parameters
learning_rate = 0.1
batch_size = 100
display_step = 1
#model_path = "/home/lei/TensorFlow-Examples-master/examples/4_Utils/model.ckpt"
# Network Parameters
n_hidden_1 = 500 # 1st layer number of features
n_input = 2 # MNIST data input (img shape: 28*28)
n_classes = 2 # MNIST total classes (0-9 digits)
# tf Graph input
xs = tf.placeholder("float", [None, n_input])
ys = tf.placeholder("float", [None, n_classes])
# Create model
def multilayer_perceptron(x, weights, biases):
# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.sigmoid(layer_1)
# Output layer with linear activation
out_layer = tf.add(tf.matmul(layer_1, weights['out']), biases['out'])
out_layer = tf.sigmoid(out_layer)
return out_layer
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
'out': tf.Variable(tf.random_normal([n_hidden_1, n_classes]))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
# Construct model
prediction = multilayer_perceptron(xs, weights, biases)
# x_data = np.array([[0,0],[1,0],[1,1],[0,2],[2,2]])
# y_data = np.array([[1, 0],[0, 1],[0, 1],[0, 1],[0,1]])
x_data=np.loadtxt('data-sinx-x.txt')
y_data=np.loadtxt('data-sinx-y.txt')
# x_data = np.linspace(-1,1,300)[:, np.newaxis]
# noise = np.random.normal(0, 0.05, x_data.shape)
# y_data = np.square(x_data) - 0.5 + noise
# 4.定義 loss 表達式
# the error between prediciton and real data
loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),
reduction_indices=[1]))
# 5.選擇 optimizer 使 loss 達到最小
# 這一行定義了用什么方式去減少 loss,學習率是 0.1
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
# important step 對所有變量進行初始化
# init = tf.initialize_all_variables()
init = tf.global_variables_initializer()
sess = tf.Session()
# 上面定義的都沒有運算,直到 sess.run 才會開始運算
sess.run(init)
# 迭代 1000 次學習,sess.run optimizer
N = 50000
for i in range(N):
# training train_step 和 loss 都是由 placeholder 定義的運算,所以這里要用 feed 傳入參數
sess.run(train_step, feed_dict={xs: x_data, ys: y_data})
if i % 10000 == 0:
print(i)
#print(sess.run(loss, feed_dict={xs: x_data, ys: y_data}))
#print(sess.run(weights), sess.run(biases))
w = sess.run(weights)
b = sess.run(biases)
w1 = np.transpose(w['h1'])
b1 = b['b1']
w2 = np.transpose(w['out'])
b2 = b['out']
np.savetxt("w1_"+ str(i) + ".txt", w1)
np.savetxt("b1_"+ str(i) + ".txt", b1)
np.savetxt("w2_"+ str(i) + ".txt", w2)
np.savetxt("b2_"+ str(i) + ".txt", b2)
w = sess.run(weights)
b = sess.run(biases)
w1 = np.transpose(w['h1'])
b1 = b['b1']
w2 = np.transpose(w['out'])
b2 = b['out']
np.savetxt("w1_"+ str(N) + ".txt", w1)
np.savetxt("b1_"+ str(N) + ".txt", b1)
np.savetxt("w2_"+ str(N) + ".txt", w2)
np.savetxt("b2_"+ str(N) + ".txt", b2)
MATLAB畫圖code
function sinxborder()
load data-sinx.txt
a = data_sinx(:,1);
b = data_sinx(:,2);
c = data_sinx(:,3);
% syms x y
% h = ezplot('sin(x)-y');
% set(h,'Color','red')
ezplot(nn(50000));
hold on
scatter(a,b,'filled', 'cdata', c);
end
function eq = nn(i)
syms x y
X = [x; y];
W1 = load(['data/w1_', num2str(i), '.txt']);
b1 = load(['data/b1_', num2str(i), '.txt']);
W2 = load(['data/w2_', num2str(i), '.txt']);
b2 = load(['data/b2_', num2str(i), '.txt']);
Ksi = sigmoid(W1*X + b1);
% O = sigmoid(W2*Ksi + b2);
O = W2*Ksi + b2;
eq = O(1,1) - O(2,1);
end
function A = sigmoid(x)
A = 1./(1+exp(-x));
end