tensorflow学习之---CNN识别MNIST

本文只通过tensorflow搭建cnn网络来识别MNIST的手写字符。

如果您已经掌握了这个可以看我的另一篇博文
tensorflow学习之–构建自己的图片数据，用cnn网络进行识别

TensorFlow

TensorFlow™ 是一个使用数据流图进行数值计算的开源软件库。图中的节点代表数学运算，而图中的边则代表在这些节点之间传递的多维数组（张量）。这种灵活的架构可让您使用一个 API 将计算工作部署到桌面设备、服务器或者移动设备中的一个或多个 CPU 或 GPU。 TensorFlow 最初是由 Google 机器智能研究部门的 Google Brain 团队中的研究人员和工程师开发的，用于进行机器学习和深度神经网络研究，但它是一个非常基础的系统，因此也可以应用于众多其他领域。–摘自tensorflow.google.cn

总的来说：

tensorflow是现在最流行的神经网络开发框架之一，
github上现存开源代码最多的神经网络开发框架之一，
背后也有google团队支持，
所以如果你是一个神经网络的研究者，是必须掌握tensorflow的.
tensorflow的教程网上比较多,下面我列举几个。

官网的入门Tutorials
极客学院TensorFlow 官方文档中文版
极客学院中文文档下载地址
MNIST
在编程的时候我们的一个程序通常都是那句伟大的问候hello world!
不过在tensorflow的学习中，就没有hello tensorflow!了，取而代之的就是MNIST手写数字的识别。
简单的说MNIST是一个封装好手写数字的图片的一个数字集合。每个图片对应一个标签，其中包含50000个训练图片和训练标签还有10000个测试图片和测试标签。我简单的拿出来四个显示如下图：

MNIST数据.png
再代码中可通过

1
2
3

from tensorflow.examples.tutorials.mnist import input_data
# 准备数据
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

自动的进行下载，下载之后再代码同级目录MNIST_data中就会多出四个文件

t10k-images-idx3-ubyte.gz
t10k-labels-idx1-ubyte.gz
train-images-idx3-ubyte.gz
train-labels-idx1-ubyte.gz

这便是下载好的MNIST数据。那么下面我们将会用训练50000的数据去监督训练，然后10000个进行测试。

CNN

cnn的基础这里有一篇很好的博文，他的全系列都是很棒的，推荐阅读，这里我就不介绍了，要了解的话移步他的博客。
零基础入门深度学习(4) - 卷积神经网络

实现过程

如下代码是通过构建CNN来识别MNIST数据的过程
代码中大部分我都进行了注释，如果文章中哪里说的不清晰请再评论中留言，如果哪里说的有问题，欢迎指正。

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# MNIST再tensorflow中已经封装好了
# mnist.test.images通过这种方式直接取出来就是tensor的形式，方便调用
# 下一篇博文 会用自己的数据进行识别
# 准备数据
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
# 通过上面的一句就可以直接下载到代码文件同级的MNIST_data文件夹之下，
# 如果已经下载了默认就会跳过

# 计算准确率 在测试的时候会用到
def compute_accuracy(v_xs, v_ys):  # 传入测试样本和对应的label
    global prediction #应为是个全局变量，在使用之前需要引用。
    # 得到预测
    y_pre = sess.run(prediction, feed_dict={xs: v_xs, keep_prob: 1})
    # tf.argmax（～，0或1）返回行或者列中最大数的下表如下所示
    """
    小栗子
    test = np.array([[1, 2, 3], [2, 3, 4], [5, 4, 3], [8, 7, 2]])
    np.argmax(test, 0)　　　＃输出：array([3, 3, 1]
    np.argmax(test, 1)　　　＃输出：array([2, 2, 0, 0]
    """
    correct_prediction = tf.equal(tf.argmax(y_pre, 1), tf.argmax(v_ys, 1))
    # tf.cast 此函数是类型转换函数
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys, keep_prob: 1})
    return result


# 计算weight
def weigth_variable(shape):
    # stddev : 正态分布的标准差
    initial = tf.truncated_normal(shape, stddev=0.1)  # 截断正态分布
    return tf.Variable(initial)


# 计算biases
def bias_varibale(shape):
    # stddev : 正态分布的标准差
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)


# 计算卷积
def conv2d(x, W):
    # stride [1, x_movement, y_movement, 1]
    # tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')


# 定义池化
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')


# 定placeholder
xs = tf.placeholder(tf.float32, [None, 784]) / 255.
ys = tf.placeholder(tf.float32, [None, 10])
keep_prob = tf.placeholder(tf.float32)  # 用来处理过度拟合
x_image = tf.reshape(xs, [-1, 28, 28, 1])

# 定义第一层
W_conv1 = weigth_variable([5, 5, 1, 32])
b_conv1 = weigth_variable([32])
# 卷积
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)  # 28*28*32
# 池化
h_pool1 = max_pool_2x2(h_conv1)  # 14*14*32

# 定义第二层
W_conv2 = weigth_variable((5, 5, 32, 64))
b_conv2 = weigth_variable([64])
# 卷积
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)  # 14*14*64
# 池化
h_pool2 = max_pool_2x2(h_conv2)  # 7*7*64

# 定义第三层全连接层
W_fc1 = weigth_variable([7 * 7 * 64, 1024]) #如果需要更改图片尺寸需要注意这里 7 这个是通过卷积和池化算出来的。
b_fc1 = bias_varibale([1024])
# [n_samples, 7, 7, 64] ->> [n_samples, 7*7*64]
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)  # 防止过度拟合
# 定义第四层全连接层
W_fc2 = weigth_variable([1024, 10])
b_fc2 = bias_varibale([10])
prediction = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

# 计算loss cross_entropy
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction), reduction_indices=[1]))
# 梯度下降优化
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
sess = tf.Session()
# 初始化variable
init = tf.global_variables_initializer()
sess.run(init)
# 训练
ENPOCE = 1000
for epoce in range(ENPOCE):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys, keep_prob: 0.5})
    if epoce % 50 == 0:
        accuracy = compute_accuracy(
            mnist.test.images[:1000], mnist.test.labels[:1000])
    print("epoch: %d  acc: %f" % (epoch + 1, accuracy))

有缘至此，连接彼此