深度学习-使用tensorflow实现猫狗识别
近几日每日定时与猫咪相伴,在陪伴的过程中不断思考着各类问题而深感忧虑;岁月在脸部留下了细密的皱纹,在额头间残留了几缕银丝;眉宇间多了一丝惆怅气质;倒是始终保持着那股不羁放纵的生活气息
深度学习图像分类领域中的一个经典案例就是猫狗大战。从体型大小、四肢长度以及面部特征和毛发状况等多个方面来看,猫与狗之间确实存在显著的区别,这些外在特征都能够让肉眼观察者轻易分辨出两者之间的不同之处。
那么如何让机器来识别猫和狗呢?这就需要使用卷积神经网络来实现了。
网上已经有不少人写过这案例了,我也来尝试下练练手。
完整代码:https://github.com/ADlead/Dogs-Cats.git
一. 数据集的准备
猫狗照片的数据集可以直接从kaggle官网(https://www.kaggle.com/c/dogs-vs-cats)下载。解压后可以看到训练数据集和测试数据集。

train文件夹内容如下:

test1文件夹内容如下:

编写代码将图片存储成batch文件
import cv2 as cv
import os
import numpy as np
import random
import pickle
import time
start_time = time.time()
data_dir = './data'
batch_save_path = './batch_files'
# 创建batch文件存储的文件夹
os.makedirs(batch_save_path, exist_ok=True)
# 图片统一大小:100
# 训练集 20000:100个batch文件,每个文件200张图片
# 验证集 5000:一个测试文件,测试时 50张 x 100 批次
# 进入图片数据的目录,读取图片信息
all_data_files = os.listdir(os.path.join(data_dir, 'train/'))
# print(all_data_files)
# 打算数据的顺序
random.shuffle(all_data_files)
all_train_files = all_data_files[:20000]
all_test_files = all_data_files[20000:]
train_data = []
train_label = []
train_filenames = []
test_data = []
test_label = []
test_filenames = []
# 训练集
for each in all_train_files:
img = cv.imread(os.path.join(data_dir,'train/',each),1)
resized_img = cv.resize(img, (100,100))
img_data = np.array(resized_img)
train_data.append(img_data)
if 'cat' in each:
train_label.append(0)
elif 'dog' in each:
train_label.append(1)
else:
raise Exception('%s is wrong train file'%(each))
train_filenames.append(each)
# 测试集
for each in all_test_files:
img = cv.imread(os.path.join(data_dir,'train/',each), 1)
resized_img = cv.resize(img, (100,100))
img_data = np.array(resized_img)
test_data.append(img_data)
if 'cat' in each:
test_label.append(0)
elif 'dog' in each:
test_label.append(1)
else:
raise Exception('%s is wrong test file'%(each))
test_filenames.append(each)
print(len(train_data), len(test_data))
# 制作100个batch文件
start = 0
end = 200
for num in range(1, 101):
batch_data = train_data[start: end]
batch_label = train_label[start: end]
batch_filenames = train_filenames[start: end]
batch_name = 'training batch {} of 15'.format(num)
all_data = {
'data':batch_data,
'label':batch_label,
'filenames':batch_filenames,
'name':batch_name
}
with open(os.path.join(batch_save_path, 'train_batch_{}'.format(num)), 'wb') as f:
pickle.dump(all_data, f)
start += 200
end += 200
# 制作测试文件
all_test_data = {
'data':test_data,
'label':test_label,
'filenames':test_filenames,
'name':'test batch 1 of 1'
}
with open(os.path.join(batch_save_path, 'test_batch'), 'wb') as f:
pickle.dump(all_test_data, f)
end_time = time.time()
print('制作结束, 用时{}秒'.format(end_time - start_time))
运行程序后,文件就处理好了

二. 神经网络的编写
cnn卷积神经网络的编写如下,编写卷积层、池化层和全连接层的代码
conv1_1 = tf.layers.conv2d(x, 16, (3, 3), padding='same', activation=tf.nn.relu, name='conv1_1')
conv1_2 = tf.layers.conv2d(conv1_1, 16, (3, 3), padding='same', activation=tf.nn.relu, name='conv1_2')
pool1 = tf.layers.max_pooling2d(conv1_2, (2, 2), (2, 2), name='pool1')
conv2_1 = tf.layers.conv2d(pool1, 32, (3, 3), padding='same', activation=tf.nn.relu, name='conv2_1')
conv2_2 = tf.layers.conv2d(conv2_1, 32, (3, 3), padding='same', activation=tf.nn.relu, name='conv2_2')
pool2 = tf.layers.max_pooling2d(conv2_2, (2, 2), (2, 2), name='pool2')
conv3_1 = tf.layers.conv2d(pool2, 64, (3, 3), padding='same', activation=tf.nn.relu, name='conv3_1')
conv3_2 = tf.layers.conv2d(conv3_1, 64, (3, 3), padding='same', activation=tf.nn.relu, name='conv3_2')
pool3 = tf.layers.max_pooling2d(conv3_2, (2, 2), (2, 2), name='pool3')
conv4_1 = tf.layers.conv2d(pool3, 128, (3, 3), padding='same', activation=tf.nn.relu, name='conv4_1')
conv4_2 = tf.layers.conv2d(conv4_1, 128, (3, 3), padding='same', activation=tf.nn.relu, name='conv4_2')
pool4 = tf.layers.max_pooling2d(conv4_2, (2, 2), (2, 2), name='pool4')
flatten = tf.layers.flatten(pool4)
fc1 = tf.layers.dense(flatten, 512, tf.nn.relu)
fc1_dropout = tf.nn.dropout(fc1, keep_prob=keep_prob)
fc2 = tf.layers.dense(fc1, 256, tf.nn.relu)
fc2_dropout = tf.nn.dropout(fc2, keep_prob=keep_prob)
fc3 = tf.layers.dense(fc2, 2, None)
三. tensorflow计算图的搭建
随后,在构建完TensorFlow的计算图之后,并在其中设置了占位符变量的基础上
self.x = tf.placeholder(tf.float32, [None, IMAGE_SIZE, IMAGE_SIZE, 3], 'input_data')
self.y = tf.placeholder(tf.int64, [None], 'output_data')
self.keep_prob = tf.placeholder(tf.float32)
# 图片输入网络中
fc = self.conv_net(self.x, self.keep_prob)
self.loss = tf.losses.sparse_softmax_cross_entropy(labels=self.y, logits=fc)
self.y_ = tf.nn.softmax(fc) # 计算每一类的概率
self.predict = tf.argmax(fc, 1)
self.acc = tf.reduce_mean(tf.cast(tf.equal(self.predict, self.y), tf.float32))
self.train_op = tf.train.AdamOptimizer(LEARNING_RATE).minimize(self.loss)
self.saver = tf.train.Saver(max_to_keep=1)
最后的saver是要将训练好的模型保存到本地。
四. 模型的训练和测试
然后编写训练部分的代码,训练步骤为1万步
acc_list = []
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(TRAIN_STEP):
train_data, train_label, _ = self.batch_train_data.next_batch(TRAIN_SIZE)
eval_ops = [self.loss, self.acc, self.train_op]
eval_ops_results = sess.run(eval_ops, feed_dict={
self.x:train_data,
self.y:train_label,
self.keep_prob:0.7
})
loss_val, train_acc = eval_ops_results[0:2]
acc_list.append(train_acc)
if (i+1) % 100 == 0:
acc_mean = np.mean(acc_list)
print('step:{0},loss:{1:.5},acc:{2:.5},acc_mean:{3:.5}'.format(
i+1,loss_val,train_acc,acc_mean
))
if (i+1) % 1000 == 0:
test_acc_list = []
for j in range(TEST_STEP):
test_data, test_label, _ = self.batch_test_data.next_batch(TRAIN_SIZE)
acc_val = sess.run([self.acc],feed_dict={
self.x:test_data,
self.y:test_label,
self.keep_prob:1.0
})
test_acc_list.append(acc_val)
print('[Test ] step:{0}, mean_acc:{1:.5}'.format(
i+1, np.mean(test_acc_list)
))
# 保存训练后的模型
os.makedirs(SAVE_PATH, exist_ok=True)
self.saver.save(sess, SAVE_PATH + 'my_model.ckpt')
训练结果如下

训练1万步后模型测试的平均准确率有0.82。
五. 识别和分类
最后,在训练完成的基础上,在线识别完成后实现了对所有测试数据集的处理;随后将各阶段处理后的结果分别存储至预设路径下的指定文件夹中
分类为狗的图片如下:

分类为猫的图片如下:

可以看到,在分类过程中存在少数分类不准确的情况。该模型表明仍有优化空间,可以通过以下方式进一步提升:优化网络权重、优化训练策略以及应用数据增强技术。
搞了一番,手和脚也有些累了。是时候撸撸猫继续闲鱼了。。
更多内容,可以到公众号 [ 浪学 ] 一起学习~
