GoogleNet-Iception介绍与实战

Posted by Fangjuntao on 2022-01-03

==主要参考:== https://blog.csdn.net/abc13526222160/article/details/95472241

代码参考:
https://github.com/calmisential/InceptionV4_TensorFlow2

GooleNet的简单介绍

GoogleNet-Iception V1(2014)介绍

  1. 为什么提出Inception
    提高网络最简单粗暴的方法就是提高网络的深度和宽度,即增加隐层和以及各层神经元数目。但这种简单粗暴的方法存在一些问题:

    ① 参数太多,若训练数据集有限,容易过拟合;
    ② 网络越大计算复杂度越大,难以应用;
    ③ 网络越深,梯度越往后穿越容易消失,难以优化模型(这个时候还没有提出BN时,网络的优化极其困难)。

    基于此,为了提高网络计算资源的利用率,在计算量不变的情况下,提高网络的宽度和深度。论文作者认为,解决这种困难的方法就是,把全连接改成稀疏连接,卷积层也是稀疏连接,但是不对称的稀疏数据数值计算效率低下,因为硬件全是针对密集矩阵优化的,所以,我们要找到卷积网络可以近似的最优局部稀疏结构,并且该结构下可以用现有的密度矩阵计算硬件实现,产生的结果就是Inception。

  2. Inception模块介绍
    image
    其中b是对a的改进,用1×1的卷积核进行降维:

  • ① 降低维度(通道数),减少计算瓶颈。
  • ② 增加网络层数,提高网络的表达能力。
  1. googLeNet-Inception V1结构
    image

GoogleNet-Iception V2介绍

这篇论文主要思想在于提出了Batch Normalization,其次就是稍微改进了一下Inception。

  1. Batch Normalization
    1.实现算法
    image

    1. BN的本质:
      我的理解BN的主要作用就是:
    • ① 加速网络训练
    • ② 防止梯度消失

    image

  2. Inception V2结构:
    大尺寸的卷积核可以带来更大的感受野,也意味着更多的参数,比如5x5卷积核参数是3x3卷积核的25/9=2.78倍。为此,作者提出可以用2个连续的3x3卷积层(stride=1)组成的小网络来代替单个的5x5卷积层,这便是Inception V2结构。这也是VGG那篇论文所提到的思想。这样做法有两个优点:

    • ① 保持相同感受野的同时减少参数
    • ② 加强非线性的表达能力
      image
      image

GoogleNet-Iception V3介绍

大卷积核完全可以由一系列的3x3卷积核来替代,那能不能分解的更小一点呢。Inception V2中:将 5X5 的卷积核替换成2个 3X3 的卷积核
++另一种方法就是将nxn的卷积都可以通过1xn卷积后接nx1卷积来替代,计算量又会降低++。但是第二种分解方法在大维度的特征图上表现不好,在特征图12-20维度上表现好。不对称分解方法有几个优点:

  • ① 节约了大量的参数
  • ② 增加一层非线性,提高模型的表达能力
  • ③ 可以处理更丰富的空间特征,增加特征的多样性
    image
    image

GoogleNet-Iception V4介绍

  1. 这篇论文,没有公式,全篇都是画图,就是网络结构。主要思想很简单:Inception表现很好,很火的ResNet表现也很好,那就想办法把他们结合起来呗。

image
image

PS: ==注意Inception-v4 network没有把Inception与ResNet结合==,Inception-ResNet才是将二者结合。

Inception-v4:
image

Inception-resnet moduels
image

  1. 还有几个作者通过实验总结的几个知识点:。

    • ① Residual Connection: 作者认为残差连接并不是深度网络所必须的(PS:ResNet的作者说残差连接时深度网络的标配),没有残差连接的网络训练起来并不困难,因为有好的初始化以及Batch Normalization,但是它确实可以大大的提升网路训练的速度。

    • ② Residual Inception Block:
      image
      画圈的部分,那个1×1的卷积层并没有激活函数,这个作用主要是维度对齐。

    • ③ Scaling of the Residual:当过滤器的数目超过1000个的时候,会出现问题,网络会“坏死”,即在average pooling层前都变成0。即使降低学习率,增加BN层都没有用。这时候就在激活前缩小残差可以保持稳定。即下图:
      image

    • 网络精度提高原因:残差连接只能加速网络收敛,真正提高网络精度的还是“更大的网络规模”。

Tensorflow2.0实战GoogleNet_inception_V4

https://github.com/calmisential/InceptionV4_TensorFlow2/blob/master/inception_modules.py

  1. inception_modules.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359

import tensorflow as tf


class BasicConv2D(tf.keras.layers.Layer):
def __init__(self, filters, kernel_size, strides, padding):
super(BasicConv2D, self).__init__()
self.conv = tf.keras.layers.Conv2D(filters=filters,
kernel_size=kernel_size,
strides=strides,
padding=padding)
self.bn = tf.keras.layers.BatchNormalization()

def call(self, inputs, training=None, **kwargs):
x = self.conv(inputs)
x = self.bn(x, training=training)
x = tf.nn.relu(x)

return x




class Stem(tf.keras.layers.Layer):
def __init__(self):
super(Stem, self).__init__()
self.conv1 = BasicConv2D(filters=32,
kernel_size=(3, 3),
strides=2,
padding="valid")
self.conv2 = BasicConv2D(filters=32,
kernel_size=(3, 3),
strides=1,
padding="valid")
self.conv3 = BasicConv2D(filters=64,
kernel_size=(3, 3),
strides=1,
padding="same")
self.b1_maxpool = tf.keras.layers.MaxPool2D(pool_size=(3, 3),
strides=2,
padding="valid")
self.b2_conv = BasicConv2D(filters=96,
kernel_size=(3, 3),
strides=2,
padding="valid")
self.b3_conv1 = BasicConv2D(filters=64,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv2 = BasicConv2D(filters=96,
kernel_size=(3, 3),
strides=1,
padding="valid")
self.b4_conv1 = BasicConv2D(filters=64,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b4_conv2 = BasicConv2D(filters=64,
kernel_size=(7, 1),
strides=1,
padding="same")
self.b4_conv3 = BasicConv2D(filters=64,
kernel_size=(1, 7),
strides=1,
padding="same")
self.b4_conv4 = BasicConv2D(filters=96,
kernel_size=(3, 3),
strides=1,
padding="valid")
self.b5_conv = BasicConv2D(filters=192,
kernel_size=(3, 3),
strides=2,
padding="valid")
self.b6_maxpool = tf.keras.layers.MaxPool2D(pool_size=(3, 3),
strides=2,
padding="valid")

def call(self, inputs, training=None, **kwargs):
x = self.conv1(inputs, training=training)
x = self.conv2(x, training=training)
x = self.conv3(x, training=training)
branch_1 = self.b1_maxpool(x)
branch_2 = self.b2_conv(x, training=training)
x = tf.concat(values=[branch_1, branch_2], axis=-1)
branch_3 = self.b3_conv1(x, training=training)
branch_3 = self.b3_conv2(branch_3, training=training)
branch_4 = self.b4_conv1(x, training=training)
branch_4 = self.b4_conv2(branch_4, training=training)
branch_4 = self.b4_conv3(branch_4, training=training)
branch_4 = self.b4_conv4(branch_4, training=training)
x = tf.concat(values=[branch_3, branch_4], axis=-1)
branch_5 = self.b5_conv(x, training=training)
branch_6 = self.b6_maxpool(x, training=training)
x = tf.concat(values=[branch_5, branch_6], axis=-1)

return x


class InceptionBlockA(tf.keras.layers.Layer):
def __init__(self):
super(InceptionBlockA, self).__init__()
self.b1_pool = tf.keras.layers.AveragePooling2D(pool_size=(3, 3),
strides=1,
padding="same")
self.b1_conv = BasicConv2D(filters=96,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b2_conv = BasicConv2D(filters=96,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv1 = BasicConv2D(filters=64,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv2 = BasicConv2D(filters=96,
kernel_size=(3, 3),
strides=1,
padding="same")
self.b4_conv1 = BasicConv2D(filters=64,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b4_conv2 = BasicConv2D(filters=96,
kernel_size=(3, 3),
strides=1,
padding="same")
self.b4_conv3 = BasicConv2D(filters=96,
kernel_size=(3, 3),
strides=1,
padding="same")

def call(self, inputs, training=None, **kwargs):
b1 = self.b1_pool(inputs)
b1 = self.b1_conv(b1, training=training)

b2 = self.b2_conv(inputs, training=training)

b3 = self.b3_conv1(inputs, training=training)
b3 = self.b3_conv2(b3, training=training)

b4 = self.b4_conv1(inputs, training=training)
b4 = self.b4_conv2(b4, training=training)
b4 = self.b4_conv3(b4, training=training)

return tf.concat(values=[b1, b2, b3, b4], axis=-1)


class ReductionA(tf.keras.layers.Layer):
def __init__(self, k, l, m, n):
super(ReductionA, self).__init__()
self.b1_pool = tf.keras.layers.MaxPool2D(pool_size=(3, 3),
strides=2,
padding="valid")
self.b2_conv = BasicConv2D(filters=n,
kernel_size=(3, 3),
strides=2,
padding="valid")
self.b3_conv1 = BasicConv2D(filters=k,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv2 = BasicConv2D(filters=l,
kernel_size=(3, 3),
strides=1,
padding="same")
self.b3_conv3 = BasicConv2D(filters=m,
kernel_size=(3, 3),
strides=2,
padding="valid")

def call(self, inputs, training=None, **kwargs):
b1 = self.b1_pool(inputs)

b2 = self.b2_conv(inputs, training=training)

b3 = self.b3_conv1(inputs, training=training)
b3 = self.b3_conv2(b3, training=training)
b3 = self.b3_conv3(b3, training=training)

return tf.concat(values=[b1, b2, b3], axis=-1)


class InceptionBlockB(tf.keras.layers.Layer):
def __init__(self):
super(InceptionBlockB, self).__init__()
self.b1_pool = tf.keras.layers.AveragePooling2D(pool_size=(3, 3),
strides=1,
padding="same")
self.b1_conv = BasicConv2D(filters=128,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b2_conv = BasicConv2D(filters=384,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv1 = BasicConv2D(filters=192,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv2 = BasicConv2D(filters=224,
kernel_size=(1, 7),
strides=1,
padding="same")
self.b3_conv3 = BasicConv2D(filters=256,
kernel_size=(1, 7),
strides=1,
padding="same")
self.b4_conv1 = BasicConv2D(filters=192,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b4_conv2 = BasicConv2D(filters=192,
kernel_size=(1, 7),
strides=1,
padding="same")
self.b4_conv3 = BasicConv2D(filters=224,
kernel_size=(7, 1),
strides=1,
padding="same")
self.b4_conv4 = BasicConv2D(filters=224,
kernel_size=(1, 7),
strides=1,
padding="same")
self.b4_conv5 = BasicConv2D(filters=256,
kernel_size=(7, 1),
strides=1,
padding="same")

def call(self, inputs, training=None, **kwargs):
b1 = self.b1_pool(inputs)
b1 = self.b1_conv(b1, training=training)

b2 = self.b2_conv(inputs, training=training)

b3 = self.b3_conv1(inputs, training=training)
b3 = self.b3_conv2(b3, training=training)
b3 = self.b3_conv3(b3, training=training)

b4 = self.b4_conv1(inputs, training=training)
b4 = self.b4_conv2(b4, training=training)
b4 = self.b4_conv3(b4, training=training)
b4 = self.b4_conv4(b4, training=training)
b4 = self.b4_conv5(b4, training=training)

return tf.concat(values=[b1, b2, b3, b4], axis=-1)


class ReductionB(tf.keras.layers.Layer):
def __init__(self):
super(ReductionB, self).__init__()
self.b1_pool = tf.keras.layers.MaxPool2D(pool_size=(3, 3),
strides=2,
padding="valid")
self.b2_conv1 = BasicConv2D(filters=192,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b2_conv2 = BasicConv2D(filters=192,
kernel_size=(3, 3),
strides=2,
padding="valid")
self.b3_conv1 = BasicConv2D(filters=256,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv2 = BasicConv2D(filters=256,
kernel_size=(1, 7),
strides=1,
padding="same")
self.b3_conv3 = BasicConv2D(filters=320,
kernel_size=(7, 1),
strides=1,
padding="same")
self.b3_conv4 = BasicConv2D(filters=320,
kernel_size=(3, 3),
strides=2,
padding="valid")

def call(self, inputs, training=None, **kwargs):
b1 = self.b1_pool(inputs)

b2 = self.b2_conv1(inputs, training=training)
b2 = self.b2_conv2(b2, training=training)

b3 = self.b3_conv1(inputs, training=training)
b3 = self.b3_conv2(b3, training=training)
b3 = self.b3_conv3(b3, training=training)
b3 = self.b3_conv4(b3, training=training)

return tf.concat(values=[b1, b2, b3], axis=-1)


class InceptionBlockC(tf.keras.layers.Layer):
def __init__(self):
super(InceptionBlockC, self).__init__()
self.b1_pool = tf.keras.layers.AveragePooling2D(pool_size=(3, 3),
strides=1,
padding="same")
self.b1_conv = BasicConv2D(filters=256,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b2_conv = BasicConv2D(filters=256,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv1 = BasicConv2D(filters=384,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b3_conv2 = BasicConv2D(filters=256,
kernel_size=(1, 3),
strides=1,
padding="same")
self.b3_conv3 = BasicConv2D(filters=256,
kernel_size=(3, 1),
strides=1,
padding="same")
self.b4_conv1 = BasicConv2D(filters=384,
kernel_size=(1, 1),
strides=1,
padding="same")
self.b4_conv2 = BasicConv2D(filters=448,
kernel_size=(1, 3),
strides=1,
padding="same")
self.b4_conv3 = BasicConv2D(filters=512,
kernel_size=(3, 1),
strides=1,
padding="same")
self.b4_conv4 = BasicConv2D(filters=256,
kernel_size=(3, 1),
strides=1,
padding="same")
self.b4_conv5 = BasicConv2D(filters=256,
kernel_size=(1, 3),
strides=1,
padding="same")

def call(self, inputs, training=None, **kwargs):
b1 = self.b1_pool(inputs)
b1 = self.b1_conv(b1, training=training)

b2 = self.b2_conv(inputs, training=training)

b3 = self.b3_conv1(inputs, training=training)
b3_1 = self.b3_conv2(b3, training=training)
b3_2 = self.b3_conv3(b3, training=training)

b4 = self.b4_conv1(inputs, training=training)
b4 = self.b4_conv2(b4, training=training)
b4 = self.b4_conv3(b4, training=training)
b4_1 = self.b4_conv4(b4, training=training)
b4_2 = self.b4_conv5(b4, training=training)

return tf.concat(values=[b1, b2, b3_1, b3_2, b4_1, b4_2], axis=-1)
  1. inception_v4.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61

import tensorflow as tf
from inception_modules import Stem, InceptionBlockA, InceptionBlockB, \
InceptionBlockC, ReductionA, ReductionB

NUM_CLASSES = 10


def build_inception_block_a(n):
block = tf.keras.Sequential()
for _ in range(n):
block.add(InceptionBlockA())
return block


def build_inception_block_b(n):
block = tf.keras.Sequential()
for _ in range(n):
block.add(InceptionBlockB())
return block


def build_inception_block_c(n):
block = tf.keras.Sequential()
for _ in range(n):
block.add(InceptionBlockC())
return block


class InceptionV4(tf.keras.Model):
def __init__(self):
super(InceptionV4, self).__init__()
self.stem = Stem()
self.inception_a = build_inception_block_a(4)
self.reduction_a = ReductionA(k=192, l=224, m=256, n=384)
self.inception_b = build_inception_block_b(7)
self.reduction_b = ReductionB()
self.inception_c = build_inception_block_c(3)

self.avgpool = tf.keras.layers.AveragePooling2D(pool_size=(8, 8))

self.dropout = tf.keras.layers.Dropout(rate=0.2)
self.flat = tf.keras.layers.Flatten()
self.fc = tf.keras.layers.Dense(units=NUM_CLASSES,
activation=tf.keras.activations.softmax)

def call(self, inputs, training=True, mask=None):
x = self.stem(inputs, training=training)
x = self.inception_a(x, training=training)
print(x)
x = self.reduction_a(x, training=training)
x = self.inception_b(x, training=training)
x = self.reduction_b(x, training=training)
x = self.inception_c(x, training=training)
print('inception_c:',x)
x = self.avgpool(x)
x = self.dropout(x, training=training)
x = self.flat(x)
x = self.fc(x)

return x

PS: 不能直接用上诉网络来进行mnist数据集的训练,因为参数的原因
image
image
minist数据集的图片像素为2828=786 ,比这个小
(图片像素必须大于或等于299
299才可)

注意:299,299,1 也行
image
image

注:如下就不行了:
image
image

解析:
image
image

而我们的200,200,1在如下出来:
image
变为:image
所以不能用 Average Pooling的8*8了

所以,若想用来训练mnist数据集,必须改网络结构的参数,甚至结构

  1. 我的错误代码:(由于mnist数据集不足以用此网络)
    inception_V4_mnist.py:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77

import tensorflow as tf
from tensorflow.python.keras.api._v2.keras import layers, optimizers, datasets, Sequential
import tensorflow.keras as keras
import numpy as np
import inception_v4

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.random.set_seed(22)

batchsize = 512

def preprocess(x, y): #数据预处理
x = tf.cast(x, dtype=tf.float32)/ 255. - 0.5
y = tf.cast(y, dtype=tf.int32)
return x,y

(x_train, y_train),(x_test, y_test) = datasets.mnist.load_data()

print(x_train.shape, y_train.shape)

# # [b, 28, 28] => [b, 28, 28, 1]
# x_train, x_test = np.expand_dims(x_train, axis=3), np.expand_dims(x_test, axis=3)

#训练集预处理
db_train = tf.data.Dataset.from_tensor_slices((x_train,y_train)) #构造数据集,这里可以自动的转换为tensor类型了
db_train = db_train.map(preprocess).shuffle(10000).batch(batchsize)

#测试集预处理
db_test = tf.data.Dataset.from_tensor_slices((x_test,y_test)) #构造数据集
db_test = db_test.map(preprocess).shuffle(10000).batch(batchsize)

db_iter = iter(db_train)
sample = next(db_iter)
print("batch: ", sample[0].shape, sample[1].shape)

# 调用Inception
model = inception_v4.InceptionV4() #记得加(),否则只是一个类,不是一个实例,下面用build()时会报错
print(model)
# derive input shape for every layers.
model.build(input_shape=(None, 299, 299, 1))
model.summary()

optimizer =optimizers.Adam(learning_rate=1e-3)
criteon = keras.losses.CategoricalCrossentropy(from_logits=True) # 分类器

acc_meter = keras.metrics.Accuracy()

for epoch in range(100):

for step, (x, y) in enumerate(db_train):

with tf.GradientTape() as tape:
# print(x.shape, y.shape)
# [b, 10]
logits = model(x)
# [b] vs [b, 10]
loss = criteon(tf.one_hot(y, depth=10), logits)

grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))

if step % 20 == 0:
print(epoch, step, 'loss:', loss.numpy())

# 测试集测试
acc_meter.reset_states()
for x, y in db_test:
# [b, 10]
logits = model(x, training=False)
# [b, 10] => [b]
pred = tf.argmax(logits, axis=1)
# [b] vs [b, 10]
acc_meter.update_state(y, pred)

print(epoch, 'evaluation acc:', acc_meter.result().numpy())