データマイニング（DM）- Python - CNN のバックアップ(No.18)

「.NET 開発基盤部会 Wiki」は、「Open棟梁Project」,「OSSコンソーシアム .NET開発基盤部会」によって運営されています。散布図

戻る
- CRISP-DM
- Excel
- KNIME
- Python
- Python - DL
  - Python - DNN
  - Python - RNN
  - Python - CNN
- DataSet

目次 †

目次
概要
詳細
- 主線
  - 手書き数字画像の認識
  - 商品の自動タグ付け
- 共通項
参考
- scikit-learn
- TensorFlow・Keras

↑

概要 †

CNN

↑

詳細 †

↑

主線 †

↑

手書き数字画像の認識 †

コチラのCNNではなくDNN、TensorFlow・Keras版。

データセット

ロード

from keras.datasets import mnist
(x_train_org, y_train_org), (x_test_org, y_test_org) = mnist.load_data()
print(x_train_org.shape, x_test_org.shape)
print(y_train_org.shape, y_test_org.shape)

確認
関数の定義はコチラ。

show_image_info(x_train_org, y_train_org, [0,1,2,3,4,5,6,7,8,9], 10)

変換

XのKeras入力用型変換

x_train_std = x_train_org.astype('f')
x_test_std = x_test_org.astype('f')

Xの画素を0.0-1.0の範囲に正規化
```
x_train_std /= 255
x_test_std /= 255
```

正解ラベルのOne-Hotエンコーディング
Kerasでは正解ラベルはOne-Hotベクトル化が必要。
・エンコーディング

y_train = np_utils.to_categorical(y_train_org, num_classes=10).astype('i') 
y_test = np_utils.to_categorical(y_test_org, num_classes=10).astype('i')

・デコーディング

print((y_train.argmax(axis=1) == y_train_org).all())
print((y_test.argmax(axis=1) == y_test_org).all())

モデルの定義
以下のように層を重ねて多層パーセプトロン（MLP）モデルを定義

Flatten
Flattenで画像行列をベクトル化

Dense
全結合層（Dense）に入力

Activation（活性化関数）
活性化関数（線形、ReLU、tanh、Sigmoid関数）のモジュール

線型の活性化関数：
決定境界はパーセプトロンと変わらない。
非線型の活性化関数：
・決定境界が曲線となりよく分類できる。
・ニューラルネットで大量のパラメータを用意することで複雑な決定境界を作れる。

Optimizers
勾配降下法のアルゴリズムを決める。-> 勾配降下法アルゴリズムの選択
- SGD (MomentumSGD, NAG)
- Rmsprop
- Adam
- AdaDelta?
- AdaGrad?

batchsize
ミニバッチ学習のサンプル数を決める。

epoch数
学習の繰り返し回数を決める。
- 計算リソースと相談
- 基本は多い方が精度が高い
- 過学習が起きることもある

Activation（softmax関数）
- 予測を行うための計算処理（順伝播計算）
- softmax関数：出力層への入力を確率に変換する活性化関数

誤差逆伝播法で学習を行うための計算処理（逆伝播計算）

cross_entropy誤差関数（損失関数）：
・cross_entropy誤差関数とはニューラルネットに特有の誤差関数
・softmax関数と対になるのはsoftmax_cross_entropy誤差関数
・誤差逆伝播法を行ったときに手前の層まで更新が行き渡る。

ニューラルネットは数式上では大規模な合成関数で、
連鎖律（合成関数の微分の計算規則）で手前の層のパラメタを調整できる｡

確率的勾配降下法：
バッチ学習の最急降下法を改良しサンプリングを使って高速に行う勾配降下法。

# モデルのインスタンスを作成
model = Sequential()

# addメソッドで層を追加していく。

# Flatten：入力の変換層、入力サイズを指定
model.add(Flatten(input_shape=(28, 28)))

# 入力層
# Dense：全結合（線形結合）層、出力サイズを指定
model.add(Dense(900))
# Activation: 活性化関数を定義（ReLU
model.add(Activation('relu'))

# 多層化：繰り返し
model.add(Dense(1000))
model.add(Activation('relu'))
model.add(Dense(500))
model.add(Activation('relu'))

# 出力層：
# Dense：全結合（線形結合）層、出力サイズを指定
model.add(Dense(10))
# Activation: 活性化関数を定義（softmax
model.add(Activation('softmax'))

# 誤差関数、最適化手法、評価基準を指定してコンパイル
# ・損失関数　：categorical_crossentropy（分類の定番、回帰ならRMSE）
# ・最適化手法：SGD（基本的な確率的勾配降下法）
# ・評価方法　：Accuracy（精度）に指定
model.compile(loss='categorical_crossentropy',
              optimizer=SGD(),
              metrics=['accuracy'])

モデルの学習
以下のようなパラメタを設定して学習を実行する。

訓練用入力データ、教師データ
テスト用入力データ、教師データ
batch_size：バッチサイズ、epochs：エポック数

varbose：学習時の表示
・0:表示なし
・1:通常表示
・2:エポックと出力のみ表示

batch_size = 100
n_epoch = 20
# Keras Model の fit() は History オブジェクトを返す
hist = model.fit(x_train_std, y_train,
                 validation_data=(x_test_std, y_test),
                 batch_size=batch_size,
                 epochs=n_epoch,
                 verbose=1
                 )

※ batch_size、epochsについてはコチラ

モデルの推論
関数の定義はコチラ。

index = 10
show_image_info(x_test_org, y_test_org, [0,1,2,3,4,5,6,7,8,9], index)

predict = model.predict(x_test_std[index].reshape(1, 28, 28)).argmax()
answer  = y_test_org[index]

print('predict: ', predict)
print('answer : ', answer)

if predict == answer:
    print('正解')
else:
    print('不正解')

モデルの評価

テストデータで評価

model.evaluate(x_test_std, y_test) #（損失値、精度）を返す

訓練データで評価
エポック毎の精度（acc）と損失値（loss）

val_～はテストデータ、
それ以外は訓練データに対応する。
```
hist.history
```

グラフへの表示
・関数定義

# 損失値(Loss)の遷移のプロット
def plot_history_loss(hist):
    plt.plot(hist.history['loss'],label="loss for training")
    plt.plot(hist.history['val_loss'],label="loss for validation")
    plt.title('model loss')
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.legend(loc='best')
    plt.show()
    
# 精度(Accuracy)の遷移のプロット
def plot_history_acc(hist):
    plt.plot(hist.history['accuracy'],label="accuracy for training")
    plt.plot(hist.history['val_accuracy'],label="accuracy for validation")
    plt.title('model accuracy')
    plt.xlabel('epoch')
    plt.ylabel('accuracy')
    plt.legend(loc='best')
    plt.ylim([0, 1])
    plt.show()

・グラフ表示

plot_history_loss(hist)
plot_history_acc(hist)

混同行列を出力
間違い易い、組合せが解る（1・7、2・7、3・8、3・9、4・9、7・9）。

result = model.predict(x_test_std).argmax(axis=1)
confmat = cm(y_test_org, result)  # y_testはOne-Hot表現前
confmat

混同行列による指標の表示関数

print_metrics(y_test_org, result)  # y_testはOne-Hot表現前

間違った画像の出力
関数の定義はコチラ。

index = (y_test_org != result)
for i, val in enumerate(index):
    if val == True:
        print('predict: ', result[i])
        print('answer : ', y_test_org[i])
        show_image_info(x_test_org, y_test_org, [0,1,2,3,4,5,6,7,8,9], i)

↑

商品の自動タグ付け †

データセット

ロード

サイトから取ってくる場合だが、

import os
from urllib import request
os.mkdir('./datasets') 
url = 'https://.../train.pickle'
request.urlretrieve(url, './datasets/train.pickle')
url = 'https://.../test.pickle'
request.urlretrieve(url, './datasets/test.pickle')
url = 'https://.../label.pickle'
request.urlretrieve(url, './datasets/label.pickle')

Kerasデータセットにあるもよう。

from keras.datasets import cifar10
(x_train,y_train),(x_test,y_test)=cifar10.load_data()

確認

関数を定義

def unpickle(file):
    import pickle
    with open(file, 'rb') as f:
        return pickle.load(f, encoding='bytes')

関数で読込

train = unpickle('./datasets/train.pickle')
test = unpickle('./datasets/test.pickle')
label = unpickle('./datasets/label.pickle')

変数に代入

x_train_org = train['data']
y_train_org = train['label']
x_test_org = test['data']
y_test_org = test['label']
print(x_train_org.shape)
print(y_train_org.shape)
print(x_test_org.shape)
print(y_test_org.shape)

変換

要素の並び順を入れ替える
（Opencv, Kerasの仕様の関係上

# サンプル数, height, width, channelへ変更
x_train = x_train_org.transpose([0, 2, 3, 1])
x_test = x_test_org.transpose([0, 2, 3, 1])

XのKeras入力用型変換

x_train_std = x_train.astype('f')
x_test_std = x_test.astype('f')

Xの画素を0.0-1.0の範囲に正規化
```
x_train_std /= 255
x_test_std /= 255
```

正解ラベルのOne-Hotエンコーディング
Kerasでは正解ラベルはOne-Hotベクトル化が必要。
・エンコーディング

y_train = np_utils.to_categorical(y_train_org, num_classes=5).astype('i') 
y_test = np_utils.to_categorical(y_test_org, num_classes=5).astype('i')

・デコーディング

print((y_train.argmax(axis=1) == y_train_org).all())
print((y_test.argmax(axis=1) == y_test_org).all())

画像とラベルを確認
関数の定義はコチラ。
```
show_image_info(x_train, y_train_org, label, 1300)
```

モデルの定義
畳み込み層とプーリング層が3層、全結合層が2層のCNNモデル

input_shape：入力のテンソルの形
filters：カーネル数＝フィルター数＝特徴マップ数
kernel_size：カーネルのサイズ（辺のピクセル数
strides：カーネルをスライドさせる際のピクセル数

padding：画像の縁取り処理における縁取り幅（kernel_size=3,3ならpadding=1,1(=same)で縮小しない。）

model = Sequential()

# 畳み込み層とプーリング層１ 
model.add(Conv2D(input_shape=(32, 32, 3), filters=64, kernel_size=(4, 4), strides=(1, 1), padding='same')) # 畳み込み層
# バッチ正規化の追加位置
model.add(MaxPool2D(pool_size=(2, 2))) # プーリング増
model.add(Activation('relu'))

# 畳み込み層とプーリング層２
model.add(Conv2D(filters=128, kernel_size=(4, 4), strides=(1, 1), padding='same')) # 畳み込み層
# バッチ正規化の追加位置
model.add(MaxPool2D(pool_size=(2, 2))) # プーリング増
model.add(Activation('relu'))

# 畳み込み層とプーリング層３
model.add(Conv2D(filters=128, kernel_size=(4, 4), strides=(1, 1), padding='same')) # 畳み込み層
# バッチ正規化の追加位置
model.add(MaxPool2D(pool_size=(2, 2))) # プーリング増
model.add(Activation('relu'))

model.add(Flatten())

# 全結合層１
model.add(Dense(512))
model.add(Activation('relu'))

# Dropoutの追加位置

# 全結合層２
model.add(Dense(5))
model.add(Activation('softmax'))

# コンパイル 
model.compile(loss='categorical_crossentropy',
              optimizer=SGD(0.01), # 学習率：0.01
              metrics=['accuracy'])

モデルの学習

batch_size = 500
n_epoch = 30
hist = model.fit(x_train_std , y_train,
                 validation_data=(x_test_std, y_test),
                 batch_size=batch_size,
                 epochs=n_epoch,
                 verbose=1)

モデルの推論
関数の定義はコチラ。

index = 10
show_image_info(x_train, y_train_org, label, index)

predict = model.predict(x_test_std[index].reshape(1, 32, 32, 3)).argmax()
answer  = y_test_org[index]

print('predict: ', predict)
print('answer : ', answer)

if predict == answer:
    print('正解')
else:
    print('不正解')

モデルの評価

テストデータで評価
コチラと同じ。

訓練データで評価
- コチラと同じ。
- 訓練不足であることが解る。

混同行列を出力
コチラと同じ。

メトリック表示
コチラと同じ。

間違った画像の出力
コチラとホボ同じ。

index = (y_test_org != result)
for i, val in enumerate(index):
    if val == True:
        print('predict: ', result[i])
        print('answer : ', y_test_org[i])
        show_image_info(x_test, y_test_org, label, i)

性能向上

訓練不足を補う
batch_size, epochs=n_epoch

チューニングを行う

その他のCNNモデル
ググると色々出てくる。計算量からGPUが必須。

↑

共通項 †

↑

画像やラベルの確認 †

画像の一覧で概要を捉える。

画像とラベルを確認する。

def show_image_info(x, y, label, index):
    print(label[y[index]])
    plt.imshow(x[index].astype(np.uint8))
    plt.show()

一致、不一致を確認

↑

CNNにおける過学習の解決 †

DNN一般の過学習の解決

データ正規化

重みの初期化

早期終了（early stopping）

バッチ正規化（Batch Normalization）

ドロップアウト（Dropout）

CNN限定の過学習の解決

前述の「DNN一般の過学習の解決」を参照

データ正規化（画像）
主線中の「Xの画素を0.0-1.0の範囲に正規化」を参照

データ拡張（data augmentation）

バッチ正規化（Batch Normalization）（モデル定義部

↑

画像の前処理（様々な変換処理） †

OpenCVを使用する。

lenna.pngの

ダウンロード

url = 'https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png'
request.urlretrieve(url, 'lenna.png')

読込・確認

img = cv2.imread('lenna.png')
print(type(img))
print(img.shape)
plt.imshow(img) # OpenCVはBGR解釈なので青みがかる。

変換して保存

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # RGB解釈に変更。
plt.imshow(img) # 元のRGB解釈の色合いで表示される。
cv2.imwrite('new_lenna.jpg', img) # RGB解釈で保存

様々な変換処理

比較関数を定義

def diff_image_info(img1, img2):
    print(img1.shape)
    print(img2.shape)
    plt.subplot(1, 2, 1)
    plt.imshow(img1)
    plt.subplot(1, 2, 2)
    plt.imshow(img2)

リサイズ
学習に画像データを入力するときは画像のサイズを揃える。
```
img2 = cv2.resize(img, (224, 224))
diff_image_info(img, img2)
```

クロップ
画像の一部を切り抜く処理はarrayのスライシングで実装。

ピクセル

img2 = img[100:400,100:400,:]
diff_image_info(img, img2)

比率

h, w, c = img.shape
img2 = img[:, int(w * (1/5)): int(w *(4/5)), :]
diff_image_info(img, img2)

明るさ調整

グレースケール化
色の意味合いが低い場合、モノクロ画像に変換し計算量を減らす。
```
grayed = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
diff_image_info(img, grayed)
```

2値化
グレーを使わない白と黒の2色に変換。
スクリーン・トーンと同様の表現で、更に計算量を減らす。
```
th, binary = cv2.threshold(grayed, 125, 255, cv2.THRESH_BINARY)
diff_image_info(grayed, binary)
```

平滑化
画像をぼやかしノイズ（ガサつきなど）を取り除く。
```
blurred = cv2.GaussianBlur(binary, (11, 11), 0)
diff_image_info(binary, blurred)
```

ヒストグラム平坦化
画素値のヒストグラムが全体的に平になるように濃度変換する処理で、
画像の明るさの範囲を引き伸ばすと明暗部分でも輪郭線が認識し易くなる。
```
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
clahed = clahe.apply(grayed)
diff_image_info(grayed, clahed)
```

Mean Subtraction
0～1正規化と合わせた標準化のようなスケーリング処理
BatchNormalization?のように共変量シフトを抑制できる｡
```
img2 = x.astype('f')
img2 /= 255 # 0～1正規化
img2 -= np.mean(img2) # スケーリング
diff_image_info(img, img2)
```
※ img単位ではなくxに対して適用できる。

画像のピクセル値の正規化

img2 = img.astype('f')
img2 -= img.min()# 最小値を引く
img2 /= img.max()# 最大値で割る 
diff_image_info(img, img2)

※ img単位ではなくxに対して適用できる。

データ拡張（data augmentation）

左右反転

flipped = cv2.flip(img, 1)
diff_image_info(img, flipped)

回転
・関数定義

def opencv_rotate(img, angle=30):
    size = (img.shape[0], img.shape[1])
    center = (int(size[0]/2), int(size[1]/2))
    rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
    return cv2.warpAffine(img, rotation_matrix, size)

・回転実行

rotated = opencv_rotate(img, 30)
diff_image_info(img, rotated)

並進
・関数定義

def opencv_move(img, h=100, v=50):
    rows, cols, channnels = img.shape
    M = np.float32([[1,0,h],[0,1,v]])
    return cv2.warpAffine(img, M, (cols, rows))

・並進実行

moved = opencv_move(img, 200, 100)
diff_image_info(img, moved)

拡大
・関数定義

def opencv_zoomin(img, h=2.0, v=2.0):
    zoomed = cv2.resize(img, None, fx=h, fy=v)
    height_1, width_1, channel_1 = img.shape
    height_2, width_2, channel_2 = zoomed.shape
    x =  int((width_2 - width_1) / 2)
    y =  int((height_2 - height_1) / 2)
    return zoomed[y:y+height_1, x:x+width_1]

・拡大実行

zoomed = opencv_zoomin(img, 2, 3)
diff_image_info(img, zoomed)

ガンマ変換
画像の明るさを全体的に明るく or 暗くする処理
・関数定義

def opencv_gamma(img, gamma=0.5):
    look_up_table = np.zeros((256, 1), dtype='uint8')
    for i in range(256):
        look_up_table[i][0] = 255 * pow(float(i) / 255, 1.0 / gamma)
    return cv2.LUT(img, look_up_table)

・ガンマ変換

img_gamma = opencv_gamma(img, 0.3)
diff_image_info(img, img_gamma)

ガウシアンノイズ
画像の加工過程で発生する粒が散らばったようなノイズを加える。
・関数定義

def opencv_gaussian(img, loc=0.0, scale=5.0):
    row, col, ch = img.shape
    noise = np.random.normal(loc,scale,(row,col,ch))
    noise = noise.reshape(row,col,ch)
    noised = img + noise
    noised /= 255
    return noised

・ガンマ変換

img_gaussian = opencv_gaussian(img, 50, 100)
diff_image_info(img, img_gaussian)

関数例
get_changedとget_augmentedは別々に定義する。
get_augmentedはtrainに対してのみ実行してtestに対しては実行しないので。

バッチ変換

def get_changed(img):
    # グレースケール化
    ...
    
    # ヒストグラム平坦化
    ...
    
    # 平滑化
    ...
    
    # カラーでなくなっている場合、
    # 次元が減っているので、追加する。
    return blurred[:,:,np.newaxis]

バッチ拡張

def get_augmented(img):
    # 左右反転
    if np.random.rand() > 0.5:
        img = cv2.flip(img, 1)
    # 左右度回転
    if np.random.rand() > 0.5:
        size = (img.shape[0], img.shape[1])
        center = (int(size[0]/2), int(size[1]/2))
        angle = np.random.randint(-45, 45) # -45 ～ +45 の範囲で
        rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
        img = cv2.warpAffine(img, rotation_matrix, size)
    return img
plt.imshow(get_augmented(x_train[1300]).astype(np.uint8))

↑

転移学習、ファイン・チューニング †

全結合層部分を含まない学習済みモデルの読込
- weights ：重みの初期値（'random': ランダム, 'imagenet': 学習済みの重み）
- include_top ：全結合層のダウンロード（True: する, False: しない）
- input_tensor：入力テンソルの型（縦, 横, チャンネル）
```
base_model = VGG16(weights='imagenet', include_top=False, input_tensor=Input(shape=(32, 32, 3)))
# inputs.output_shape => (None, 1, 1, 512)
```

追加する全結合層部分を定義

n_class = 5
top_model = Sequential()
top_model.add(Flatten(input_shape=inputs.output_shape[1:]))
top_model.add(Dense(256))
top_model.add(Activation('relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(n_class))
top_model.add(Activation('softmax'))

全結合層部分を含まない学習済みモデルに全結合層部分を追加
```
model = Model(inputs=base_model.input, outputs=top_model(base_model.output))
```

ファイン・チューニングでは、既存部分を再学習する。
- 畳み込み層のパラメタを固定してコンパイル。
- 学習率を通常の 10^-2 倍程度に小さくする｡
```
for layer in model.layers[:15]:
    layer.trainable = False
model.compile(loss='categorical_crossentropy',
              optimizer=SGD(lr=0.0001),
              metrics=['accuracy'])
```

モデルの確認と試用

確認
```
model.summary()
```

前処理

関数定義

def preprocess_vgg16(img):
    # リサイズ
    img= cv2.resize(img, (32, 32))
    # RGBからそれぞれvgg指定の値を引く
    # (mean-subtractionに相当)
    img[:, :, 0] -= 103.939
    img[:, :, 1] -= 116.779
    img[:, :, 2] -= 123.68
    return img

前処理実行
leakageの防止（データ分割後にデータ変換・拡張等）
なお、get_augmented、get_changedについては画像の前処理を参照。

x_train_list = []
for img in x_train:
    x_train_list.append(preprocess_vgg16(get_augmented(get_changed(img))))
x_train_aug = np.array(x_train_list)
x_test_list = []
for img in x_test:
    x_test_list.append(preprocess_vgg16(get_augmented(get_changed(img))))
x_test_aug = np.array(x_test_list)

試用
調整てfitを行う（前述の手順）。

batch_size = 100
n_epoch = 1 # 試用なので回数を減らす

↑

参考 †

↑

scikit-learn †

↑