Image Classification ด้วย Convolutional Neural Networks (CNN)

5 min readAug 11, 2020

Image Classification

คือ จำแนกว่า Pixel หลายล้าน Pixel แต่ละจุด คืออะไร จะได้ผลออกมาเป็นแบ่งเป็นพื้นที่สีต่าง ๆ ซึ่งแต่ละสีหมายความถึงลักษณะที่แตกต่างกัน เช่น บ้าน, ถนน, ต้นไม้

Convolutional Neural Network (CNN)

คือ วิธีการหนึ่งสำหรับการเรียนรู้ของ Machine Learning ซึ่งจะนำโมเดลเรียนรู้นี้ไปใช้เพื่อการจำแนกรูปภาพ วิดีโอ ข้อความ และเสียง เป็นต้น ยังมีความสามารถสำหรับการค้นหารูปแบบหรือคุณลักษณะเด่นของภาพ เพื่อการจดจำรูปแบบภาพต่างๆ ได้แก่ ใบหน้าคน ภาพวัตถุ ฉากหลัง และอื่นๆ ด้วย

Try it out

เราจะมาทดสอบ Model ด้วย Deep Learning จะใช้ Python กับ Tensorflow ในการสร้าง โดยจะทำผ่าน Jupyter Notebook ซึ่งโจทย์ คือ การทำนายผลไม้ จะแบ่งออกเป็น 4 ชนิดด้วยกัน ว่ารูปที่เราใส่เข้าไปใน model จะสามารถทำนายได้อย่างถูกต้องหรือไม่

ภาพรวมขั้นตอน มีขั้นตอนคือ

Data preparing
Create model
Train Model
Save
Load
Prediction

ก่อนจะไปเริ่มจะต้องทำสิ่งเหล่านี้ก่อน เราต้องมีสิ่งจำเป็นก่อนการใช้งานคือ

Cuda
Cudnn for Cuda
Tensorflow , Tensorflow-gpu

หลังจากติดตั้งเสร็จสมบูรณ์แล้ว เข้า jupyter notebook เพื่อทำการเช็คเครื่องคอมให้พร้อมสำหรับในการทำ Image Classification

คำสั่งนี้ คือ การเช็คว่าสามารถรันการ์ดจอได้หรือไม่

!nvidia-smi -L

output

คำสั่งนี้ คือ การตั้งค่า Tensorflow ให้ทำงานผ่าน GPU

import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)
print( 'Tensorflow Version:', tf.__version__)
print("GPU Available::", tf.config.list_physical_devices('GPU'))

ถ้าทำตามคำสั่งทั้งหมดผ่านแล้ว เริ่มขั้นตอนต่อไปได้เลย

Data preparing

เริ่มจากหารูปผลไม้ที่ต้องการ 4 ชนิด เป็น Dataset ของเรา ดาวน์โหลดรูปแล้ว save เป็นไฟล์ .jpg ในที่นี้ยกตัวอย่างคือ แก้วมังกร เลมอน สับปะรด และแตงโม ชนิดละ 30 รูป

กลับมาทำที่ jupyter notebook อีกครั้ง จะ Import Package ที่ต้องใช้ ดังนี้

import PILimport timeimport numpy as np
import matplotlib.pyplot as pltimport plotly
import plotly.graph_objs as gofrom tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.models import load_model
from tensorflow.keras.utils import plot_model
from tensorflow.keras.models import model_from_jsonfrom sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
import pickle as pfrom tensorflow.keras import layersto_categorical = tf.keras.utils.to_categorical

คำสั่งต่อไป นำ Dataset มาใช้ประมวลผลโดยใช้คำสั่ง

import pathlib
data_set_dir = "E:/AI/dataset_image"
data_dir = pathlib.Path(data_set_dir)   
image_count = len(list(data_dir.glob('*/*.jpg')))  
print("count of image which .jpg : ",image_count)

data_set_dir คือ path ของ Dataset ที่เตรียมไว้

image_count คือ ขนาดของ dataset

output : จำนวนรูปภาพทั้งหมดที่มีไฟล์เป็น .jpg

คำสั่งต่อไป กำหนดให้ขนาดรูปเท่ากันก่อนไปทำการ Training

batch_size = 10 
img_height = 150
img_width = 150

batch_size คือ ค่าที่กำหนดขนาดในการอ่านข้อมูลต่อครั้ง หรือจำนวนรูปภาพที่อ่านในแต่ละครั้ง

img_height คือ ความสูงของรูปภาพ

img_width คือ ความกว้างของรูปภาพ

คำสั่งต่อไป การสร้างข้อมูลก่อนไปทำการสร้าง Model

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,  #sแบ่งข้อมูล เพื่อ training 80% และ validate 20%
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

validation_split คือ การแบ่งข้อมูล training 80% และ validate 20%

คำสั่งต่อไป check มี Class อะไรบ้าง

class_names = train_ds.class_names
print(class_names)

output : ชื่อ class ตามชื่อผลไม้

คำสั่งต่อไป แสดงผลรูปภาพพร้อมกับชื่อผลไม้ที่อยู่ใน Dataset จำนวน 9 รูป

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")

คำสั่งต่อไป check ขนาดข้อมูลมีค่าเท่ากับที่ตั้งค่าไว้หรือไม่

for image_batch, labels_batch in train_ds:  (btach,width,height,color chanal)
  print(image_batch.shape)
  print(labels_batch.shape)
  break

output : bath_size, img_height, img_width

คำสั่งต่อไป การนำรูปภาพทั้งหมดมา Normalization หรือผ่าน Sigmoid_Function เพื่อแปลงค่าสีจาก 255 ในแต่ละ Chanal ให้เป็นค่า 0–1

normalization_layer = layers.experimental.preprocessing.Rescaling(1./255) 
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]print(np.min(first_image), np.max(first_image))

Create model

num_classes = 4
epochs= 15
model = Sequential([
  layers.experimental.preprocessing.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

epochs คือ จำนวนครั้งที่เราจะ Train

num_class คือ จำนวนคลาส ซึ่งมีค่าคือ 4

และเราจะกำหนดค่าใน model เป็นวิธีที่เราจะใช้ในการจำแนกภาพต่างๆ

คำสั่งต่อไป การหาค่าความถูกต้อง หรือ Accuracy ในตอน Train

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy']) 
model.summary()

Train Model

start = time.time()
his = model.fit(
  train_ds,
  validation_data = val_ds,
  epochs = epochs
)

จะเก็บรูปแบบ Model ในตัวแปร his

Save

บันทึก History รวมทั้ง Weight ของ Model เพื่อจะเรียกใช้ model นี้ได้ทันทีโดยไม่ต้อง Train ใหม่

with open('history_model', 'wb') as file:
    p.dump(his.history, file)
    
filepath='model1.h5'
model.save(filepath)
filepath_model = 'model1.json'
filepath_weights = 'weights_model.h5'
model_json = model.to_json()
with open(filepath_model, "w") as json_file:
    json_file.write(model_json)
    
    model.save_weights('weights_model.h5')
    print("Saved model to disk")

Load

เพื่อ Load History, Plot กราฟ loss และ val_loss รวมทั้ง Load Weight ของ Model

with open('history_model', 'rb') as file:
     his = p.load(file)
filepath='model1.h5'
filepath_model = 'model1.json'
filepath_weights = 'weights_model.h5'
h1 = go.Scatter(y=his['loss'], 
                    mode="lines", line=dict(
                    width=2,
                    color='blue'),
                    name="loss"
                   )
h2 = go.Scatter(y=his['val_loss'], 
                    mode="lines", line=dict(
                    width=2,
                    color='red'),
                    name="val_loss"
                   )
                   
data = [h1,h2]
layout1 = go.Layout(title='Loss',
                   xaxis=dict(title='epochs'),
                   yaxis=dict(title=''))
fig1 = go.Figure(data, layout=layout1)
plotly.offline.iplot(fig1, filename="testfruit")
predict_model = load_model(filepath) 
predict_model.summary()
with open(filepath_model, 'r') as f:
    loaded_model_json = f.read()
    predict_model = model_from_json(loaded_model_json)
    predict_model.load_weights(filepath_weights)    
    print("Loaded model from disk")

output : แสดงค่า Loss กับ จำนวนในการ Train

Prediction

การจำแนกผลไม้ด้วย Model ที่ Train แล้วโดยจะใช้ภาพที่เตรียมไว้

นำ Path ของรูปนี้และให้ Model ทำการ Predict ว่ารูปนี้คือผลไม้ชนิดใด จะใช้คำสั่ง

import requests
from IPython.display import Image
from io import BytesIO
from tensorflow import kerastest_path = ("E:/AI/dataset_image/watermelon/Image_28.jpg")
img =keras.preprocessing.image.load_img(
    test_path, target_size=(img_height, img_width)
)
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch
predictions = predict_model.predict(img_array)
score = tf.nn.softmax(predictions[0])
print("dragonfruit",score[0])
print("lemon",score[1])
print( "pineapple",score[2])
print( "watermelon",score[3])
display(Image(filename=test_path,width=150, height=150))

test_path คือ path ของรูปที่จะ Predict

output ที่ออกมาดูจากค่าที่คำนวนออกมา ซึ่งค่าของ watermelon มีค่ามากที่สุด และผลออกว่าเป็นแตงโม 100%

Reference

https://blog.pjjop.org/deep-learning/