Codementor Events

How to save/restore a model after training in keras?

Published Aug 24, 2021Last updated Feb 19, 2022
How to save/restore a model after training in keras?

Hello, my name is Alex Polymath.
This is another post about neural network fundamentals - saving and loading
weights of trained NN(neural network)

You can run google colab or use your computer.

Overview

In this tutorial we will use mnist dataset from kaggle

  1. First wi will prepare data for training
  2. Train neural network
  3. save it
  4. load it
  5. test it on test data

1. Download data from kaggle.

There will be 2 files

  • train.csv.zip
  • test.csv.zip
    I've no idea why, but test file doesn't make any sense,
    since there are no lables there.

https://www.kaggle.com/oddrationale/mnist-in-csv

If you using google colab

Drag'n'Drop train.csv.zip file to files šŸ˜ƒ

!unzip train.csv.zip
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('ggplot')
import keras
import matplotlib.pyplot as plt

from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split

Cook the data

train_df = pd.read_csv('/content/train.csv') #might be in other place
train_labels = train_df['label'] #We need Y values - labels
train_labels = train_labels.to_numpy() # nothing smart just convert to numpy array
del train_df['label'] # remove label from original dataframe to use it as X
train_data = train_df.to_numpy()


# we can't use values 1,2,3,4,4,5 for Y
# instead we should use smth like [1,0,0,0,0,0], [0,1,0,0,0,0], ...
y = LabelBinarizer().fit_transform(train_labels) 


#Split train and test data
X_train, X_test, y_train, y_test = train_test_split(train_data, y, test_size=0.1)

  1. Train neural network

2. Compile neural network

# Define sequential model

model = keras.Sequential()

# Define the first layer
model.add(keras.layers.Dense(128, activation="relu", input_shape=(784,)))
model.add(keras.layers.Dense(128, activation="relu", input_shape=(128,)))
model.add(keras.layers.Dense(128, activation="relu", input_shape=(128,)))
model.add(keras.layers.Dense(128, activation="relu", input_shape=(128,)))

# Add activation function to classifier
model.add(keras.layers.Dense(10, activation='softmax'))

# Finish the modecl compilation
model.compile('adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Complete the model fit operation

# callbacks=[PlotLossesKeras()] - this is a single magic line of code which draw #live chart
model.fit(train_data, y, epochs=10, validation_data=(X_test, y_test), callbacks=[PlotLossesKeras()], verbose=0)

3. Save weight

model.save('/content/mynn')

4. Load weights

test_model = keras.models.load_model('/content/mynn')

5. test it on test data

test_df = pd.read_csv('/content/mnist_test.csv') #might be in other place
test_labels = test_df['label'] #We need Y values - labels
test_labels = test_labels.to_numpy() # nothing smart just convert to numpy array
del test_df['label'] # remove label from original dataframe to use it as X
test_data = test_df.to_numpy()

Let visualize some random item

img = test_data[3].reshape(28,28)
plt.imshow(img)

download.png


//we make a trick with np array to wrap single item into array
// because predict method predicts many values at once
y_proba = test_model.predict(np.array([test_data[3]]))
y_classes = y_proba.argmax(axis=-1)
y_classes

Follow me in twitter
@alexpolymath

Discover and read more posts from Alex Polymath
get started
post comments1Reply
avensis david
3 years ago

I find it really interesting thanks for sharing