Basic regression with keras-Simulated Datas/ Big Batch size
In this serie we will be dealing with a simple csv data file which structure is:
indf | ndvi | dnbr |
---|---|---|
Basically, these datas are simulated random ndvi and $dnbr=3*ndvi+1+\epsilon$ with $\epsilon\sim \mathcal{N}(0,0.1)$ and $ ndvi\sim \mathcal{U}[0,1]$
As of indf: it is our primary key column. Three indfs shuffled files permit us to train, validate and test our algorithm.
The few next lines are just a tip for final layout centering.
from IPython.core.display import HTML as Center
Center(""" <style>
.output_png {
display: table-cell;indf
text-align: center;
vertical-align: middle;
height:1600;
width:600;
}
</style> """)
Let’s begin by charging the necessary libraries:
import os
import re
import random
from matplotlib import pyplot as plt
from numpy.polynomial.polynomial import polyfit
from scipy.stats import gaussian_kde
import pandas as pd
import numpy as np
from tensorflow import keras
import keras.backend as K
from tensorflow.keras import layers
from generator_from_one_file import DataGenerator
Then we can prepare the datas based on those three files we talked about earlier:
data_file = "test_data.csv"
training_ids_file="training.csv"
#training_ids_file=data_path+"train.csv"
test_ids_file="test.csv"
#test_ids_file=data_path+"extra.csv"
validation_ids_file="validation.csv"
#validation_ids_file=data_path+"val.csv"
Doing a large usage of generators (one can refer to generator_from_one_file.py):
batch_size=32
training=DataGenerator(training_ids_file,data_file,batch_size)
training_count=training.get_infos()
test=DataGenerator(test_ids_file,data_file,batch_size)
test_count=training.get_infos()
validation=DataGenerator(validation_ids_file,data_file,batch_size)
validation_count=validation.get_infos()
Time now to implement the network: a very simple one neuron-one layer without activation function:
def build_model():
model = keras.Sequential([
layers.Input(shape=(1,)),
layers.Dense(1,activation="linear"),
])
opt = keras.optimizers.Adam()
model.compile(optimizer=opt, loss="mse", metrics=["mae"])
return model
We write a simple callback function that will feed a dictionnary with both weights on each epoch:
class GetWeights(keras.callbacks.Callback):
def __init__(self):
super(GetWeights, self).__init__()
self.weight_dict = {}
def on_epoch_end(self, epoch, logs=None):
w = model.layers[0].get_weights()[0]
b = model.layers[0].get_weights()[1]
if epoch == 0:
# create array to hold weights and biases
self.weight_dict['w_'] = w
self.weight_dict['b_'] = b
else:
# append new weights to previously-created weights array
self.weight_dict['w_'] = np.dstack(
(self.weight_dict['w_'], w))
# append new weights to previously-created weights array
self.weight_dict['b_'] = np.dstack(
(self.weight_dict['b_'], b))
gw = GetWeights()
Let’s define the Callback functions that will allow us to analyse our model
callbacks = [
keras.callbacks.ModelCheckpoint('one_file.keras',
save_best_only=True),
keras.callbacks.TensorBoard(log_dir="./logs",histogram_freq=1),
gw
]
Build and learn:
model=build_model()
model.summary()
history = model.fit(training,
batch_size=batch_size,
steps_per_epoch=int(training_count/batch_size),
epochs=50,
validation_data=validation,
validation_steps=int(validation_count/batch_size),
callbacks=callbacks,
use_multiprocessing=True,
workers=30,
)
2022-04-25 09:47:13.855388: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-25 09:47:14.948320: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 93 MB memory: -> device: 0, name: Quadro P2000, pci bus id: 0000:3b:00.0, compute capability: 6.1
2022-04-25 09:47:14.949095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 4188 MB memory: -> device: 1, name: Quadro P2000, pci bus id: 0000:af:00.0, compute capability: 6.1
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 1) 2
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________
Epoch 1/50
104/104 [==============================] - 4s 23ms/step - loss: 9.4443 - mae: 2.8659 - val_loss: 9.2228 - val_mae: 2.8292
Epoch 2/50
104/104 [==============================] - 3s 26ms/step - loss: 8.5376 - mae: 2.7154 - val_loss: 8.3244 - val_mae: 2.6775
Epoch 3/50
104/104 [==============================] - 3s 26ms/step - loss: 7.6955 - mae: 2.5666 - val_loss: 7.5172 - val_mae: 2.5340
Epoch 4/50
104/104 [==============================] - 3s 24ms/step - loss: 6.9355 - mae: 2.4259 - val_loss: 6.7525 - val_mae: 2.3898
Epoch 5/50
104/104 [==============================] - 3s 22ms/step - loss: 6.2222 - mae: 2.2859 - val_loss: 6.0568 - val_mae: 2.2514
Epoch 6/50
104/104 [==============================] - 3s 24ms/step - loss: 5.5676 - mae: 2.1495 - val_loss: 5.4126 - val_mae: 2.1153
Epoch 7/50
104/104 [==============================] - 3s 25ms/step - loss: 4.9664 - mae: 2.0170 - val_loss: 4.8338 - val_mae: 1.9860
Epoch 8/50
104/104 [==============================] - 3s 24ms/step - loss: 4.4226 - mae: 1.8888 - val_loss: 4.2967 - val_mae: 1.8579
Epoch 9/50
104/104 [==============================] - 4s 28ms/step - loss: 3.9290 - mae: 1.7661 - val_loss: 3.8142 - val_mae: 1.7356
Epoch 10/50
104/104 [==============================] - 3s 24ms/step - loss: 3.4722 - mae: 1.6438 - val_loss: 3.3736 - val_mae: 1.6166
Epoch 11/50
104/104 [==============================] - 3s 23ms/step - loss: 3.0685 - mae: 1.5302 - val_loss: 2.9784 - val_mae: 1.5029
Epoch 12/50
104/104 [==============================] - 3s 24ms/step - loss: 2.6997 - mae: 1.4194 - val_loss: 2.6192 - val_mae: 1.3938
Epoch 13/50
104/104 [==============================] - 3s 22ms/step - loss: 2.3711 - mae: 1.3153 - val_loss: 2.2957 - val_mae: 1.2902
Epoch 14/50
104/104 [==============================] - 3s 24ms/step - loss: 2.0744 - mae: 1.2159 - val_loss: 2.0112 - val_mae: 1.1948
Epoch 15/50
104/104 [==============================] - 3s 24ms/step - loss: 1.8146 - mae: 1.1273 - val_loss: 1.7588 - val_mae: 1.1066
Epoch 16/50
104/104 [==============================] - 3s 24ms/step - loss: 1.5816 - mae: 1.0441 - val_loss: 1.5352 - val_mae: 1.0249
Epoch 17/50
104/104 [==============================] - 3s 24ms/step - loss: 1.3773 - mae: 0.9679 - val_loss: 1.3363 - val_mae: 0.9498
Epoch 18/50
104/104 [==============================] - 3s 24ms/step - loss: 1.1992 - mae: 0.8990 - val_loss: 1.1642 - val_mae: 0.8824
Epoch 19/50
104/104 [==============================] - 3s 24ms/step - loss: 1.0432 - mae: 0.8363 - val_loss: 1.0150 - val_mae: 0.8215
Epoch 20/50
104/104 [==============================] - 3s 24ms/step - loss: 0.9090 - mae: 0.7804 - val_loss: 0.8866 - val_mae: 0.7670
Epoch 21/50
104/104 [==============================] - 3s 23ms/step - loss: 0.7949 - mae: 0.7307 - val_loss: 0.7756 - val_mae: 0.7177
Epoch 22/50
104/104 [==============================] - 3s 26ms/step - loss: 0.6964 - mae: 0.6850 - val_loss: 0.6807 - val_mae: 0.6742
Epoch 23/50
104/104 [==============================] - 3s 24ms/step - loss: 0.6128 - mae: 0.6446 - val_loss: 0.6026 - val_mae: 0.6375
Epoch 24/50
104/104 [==============================] - 3s 24ms/step - loss: 0.5436 - mae: 0.6101 - val_loss: 0.5365 - val_mae: 0.6049
Epoch 25/50
104/104 [==============================] - 3s 22ms/step - loss: 0.4857 - mae: 0.5796 - val_loss: 0.4808 - val_mae: 0.5762
Epoch 26/50
104/104 [==============================] - 3s 24ms/step - loss: 0.4378 - mae: 0.5533 - val_loss: 0.4344 - val_mae: 0.5513
Epoch 27/50
104/104 [==============================] - 3s 24ms/step - loss: 0.3980 - mae: 0.5304 - val_loss: 0.3960 - val_mae: 0.5297
Epoch 28/50
104/104 [==============================] - 3s 25ms/step - loss: 0.3663 - mae: 0.5114 - val_loss: 0.3650 - val_mae: 0.5114
Epoch 29/50
104/104 [==============================] - 3s 27ms/step - loss: 0.3397 - mae: 0.4948 - val_loss: 0.3397 - val_mae: 0.4959
Epoch 30/50
104/104 [==============================] - 3s 24ms/step - loss: 0.3180 - mae: 0.4804 - val_loss: 0.3184 - val_mae: 0.4821
Epoch 31/50
104/104 [==============================] - 3s 22ms/step - loss: 0.3005 - mae: 0.4684 - val_loss: 0.3015 - val_mae: 0.4707
Epoch 32/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2863 - mae: 0.4579 - val_loss: 0.2868 - val_mae: 0.4602
Epoch 33/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2741 - mae: 0.4488 - val_loss: 0.2743 - val_mae: 0.4508
Epoch 34/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2636 - mae: 0.4403 - val_loss: 0.2644 - val_mae: 0.4430
Epoch 35/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2542 - mae: 0.4323 - val_loss: 0.2546 - val_mae: 0.4349
Epoch 36/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2459 - mae: 0.4252 - val_loss: 0.2461 - val_mae: 0.4277
Epoch 37/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2382 - mae: 0.4183 - val_loss: 0.2380 - val_mae: 0.4205
Epoch 38/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2307 - mae: 0.4112 - val_loss: 0.2304 - val_mae: 0.4137
Epoch 39/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2237 - mae: 0.4047 - val_loss: 0.2227 - val_mae: 0.4065
Epoch 40/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2166 - mae: 0.3980 - val_loss: 0.2154 - val_mae: 0.3997
Epoch 41/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2097 - mae: 0.3914 - val_loss: 0.2083 - val_mae: 0.3928
Epoch 42/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2024 - mae: 0.3841 - val_loss: 0.2009 - val_mae: 0.3856
Epoch 43/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1956 - mae: 0.3774 - val_loss: 0.1939 - val_mae: 0.3788
Epoch 44/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1883 - mae: 0.3701 - val_loss: 0.1866 - val_mae: 0.3715
Epoch 45/50
104/104 [==============================] - 3s 24ms/step - loss: 0.1813 - mae: 0.3629 - val_loss: 0.1791 - val_mae: 0.3637
Epoch 46/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1740 - mae: 0.3554 - val_loss: 0.1721 - val_mae: 0.3565
Epoch 47/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1669 - mae: 0.3478 - val_loss: 0.1648 - val_mae: 0.3486
Epoch 48/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1598 - mae: 0.3402 - val_loss: 0.1574 - val_mae: 0.3405
Epoch 49/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1524 - mae: 0.3320 - val_loss: 0.1501 - val_mae: 0.3323
Epoch 50/50
104/104 [==============================] - 3s 24ms/step - loss: 0.1454 - mae: 0.3242 - val_loss: 0.1429 - val_mae: 0.3241
Now, we load the infos and evaluate the net:
##################Checking History############################
model = keras.models.load_model('one_file.keras')
print(f"Test MAE: {model.evaluate(test,batch_size=batch_size,steps=int(test_count/batch_size))[1]:.2f}")
loss = history.history["mae"]
val_loss = history.history["val_mae"]
epochs = range(1, len(loss) + 1)
104/104 [==============================] - 0s 2ms/step - loss: 0.1456 - mae: 0.3255
Test MAE: 0.33
first_layer_weights = model.layers[0].get_weights()[0]
first_layer_biases = model.layers[0].get_weights()[1]
print("w = "+str(first_layer_weights[0][0]))
print("b = "+str(first_layer_biases[0]))
print()
print("History of weight: \n"+str(gw.weight_dict['w_']))
print("History of bias: \n"+str(gw.weight_dict['b_']))
w = 1.7393888
b = 1.6759396
History of weight:
[[[-0.8327197 -0.7327028 -0.63539875 -0.54049414 -0.4482258
-0.35782048 -0.27035686 -0.18447052 -0.10172622 -0.02062045
0.05727153 0.13337713 0.20724966 0.27921426 0.3479225
0.41440862 0.47908387 0.5417002 0.6010756 0.65853363
0.7135724 0.76665 0.81689113 0.86521685 0.91134405
0.955781 0.9980275 1.0383987 1.0768399 1.1138892
1.1493787 1.1832614 1.2163761 1.2475177 1.2785128
1.3089203 1.3387821 1.3679621 1.3970793 1.4261677
1.4557024 1.4855101 1.5153441 1.5449064 1.5759834
1.6071749 1.6399227 1.6719968 1.705449 1.7393888 ]]]
History of bias:
[[[0.10224409 0.20258835 0.3000701 0.39481494 0.48675668 0.57614195
0.6626023 0.7466329 0.82758945 0.9061539 0.9812596 1.0540725
1.1240187 1.191064 1.2548038 1.3154875 1.3734779 1.4284813
1.479668 1.5279312 1.5729469 1.6147243 1.6527137 1.6875445
1.718733 1.7470683 1.7718154 1.7931625 1.8111029 1.8259947
1.8378032 1.8460872 1.8523203 1.8547686 1.8550116 1.8534969
1.8494251 1.8433033 1.8354458 1.8260137 1.8154101 1.8034348
1.7903117 1.7753468 1.760216 1.7445545 1.7287849 1.7115461
1.6937226 1.6759396 ]]]
Function to plot the results:
###################################################################
# Plot Result function
####################################################################
def get_plot_datas(ids_file,data_file):
ids_file=pd.read_csv(ids_file)
data_file=pd.read_csv(data_file)
data_file=data_file[['indf','dnbr','ndvi']]
datas=data_file[(data_file['indf'].isin(ids_file.indf))]
return datas[['ndvi','dnbr']].to_numpy()
Let’s have a brief look at the datas: nb: what we call the mean model is simply: $$y_{pred}=\bar y_{obs}$$ In that case the mse is the variance.
#################Datas summary####################
datas=get_plot_datas(test_ids_file,data_file)
x=datas[:,0]
y=datas[:,1]
print("Observed statistics:")
print()
print("mean dnbr = "+str(np.mean(y)))
print("dnbr variance (mse with mean model) = "+str(np.square(np.std(y))))
print()
print()
y_pred=model.predict(x)
print("mean dnbr predicted = "+str(np.mean(y_pred[:,0])))
print("mae_from_mean_model ="+str(np.sum(np.abs(y-np.mean(y)))/len(y)))
Observed statistics:
mean dnbr = 2.481034895608084
dnbr variance (mse with mean model) = 0.7607728299290712
mean dnbr predicted = 2.5352304
mae_from_mean_model =0.7521578471518485
In the case of the linear model, the mse being:$$f(a,b)=\frac{1}{n}\Sigma(ax_{obs}+b-y_{obs})²$$ the gradient coordinates are: $$\frac{\partial f}{\partial a}=\frac{2}{n}\Sigma x_{obs}(ax_{obs}+b-y_{obs})$$ and $$\frac{\partial f}{\partial b}=\frac{2}{n}\Sigma (ax_{obs}+b-y_{obs})$$ the sum being taken on the batch data.
The minimun is thus obtained solving $$\frac{\partial f}{\partial a}=0$$ and $$\frac{\partial f}{\partial b}=0$$ which leads to: $$a=\frac{cov(x,y)}{s²_{x}}$$ and $$b=\bar{y}-a\bar{x}$$
We compare the performance of our NN with that elementary model in the next few lines:
a,b=np.polyfit(x,y,1)
print("Results from the linear model:")
print()
print("a (supposed to be equivalent to w in the last section) = "+str(a))
print("b = "+str(b))
linear_mae=sum(abs(y-(a*x+b)))/len(y)
linear_mse=sum(np.square(y-(a*x+b)))/len(y)
print("linear_model_mae = "+str(linear_mae))
print("linear_model_mse = "+str(linear_mse))
print()
print("NN model results:")
Results from the linear model:
a (supposed to be equivalent to w in the last section) = 3.001565491043912
b = 0.9982055323841632
linear_model_mae = 0.0803045100039715
linear_model_mse = 0.009984939104477841
NN model results:
Ready to plot:
##############################PLOTS#######################################
abscisses=[0,1]
ordonnees=[0,1]
figure,axis = plt.subplots(3,2,figsize = (20, 20))
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
axis[0,0].scatter(x,y,c=z,s=1)
axis[0,0].plot(x,a*x+b,'-',color='black',label="linear model")
axis[0,0].plot(x,first_layer_weights[0][0]*x+first_layer_biases[0],c="red",label="NN model")
axis[0,0].set_title("DNBR vs NDVI: Both models")
axis[0,0].legend()
axis[0,1].plot(epochs, loss, "bo", label="Training MAE")
axis[0,1].plot(epochs, val_loss, "b", label="Validation MAE")
axis[0,1].set_title("Training and validation MAE")
axis[0,1].legend()
xy = np.vstack([y,y_pred[:,0]])
z = gaussian_kde(xy)(xy)
axis[1,0].scatter(y,y_pred[:,0],c=z,s=1)
axis[1,0].plot(abscisses,ordonnees,label="y=x")
axis[1,0].set_title("NN Prévisions vs observées")
axis[1,0].legend()
res=np.subtract(y_pred[:,0],y)
xy = np.vstack([x,res])
z = gaussian_kde(xy)(xy)
axis[1,1].scatter(x,res,c=z,s=1)
axis[1,1].set_title("Résidus vs NDVI")
axis[2,0].hist(y_pred)
axis[2,0].set_title("Predictions distribution")
axis[2,1].hist(y)
axis[2,1].set_title("dnbr distribution")
plt.show()