Basic regression with keras-Simulated Datas/ Big Batch size

In this serie we will be dealing with a simple csv data file which structure is:

indf	ndvi	dnbr

Basically, these datas are simulated random ndvi and $dnbr=3*ndvi+1+\epsilon$ with $\epsilon\sim \mathcal{N}(0,0.1)$ and $ ndvi\sim \mathcal{U}[0,1]$

As of indf: it is our primary key column. Three indfs shuffled files permit us to train, validate and test our algorithm.

The few next lines are just a tip for final layout centering.

from IPython.core.display import HTML as Center

Center(""" <style>
.output_png {
    display: table-cell;indf
    text-align: center;
    vertical-align: middle;
    height:1600;
    width:600;
    
}
</style> """)

Let’s begin by charging the necessary libraries:

import os
import re
import random
from matplotlib import pyplot as plt
from numpy.polynomial.polynomial import polyfit
from scipy.stats import gaussian_kde
import pandas as pd
import numpy as np
from tensorflow import keras
import keras.backend as K
from tensorflow.keras import layers
from generator_from_one_file import DataGenerator

Then we can prepare the datas based on those three files we talked about earlier:

data_file = "test_data.csv"
training_ids_file="training.csv"
#training_ids_file=data_path+"train.csv"
test_ids_file="test.csv"
#test_ids_file=data_path+"extra.csv"
validation_ids_file="validation.csv"
#validation_ids_file=data_path+"val.csv"

Doing a large usage of generators (one can refer to generator_from_one_file.py):

batch_size=32
training=DataGenerator(training_ids_file,data_file,batch_size)
training_count=training.get_infos()
test=DataGenerator(test_ids_file,data_file,batch_size)
test_count=training.get_infos()
validation=DataGenerator(validation_ids_file,data_file,batch_size)
validation_count=validation.get_infos()

Time now to implement the network: a very simple one neuron-one layer without activation function:

def build_model():
    model = keras.Sequential([
        layers.Input(shape=(1,)),
        layers.Dense(1,activation="linear"),
    ])
    opt = keras.optimizers.Adam()
    model.compile(optimizer=opt, loss="mse", metrics=["mae"])
    return model

We write a simple callback function that will feed a dictionnary with both weights on each epoch:

class GetWeights(keras.callbacks.Callback):
  def __init__(self):
    super(GetWeights, self).__init__()
    self.weight_dict = {}
  def on_epoch_end(self, epoch, logs=None):
    w = model.layers[0].get_weights()[0]
    b = model.layers[0].get_weights()[1]
    if epoch == 0:
          # create array to hold weights and biases
        self.weight_dict['w_'] = w
        self.weight_dict['b_'] = b
    else:
          # append new weights to previously-created weights array
          self.weight_dict['w_'] = np.dstack(
              (self.weight_dict['w_'], w))
          # append new weights to previously-created weights array
          self.weight_dict['b_'] = np.dstack(
              (self.weight_dict['b_'], b))

gw = GetWeights()

Let’s define the Callback functions that will allow us to analyse our model

callbacks = [
    keras.callbacks.ModelCheckpoint('one_file.keras',
                                    save_best_only=True),
    keras.callbacks.TensorBoard(log_dir="./logs",histogram_freq=1),
    gw
]

Build and learn:

model=build_model()
model.summary()
history = model.fit(training,
                    batch_size=batch_size,
                    steps_per_epoch=int(training_count/batch_size),
                    epochs=50,
                    validation_data=validation,
                    validation_steps=int(validation_count/batch_size),
                    callbacks=callbacks,
                    use_multiprocessing=True,
                    workers=30,
                    )

2022-04-25 09:47:13.855388: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-25 09:47:14.948320: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 93 MB memory:  -> device: 0, name: Quadro P2000, pci bus id: 0000:3b:00.0, compute capability: 6.1
2022-04-25 09:47:14.949095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 4188 MB memory:  -> device: 1, name: Quadro P2000, pci bus id: 0000:af:00.0, compute capability: 6.1


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 1)                 2         
                                                                 
=================================================================
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________
Epoch 1/50
104/104 [==============================] - 4s 23ms/step - loss: 9.4443 - mae: 2.8659 - val_loss: 9.2228 - val_mae: 2.8292
Epoch 2/50
104/104 [==============================] - 3s 26ms/step - loss: 8.5376 - mae: 2.7154 - val_loss: 8.3244 - val_mae: 2.6775
Epoch 3/50
104/104 [==============================] - 3s 26ms/step - loss: 7.6955 - mae: 2.5666 - val_loss: 7.5172 - val_mae: 2.5340
Epoch 4/50
104/104 [==============================] - 3s 24ms/step - loss: 6.9355 - mae: 2.4259 - val_loss: 6.7525 - val_mae: 2.3898
Epoch 5/50
104/104 [==============================] - 3s 22ms/step - loss: 6.2222 - mae: 2.2859 - val_loss: 6.0568 - val_mae: 2.2514
Epoch 6/50
104/104 [==============================] - 3s 24ms/step - loss: 5.5676 - mae: 2.1495 - val_loss: 5.4126 - val_mae: 2.1153
Epoch 7/50
104/104 [==============================] - 3s 25ms/step - loss: 4.9664 - mae: 2.0170 - val_loss: 4.8338 - val_mae: 1.9860
Epoch 8/50
104/104 [==============================] - 3s 24ms/step - loss: 4.4226 - mae: 1.8888 - val_loss: 4.2967 - val_mae: 1.8579
Epoch 9/50
104/104 [==============================] - 4s 28ms/step - loss: 3.9290 - mae: 1.7661 - val_loss: 3.8142 - val_mae: 1.7356
Epoch 10/50
104/104 [==============================] - 3s 24ms/step - loss: 3.4722 - mae: 1.6438 - val_loss: 3.3736 - val_mae: 1.6166
Epoch 11/50
104/104 [==============================] - 3s 23ms/step - loss: 3.0685 - mae: 1.5302 - val_loss: 2.9784 - val_mae: 1.5029
Epoch 12/50
104/104 [==============================] - 3s 24ms/step - loss: 2.6997 - mae: 1.4194 - val_loss: 2.6192 - val_mae: 1.3938
Epoch 13/50
104/104 [==============================] - 3s 22ms/step - loss: 2.3711 - mae: 1.3153 - val_loss: 2.2957 - val_mae: 1.2902
Epoch 14/50
104/104 [==============================] - 3s 24ms/step - loss: 2.0744 - mae: 1.2159 - val_loss: 2.0112 - val_mae: 1.1948
Epoch 15/50
104/104 [==============================] - 3s 24ms/step - loss: 1.8146 - mae: 1.1273 - val_loss: 1.7588 - val_mae: 1.1066
Epoch 16/50
104/104 [==============================] - 3s 24ms/step - loss: 1.5816 - mae: 1.0441 - val_loss: 1.5352 - val_mae: 1.0249
Epoch 17/50
104/104 [==============================] - 3s 24ms/step - loss: 1.3773 - mae: 0.9679 - val_loss: 1.3363 - val_mae: 0.9498
Epoch 18/50
104/104 [==============================] - 3s 24ms/step - loss: 1.1992 - mae: 0.8990 - val_loss: 1.1642 - val_mae: 0.8824
Epoch 19/50
104/104 [==============================] - 3s 24ms/step - loss: 1.0432 - mae: 0.8363 - val_loss: 1.0150 - val_mae: 0.8215
Epoch 20/50
104/104 [==============================] - 3s 24ms/step - loss: 0.9090 - mae: 0.7804 - val_loss: 0.8866 - val_mae: 0.7670
Epoch 21/50
104/104 [==============================] - 3s 23ms/step - loss: 0.7949 - mae: 0.7307 - val_loss: 0.7756 - val_mae: 0.7177
Epoch 22/50
104/104 [==============================] - 3s 26ms/step - loss: 0.6964 - mae: 0.6850 - val_loss: 0.6807 - val_mae: 0.6742
Epoch 23/50
104/104 [==============================] - 3s 24ms/step - loss: 0.6128 - mae: 0.6446 - val_loss: 0.6026 - val_mae: 0.6375
Epoch 24/50
104/104 [==============================] - 3s 24ms/step - loss: 0.5436 - mae: 0.6101 - val_loss: 0.5365 - val_mae: 0.6049
Epoch 25/50
104/104 [==============================] - 3s 22ms/step - loss: 0.4857 - mae: 0.5796 - val_loss: 0.4808 - val_mae: 0.5762
Epoch 26/50
104/104 [==============================] - 3s 24ms/step - loss: 0.4378 - mae: 0.5533 - val_loss: 0.4344 - val_mae: 0.5513
Epoch 27/50
104/104 [==============================] - 3s 24ms/step - loss: 0.3980 - mae: 0.5304 - val_loss: 0.3960 - val_mae: 0.5297
Epoch 28/50
104/104 [==============================] - 3s 25ms/step - loss: 0.3663 - mae: 0.5114 - val_loss: 0.3650 - val_mae: 0.5114
Epoch 29/50
104/104 [==============================] - 3s 27ms/step - loss: 0.3397 - mae: 0.4948 - val_loss: 0.3397 - val_mae: 0.4959
Epoch 30/50
104/104 [==============================] - 3s 24ms/step - loss: 0.3180 - mae: 0.4804 - val_loss: 0.3184 - val_mae: 0.4821
Epoch 31/50
104/104 [==============================] - 3s 22ms/step - loss: 0.3005 - mae: 0.4684 - val_loss: 0.3015 - val_mae: 0.4707
Epoch 32/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2863 - mae: 0.4579 - val_loss: 0.2868 - val_mae: 0.4602
Epoch 33/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2741 - mae: 0.4488 - val_loss: 0.2743 - val_mae: 0.4508
Epoch 34/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2636 - mae: 0.4403 - val_loss: 0.2644 - val_mae: 0.4430
Epoch 35/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2542 - mae: 0.4323 - val_loss: 0.2546 - val_mae: 0.4349
Epoch 36/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2459 - mae: 0.4252 - val_loss: 0.2461 - val_mae: 0.4277
Epoch 37/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2382 - mae: 0.4183 - val_loss: 0.2380 - val_mae: 0.4205
Epoch 38/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2307 - mae: 0.4112 - val_loss: 0.2304 - val_mae: 0.4137
Epoch 39/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2237 - mae: 0.4047 - val_loss: 0.2227 - val_mae: 0.4065
Epoch 40/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2166 - mae: 0.3980 - val_loss: 0.2154 - val_mae: 0.3997
Epoch 41/50
104/104 [==============================] - 3s 24ms/step - loss: 0.2097 - mae: 0.3914 - val_loss: 0.2083 - val_mae: 0.3928
Epoch 42/50
104/104 [==============================] - 3s 25ms/step - loss: 0.2024 - mae: 0.3841 - val_loss: 0.2009 - val_mae: 0.3856
Epoch 43/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1956 - mae: 0.3774 - val_loss: 0.1939 - val_mae: 0.3788
Epoch 44/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1883 - mae: 0.3701 - val_loss: 0.1866 - val_mae: 0.3715
Epoch 45/50
104/104 [==============================] - 3s 24ms/step - loss: 0.1813 - mae: 0.3629 - val_loss: 0.1791 - val_mae: 0.3637
Epoch 46/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1740 - mae: 0.3554 - val_loss: 0.1721 - val_mae: 0.3565
Epoch 47/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1669 - mae: 0.3478 - val_loss: 0.1648 - val_mae: 0.3486
Epoch 48/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1598 - mae: 0.3402 - val_loss: 0.1574 - val_mae: 0.3405
Epoch 49/50
104/104 [==============================] - 3s 25ms/step - loss: 0.1524 - mae: 0.3320 - val_loss: 0.1501 - val_mae: 0.3323
Epoch 50/50
104/104 [==============================] - 3s 24ms/step - loss: 0.1454 - mae: 0.3242 - val_loss: 0.1429 - val_mae: 0.3241

Now, we load the infos and evaluate the net:

##################Checking History############################
model = keras.models.load_model('one_file.keras')
print(f"Test MAE: {model.evaluate(test,batch_size=batch_size,steps=int(test_count/batch_size))[1]:.2f}")
loss = history.history["mae"]
val_loss = history.history["val_mae"]
epochs = range(1, len(loss) + 1)

104/104 [==============================] - 0s 2ms/step - loss: 0.1456 - mae: 0.3255
Test MAE: 0.33

first_layer_weights = model.layers[0].get_weights()[0]
first_layer_biases  = model.layers[0].get_weights()[1]
print("w = "+str(first_layer_weights[0][0]))
print("b = "+str(first_layer_biases[0]))
print()
print("History of weight: \n"+str(gw.weight_dict['w_']))
print("History of bias: \n"+str(gw.weight_dict['b_']))

w = 1.7393888
b = 1.6759396

History of weight: 
[[[-0.8327197  -0.7327028  -0.63539875 -0.54049414 -0.4482258
   -0.35782048 -0.27035686 -0.18447052 -0.10172622 -0.02062045
    0.05727153  0.13337713  0.20724966  0.27921426  0.3479225
    0.41440862  0.47908387  0.5417002   0.6010756   0.65853363
    0.7135724   0.76665     0.81689113  0.86521685  0.91134405
    0.955781    0.9980275   1.0383987   1.0768399   1.1138892
    1.1493787   1.1832614   1.2163761   1.2475177   1.2785128
    1.3089203   1.3387821   1.3679621   1.3970793   1.4261677
    1.4557024   1.4855101   1.5153441   1.5449064   1.5759834
    1.6071749   1.6399227   1.6719968   1.705449    1.7393888 ]]]
History of bias: 
[[[0.10224409 0.20258835 0.3000701  0.39481494 0.48675668 0.57614195
   0.6626023  0.7466329  0.82758945 0.9061539  0.9812596  1.0540725
   1.1240187  1.191064   1.2548038  1.3154875  1.3734779  1.4284813
   1.479668   1.5279312  1.5729469  1.6147243  1.6527137  1.6875445
   1.718733   1.7470683  1.7718154  1.7931625  1.8111029  1.8259947
   1.8378032  1.8460872  1.8523203  1.8547686  1.8550116  1.8534969
   1.8494251  1.8433033  1.8354458  1.8260137  1.8154101  1.8034348
   1.7903117  1.7753468  1.760216   1.7445545  1.7287849  1.7115461
   1.6937226  1.6759396 ]]]

Function to plot the results:

###################################################################
#                 Plot Result function
####################################################################
def get_plot_datas(ids_file,data_file):
    ids_file=pd.read_csv(ids_file)
    data_file=pd.read_csv(data_file)
    data_file=data_file[['indf','dnbr','ndvi']]
    datas=data_file[(data_file['indf'].isin(ids_file.indf))]
    return datas[['ndvi','dnbr']].to_numpy()

Let’s have a brief look at the datas: nb: what we call the mean model is simply: $$y_{pred}=\bar y_{obs}$$ In that case the mse is the variance.

#################Datas summary####################
datas=get_plot_datas(test_ids_file,data_file)
x=datas[:,0]
y=datas[:,1]
print("Observed statistics:")
print()
print("mean dnbr = "+str(np.mean(y)))
print("dnbr variance (mse with mean model) = "+str(np.square(np.std(y))))
print()
print()
y_pred=model.predict(x)
print("mean dnbr predicted = "+str(np.mean(y_pred[:,0])))
print("mae_from_mean_model ="+str(np.sum(np.abs(y-np.mean(y)))/len(y)))

Observed statistics:

mean dnbr = 2.481034895608084
dnbr variance (mse with mean model) = 0.7607728299290712


mean dnbr predicted = 2.5352304
mae_from_mean_model =0.7521578471518485

In the case of the linear model, the mse being:$$f(a,b)=\frac{1}{n}\Sigma(ax_{obs}+b-y_{obs})²$$ the gradient coordinates are: $$\frac{\partial f}{\partial a}=\frac{2}{n}\Sigma x_{obs}(ax_{obs}+b-y_{obs})$$ and $$\frac{\partial f}{\partial b}=\frac{2}{n}\Sigma (ax_{obs}+b-y_{obs})$$ the sum being taken on the batch data.

The minimun is thus obtained solving $$\frac{\partial f}{\partial a}=0$$ and $$\frac{\partial f}{\partial b}=0$$ which leads to: $$a=\frac{cov(x,y)}{s²_{x}}$$ and $$b=\bar{y}-a\bar{x}$$

We compare the performance of our NN with that elementary model in the next few lines:

a,b=np.polyfit(x,y,1)
print("Results from the linear model:")
print()
print("a (supposed to be equivalent to w in the last section) = "+str(a))
print("b = "+str(b))
linear_mae=sum(abs(y-(a*x+b)))/len(y)
linear_mse=sum(np.square(y-(a*x+b)))/len(y)
print("linear_model_mae = "+str(linear_mae))
print("linear_model_mse = "+str(linear_mse))
print()
print("NN model results:")

Results from the linear model:

a (supposed to be equivalent to w in the last section) = 3.001565491043912
b = 0.9982055323841632
linear_model_mae = 0.0803045100039715
linear_model_mse = 0.009984939104477841

NN model results:

Ready to plot:

##############################PLOTS#######################################
abscisses=[0,1]
ordonnees=[0,1]
figure,axis = plt.subplots(3,2,figsize = (20, 20))
xy = np.vstack([x,y])
z = gaussian_kde(xy)(xy)
axis[0,0].scatter(x,y,c=z,s=1)
axis[0,0].plot(x,a*x+b,'-',color='black',label="linear model")
axis[0,0].plot(x,first_layer_weights[0][0]*x+first_layer_biases[0],c="red",label="NN model")
axis[0,0].set_title("DNBR vs NDVI: Both models")
axis[0,0].legend()
axis[0,1].plot(epochs, loss, "bo", label="Training MAE")
axis[0,1].plot(epochs, val_loss, "b", label="Validation MAE")
axis[0,1].set_title("Training and validation MAE")
axis[0,1].legend()
xy = np.vstack([y,y_pred[:,0]])
z = gaussian_kde(xy)(xy)
axis[1,0].scatter(y,y_pred[:,0],c=z,s=1)
axis[1,0].plot(abscisses,ordonnees,label="y=x")
axis[1,0].set_title("NN Prévisions vs observées")
axis[1,0].legend()
res=np.subtract(y_pred[:,0],y)
xy = np.vstack([x,res])
z = gaussian_kde(xy)(xy)
axis[1,1].scatter(x,res,c=z,s=1)
axis[1,1].set_title("Résidus vs NDVI")
axis[2,0].hist(y_pred)
axis[2,0].set_title("Predictions distribution")
axis[2,1].hist(y)
axis[2,1].set_title("dnbr distribution")
plt.show()

png