BackPropagation
- Backpropagation is supervised learning algorithm , for training Neural Networks.
- Every node in Neural Network represent a Neuron, so we can say that Neural Network is a circuit of neurons,
- Neural Network consist an Input layer, an output layer and a hidden layer, let's see in diagram.
What is the Role of Backpropagation
- First of all,if I want to create a neural network, then I have to initialize some weights.
- Now, whatever values i have selected for weights i do not know how much they are correct.
- To check that the weight values that I have selected are correct or incorrect I have to calculate the error of the model.
- Suppose my model error occurred too much
- Meaning my predicated output is very different from the actual output, so what shall I do? I will try to minimize the error.
data:image/s3,"s3://crabby-images/44830/4483020ef0362a637ea3c58497d0eab098bb5f84" alt=""
Note:
- Here we are trying to minimize our error , how we will do this?
- What we really want to do is we have to learn our model to change the weights automatically so that we can get least error.
- As shown in the above diagram, we first calculated the error of our model, after that we saw that if the error is minimal then our model is ready for prediction.
- If the error is not minimized, we will update the parameters (weights) and calculate the error again.
- These processes will run until the error of our model is minimized.
Gradient Descent
- We have number of optimizer but here we are using Gradient descent optimizer.
- Gradient descent work as a optimizer, for finding minimum of a function.
- In our case we update the weights using gradient descent and try to minimize error function.
data:image/s3,"s3://crabby-images/6e593/6e5938e8cec26540e725ab5ef385d1e338a9ec61" alt=""
Note: Achieving Global Loss Minimum is Backpropagation
How does back propagation algorithm work?
Suppose we have a neural network that has an input layer, a hidden layer and an output layer
step1: First, we give random weights to the model.
step2: Forward propagation (normal neural network calculation)
step3: Calculate total error.
step4: Backward propagation (gradient descent), updating parameters (weights and bias)
step5: Until the error is minimized (Predicted output to be approximately equal to original output)
data:image/s3,"s3://crabby-images/be74f/be74fcbc4e7aea2fde49925ff0819b5eee9ff414" alt=""
The formulas that we are using here
FORWARD PROPAGATION
1. To calculate value of h1
data:image/s3,"s3://crabby-images/24694/2469460da054d799399ac4821be8f3722948a936" alt=""
2. To calculate the output of h1
data:image/s3,"s3://crabby-images/7bb19/7bb1943afa7fefcc46b7f5c0003971d5470f5bec" alt=""
3. To calculate error of output of h1
data:image/s3,"s3://crabby-images/0ce00/0ce002fdd63c86fec039a34ca4286287668ad460" alt=""
4. To calculate total error of the model
data:image/s3,"s3://crabby-images/8a1f4/8a1f47e012830a574e2c53f8e4c1a5a78809483d" alt=""
Now will propagate Backward
BACKWARD PROPAGATION
- Here we are writing the process and formulas to update our w5 weight.
- For that, we should know how much total error has come with respect to w5 weight.
data:image/s3,"s3://crabby-images/43f97/43f97493d364151ca53eb52f7762efc3c91090b0" alt=""
1. Calculating our total total error with respect to output one.
data:image/s3,"s3://crabby-images/824de/824debda7ffa6870dcb7bf3942a85c3e04404125" alt=""
2. calculating our total output 1 with respect to net output 1
data:image/s3,"s3://crabby-images/736d9/736d9b8eddbca929035e3af919a5c3d5ea63e7c2" alt=""
data:image/s3,"s3://crabby-images/f8526/f85263d2b330944cf99bf4a2a0b40c2c97559a24" alt=""
data:image/s3,"s3://crabby-images/5324d/5324d424824bcac11eccff8bea1bfacbc83c1ce7" alt=""
3. Calculate net output1 with respect to weight5
data:image/s3,"s3://crabby-images/f7d2e/f7d2e913f2508a8fddcaa47b9d98b0f34587c0fc" alt=""
4. Calculating updated weight
data:image/s3,"s3://crabby-images/a4441/a4441885b7ba44633406251b8a7a124c57efd400" alt=""
Similarly we can calculate other weight values as well (All this process is happened behind in the model)
How is back-propagation implemented?
Initializing variables value
import numpy as np
x = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
print("small x",x)
#original output
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) #maximum along the first axis
print("Capital X",X)
|
Output
data:image/s3,"s3://crabby-images/68cc1/68cc1960ff73379250413c40ff137110ff91d09d" alt=""
#Defining Sigmoid Function for output
def sigmoid (x):
return (1/(1 + np.exp(-x)))
#Derivative of Sigmoid Function
def derivatives_sigmoid(x):
return x * (1 - x)
#Variables initialization
epoch=7000 #Setting training iterations
lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of input layer neurons
hiddenlayer_neurons = 3 #number of hidden layers neurons
output_neurons = 1 #number of neurons at output layer
|
Note:
- In this code,we have defined sigmoid function and its derivative function.
- As you know, we train the Neural network many times at a single point, for that we need the number of epochs.
- Below that we have defined the only number of neurons in each layer.
#Defining weight and biases for hidden and output layer
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))
|
Note:
- Here we have defined random weights and bias
- As we know, we should first defined the wights and Bias for the first (here we have only one hidden layer) hidden layer.
- After that we have defined the weights and bias for the output layer.
- Keep in mind when defining the weights size (how many neurons are in the previous layer, the number of neurons in the layer for that we have defined weights).
- Size of bias (number of neurons in output layer,the number of neurons in the layer for that we have defined biases).
#Forward Propagation
for i in range(epoch):
hinp1=np.dot(X,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
|
Note:
- Here we are just calculating output of our model, first we have done this for hidden layer and after that for output layer , and finally get the output.
- np.dot is used for dot product of two matrix.
#Backpropagation Algorithm
EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
#how much hidden layer wts contributed to error
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr
# dotproduct of nextlayererror and currentlayerop
bout += np.sum(d_output, axis=0,keepdims=True) *lr
#Updating Weights
wh += X.T.dot(d_hiddenlayer) *lr
print("Actual Output: \n" + str(y))
print("Predicted Output: \n" ,output)
|
Output
data:image/s3,"s3://crabby-images/3bc85/3bc85d2286f86b443f3539fd4489f99d8e9b0aec" alt=""
Note:
- In this code first we calculated error of output layer and after that calculated error of output layer.
- As we know from the formula we have to find out how much hidden layer contribute in total error and also contribution of weight in total error.
- After that we have updated our weights and biases, until we get minimum error.
- X.T is used to make transpose matrix.
Click here for more programs of RTU ML LAB