The activation function is used in the Hidden layer and the output layer of the network. Activation function calculates the weighted sum along with bias. Based on this calculation, it decides the state of neuron i.e. neuron is activated or not. Activating function resulting values in between 0 to 1 or -1 to 1 etc.
Activation Function checks whether the computed sum weight value is above the required threshold. If the computed value is above the required threshold then the activation function is activated and output is computed.
Activation Functions are of two types:
- Linear activation Function
- Non-linear Activation Function
The role of activation function is to introduce non-linearity into the output of a neuron. A Neural network without activation function is a linear regression model.
Linear Activation Function
A linear function has the equation same as the straight line
f(x)=x.
The range of the linear function is from +infinity to -infinity. It is used at the output layer. Linear function derivative will become constant, i.e. f'(x)=1. It is identity function
Fig1: Linear Activation Function
Non-Linear Activation Function
The Nonlinear Activation Functions are the most used activation functions. Nonlinear function helps the model to adapt according to data and to differentiate between the output.
Fig2: Nonlinear Activation Functions
Nonlinear Activation Functions are as follows:
- Sigmoid Activation Function or Logistic sigmoid Activation Function
- Tanh Activation Function or Tangent Hyperbolic Function
- Rectified Linear Unit Function/ ReLU
- Softmax
Sigmoid Activation Function / Logistic Activation Function
Sigmoid function is-It is basically a S-shaped graph. The range of the Sigmoid function is between 0 to 1. small change in x value, makes large changes in Y.
Commonly used: It is usually used in the binary classification of output function as it lies in between o and 1. It is Binary step function.
Fig3: Sigmoid Activation Function Curve
Function properties: 1. Differentiable and 2. Monotonic. Differentiable means slope can find out in the sigmoid curve at any two points. Function Derivative is
f'(x)=f(x)(1-f(x))
Sigmoid function is monotonic but its derivative is not monotonic. Used in feedforward network.
Tanh Activation Function / Tangent Hyperbolic Activation Function
This activation function always works better than sigmoid activation function. It is mathematically shifted version of sigmoid. Tanh Function is
OR
This is also sigmoidal i.e. S shaped. The range of the Tanh function is between -1 to 1.
Fig4: Tanh Activation Function Curve
Commonly used: Usually used in Hidden layer of Neural Network. Function Derivative is
Function properties are same as sigmoid i.e. Differentiable and monotonic. It is also used in the feedforward network. useful to showcase negative value as clearly negative and positive clearly mapped.
Fig5: Comparison of Sigmoid and Tanh Activation Function
ReLU/ Rectified Linear Unit Activation Function
Most used activation function in the neural network. It is used in almost all the places where Convolution neural network has been used.
Commonly used: Usually used in Hidden layer of Neural Network
The range of the Relu function is between 0 to infinity.