Derivatives/ Differential: Presents change in Y-axis with respect to X-axis. It also termed as slope.
Monotonic: A function is entirely non-increasing or entirely non-decreasing.
The activation function is used in the Hidden layer and the output layer of the network. Activation function calculates the weighted sum along with bias. Based on this calculation, it decides the state of neuron i.e. neuron is activated or not. Activating function resulting values in between 0 to 1 or -1 to 1 etc.
Activation Function checks whether the computed sum weight value is above the required threshold. If the computed value is above the required threshold then the activation function is activated and output is computed.
Activation Functions are of two types:
- Linear activation Function
- Non-linear Activation Function
The role of activation function is to introduce non-linearity into the output of a neuron. A Neural network without activation function is a linear regression model.
Linear Activation Function
A linear function has the equation same as the straight line
The range of the linear function is from +infinity to -infinity. It is used at the output layer. Linear function derivative will become constant, i.e. f'(x)=1. It is identity function
Fig1: Linear Activation Function
Non-Linear Activation Function
The Nonlinear Activation Functions are the most used activation functions. Nonlinear function helps the model to adapt according to data and to differentiate between the output.
Fig2: Nonlinear Activation Functions
Nonlinear Activation Functions are as follows:
- Sigmoid Activation Function or Logistic sigmoid Activation Function
- Tanh Activation Function or Tangent Hyperbolic Function
- Rectified Linear Unit Function/ ReLU
Sigmoid Activation Function / Logistic Activation Function
Sigmoid function is-It is basically a S-shaped graph. The range of the Sigmoid function is between 0 to 1. small change in x value, makes large changes in Y.
Commonly used: It is usually used in the binary classification of output function as it lies in between o and 1. It is Binary step function.
Fig3: Sigmoid Activation Function Curve
Function properties: 1. Differentiable and 2. Monotonic. Differentiable means slope can find out in the sigmoid curve at any two points. Function Derivative is
Sigmoid function is monotonic but its derivative is not monotonic. Used in feedforward network.
Tanh Activation Function / Tangent Hyperbolic Activation Function
This activation function always works better than sigmoid activation function. It is mathematically shifted version of sigmoid. Tanh Function is
This is also sigmoidal i.e. S shaped. The range of the Tanh function is between -1 to 1.
Fig4: Tanh Activation Function Curve
Commonly used: Usually used in Hidden layer of Neural Network. Function Derivative is
Function properties are same as sigmoid i.e. Differentiable and monotonic. It is also used in the feedforward network. useful to showcase negative value as clearly negative and positive clearly mapped.
Fig5: Comparison of Sigmoid and Tanh Activation Function
ReLU/ Rectified Linear Unit Activation Function
Most used activation function in the neural network. It is used in almost all the places where Convolution neural network has been used.
Commonly used: Usually used in Hidden layer of Neural Network
The range of the Relu function is between 0 to infinity.
- is a family of statistical learning models influenced by biological neural networks;
- is devices of interconnected “neurons” which send messages together;
- connections have numeric weights that can be tuned based on experience and make neural network adaptive to inputs and capable of learning.
For example, A neural network for handwriting recognition is defined by a set of input neurons which may be activated by the pixels of an input image. After being weighted and transformed by a function, the activations of these neurons are then passed on to other neurons. This process is repeated until finally, an output neuron is activated. This determines which character was read.
At the time of building Neural Network, the developer has the option to choose activation function in the hidden as well as the output layer. Neural Network Elements are:
Input Layer: For input features, this is basically the dataset and this information is passed to the hidden layer
Hidden Layer: computations performed on this layer are not exposed. Here all the required computations are performed on input layer data/ features and outcome is transferred to the output layer
Output Layer: It is basically the resulting layer and provides results based on information learned from the neural network.
Neural Network Configuration Options:
- Number of Hidden Layers
- Number of nodes in each hidden layer
- Activation Function
- Learning Rate
- Iterations and error level