Which activation function can alleviate the vanishing gradient problem to some extent?

  • Sigmoid
  • ReLU (Rectified Linear Unit)
  • Tanh (Hyperbolic Tangent)
  • Leaky ReLU
The ReLU activation function is known for mitigating the vanishing gradient problem, which is a common issue in deep learning. ReLU allows gradients to flow more freely during backpropagation, making it easier to train deep neural networks.
Add your answer
Loading...

Leave a comment

Your email address will not be published. Required fields are marked *