What is one major drawback of using the sigmoid activation function in deep networks?

  • Prone to vanishing gradient
  • Limited to binary classification
  • Efficiently handles negative values
  • Non-smooth gradient behavior
One major drawback of using the sigmoid activation function in deep networks is its susceptibility to the vanishing gradient problem. This can hinder training deep networks as the gradient becomes very small for extreme values, slowing down learning.
Add your answer
Loading...

Leave a comment

Your email address will not be published. Required fields are marked *