User talk:193.116.202.155

Range of $$tanh$$ on Vanishing Gradient Problem Page
Thanks for fixing the apparent error in the vanishing gradient problem page! I think you may have goofed. The sentence is talking about the gradient of the $$tanh$$ function, not the $$tanh$$ function itself. I thought I would let you know why I reverted it. Happy to discuss... I think the sentence must be confusing because the page history has it being fixed to (0, 1) just this past March. Themumblingprophet (talk) 14:34, 1 June 2020 (UTC)

Since you know some about activation functions I thought I might ask what you think about the sentence,


 * traditional activation functions such as the hyperbolic tangent function have gradients in the range (0, 1)...

In my experience the tanh function is the traditional activation function for RNNs. And it's also true that the gradient is between (0, 1). But this sentence implies that it is true of multiple "traditional" activation functions. I'm not sure that's the case. The logistic sigmoid's gradient has a range (0, 0.25). This is the activation function used in Hochreiter's original analysis, for example. So maybe it is also a traditional activation function for RNNs. — Preceding unsigned comment added by Themumblingprophet (talk • contribs) 14:56, 1 June 2020 (UTC)