Talk:Gated recurrent unit

Fully gated unit picture
Unless I am mistaken, the picture given for the fully gated recurrent unit does not match up with the equation in the article for the hidden state. The 1- node should connect to the product of the output of tanh, not the product with the previous hidden state. In other words, instead of the 1- node being on the arrow above z[t], it should be on the arrow to the right.

--ZaneDurante (talk) 18:21, 2 June 2020 (UTC)


 * Yes, you are right! I also noticed this already in 2016 when I prepared lecture slides based on the formulas and this picture. They do not match. 193.174.205.82 (talk) 14:56, 18 January 2023 (UTC)

Article requires clarification
Is not clear on the article how the cell connects to another cell, to his own layer, or to what else it connects.

Remove CARU section?
Lots of publicity for a paper by Macao authors from a Macao IP address, with limited relevance for the GRU article. 194.57.247.3 (talk) 11:45, 28 October 2022 (UTC) Than Please describe what is y_hat(t) in the figure (it does not appear in equations) — Preceding unsigned comment added by Geofo (talk • contribs) 11:15, 29 August 2023 (UTC)

$z$ or $1-z$?
Why does this article have $h_t=(1-z_t) \odot h_{t-1} + z_t \odot \hat{h}_t$? The original paper (reference [1]) has h_t =  z_t \odot h_{t-1} + (1-z_t) \odot  \hat{h}_t, which is also the convention used by PyTorch (see this page) and tensorflow (not documented in the obvious place, but clear if you write some code to test it.)  — Preceding unsigned comment added by Neil Strickland (talk • contribs) 23:19, 28 January 2024 (UTC)