Wikipedia:Reference desk/Archives/Mathematics/2020 August 15

= August 15 =

Chain rule with multiple variables
I asked the question given in the link earlier on the mathematics stack exchange site. I was wondering if anyone could help me understand the answer I got in layman's terms. I simply don't understand. The question is asking for clarity on the chain rule with multiple variables.

https://math.stackexchange.com/questions/3790900/chain-rule-with-a-function-depending-on-functions-of-different-variables/3791017?noredirect=1#comment7809514_3791017 — Preceding unsigned comment added by 69.47.2.70 (talk) 06:00, 15 August 2020 (UTC)

The formula you give for the chain-rule derivative brings the potential confusion to an entirely new level by mixing these two roles in a single formula. For simplicity, I'll illustrate the issue with the univariate version of the chain rule. Then, what we have is something like
 * Let me first get something out of the way. By a (not uncommon, but possibly confusing) abuse of notation, when a mathematical text says something like "Let $$f = f(x)$$", the symbol "" is made to do dual duty. In the right-hand side of this definition – which has the form of an equation but really is a definition – it denotes a function; in this case a function of just one variable. In the left-hand side, on the other hand, the symbol is defined to do duty as standing for an expression, in this case the expression "$$f(x)$$"; it will serve as an abbreviation. To determine which of the two is intended in any further use requires a mathematical interpretation of the context, but as a rule of thumb application to an argument, as in "$$f(a-b)$$", tells us it stands for the function, while the lack of an argument, as in $$\frac{df}{dx}$$, tells us it that this time it abbreviates the expression, this thus being convenient shorthand notation for $$\frac{df(x)}{dx}$$. Clearly, the expression $$f(x)$$ depends on $$x$$; this dependence is explicit. Therefore "" used as abbreviation also depends on $$x$$, but now the dependence is implicit. This can be tricky or sneaky: a symbol may implicitly depend on another symbol, which in turn implicitly depends on yet another symbol, and so on.
 * $$\mathrm{Let}~f=f(x)~\mathrm{and}~x=x(u).$$
 * $$\mathrm{Then}~\frac{df}{du} = \frac{df}{dx}\frac{dx}{du}~(1).$$
 * What is going on here can be more easily be explained by introducing some extra symbols, as follows:
 * $$\mathrm{Let}~F=f(X),~\mathrm{where}~X=x(u).$$
 * $$\mathrm{Then}~\frac{dF}{du} = \frac{df(x)}{dx}(X)\cdot\frac{dX}{du}.$$
 * So the expression $$\frac{df}{dx}$$ in (1) is not simply the derivative of $$f$$ with respect to $$x$$, which is a function of one argument, but stands here for the result of applying that function to $$x(u)$$. You can kind of guess that because in the context this is an operand of a multiplication, so it has to be a number, not a function. You can get a number out by applying the function to some argument. The only reasonable candidate hanging around for assuming the role of that required argument is $$x(u)$$.
 * Now take the definition $$y = y(u,b)$$ and the expression $$\frac{\partial y}{\partial v}$$. As before, this stands for $$\frac{\partial y(u,b)}{\partial v}$$. Unless $$b$$ sneakily implicitly depends on $$v$$ – but for reasons that should be obvious mathematical etiquette does not allow one to secretly introduce dependences, so in the absence of evidence to the contrary we are allowed to assume no such implicit dependences exists – the term $$y(u,b)$$ in that expression does not depend on $$v$$. As $$v$$ varies, it remains constant. Therefore, $$\frac{\partial y}{\partial v}=0$$.
 * I hope this helps. I am not going to attempt to explain the elaborate answer on the stack exchange, which should take me several hours. --Lambiam 12:41, 15 August 2020 (UTC)


 * For the two example questions asked in that Stack Exchange post (assuming no unstated relation between variables):
 * $$\mathrm{If}~f=f(x,y)~\mathrm{and}~x=x(u,v)~\mathrm{and}~y=y(u,b),\mathrm{then}$$:
 * $$\frac{\partial f}{\partial u}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial u}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial u}$$
 * $$\frac{\partial f}{\partial v}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}$$
 * $$\frac{\partial f}{\partial b}=\frac{\partial f}{\partial y}\frac{\partial y}{\partial b}$$
 * $$\mathrm{If}~f=f(x,y)~\mathrm{and}~x=x(u,v)~\mathrm{and}~y=y(a,b),\mathrm{then}$$:
 * $$\frac{\partial f}{\partial u}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial u}$$
 * $$\frac{\partial f}{\partial v}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}$$
 * $$\frac{\partial f}{\partial a}=\frac{\partial f}{\partial y}\frac{\partial y}{\partial a}$$
 * $$\frac{\partial f}{\partial b}=\frac{\partial f}{\partial y}\frac{\partial y}{\partial b}$$
 * These are the elements of the matrix multiplication at the end of Philipp's first response.
 * -- ToE 02:10, 17 August 2020 (UTC)


 * Note that there is a careless error in the first matrix multiplication, midway through Philipp's first response.  The matrices are set up correctly, but the product is incorrect. Where they wrote:
 * $$(D_x f D_u x + D_y f D_v x, D_x f D_u y + D_y f D_v y)$$,
 * they should have written:
 * $$(D_x f D_u x + D_y f D_u y, D_x f D_v x + D_y f D_v y)$$.
 * -- ToE 02:33, 17 August 2020 (UTC)


 * Matrix multiplication error aside, I don't think Philipp did you a favor by answering your question in terms of the total derivative and Jacobian matrix. Linear algebra, of which you say you know little, offers useful tools for a terse and elegant representation of multivariate concepts, but it isn't necessary to address your question.  As Lambiam explained above,
 * $$\mathrm{Let}~f=f(x,y)~\mathrm{and}~x=x(u,v)~\mathrm{and}~y=y(u,b)$$
 * is the same as saying:
 * $$\mathrm{Let}~f=f(x,y)~\mathrm{and}~x=x(u,v,b)~\mathrm{with}~\frac{\partial x}{\partial b}=0~\mathrm{and}~y=y(u,v,b)~\mathrm{with}~\frac{\partial y}{\partial v}=0.$$
 * $$\mathrm{So}~\frac{\partial f}{\partial v}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}+\frac{\partial f}{\partial y}\frac{\partial y}{\partial v}=\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}+\frac{\partial f}{\partial y}\cdot~0=\frac{\partial f}{\partial x}\frac{\partial x}{\partial v}.$$
 * Simple, right? -- ToE 14:27, 17 August 2020 (UTC)