Skip to content

Chain Rule for Multivariable Functions

In this lesson you’ll learn how to differentiate composite functions where the variables themselves depend on other variables. The tree diagram method will become your go-to tool for keeping track of all the dependency paths.

In Calculus 1, the chain rule was straightforward: if y = f(g(x)), then dy/dx = f’(g(x)) g’(x). One function inside another, one path.

In multivariable calculus, a function z = f(x, y) might have x and y that each depend on other variables. Now there are multiple paths from z to the variable you’re differentiating with respect to, and you need to account for all of them.

If z = f(x, y) where x = x(t) and y = y(t), then z ultimately depends on just t through two paths

dzdt=zxdxdt+zydydt\frac{dz}{dt} = \frac{\partial z}{\partial x}\frac{dx}{dt} + \frac{\partial z}{\partial y}\frac{dy}{dt}

Each term represents one path: z changes because x changes with t, and z also changes because y changes with t. Add them up.

If z = f(x, y) where x = x(s, t) and y = y(s, t), then

zs=zxxs+zyys\frac{\partial z}{\partial s} = \frac{\partial z}{\partial x}\frac{\partial x}{\partial s} + \frac{\partial z}{\partial y}\frac{\partial y}{\partial s} zt=zxxt+zyyt\frac{\partial z}{\partial t} = \frac{\partial z}{\partial x}\frac{\partial x}{\partial t} + \frac{\partial z}{\partial y}\frac{\partial y}{\partial t}

The tree diagram is the easiest way to never miss a term. Draw the function at the top, its immediate variables below, and the final variables at the bottom. Each edge gets labeled with the appropriate derivative. To find a total derivative, trace every path from top to bottom and multiply along each path, then add all paths.

You don’t need the tree diagram to use the chain rule. You can always just write out the formula directly. But when the dependencies get complicated, the tree is a cool trick that makes it almost impossible to forget a term.

Example 1: Single parameter

Let z = x² + 3xy + y² where x = t² and y = 3t. Find dz/dt.

First, compute the partial derivatives of z

zx=2x+3y\frac{\partial z}{\partial x} = 2x + 3y zy=3x+2y\frac{\partial z}{\partial y} = 3x + 2y

Then the derivatives of x and y with respect to t

dxdt=2tdydt=3\frac{dx}{dt} = 2t \qquad \frac{dy}{dt} = 3

Apply the chain rule

dzdt=(2x+3y)(2t)+(3x+2y)(3)\frac{dz}{dt} = (2x + 3y)(2t) + (3x + 2y)(3)

At t = 1 (so x = 1, y = 3)

dzdt=(2+9)(2)+(3+6)(3)=22+27=49\frac{dz}{dt} = (2 + 9)(2) + (3 + 6)(3) = 22 + 27 = 49

Example 2: Two parameters with exponentials

Let z = exye^{xy} where x = s + t and y = s - t. Find ∂z/∂s and ∂z/∂t.

The partials of z

zx=yexyzy=xexy\frac{\partial z}{\partial x} = ye^{xy} \qquad \frac{\partial z}{\partial y} = xe^{xy}

The partials of x and y

xs=1xt=1ys=1yt=1\frac{\partial x}{\partial s} = 1 \qquad \frac{\partial x}{\partial t} = 1 \qquad \frac{\partial y}{\partial s} = 1 \qquad \frac{\partial y}{\partial t} = -1

Apply the chain rule for each

The two green circles at the bottom of the tree are s and t, the independent variables we’re differentiating with respect to. Two paths reach s (through x and through y), giving ∂z/∂s. Two paths reach t, giving ∂z/∂t. Multiply along each path, then add

zs=yexy(1)+xexy(1)=(x+y)exy\frac{\partial z}{\partial s} = ye^{xy}(1) + xe^{xy}(1) = (x + y)e^{xy} zt=yexy(1)+xexy(1)=(yx)exy\frac{\partial z}{\partial t} = ye^{xy}(1) + xe^{xy}(-1) = (y - x)e^{xy}

Substituting back: since x + y = 2s and y - x = -2t

zs=2se(s+t)(st)zt=2te(s+t)(st)\frac{\partial z}{\partial s} = 2se^{(s+t)(s-t)} \qquad \frac{\partial z}{\partial t} = -2te^{(s+t)(s-t)}

Example 3: Verifying by direct substitution

We can check Example 1 by substituting first. With x = t² and y = 3t

z=(t2)2+3(t2)(3t)+(3t)2=t4+9t3+9t2z = (t^2)^2 + 3(t^2)(3t) + (3t)^2 = t^4 + 9t^3 + 9t^2 dzdt=4t3+27t2+18t\frac{dz}{dt} = 4t^3 + 27t^2 + 18t

At t = 1: dz/dt = 4 + 27 + 18 = 49. Same answer as the chain rule gave us. Both methods work, but the chain rule is often faster when you don’t want to (or can’t) substitute everything first.

The multivariable chain rule is everywhere:

  • Machine learning uses it as backpropagation: the loss depends on outputs, which depend on weights, which depend on earlier layers. The chain rule traces every path backward through the network.
  • Game engines use it when a character’s hand position depends on shoulder angle, elbow angle, and wrist angle, all changing with time. The chain rule computes the hand’s velocity.
  • Physics uses it for related rates in 3D: pressure depends on temperature and volume, which both depend on time.
  • Engineering uses it for sensitivity analysis: how does a final measurement change when multiple input parameters shift?
For $z = f(x,y)$ with $x = x(t)$ and $y = y(t)$, $dz/dt$ equals
A tree diagram helps you
If $z = e^{xy}$, $x = s + t$, $y = s - t$, then $\partial z / \partial t$ equals
In machine learning, backpropagation is an application of
The most common mistake with the multivariable chain rule is