The last several videos have been about the
idea of a derivative, and before moving on to integrals, I want to take some time to
talk about limits. To be honest, the idea of a limit is not really
anything new. If you know what the word “approach” means you pretty much already know what a
limit is, you could say the rest is a matter of assigning fancy notation to the intuitive
idea of one value getting closer to another. But there are actually a few reasons to devote
a full video to this topic. For one thing it’s worth showing is how
the way I’ve been describing derivatives so far lines up with the the formal definition
of a derivative as it’s typically presented in most courses and textbooks.
I want to give you some confidence that thinking of terms like dx and df as concrete non-zero
nudges is not just some trick for building intuition; it’s actually backed up by the
formal definition of a derivative in all its rigor.
I also want to shed a little light on what exactly mathematicians mean by “approach”,
in terms of something called the "epsilon delta" definition of limits.
Then we’ll finish off with a clever trick for computing limits called L’Hopital’s
rule.
So first thing’s first, let’s take a look
at the formal definition of the derivative. As a reminder, when you have some function
f(x), to think about the derivative at a particular input, maybe x=2, you start by imagining nudging
that input by some tiny dx, and looking at the resulting change to the output, df.
The ratio df/dx, which can nicely be thought of as the rise-over-run slope between the
starting point on the graph and the nudged point, is almost the derivative. The actual
derivative is whatever this ratio approaches as dx approaches 0.
Just to spell out what is meant here, that nudge to the output “df” is is the difference
between f(starting-input + dx) and f(starting-input); the change to the output caused by the nudge
dx.
To express that you want to find what this
ratio approaches as dx approaches 0, you write “l-i-m”, for limit, with “dx arrow 0”
below it. Now, you’ll almost never see terms with
a lowercase d, like dx, inside a limit like this. Instead the standard is to use a different
variable, like delta-x, or commonly “h” for some reason.
The way I like to think of it is that terms with this lowercase d in the typical derivative
expression have built into them the idea of a limit, the idea that dx is supposed to eventually
approach 0. So in a sense this lefthand side “df/dx”,
the ratio we’ve been thinking about for the past few videos, is just shorthand for
what the righthand side spells out in more detail, writing out exactly what we mean by
df, and writing out the limiting process explicitly. And that righthand side is the formal definition
of a derivative, as you’d commonly see it in any calculus textbook Now, if you’ll pardon me for a small rant
here, I want to emphasize that nothing about this righthand side references the paradoxical
idea of an “infinitely small” change. The point of limits is to avoid that.
This value h is the exact same thing as the “dx” I’ve been referencing throughout
the series.
It’s a nudge to the input of f with some
nonzero, finitely small size, like 0.001, it’s just that we’re analyzing what happens
for arbitrarily small choices of h. In fact, the only reason people introduce
a new variable name into this formal definition, rather than just using dx, is to be super-extra
clear that these changes to the input are ordinary numbers that have nothing to do with
the infinitesimal. You see, there are others who like to interpret
dx as an “infinitely small change”, whatever that would mean, or to just say that dx and
df are nothing more than symbols that shouldn’t be taken too seriously.
But by now in the series, you know that I’m not really a fan of either of those views,
I think you can and should interpret dx as a concrete, finitely small nudge, just so
long as you remember to ask what happens as it approaches 0.
For one thing, and I hope the past few videos have helped convince you of this, that helps
to build a stronger intuition for where the rules of calculus actually come from.
But it’s not just some trick for building intuitions.
Everything I’ve been saying
about derivatives with this concrete-finitely-small-nudge philosophy is just a translation of the formal
definition of derivatives. Long story short, the big fuss about limits
is that they let us avoid talking about infinitely small changes by instead asking what happens
as the size of some change to our variable approaches 0.
And that brings us to goal #2: Understanding exactly it means for one value to approach
another. For example, consider the function [(2+h)3
– 23]/h. This happens to be the expression that pops
out if you unravel the definition for the derivative of x3 at x=2, but let’s just
think of it as any ol’ function with an input h.
Its graph is this nice continuous looking parabola. But actually, if you think about
what’s going at h=0, plugging that in you’d get 0/0, which is not defined. Just ask siri.
So really, this graph has a hole at that point. You have to exaggerate to draw that hole,
often with a little empty circle like this, but keep in mind the function is perfectly
well-defined for inputs as close to 0 as you want.
And wouldn’t you agree that as h approaches 0, the corresponding output, the height of
this graph, approaches 12? And it doesn’t matter which side you come at it from.
That
the limit of this ratio as h goes to 0 equals 12.
But imagine you’re a mathematician inventing calculus, and someone skeptically asks “well
what exactly do you mean by approach?” That would be an annoying question. I mean,
come on, we all know what it means for one value to get closer to another.
But let me show you a way to answer completely unambiguously.
For a given range of inputs within some distance of 0, excluding the forbidden point 0, look
at the corresponding outputs, all possible heights of the graph above that range.
As that range of input values closes in more and more tightly around 0, the range of output
values closes in more and more closely around 12.
The size of that range of outputs can
be made as small as you want. As a counterexample, consider a function that
looks like this, which is also not defined at 0, but kind of jumps at that point.
As you approach h = 0 from the right, the function approaches 2, but as you come at
0 from the left, it approaches 1. Since there’s not a clear, unambiguous value that this function
approaches as h approaches 0, the limit is simply not defined at that point.
When you look at any range of inputs around 0, and the corresponding range of outputs,
as you shrink that input range the corresponding outputs don’t narrow in on any specific
value.
Instead those outputs straddle a range that never even shrinks smaller than 1, no
matter how small your input range. This perspective of shrinking an input range
around the limiting point, and seeing whether or not you’re restricted in how much that
shrinks the output range, leads to something called the “epsilon delta” definition
of limits. You could argue this needlessly heavy-duty
for an introduction to calculus. Like I said, if you know what the word “approach” means,
you know what a limit means, so there’s nothing new on the conceptual level here.
But this is an interesting glimpse into the field of real analysis, and it gives you a
taste for how mathematicians made the intuitive ideas of calculus fully airtight and rigorous.
You’ve already seen the main idea: when a limit exists, you can make this output range
as small as you want; but when the limit doesn’t exist, that output range can’t get smaller
than some value, no matter how much you shrink the input range around the limiting input.
Phrasing that same idea a little more precisely, maybe in the context of this example where
the limiting value was 12, think of any distance away from 12, where for some reason it’s
common to use the greek letter “epsilon” to denote that distance.
And the intent here
is that that distance be something as small as you want.
What it means for the limit to exist is that you can always find a range of inputs around
our limiting input, some distance delta away from 0, so that any input within a distance
delta of 0 corresponds to an output with a distance epsilon of 12.
They key point is that this is true for any epsilon, no matter how small.
In contrast, when a limit doesn’t exist, as in this example, you can find a sufficiently
small epsilon, like 0.4, so that no matter how small you make your range around 0, no
matter how tiny delta is, the corresponding range of outputs is just always too big. There
is no limiting output value that they get arbitrarily close to. So far this is all pretty theory heavy; limits
being used to formally define the derivative, then epsilons and deltas being used to rigorously
define limits themselves. So let’s finish things off here with a trick for actually
computing limits.
For example, let’s say for some reason you
were studying the function sin(pi*x)/(x2-1). Maybe this models some kind of dampened oscillation.
When you plot a bunch of points to graph it, it looks pretty continuous, but there’s
a problematic value, x=1. When you plug that in, sin(pi) is 0, and the
denominator is also 0, so the function is actually not defined there, and the graph
should really have a hole there. This also happens at -1, but let’s just
focus our attention on one of these holes for now.
The graph certainly does seem to approach some distinct value at that point, wouldn’t
you say? So you might ask, how do you figure out what output this approaches as x approaches
1, since you can’t just plug in 1? Well, one way to approximate it would be to
plug in a number very close to 1, like 1.00001.
Doing that, you’d get a number around -1.57.
But is there a way to know exactly what it is? Some systematic process to take an expression
like this one, which looks like 0/0 at some input, and ask what its limit is as x approaches
that input? Well, after limits so helpfully let us write
the definition for a derivative, derivatives can come back to return the favor and help
us evaluate limits.
Let me show you what I mean.
Here’s the graph of sin(pi*x), and here’s the graph of x2-1. That’s kind of a lot
on screen, but just focus on what’s happening at x=1. The point here is that sin(pi*x) and
x2-1 are both 0 at that point, so they cross the x-axis.
In the same spirit as plugging in a specific value near 1, like 1.00001, let’s zoom in
on that point and consider what happens a tiny nudge dx away.
The value of sin(pi*x) is bumped down, and the value of that nudge, which was caused
by the nudge dx to the input, is what we might call d(sin(pi*x)).
From our knowledge of derivatives, using the chain rule, that should be around cos(pi*x)*pi*dx.
Since the starting value was x=1, we plug in x=1 to this expression.
In other words, the size of the change to this sin(pi*x) graph is roughly proportional
to dx, with proportionality constant cos(pi)*pi.
Since cos(pi) is exactly -1, we can write
that as -pi*dx. Similarly, the value this x2-1 graph has changed
by some d(x2-1). And taking the derivative, the size of that nudge should be 2*x*dx. Again,
since we started at x=1, that means the size of this change is about 2*1*dx.
So for values of x which are some tiny value dx away from 1, the ratio sin(pi*x)/(x2-1)
is approximately (-pi*dx) / (2*dx).
The dx’s cancel, so that value is -pi/2.
Since these approximations get more and more accurate for smaller and smaller choices of
dx, this ratio -pi/2 actually tells us the precise limiting value as x approaches 1.
Remember, what that means is that the limiting height on our original graph is evidently
exactly -pi/2. What happened there is a little subtle, so
let me show it again, but this time a little more generally. Instead of these two specific
functions, which both equal 0 at x=1, think of any two functions f(x) and g(x), which
are both 0 at some common value x = a.
And these have to be functions where you’re
able to take a derivative of them at x = a, meaning they each basically look like a line
when you zoom in close enough to that value. Even though you can’t compute f divided
by g at the trouble point, since both equal zero, you can ask abou this ratio for values
of x very close to a, the limit as x approach a. And it’s helpful to think of those nearby
inputs as a tiny nudge dx away from a. The value of f at that nudged point is approximately
its derivative, df/dx evaluated at a, times dx. Likewise the the value of g at that nudged
point is approximately the derivative of g, evaluated at a, times dx.
So near this trouble point, the ratio between the outputs of f and g is actually about the
same as the derivative of f at a, times dx, divided by the derivative of g at a, times
dx.
These dx’s cancel, so the ratio of f and
g near a is about the same as the ratio between their derivatives.
Since those approximations get more accurate for smaller nudges, this ratio of derivatives
gives the precise value for the limit. This is a really handy trick for computing
a lot of limits. If you come across an expression that seems to equal 0/0 when you plug in some
input, just take the derivative of the top and bottom expressions, and plug in that trouble
input. This clever trick is called “L'Hôpital's
rule”. Interestingly, it was actually discovered by Johann Bernoulli, but L’Hopital was a
wealthy dude who essentially paid Bernoulli for the rights to some of his mathematical
discoveries.
In a very literal way, it pays to understand
these tiny nudges. You might remember that the definition of
a derivative for any given function comes down to computing the limit of a fraction
that looks like 0/0, so you might think L’Hopital’s rule gives a handy way to discover new derivative
formulas. But that would be cheating, since presumably
you don’t yet know what the derivative on the numerator here is.
When it comes to discovering derivative formulas, something we’ve been doing a fair amount
this series, there is no systematic plug-and-chug method. But that’s a good thing. When creativity
is required to solve problems like these, it’s a good sign you’re doing something
real; something that might give you a powerful tool to solve future problems. Up next, I’ll talk about what an integral
is, as well as the fundamental theorem of calculus, which is another example of where
limits are used to help give a clear meaning to a fairly delicate idea that flirts with
infinity. As you know, most support for this channel
comes through Patreon, and the primary perk for patrons is early access to future series
like this, where the next one will be on Probability.
But for those of you who want a more tangible
way to flag that you’re part of the community, there is also a small 3blue1brown store, links
on the screen and in the description. I’m still debating whether or to make a
preliminary batch of plushie pi creatures, it kind of depends on how many viewers seem
interested in the store in general, but let me know in comments what kind of other things
you’d like to see there..