Limits and Continuous Functions
To analyze change and define derivatives, we must first establish the concepts of limits and continuity. Limits allow us to study the behavior of functions as inputs get infinitely close to a specific point, even if the function itself is undefined or behaves chaotically at that exact location. Continuity guarantees that the loss landscapes we navigate are smooth and unbroken, which is an absolute requirement for gradient-based optimization algorithms to function reliably.
The Concept of Nearness and Limits
A limit describes the value that a function approaches as the input to the function approaches a specific target. This is the conceptual tool that allows mathematicians to bridge the gap between discrete, disjointed steps and smooth, continuous movement.
Formal Definition of a Limit
Mathematically, we write the limit of a function $f(x)$ as $x$ approaches $a$ as:
$$\lim_{x \to a} f(x) = L$$
This notation states that as the input $x$ gets arbitrarily close to $a$ (from both the left and right sides), the output $f(x)$ gets arbitrarily close to the value $L$. Crucially, $f(a)$ does not need to be equal to $L$, or even exist, for the limit to exist.
Handling Indeterminate Forms
In calculus, we frequently encounter expressions like $\frac{0}{0}$ or $\frac{\infty}{\infty}$ when analyzing instantaneous rates of change. Limits allow us to evaluate these 'indeterminate forms' by simplifying the expression and analyzing its behavior as we approach the point of interest.
Continuity in Optimization Landscapes
A function is continuous if its graph has no holes, jumps, or vertical asymptotes. In formal terms, a function $f(x)$ is continuous at a point $a$ if three conditions are met: $f(a)$ is defined, the limit of $f(x)$ as $x$ approaches $a$ exists, and that limit is exactly equal to $f(a)$.
Mathematical Continuity
A function is continuous over an interval if it is continuous at every point in that interval. This can be expressed as:
$$\lim_{x \to a} f(x) = f(a)$$
If this condition fails at any point, the function is discontinuous, representing a break or sudden jump in the value of the function.
Why Continuity Matters for Optimizers
Optimization algorithms, such as gradient descent, rely on the assumption that the loss function is continuous and smooth. If a loss landscape is discontinuous, a tiny change in weights could lead to a massive, unpredictable jump in error, making it impossible for the model to find a stable minimum.