Integrals: Calculating Area Under the Curve

While derivatives break functions down to analyze instantaneous change, integrals assemble these individual pieces to measure accumulation. Integral calculus is the study of areas, volumes, and total accumulated quantities. In machine learning and data science, integration is essential for working with continuous probability distributions, calculating expectations, and optimizing long-term rewards in reinforcement learning.

Integration as Accumulation

The core concept of integration is finding the area under a curve. We can approximate this area by dividing the region under the curve into narrow vertical rectangles and summing their areas. As the width of these rectangles approaches zero and their number approaches infinity, the sum converges to the exact integral.

The Definite Integral

A definite integral calculates the accumulated area under a function $f(x)$ between two specific bounds $a$ and $b$. It is denoted as: $$A = \int_{a}^{b} f(x) dx$$ Where the $\int$ symbol represents an elongated 'S' for sum, and $dx$ represents the infinitely thin width of each rectangular slice.

The Fundamental Theorem of Calculus

The Fundamental Theorem of Calculus connects differentiation and integration, proving they are inverse operations. If $F(x)$ is the antiderivative of $f(x)$ (meaning $F'(x) = f(x)$), then the definite integral can be evaluated as: $$\int_{a}^{b} f(x) dx = F(b) - F(a)$$

Integrals in Probability and AI

Continuous variables in AI cannot be handled with simple sums. Instead, we use integration to calculate probabilities and expected values over continuous ranges.

Probability Density Functions (PDF)

For a continuous random variable, the probability of a value falling within an interval $[a, b]$ is not calculated by summing discrete points, but by integrating the probability density function: $$P(a \le X \le b) = \int_{a}^{b} p(x) dx$$ This is crucial for generative models like Variational Autoencoders (VAEs) and Normalizing Flows.

Expected Values and RL

In machine learning, we often need to calculate the expected value of a continuous function. This is defined using integrals: $E[f(x)] = \int f(x) p(x) dx$. In reinforcement learning, integrals are used to calculate continuous cumulative rewards and optimize agent policies in continuous action spaces.