Linear Independence and Matrix Rank

More features don't always mean more information. If your dataset has 'Height in Inches' and 'Height in CM', one is perfectly predictable from the other. Matrix Rank tells us the number of 'unique' or independent directions in our data.

Linear Independence

A set of vectors is independent if none can be formed by combining the others. If a vector can be explained by the others, it is redundant (linearly dependent).

The Meaning of Rank

The Rank of a matrix is the maximum number of independent rows or columns. It measures the 'Information Density' of the matrix.

Why it Matters

A matrix must have 'Full Rank' to be invertible. In data prep, we look for rank-deficient matrices to identify features that we can safely delete to save compute and prevent overfitting.

Rank Deficiency

If Rank < Dimension, your data exists in a 'collapsed' state with hidden redundancies. Finding the rank allows you to compress data without losing any real signal.