Market Basket Analysis (Apriori Algorithm)
Market basket analysis mines transaction data for frequently co-purchased items and derives association rules like \u201cif A then B\u201d, enabling product placement, bundling, and recommendation strategies.
Key Metrics
Association rule mining is driven by three metrics: Support (how often the itemset appears), Confidence (probability of Y given X), and Lift (how much more likely than by chance).
Definitions
- Support(X) = transactions containing X / total transactions.
- Confidence(X \u2192 Y) = Support(X \u222a Y) / Support(X).
- Lift(X \u2192 Y) = Confidence(X \u2192 Y) / Support(Y). Lift > 1 means positive association; = 1 means independent; < 1 means negative association.
The Apriori Algorithm
Apriori exploits the anti-monotone property: any subset of a frequent itemset must also be frequent. This prunes the search space by eliminating infrequent candidates early.
Apriori with mlxtend
Interpreting Rules
Rules with high support, high confidence, and lift > 1 are the most actionable. Sorting by lift identifies the most surprising associations beyond base rates.
Business Applications
- Product placement: Place frequently co-purchased items near each other.
- Bundle offers: Discount bundles identified by high-confidence rules.
- Recommendation engines: Suggest items frequently bought with what's in the cart.
- Inventory management: Stock frequently co-occurring items together.
FP-Growth: A Faster Alternative
FP-Growth avoids candidate generation by building a compact prefix tree (FP-tree), making it significantly faster than Apriori on large datasets. mlxtend.frequent_patterns.fpgrowth provides a drop-in replacement with the same API as apriori.