Gaussian vs. Multinomial Naive Bayes

Naive Bayes comes in several flavours depending on the distribution assumed for the feature likelihoods — the most common are Gaussian (for continuous data) and Multinomial (for count data).

Gaussian Naive Bayes

Gaussian NB assumes each feature follows a Gaussian (normal) distribution within each class. It estimates the mean and variance of each feature per class from training data and uses the Gaussian PDF to compute likelihoods.

Gaussian NB in Practice

<pre><code class="language-python">from sklearn.naive_bayes import GaussianNB from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split X, y = load_iris(return_X_y=True) X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.3, random_state=0) gnb = GaussianNB().fit(X_tr, y_tr) print(f"Gaussian NB Accuracy: {gnb.score(X_te, y_te):.4f}")</pre>

Multinomial Naive Bayes

Multinomial NB models features as counts (e.g., word frequencies in a document). It is the standard choice for text classification with bag-of-words or TF-IDF features.

Multinomial NB for Text

<pre><code class="language-python">from sklearn.naive_bayes import MultinomialNB from sklearn.feature_extraction.text import CountVectorizer from sklearn.pipeline import make_pipeline docs = ["free money now", "meet me tomorrow", "win cash prize", "lunch at noon"] labels = [1, 0, 1, 0] # 1 = spam, 0 = ham model = make_pipeline(CountVectorizer(), MultinomialNB()) model.fit(docs, labels) print(model.predict(["win free money", "see you tomorrow"]))</pre>

Bernoulli NB: A Third Option

Bernoulli NB is designed for binary feature vectors (word presence/absence rather than counts). It penalises the non-occurrence of a word, which can improve performance on short documents where absence is informative.