Naïve Bayes

Naive Bayes is a machine learning technique designed to solve classification problems. It relies on Bayes’ theorem, which is used to determine conditional probability of events. Naive Bayes assumes that all features used for classification are independent of each other.

\[P(C_k|X) = \frac{P(C_k) \cdot P(X|C_k)}{P(X)}\]

In the formula, we recognize variables \(C_k\) as a class, and \(X\) as a feature. \(P(C_k|X)\) is the posterior probability. \(P(C_k)\) is the class prior probability. \(P(X|C_k)\) is the likelihood of observing features \(X\) given \(C_k\) is true. \(P(X)\) is the feature prior probability. This formula calculates the probability of a class given a feature set.

Objective

The fundamental goals of the Naive Bayes algorithm are classifying data into distinct categories and delivering probabilistic predictions. This is accomplished by assigning class labels to data points grounded in the likelihood of each point belonging to a particular class, incorporating informed decision-making based on past knowledge. Naive Bayes exhibits versatility in accommodating various feature types, including text, continuous variables, or binary data. It also employs Laplace smoothing as a safeguard against the occurrence of zero probabilities in classification tasks.

Variants of Naive Bayes

Gaussian Naive Bayes
- Suitable for continuous data with a Gaussian (normal) distribution. It is commonly used in real-valued datasets where features are assumed to be continuous and follow a bell-shaped curve.
- Example: Medical diagnosis, where attributes such as blood pressure, height, or weight are measured continuously and assumed to be normally distributed.

Multinomial Naive Bayes
- Fitting for text and categorical data. Common in natural language processing (NLP) for text classification tasks.
- Example: Spam email detection, sentiment analysis, and document categorization

Bernoulli Naive Bayes:
- Appropriate for binary data, where features are binary variables that represent the presence or absence of certain attributes.
- Example: Facial recognition systems, where the presence or absence of specific facial features is crucial for classification.