Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier

Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier

| Pedro Domingos, Michael Pazzani
The simple Bayesian classifier (SBC) is often assumed to require attribute independence given the class for optimal performance, but this paper shows that the SBC can still be optimal even when this assumption is violated. The key insight is that correct classification can be achieved even if probability estimates are inaccurate. The paper demonstrates that the previously assumed region of optimality for the SBC is much smaller than the actual region. It derives necessary and sufficient conditions for the SBC's optimality, showing that it can be optimal for learning arbitrary conjunctions and disjunctions, even when they violate the independence assumption. Empirical evidence shows that the SBC performs competitively in domains with significant attribute dependence. The paper also provides a simple example illustrating how the SBC can outperform the optimal classifier in many cases. It further shows that the SBC is locally optimal for a significant portion of the probability space, and globally optimal for certain concept classes. The SBC is globally optimal for linearly separable problems in symbolic domains, but not for all linearly separable concepts. The paper concludes that the SBC has a broader range of applicability than previously thought, and that its performance is not solely due to the independence assumption. The results suggest that the SBC is a valuable classifier for many real-world problems.The simple Bayesian classifier (SBC) is often assumed to require attribute independence given the class for optimal performance, but this paper shows that the SBC can still be optimal even when this assumption is violated. The key insight is that correct classification can be achieved even if probability estimates are inaccurate. The paper demonstrates that the previously assumed region of optimality for the SBC is much smaller than the actual region. It derives necessary and sufficient conditions for the SBC's optimality, showing that it can be optimal for learning arbitrary conjunctions and disjunctions, even when they violate the independence assumption. Empirical evidence shows that the SBC performs competitively in domains with significant attribute dependence. The paper also provides a simple example illustrating how the SBC can outperform the optimal classifier in many cases. It further shows that the SBC is locally optimal for a significant portion of the probability space, and globally optimal for certain concept classes. The SBC is globally optimal for linearly separable problems in symbolic domains, but not for all linearly separable concepts. The paper concludes that the SBC has a broader range of applicability than previously thought, and that its performance is not solely due to the independence assumption. The results suggest that the SBC is a valuable classifier for many real-world problems.
Reach us at info@futurestudyspace.com