Understanding The Evidence Framework Applied to Classification Networks

The paper presents three Bayesian ideas for supervised adaptive classifiers. First, it argues that the output of a classifier should be obtained by marginalizing over the posterior distribution of the parameters, and proposes an approximation to this integral. This involves "moderating" the most probable classifier's outputs, which improves performance. Second, it demonstrates that the Bayesian framework for model comparison can be applied to classification problems, successfully choosing the magnitude of weight decay terms and ranking solutions with different numbers of hidden units. Third, an information-based data selection criterion is derived and demonstrated within this framework. The paper also discusses the validity of approximations, the importance of moderating classifier outputs, and the evaluation of evidence for model comparison. Additionally, it explores active learning strategies, including the mean marginal information gain objective function, which measures the expected informativeness of a datum. The paper concludes with a discussion on the applicability of these ideas and open questions.The paper presents three Bayesian ideas for supervised adaptive classifiers. First, it argues that the output of a classifier should be obtained by marginalizing over the posterior distribution of the parameters, and proposes an approximation to this integral. This involves "moderating" the most probable classifier's outputs, which improves performance. Second, it demonstrates that the Bayesian framework for model comparison can be applied to classification problems, successfully choosing the magnitude of weight decay terms and ranking solutions with different numbers of hidden units. Third, an information-based data selection criterion is derived and demonstrated within this framework. The paper also discusses the validity of approximations, the importance of moderating classifier outputs, and the evaluation of evidence for model comparison. Additionally, it explores active learning strategies, including the mean marginal information gain objective function, which measures the expected informativeness of a datum. The paper concludes with a discussion on the applicability of these ideas and open questions.

The Evidence Framework Applied to Classification Networks

1992 | David J. C. MacKay