Divergence Measures Based on the Shannon Entropy

Divergence Measures Based on the Shannon Entropy

January 1991 | Jianhua Lin, Member, IEEE
A new class of information-theoretic divergence measures based on the Shannon entropy is introduced. Unlike the well-known Kullback-Leibler divergences, these new measures do not require the condition of absolute continuity between probability distributions. The measures are closely related to the variational distance and the probability of misclassification error, with established bounds that are crucial in many applications. The new measures are characterized by nonnegativity, finiteness, semiboundedness, and boundedness. The paper discusses the limitations of existing divergence measures, such as the I and J divergences, which require absolute continuity and cannot provide bounds for the variational distance and Bayes probability of error. A new directed divergence measure is introduced that overcomes these limitations. This measure is closely related to the I divergence and has desirable properties such as nonnegativity and boundedness. A symmetric form of this divergence, the L divergence, is also defined and compared with I and J divergences. The Jensen-Shannon divergence is introduced as a generalization of the L divergence. It allows for different weights to be assigned to each probability distribution, making it suitable for decision problems where weights could be prior probabilities. The Jensen-Shannon divergence provides both lower and upper bounds for the Bayes probability of misclassification error. The generalized Jensen-Shannon divergence is extended to handle more than two probability distributions, making it useful for multiclass decision-making. It is related to the Jensen difference proposed by Rao in a different context. The paper also discusses the properties of the new divergence measures, including their boundedness and their relationship to the Shannon entropy. The paper concludes that the new divergence measures provide a unified definition and characterization of information-theoretic divergence measures based on the Shannon entropy. These measures have theoretical foundations and are useful in various applications, including signal processing, pattern recognition, and decision-making. The results presented fill a gap in the theoretical justification of these measures and provide a foundation for future applications.A new class of information-theoretic divergence measures based on the Shannon entropy is introduced. Unlike the well-known Kullback-Leibler divergences, these new measures do not require the condition of absolute continuity between probability distributions. The measures are closely related to the variational distance and the probability of misclassification error, with established bounds that are crucial in many applications. The new measures are characterized by nonnegativity, finiteness, semiboundedness, and boundedness. The paper discusses the limitations of existing divergence measures, such as the I and J divergences, which require absolute continuity and cannot provide bounds for the variational distance and Bayes probability of error. A new directed divergence measure is introduced that overcomes these limitations. This measure is closely related to the I divergence and has desirable properties such as nonnegativity and boundedness. A symmetric form of this divergence, the L divergence, is also defined and compared with I and J divergences. The Jensen-Shannon divergence is introduced as a generalization of the L divergence. It allows for different weights to be assigned to each probability distribution, making it suitable for decision problems where weights could be prior probabilities. The Jensen-Shannon divergence provides both lower and upper bounds for the Bayes probability of misclassification error. The generalized Jensen-Shannon divergence is extended to handle more than two probability distributions, making it useful for multiclass decision-making. It is related to the Jensen difference proposed by Rao in a different context. The paper also discusses the properties of the new divergence measures, including their boundedness and their relationship to the Shannon entropy. The paper concludes that the new divergence measures provide a unified definition and characterization of information-theoretic divergence measures based on the Shannon entropy. These measures have theoretical foundations and are useful in various applications, including signal processing, pattern recognition, and decision-making. The results presented fill a gap in the theoretical justification of these measures and provide a foundation for future applications.
Reach us at info@study.space