Symbolic Data Analysis (SDA) is a comprehensive term for data analysis targeting symbolic data. It was introduced by E. Diday in 1988 at the First Conference of the International Federation of Classification Societies (IFCS). Since then, primarily in Europe, the theoretical framework of SDA has developed, leading to the development of numerous algorithms.
Symbolic data are defined as values of symbolic variables. Unlike traditional data, which are typically single quantitative or categorical variables, symbolic variables can represent multivalued, interval, or modal variables, which may include probability or uncertainty weights. Traditional data are special cases of symbolic data, making SDA an extension of conventional data analysis methods. SDA encompasses various descriptive statistics, multivariate analysis techniques, exploratory data analysis methods like clustering, and applications to more complex data sets. Recent advancements highlight its suitability for real-world data and its advantages in handling large datasets, leading to increased research interest in the field.
The small world phenomenon, derived from the surprise of discovering a common acquaintance between strangers, was first explored by social psychologist S. Milgram in the 1960s. Milgram's study found that, on average, two people in the United States are connected through just five intermediaries. This finding sparked interest in social psychology. In 1998, D. Watts, who had recently obtained a Ph.D. in applied mathematics, characterized small worlds as graphs with two key features: high clustering coefficient (C) and short average path length (L). Watts demonstrated that random edge rewiring can significantly reduce L for graphs with high C, creating small worlds. Since then, numerous studies have shown that various networks, including social networks, natural networks, and artificial networks, exhibit these characteristics. Small worlds are also noted for their rapid spread of trends and diseases and their resilience against attacks, balancing edge costs and transmission efficiency.Symbolic Data Analysis (SDA) is a comprehensive term for data analysis targeting symbolic data. It was introduced by E. Diday in 1988 at the First Conference of the International Federation of Classification Societies (IFCS). Since then, primarily in Europe, the theoretical framework of SDA has developed, leading to the development of numerous algorithms.
Symbolic data are defined as values of symbolic variables. Unlike traditional data, which are typically single quantitative or categorical variables, symbolic variables can represent multivalued, interval, or modal variables, which may include probability or uncertainty weights. Traditional data are special cases of symbolic data, making SDA an extension of conventional data analysis methods. SDA encompasses various descriptive statistics, multivariate analysis techniques, exploratory data analysis methods like clustering, and applications to more complex data sets. Recent advancements highlight its suitability for real-world data and its advantages in handling large datasets, leading to increased research interest in the field.
The small world phenomenon, derived from the surprise of discovering a common acquaintance between strangers, was first explored by social psychologist S. Milgram in the 1960s. Milgram's study found that, on average, two people in the United States are connected through just five intermediaries. This finding sparked interest in social psychology. In 1998, D. Watts, who had recently obtained a Ph.D. in applied mathematics, characterized small worlds as graphs with two key features: high clustering coefficient (C) and short average path length (L). Watts demonstrated that random edge rewiring can significantly reduce L for graphs with high C, creating small worlds. Since then, numerous studies have shown that various networks, including social networks, natural networks, and artificial networks, exhibit these characteristics. Small worlds are also noted for their rapid spread of trends and diseases and their resilience against attacks, balancing edge costs and transmission efficiency.