Understanding Mining quantitative association rules in large relational tables

This paper introduces the problem of mining quantitative association rules in large relational tables containing both quantitative and categorical attributes. The authors propose a method to handle quantitative attributes by partitioning their values into intervals and combining adjacent intervals as needed. They introduce a measure of partial completeness to quantify the information lost due to partitioning. To address the issue of generating too many similar rules, they use a "greater-than-expected-value" interest measure to identify interesting rules. The paper presents an algorithm for mining quantitative association rules and describes the results of applying this approach to a real-life dataset. The problem is viewed as finding associations between the "1" values in a relational table where all attributes are boolean. However, relational tables in most business and scientific domains have richer attribute types, including quantitative and categorical attributes. The authors define the problem of mining association rules over quantitative and categorical attributes in large relational tables and present techniques for discovering such rules. They refer to this problem as the Quantitative Association Rules problem. The authors discuss the challenges of mapping quantitative association rules to the Boolean Association Rules problem, including the "MinSup" and "MinConf" problems. They propose a solution that considers ranges over adjacent values/intervals of quantitative attributes to avoid the "MinSup" problem. To mitigate the "ExecTime" problem, they introduce a user-specified "maximum support" parameter to restrict the extent to which adjacent values/intervals may be combined. They also introduce a partial completeness measure to help decide whether to partition a quantitative attribute and how many partitions to use. The authors present an interest measure based on deviation from expectation to prune out uninteresting rules. They describe an algorithm for discovering quantitative association rules that shares the basic structure of the algorithm for finding boolean association rules. The algorithm is optimized for fast implementation, with new computational details for generating candidates and counting their supports. The authors evaluate their approach on a real-life dataset with 7 attributes, 5 quantitative and 2 categorical. They find that the number of interesting rules decreases as the partial completeness level increases, and the percentage of rules pruned also decreases. The interest measure is shown to effectively identify interesting rules, with the percentage of rules identified as interesting decreasing as the interest level increases. The algorithm is expected to have near-linear scaleup, as confirmed by the results.This paper introduces the problem of mining quantitative association rules in large relational tables containing both quantitative and categorical attributes. The authors propose a method to handle quantitative attributes by partitioning their values into intervals and combining adjacent intervals as needed. They introduce a measure of partial completeness to quantify the information lost due to partitioning. To address the issue of generating too many similar rules, they use a "greater-than-expected-value" interest measure to identify interesting rules. The paper presents an algorithm for mining quantitative association rules and describes the results of applying this approach to a real-life dataset. The problem is viewed as finding associations between the "1" values in a relational table where all attributes are boolean. However, relational tables in most business and scientific domains have richer attribute types, including quantitative and categorical attributes. The authors define the problem of mining association rules over quantitative and categorical attributes in large relational tables and present techniques for discovering such rules. They refer to this problem as the Quantitative Association Rules problem. The authors discuss the challenges of mapping quantitative association rules to the Boolean Association Rules problem, including the "MinSup" and "MinConf" problems. They propose a solution that considers ranges over adjacent values/intervals of quantitative attributes to avoid the "MinSup" problem. To mitigate the "ExecTime" problem, they introduce a user-specified "maximum support" parameter to restrict the extent to which adjacent values/intervals may be combined. They also introduce a partial completeness measure to help decide whether to partition a quantitative attribute and how many partitions to use. The authors present an interest measure based on deviation from expectation to prune out uninteresting rules. They describe an algorithm for discovering quantitative association rules that shares the basic structure of the algorithm for finding boolean association rules. The algorithm is optimized for fast implementation, with new computational details for generating candidates and counting their supports. The authors evaluate their approach on a real-life dataset with 7 attributes, 5 quantitative and 2 categorical. They find that the number of interesting rules decreases as the partial completeness level increases, and the percentage of rules pruned also decreases. The interest measure is shown to effectively identify interesting rules, with the percentage of rules identified as interesting decreasing as the interest level increases. The algorithm is expected to have near-linear scaleup, as confirmed by the results.

Mining Quantitative Association Rules in Large Relational Tables

1996 | Ramakrishnan Srikant*, Rakesh Agrawal