Abstract
In this paper, we present an algebraic specification for association rule queries that can form the foundation for integrating data mining and database management. We first define a set of nested algebraic operators needed to specify association rule queries. Association rule discovery is then expressed as a query tree of these operators. The expressiveness of the algebra is indicated by specifying some of the variants of association rule queries as query trees. Other variants of association rule queries discussed in the literature can also be represented using the algebra. Constrained association queries (CAQs) have been proposed by researchers to limit the number of rules discovered. We discuss the representation of CAQs using the algebra. Certain sequences of algebraic operators occur together in most of the query variants. These sequences are combined as modules to simplify the presentation of query trees. While the focus of the paper is the algebraic specification of association rule queries, we briefly discuss the optimization issues in implementing the algebra for association rule mining. The grouping of algebraic operators into modules facilitate the use of existing algorithms for association rules in query optimization.
Original language | English |
---|---|
Pages (from-to) | 77-87 |
Number of pages | 11 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 4730 |
DOIs | |
Publication status | Published - 2002 |
Event | Data Mining and Knowledge Discovery: Theory, Tools, and Technology IV - Orlando, FL, United States Duration: 1 Apr 2002 → 4 Apr 2002 |
Keywords
- Algebraic specification
- Association rule queries
- Constrained association queries
- Nested relational algebra
- Query trees