CS 68191 Masters Seminar / CS 89191 Doctoral Seminar
Spring 2007
Masters Student Presentation
Cost-Based Query Optimization of Frequent Itemset Mining on Multiple Databases
Abdulkareem Alali
Mining frequent patterns across multiple datasets has received a lot
of research interest recently. We investigate cost-based query
optimization approaches to efficiently evaluate such mining
tasks. Specially: 1) We present a rich class of queries on mining
frequent itemsets across multiple datasets supported by a SQL-based
mechanism. 2) We present an approach to enumerate all possible query
plans for the mining queries, and develop a dynamic programming
approach and a branch-and-bound approach based on the enumeration
algorithm to find optimal query plans with the least mining cost. 3)
We introduce models to estimate the cost of individual mining
operators. 4) We evaluate our query optimization techniques on both
real and synthetic datasets and show significant performance
improvements.