Recommended Readings


Association Rules

  1.  R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. SIGMOD, 207-216, 1993.

  2.  R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB, 487-499, 1994.

  3.  S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket analysis. SIGMOD, 255-264, 1997.

  4. J.S. Park, M.S. Chen, and P.S. Yu. An effective hash-based algorithm for mining association rules. SIGMOD, 175-186, 1995.

  5.  A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. VLDB, 432-444, 1995.

  6. M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. Parallel algorithm for discovery of association rules. Data Mining and Knowledge Discovery, 1:343-374, 1997.

  7.  J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. SIGMOD, 1-12, 2000.

  8.  R. J. Bayardo. Efficiently mining long patterns from databases. SIGMOD, 85-93, 1998.

  9.  N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. ICDT, 398-416, 1999.

Scalable Decision Tree Construction

  1. M. Mehta, R. Agrawal, J. Rissanen, SLIQ: A Fast Scalable Classifier for Data Mining, EDBT 1996.
  2. J. Shafer, R. Agrawal, M. Mehtam, SPRINT: A Scalable Parallel Classifier for Data Mining, VLDB 1996.
  3. Khaled Alsabti, Sanjay Ranka, Vineet Singh, CLOUDS: A Decision Tree Classifier for Large Datasets,  KDD 1998.
  4. Johannes Gehrke, Raghu Ramakrishnan, Venkatesh Ganti, RainForest - A Framework for Fast Decision Tree Construction of Large Datasets,VLDB 1998.
  5. Johannes Gehrke, Venkatesh Ganti, Raghu Ramakrishnan, Wei-Yin Loh, BOAT-Optimistic Decision Tree Construction,  SIGMOD 1999.
  6. Ruoming Jin, Gagan Agrawal, Communication and Memory Efficient Parallel Decision Tree Construction, SDM 2003.

Clustering

  1. P. Berkhin, Survey of clustering data mining techniques, 2002.

  2. Ruoming Jin, Anjan Goswami, and Gagan Agrawal, Fast and Exact Out-of-Core and Distributed K-Means Clustering, Knowledge and Information Systems
    Volume 10 ,  Issue 1  (July 2006) (Early version in ICDM, 2004).

  3. R. Ng and J. Han. Efficient and effective clustering method for spatial data mining. VLDB, 144-155, 1994.

  4.  T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH : an efficient data clustering method for very large databases. SIGMOD, 103-114, 1996.

  5.  M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases. KDD, 226-231, 1996.

  6.  S. Guha, R. Rastogi, and K. Shim. Cure: an efficient clustering algorithm for large databases. SIGMOD, 73-84, 1998.

  7.  W. Wang, J. Yang, and R. Muntz. STING: a statistical information grid approach to spatial data mining. VLDB, 186-195, 1997.

  8. G. Sheikholeslami, S. Chatterjee, and A. Zhang. WaveCluster: a multi-resolution clustering approach for very large spatial databases. VLDB, 428-439, 1998.