Title

WC-clustering: Hierarchical Clustering Using the Weighted Confidence Affinity Measure

Document Type

Conference Proceeding

Publication Date

10-2007

Abstract

Market-basket data analysis is an important problem that has been well addressed in the literature especially in the context of finding associations among items in large groups of transactions. Recently, there have been many attempts for clustering market-basket data. However, most of those market-basket clustering methods belong to partitional clustering which require at least one input parameter (e.g., the minimum intra- cluster similarity or the desired number of clusters). In this paper, we propose WC-clustering, a hierarchical clustering approach using vertical data structures. In order to minimize the impact of low support items, we devise a weighted confidence (WC) affinity function to calculate the similarity between clusters (or itemsets). Our experimental results show that WC-clustering produces much more compact results than Apriori and that the proposed weighted confidence affinity measure is more accurate than other contemporary affinity measures in the literature.