Mining high-utility itemsets with multiple minimum utility thresholds
JCW Lin, W Gan, P Fournier-Viger… - Proceedings of the Eighth …, 2015 - dl.acm.org
JCW Lin, W Gan, P Fournier-Viger, TP Hong
Proceedings of the Eighth International C* Conference on Computer Science …, 2015•dl.acm.orgHigh-utility itemset mining (HUIM) is an emerging topic in data mining. It consists of
discovering high-utility itemsets (HUIs), ie groups of items (itemsets) that generate a high
profit in transactional databases. Several algorithms have been proposed for this task.
However, they suffer from an important limitation, which is to rely on a single minimum utility
threshold as the sole criterion for identifying HUIs. In this paper, we address this issue by
introducing the novel framework of HUIM with multiple minimum utility thresholds (HUIM …
discovering high-utility itemsets (HUIs), ie groups of items (itemsets) that generate a high
profit in transactional databases. Several algorithms have been proposed for this task.
However, they suffer from an important limitation, which is to rely on a single minimum utility
threshold as the sole criterion for identifying HUIs. In this paper, we address this issue by
introducing the novel framework of HUIM with multiple minimum utility thresholds (HUIM …
High-utility itemset mining (HUIM) is an emerging topic in data mining. It consists of discovering high-utility itemsets (HUIs), i.e. groups of items (itemsets) that generate a high profit in transactional databases. Several algorithms have been proposed for this task. However, they suffer from an important limitation, which is to rely on a single minimum utility threshold as the sole criterion for identifying HUIs. In this paper, we address this issue by introducing the novel framework of HUIM with multiple minimum utility thresholds (HUIM-MMU). According to this framework, the user may specify different thresholds for each item, to discover HUIs. To perform HUIM-MMU, we first present an algorithm named HUI-MMU, which relies on a new sorted downward closure (SDC) property and least minimum utility threshold (LMU). Furthermore, an improved algorithm, namely HUI-MMUTID, is also proposed based on TID-index strategy, to increase mining performance. Substantial experiments both on real-life and synthetic datasets show that the two proposed algorithms can efficiently and effectively discover the complete set of HUIs in transactional databases while considering multiple minimum utility thresholds.
ACM Digital Library