6 الى13 داتا ماينق
6 الى13 داتا ماينق
6 الى13 داتا ماينق
Ch. 6
Divide-and-conquer
A good open-source implementation and refinement
of FPGrowth
Ch. 7
Ch. 8
15. Supervised vs. Unsupervised Learning ?
Comparing classifiers:
Confidence intervals
Cost-benefit analysis and ROC Curves
Bootstrap
Works well with small data sets
Samples the given training tuples uniformly with
replacement
Ch. 9
Ch 10
Ch 11
Strength
Mixture models are more general than
partitioning and fuzzy clustering
Clusters can be characterized by a small
number of parameters
The results may satisfy the statistical
assumptions of the generative models
Weakness
Converge to local optimal (overcome: run
multi-times w. random initialization)
Computationally expensive if the number of
distributions is large, or the data set contains
very few observed data points
Need large data sets
Hard to estimate the number of clusters
Sophisticated graphs
High dimensionality
Sparsity
58. Two Approaches for Graph Clustering ?
Use generic clustering methods for high-dimensional data
Designed specifically for clustering graphs
Ch 12
Strength
Detect outliers without requiring any
labeled data
Work for many types of data
Clusters can be regarded as
summaries of the data
Once the cluster are obtained, need
only compare any object against the
CREATED BY GHUASIN_RASHED
Ch 13