Fuzzy C-means
Summarized notes from Introduction to Data Mining (Int'l ed), Chapter 8.2
Unsupervised learning > clustering
-
assign a weight to each object and each cluster that indicates the degree to which the object belongs to the cluster => \(w_{ij}\) is the weight with which object \(x_i\) belongs to cluster \(C_j\)
-
additional constraints for fuzzy clusters:
- each weight is between 0 and 1
- sum of all weights for point \(x_1\) is 1
- all clusters contain at least one non-zero-weight point, but all points with weight 1
Pseudo code
1 2 3 4 5 6 7 |
|
- random initialization if often used as initial weights \(w_{ij}\)
- choice of initial centroids is relevant, similar to K-means
Algorithm will attempt to minimize the fuzzy version of SSE:
where \(x_i\) is some data point, \(c_j\) is centroid, \(m\) is all points in dataset, \(k\) is number of clusters, and \(p\) determines the influence of the weights and has a value between 1 and \(\infty\).
Considerations for choosing \(p\):
- \(p = 1\) fuzzy c-means behaves like traditional K-means
- when \(p\) increases all the cluster centroids approach the global centroid of all the data points
Fuzzy centroid is similar to traditional centroid except all points are considered with their respective weights, and the contribution of each point to the centroid is weighted by its membership degree.
Computing weight update:
If \(p > 2\), then exponent decreases the weight assigned to clusters that are close to the point. If \(p==1\) the membership weight goes to 1 for the closest cluster and to 0 for all the other clusters.
Analysis
- able to represent the degree to which each point belongs to any particular cluster
- other advantages similar to K-means
- computationally more intensive than K-means