Class | Ai4r::Clusterers::SingleLinkage |
In: |
lib/ai4r/clusterers/single_linkage.rb
|
Parent: | Clusterer |
Implementation of a Hierarchical clusterer with single linkage (Everitt et al., 2001 ; Johnson, 1967 ; Jain and Dubes, 1988 ; Sneath, 1957 ) Hierarchical clusteres create one cluster per element, and then progressively merge clusters, until the required number of clusters is reached. With single linkage, the distance between two clusters is computed as the distance between the two closest elements in the two clusters.
D(cx, (ci U cj) = min(D(cx, ci), D(cx, cj))
clusters | [R] | |
data_set | [R] | |
number_of_clusters | [R] |
Build a new clusterer, using data examples found in data_set. Items will be clustered in "number_of_clusters" different clusters.
Create a partial distance matrix:
[ [d(1,0)], [d(2,0)], [d(2,1)], [d(3,0)], [d(3,1)], [d(3,2)], ... [d(n-1,0)], [d(n-1,1)], [d(n-1,2)], ... , [d(n-1,n-2)] ]
where n is the number of data items in the data set
Returns ans array with the indexes of the two closest clusters => [index_cluster_a, index_cluster_b]
return distance between cluster cx and new cluster (ci U cj), using single linkage
cluster_a and cluster_b are removed from index_cluster, and a new cluster with all members of cluster_a and cluster_b is added. It modifies index clusters array.
Returns the distance between element data_item[index_a] and data_item[index_b] using the distance matrix
ci and cj are the indexes of the clusters that are going to be merged. We need to remove distances from/to ci and ci, and add distances from/to new cluster (ci U cj)