Class Ai4r::Clusterers::SingleLinkage
In: lib/ai4r/clusterers/single_linkage.rb
Parent: Clusterer

Implementation of a Hierarchical clusterer with single linkage (Everitt et al., 2001 ; Johnson, 1967 ; Jain and Dubes, 1988 ; Sneath, 1957 ) Hierarchical clusteres create one cluster per element, and then progressively merge clusters, until the required number of clusters is reached. With single linkage, the distance between two clusters is computed as the distance between the two closest elements in the two clusters.

  D(cx, (ci U cj) = min(D(cx, ci), D(cx, cj))

Methods

Attributes

clusters  [R] 
data_set  [R] 
number_of_clusters  [R] 

Public Class methods

Public Instance methods

Build a new clusterer, using data examples found in data_set. Items will be clustered in "number_of_clusters" different clusters.

Classifies the given data item, returning the cluster index it belongs to (0-based).

Protected Instance methods

Given an array with clusters of data_items indexes, it returns an array of data_items clusters

Create a partial distance matrix:

  [
    [d(1,0)],
    [d(2,0)], [d(2,1)],
    [d(3,0)], [d(3,1)], [d(3,2)],
    ...
    [d(n-1,0)], [d(n-1,1)], [d(n-1,2)], ... , [d(n-1,n-2)]
  ]

where n is the number of data items in the data set

returns [ [0], [1], [2], … , [n-1] ] where n is the number of data items in the data set

Returns ans array with the indexes of the two closest clusters => [index_cluster_a, index_cluster_b]

return distance between cluster cx and new cluster (ci U cj), using single linkage

cluster_a and cluster_b are removed from index_cluster, and a new cluster with all members of cluster_a and cluster_b is added. It modifies index clusters array.

Returns the distance between element data_item[index_a] and data_item[index_b] using the distance matrix

ci and cj are the indexes of the clusters that are going to be merged. We need to remove distances from/to ci and ci, and add distances from/to new cluster (ci U cj)

[Validate]