|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.commons.math3.stat.clustering.KMeansPlusPlusClusterer<T>
T
- type of the points to clusterpublic class KMeansPlusPlusClusterer<T extends Clusterable<T>>
Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.
Nested Class Summary | |
---|---|
static class |
KMeansPlusPlusClusterer.EmptyClusterStrategy
Strategies to use for replacing an empty cluster. |
Field Summary | |
---|---|
private KMeansPlusPlusClusterer.EmptyClusterStrategy |
emptyStrategy
Selected strategy for empty clusters. |
private Random |
random
Random generator for choosing initial centers. |
Constructor Summary | |
---|---|
KMeansPlusPlusClusterer(Random random)
Build a clusterer. |
|
KMeansPlusPlusClusterer(Random random,
KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
Build a clusterer. |
Method Summary | ||
---|---|---|
private static
|
assignPointsToClusters(List<Cluster<T>> clusters,
Collection<T> points,
int[] assignments)
Adds the given points to the closest Cluster . |
|
private static
|
chooseInitialCenters(Collection<T> points,
int k,
Random random)
Use K-means++ to choose the initial centers. |
|
List<Cluster<T>> |
cluster(Collection<T> points,
int k,
int maxIterations)
Runs the K-means++ clustering algorithm. |
|
List<Cluster<T>> |
cluster(Collection<T> points,
int k,
int numTrials,
int maxIterationsPerTrial)
Runs the K-means++ clustering algorithm. |
|
private T |
getFarthestPoint(Collection<Cluster<T>> clusters)
Get the point farthest to its cluster center |
|
private static
|
getNearestCluster(Collection<Cluster<T>> clusters,
T point)
Returns the nearest Cluster to the given point |
|
private T |
getPointFromLargestNumberCluster(Collection<Cluster<T>> clusters)
Get a random point from the Cluster with the largest number of points |
|
private T |
getPointFromLargestVarianceCluster(Collection<Cluster<T>> clusters)
Get a random point from the Cluster with the largest distance variance. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private final Random random
private final KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy
Constructor Detail |
---|
public KMeansPlusPlusClusterer(Random random)
The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
random
- random generator to use for choosing initial centerspublic KMeansPlusPlusClusterer(Random random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
random
- random generator to use for choosing initial centersemptyStrategy
- strategy to use for handling empty clusters that
may appear during algorithm iterationsMethod Detail |
---|
public List<Cluster<T>> cluster(Collection<T> points, int k, int numTrials, int maxIterationsPerTrial) throws MathIllegalArgumentException, ConvergenceException
points
- the points to clusterk
- the number of clusters to split the data intonumTrials
- number of trial runsmaxIterationsPerTrial
- the maximum number of iterations to run the algorithm
for at each trial run. If negative, no maximum will be used
MathIllegalArgumentException
- if the data points are null or the number
of clusters is larger than the number of data points
ConvergenceException
- if an empty cluster is encountered and the
emptyStrategy
is set to ERROR
public List<Cluster<T>> cluster(Collection<T> points, int k, int maxIterations) throws MathIllegalArgumentException, ConvergenceException
points
- the points to clusterk
- the number of clusters to split the data intomaxIterations
- the maximum number of iterations to run the algorithm
for. If negative, no maximum will be used
MathIllegalArgumentException
- if the data points are null or the number
of clusters is larger than the number of data points
ConvergenceException
- if an empty cluster is encountered and the
emptyStrategy
is set to ERROR
private static <T extends Clusterable<T>> int assignPointsToClusters(List<Cluster<T>> clusters, Collection<T> points, int[] assignments)
Cluster
.
T
- type of the points to clusterclusters
- the Cluster
s to add the points topoints
- the points to add to the given Cluster
sassignments
- points assignments to clusters
private static <T extends Clusterable<T>> List<Cluster<T>> chooseInitialCenters(Collection<T> points, int k, Random random)
T
- type of the points to clusterpoints
- the points to choose the initial centers fromk
- the number of centers to chooserandom
- random generator to use
private T getPointFromLargestVarianceCluster(Collection<Cluster<T>> clusters) throws ConvergenceException
Cluster
with the largest distance variance.
clusters
- the Cluster
s to search
ConvergenceException
- if clusters are all emptyprivate T getPointFromLargestNumberCluster(Collection<Cluster<T>> clusters) throws ConvergenceException
Cluster
with the largest number of points
clusters
- the Cluster
s to search
ConvergenceException
- if clusters are all emptyprivate T getFarthestPoint(Collection<Cluster<T>> clusters) throws ConvergenceException
clusters
- the Cluster
s to search
ConvergenceException
- if clusters are all emptyprivate static <T extends Clusterable<T>> int getNearestCluster(Collection<Cluster<T>> clusters, T point)
Cluster
to the given point
T
- type of the points to clusterclusters
- the Cluster
s to searchpoint
- the point to find the nearest Cluster
for
Cluster
to the given point
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |