K means (Cluster Analysis)

FCS Express can perform cluster analysis using k-means methodology.

 

Cluster analysis aims to group a set of objects/events in such a way that objects/events in the same group (i.e. a cluster) are more similar (in some sense or another) to each other than to those in other groups (i.e. the other clusters).

 

k-means is a partitioning-based clustering algorithm. k-means method for clustering is an iterative process in which an initial partition of given k clusters is then improved by applying a search algorithm to the data. Simplifying, given a pre-defined number (k) of clusters, the algorithm:

 

-begins with an initial set of k cluster centers (i.e. the centroids)

-(re)assigns objects to the closest centroids

-recalculates centroids according to new memberships of the data points.

-repeats the last two steps until a consistent result is found or until the maximum number of iterations is reached.

 

The whole procedure above can be repeated several times, starting from a different set of centroids each time. When this is done, the best clustering is then used as final result.

 

To compensate for the fact that K-mean is very sensitive to the initial guesses when performing the clustering, there is the option to run KMeans multiple times and FCS Express will automatically select the best result as the final clustering result.

 

Note: the basic k-means clustering is based on a non-deterministic algorithm. This means that running the algorithm several times on the same data, could give different results. However, to ensure consistent results, FCS Express performs k-means clustering using a deterministic method.

 

To begin working with cluster analysis in FCS Express please see the topics on:

Defining a k-means Clustering Analysis

Applying a k-means Clustering Analysis

Working with k-means parameters