Subjects: Mathematics >> Computational Mathematics. submitted time 2019-11-26
Abstract: This paper proposes a novel clustring algorithm named Sliding Means, aiming to take the place of k-means algorithm which is widely used in internet applications. Sliding means has the ability to handle with very large datasets, and to automatically determine the number of clusters. With the help of shuffling samples, bad initial centroids have little chance to be selected. Sliding means is also able to drop some bad centroids on the fly. On the iris dataset and optdigits dataset, sliding means achieves better performance(Adjusted Rand Index) than k-means by 9.93% and 5.17% respectively.
Peer Review Status:Awaiting Review