kpeaks: An R Package for Quick Selection of K for Cluster Analysis


CEBECİ Z. , Cebeci C.

International Conference on Artificial Intelligence and Data Processing (IDAP), Malatya, Türkiye, 28 - 30 Eylül 2018 identifier

  • Cilt numarası:
  • Basıldığı Şehir: Malatya
  • Basıldığı Ülke: Türkiye

Özet

The argument k is a mandatory user-specified input argument for the number of clusters which is required to start all of the partitioning clustering algorithms. In unsupervised learning applications, an optimal value of this argument is generally determined by using any of the internal validity indexes. However, the determination of k with aid of these indexes are computationally very expensive because they compute a k value using the results after several runs of a clustering algorithm. On the contrary, the package 'kpeaks' enables to estimate k before starting a clustering session. It is based on a simple novel technique using the descriptive statistics of peak counts of the features in datasets. In this paper, we introduce and illustrate the details of R package 'kpeaks' as an implementation for quick selection of the number of clusters for starting cluster algorithms.