Clustering the mixed panel dataset using Gower's distance and k-prototypes algorithms

AKAY, ÖZLEM; YÜKSEL, GÜZİN

doi:10.1080/03610918.2017.1367806

Clustering the mixed panel dataset using Gower's distance and k-prototypes algorithms

AKAY Ö., YÜKSEL G.

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, cilt.47, sa.10, ss.3031-3041, 2018 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 47 Sayı: 10
Basım Tarihi: 2018
Doi Numarası: 10.1080/03610918.2017.1367806
Dergi Adı: COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.3031-3041
Anahtar Kelimeler: Cluster analysis, Gower's distance, k-prototypes, Panel data
Çukurova Üniversitesi Adresli: Evet

Özet

Panel datasets have been increasingly used in economics to analyze complex economic phenomena. Panel data is a two-dimensional array that combines cross-sectional and time series data. Through constructing a panel data matrix, the clustering method is applied to panel data analysis. This method solves the heterogeneity question of the dependent variable, which belongs to panel data, before the analysis. Clustering is a widely used statistical tool in determining subsets in a given dataset. In this article, we present that the mixed panel dataset is clustered by agglomerative hierarchical algorithms based on Gower's distance and by k-prototypes. The performance of these algorithms has been studied on panel data with mixed numerical and categorical features. The effectiveness of these algorithms is compared by using cluster accuracy. An experimental analysis is illustrated on a real dataset using Stata and R package software.