Efficient Feature Selection for Product Labeling over Unstructured Data

YETGİN, ZEKİ; ELEWI, ABDULLAH; Gozukara, Furkan

Efficient Feature Selection for Product Labeling over Unstructured Data

Atıf İçin Kopyala

YETGİN Z., ELEWI A., Gozukara F.

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, cilt.8, sa.7, ss.376-381, 2017 (ESCI)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 8 Sayı: 7
Basım Tarihi: 2017
Dergi Adı: INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS
Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus, Index Islamicus, INSPEC
Sayfa Sayıları: ss.376-381
Çukurova Üniversitesi Adresli: Evet

Özet

The paper introduces a novel feature selection algorithm for labeling identical products collected from online web resources. Product labeling is important for clustering similar or same products. Products blindly crawled over the web sources, such as online sellers, have unstructured data due to having features expressed in different representations and formats. Such data result in feature vectors whose representation is unknown and non-uniform in length. Thus, product labeling, as a challenging problem, needs efficient selection of features that best describe the products. In this paper, an efficient feature selection algorithm is proposed for product labeling problem. Hierarchical clustering is used with the state of the art similarity metrics to assess the performance of the proposed algorithm. The results show that the proposed algorithm increases the performance of product labeling significantly. Furthermore, the method can be applied to any clustering algorithm that works on unstructured data.