PageRank based Semantic Similarity Measure on a Graph based Turkish WordNet


Tulu C., ORHAN U.

2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Türkiye, 5 - 08 Ekim 2017, ss.468-473 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Doi Numarası: 10.1109/ubmk.2017.8093438
  • Basıldığı Şehir: Antalya
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.468-473
  • Çukurova Üniversitesi Adresli: Evet

Özet

Semantic similarity of texts is one of the important areas of Natural Language Processing, and there are several approaches to measure similarity: statistical, WordNet based, and hybrid. For all of these approaches, a lexical knowledge is used such as corpus or semantic network. WordNet is one of the most preferred and mature lexical knowledge base. In this study, we have focused on measuring semantic similarity of Turkish words with a graph based Turkish WordNet. In order to measure semantic similarities, a PageRank based application was chosen. For testing the success of the proposed system, RG65 standard similarity dataset was translated to Turkish and used as benchmark data. Similarity results of the translated RG65 dataset are computed using Turkish WordNet. Result of the computation shows rho=0.543 correlation with human judgement. Taking into account that Turkish WordNet is very limited in term of number of words and there is no study in this area for Turkish language, it is considered that also the low success for this study is acceptable.