Multi-Criteria Review Scoring in E-Commerce with LLM-Based Validation


Sari Z. S., Durmaz O., Dagdas G., Uslu S., Karakoca O. B., AKAY M. F.

8th International Congress on Human-Computer Interaction, Optimization and Robotic Applications, ICHORA 2026, Ankara, Türkiye, 21 - 23 Mayıs 2026, (Tam Metin Bildiri)

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ichora69329.2026.11537099
  • Basıldığı Şehir: Ankara
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: ECommerce, Large Language Models, Review Scoring
  • Çukurova Üniversitesi Adresli: Evet

Özet

With the rapid growth of e-commerce platforms, the number of customer reviews has increased significantly, making the analysis and presentation of these reviews to potential customers a time-consuming and difficult-to-manage process for businesses. In this study, a Large Language Model (LLM)-based approach has been proposed to address this problem by systematically evaluating customer reviews based on criteria such as Product, Shipping, Color, Profanity, Overall, and Helpful. The score generation performance of the gpt-5, gpt-5-nano, gpt-5-mini, gpt-5-chat-latest, gpt-4o, and gpt-4.1 models has been evaluated by three independent LLM auditors: Claude Sonnet 4.6, Gemini 3 Pro, and GLM-5. Mean Absolute Deviation (MAD) has been used in the validation process. According to the results of the review provided by e-commerce platform provider Inveon, the gpt-5-nano model has been determined as the most suitable option in terms of performance and cost balance, with a MAD value of 0.33.