Assessment of clustering algorithms for unsupervised transcription factor binding site discovery

Karabulut, Mustafa; İBRİKÇİ, TURGAY

doi:10.1016/j.eswa.2011.02.161

Assessment of clustering algorithms for unsupervised transcription factor binding site discovery

Karabulut M., İBRİKÇİ T.

EXPERT SYSTEMS WITH APPLICATIONS, cilt.38, sa.9, ss.11160-11166, 2011 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 38 Sayı: 9
Basım Tarihi: 2011
Doi Numarası: 10.1016/j.eswa.2011.02.161
Dergi Adı: EXPERT SYSTEMS WITH APPLICATIONS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.11160-11166
Çukurova Üniversitesi Adresli: Evet

Özet

Identification of transcription factor binding sites is a key task to understand gene regulation mechanism to discover gene networks and functions. Clustering approach is proved to be useful when finding such patterns residing in promoter regions of co-regulated genes. Four clustering algorithms, Self-Organizing Map, K-Means, Fuzzy C-Means and Expectation-Maximization are studied in this paper to discover motifs in datasets extracted from Saccharomyces cerevisiae, Escherichia coli, Droshophila melanogaster and Homo sapiens DNA sequences. Required modifications to clustering algorithms in order to adapt them to motif finding task are presented through the paper. Then, their motif-finding performances are discussed carefully and evaluated against a popular motif-finding method, MEME. (C) 2011 Elsevier Ltd. All rights reserved.