Fuzzy C-Means based DNA motif discovery

KARABULUT M., İBRİKÇİ T.

4th International Conference on Intelligent Computing, Shanghai, Çin, 15 - 18 Eylül 2008, cilt.5226, ss.189-190, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası: 5226
Doi Numarası: 10.1007/978-3-540-87442-3_24
Basıldığı Şehir: Shanghai
Basıldığı Ülke: Çin
Sayfa Sayıları: ss.189-190
Çukurova Üniversitesi Adresli: Evet

Özet

In this paper, we examined the problem of identifying motifs in DNA sequences. Transcription-binding sites, which are functionally significant sub-sequences, are considered as motifs. In order to reveal such DNA motifs, our method makes use of Fuzzy clustering of Position Weight Matrix. The Fuzzy C-Means (FCM) algorithm clearly predicted known motifs that existed in intergenic regions of GAL4, CBF1 and GCN4 DNA sequences. This paper also provides a comparison of FCM with some clustering methods such as Self-Organizing Map and K-Means. The results of the FCM algorithm is compared to the results of popular motif discovery tool Multiple Expectation Maximization for Motif Elicitation (MEME) as well. We conclude that soft-clustering-based machine learning methods such as FCM are useful to finding patterns in biological sequences.