Using Deep Learning for Mammography Classification


Hepsag P. U. , ÖZEL S. A. , YAZICI A.

2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Türkiye, 5 - 08 Ekim 2017, ss.418-423 identifier identifier

  • Cilt numarası:
  • Doi Numarası: 10.1109/ubmk.2017.8093429
  • Basıldığı Şehir: Antalya
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.418-423

Özet

Breast biopsies based on the results of mammography and ultrasound have been diagnosed as benign at a rate of approximately 40 to 60 percent. Negative biopsy results have negative impacts on many aspects such as unnecessary operations, fear, pain, and cost. Therefore, there is a need for a more reliable technique to reduce the number of unnecessary biopsies in the diagnosis of breast cancer. So, computer-aided diagnostic methods are very important for doctors to make more accurate decisions and to avoid unnecessary biopsies. For this purpose, we apply deep learning using Convolutional Neural Networks (CNN) to classify abnormalities as benign or malignant in mammogram images by using two different databases namely, mini-MIAS and BCDR. While mini-MIAS database has valuable information like location of the center of abnormality and radius of the circle that surrounds the abnormality, BCDR database does not have. When we use both dataset as they are, we observe accuracy, precision, recall, and f-score values between around 60% and 72%. In order to improve our results, we take the benefit of preprocessing methods containing cropping, augmentation, and balancing image data. In an effort to crop image data sourced from BCDR, we create a mask to find region of interest. After applying our preprocessing methods over the BCDR dataset, we observe that classification accuracy improves from 65% to around 85%. When we compare the classification accuracy, precision, recall and f-score obtained from the MIAS database with those obtained from the BCDR database we found that after applying preprocessing methods to BCDR dataset, the classification performance become very close to each other for the two datasets.