Computer-Aided Detection System for Breast Cancer Based on mammogram images

 

 

المؤلفون / Authors

الملخص / Abstract

الكلمات المفتاحية / Keywords

أقسام الملف

Introduction


Methodology


Results and Discussion


Conclusions  

References

Computer-Aided Detection System for Breast Cancer Based on mammogram images
Suha Abdo(1)    ,     Ghada Saad(2)
Biomedical Engineer(1)   ,      Assistant Professor(2)
Faculty of Medical Engineering
Al-Andalus University, Al Qadmous - Tartus, Syria
ghdsaad@yahoo.com   Suhaabdo3577@gmail.com
 
Abstract:  
Objective: Breast cancer is the most common cancer diagnosed in women worldwide at age of 35-55. An estimated 1.38 million women across the world were diagnosed with breast cancer in 2008, accounting for nearly a quarter (23%) of all cancers  diagnosed in women. Radiologists usually search mammograms for specific abnormalities. Some of the important signs of breast cancer that radiologists look for are clusters of micro calcifications, masses, and architectural distortions. Today computer aided diagnosis systems proved their ability to raise the sensitivity of radiologists by10%, and so far it’s integrated with many of imaging techniques and systems. The main objective of this research is to develop a computerized algorithm for the diagnosis of benign and malignant masses in digital mammogram images, according to extracted features from interactively selected (ROI) (700x700 pixel) from 60 images taken from mini-MIAS database (mammographic image analysis society).  
 Methodology: The methodology consisting of five consecutive steps which is image enhancement with Median and adaptive median filter,  The second step is the segmentation where segment the breast region from the background using a threshold value computed using the Otsu thresholding. And Connected Component Labeling (CCL) algorithm ,also we used morphological preprocessing to remove the pectoral muscle, the next step is features extraction by using Wavelet transform , then we used Principal component analysis( PCA) to select the best suitable features, Finally a classification step is performed, where  we used Gaussian mixture model (GMM) classifier.  
Conclusion: The suggested algorithm was tested for mini-MIAS database  and showed high level of overall accuracy (93.33 %). Where experimentation performed on 60 images,( 20 malignant, 20 benign, 20 normal ).  
   
Keywords: Breast, CAD system ,GMM, , Mammogram ,Mini-MIAS, extracted features  
   

Introduction

Breast cancer is the most common cancer diagnosed in women worldwide at age of 35-55. An estimated 1.38 million women across the world were diagnosed with breast cancer in 2008[1]. Image processing techniques have been developed over the last two decades to assist physicians in diagnosing breast cancer. Among the different currently available diagnostic methods, mammography is considered to be the most reliable for the detection of both benign and malignant mammary neoplasia at the very early stage of cancer prognoses [2,3] Mammography can often detect breast cancer at an early stage, when treatment is more effective and a cure is more likely. Numerous studies have shown that early detection with mammography saves lives and increases treatment options. [4]. In recent years, many researches focused on development of the CAD systems to help radiologists in examining mammogram images. the main purposes of CAD systems are to help in evaluating the risk of cancer and discriminating the abnormalities type based on the analysis of detected lesions.The basic objectives that we have covered in our work can be enlisted as follows: First; the pre-processing step to enhance the image quality and make ready to further processing by removing the unrelated and surplus parts in the Background of the mammogram images by applying a Median and adaptive median filter. Then we removed the pectoral muscle by using morphological processing. Third; the segmentation of the ROI from the breast region using Otsu thresholding. And Connected Component Labeling (CCL) algorithm The fourth step; the extraction of the features set from the segmented ROI using Wavelet transform then select the best suitable features using PCA . Finally; classifying the ROI into normal, benign or malignant using Gaussimixture model (GMM) classifier. Many researches focused on improving the mass detection and segmentation in order to accurately detect the abnormalities regions from mammogram images. Saad.G.,et al [5] In this study, an automated system has been developed to minimize the manual inference and diagnose breast cancer with good precision, where they propose a twofold detection algorithm. In the first stage, used Wiener filter to remove background noise from the image. Next they used Otsu’s algorithm for segmentation. Law’s mask has next been applied for texture analysis in image processing. The classification step of the ROI into normal or abnormal has been performed through comparing the performance of two classifiers, Artificial Neural Networks and Adaboost. The suggested algorithm was tested for DDSM, MIAS and local database and showed high level of overall accuracy (98.68%) and sensitivity (80.15%).

 
Elmoufidi, A., et al.[6] This paper presents a method for the detection of regions of interest (ROI) in mammograms by using a dynamic K-means clustering algorithm. a two-dimensional median filtering was used for enhancement of the contrast in mammogram images. this method consists of three phases: firstly, a two-dimensional median filtering was used for enhancement of the contrast in mammogram images. secondly, generating range of number of clusters by using Local Binary Pattern (LBP) and Applying k-means with its features to automatically generating the optimal number of clusters; thirdly, partition the mammograms images into k clusters by applying the dynamic k-means clustering algorithm, they end by detecting the regions of interest (ROI) in mammograms images. Their algorithm was tested on mammograms from the MIAS dataset. and The archived results are 2.84 false positives per image and sensitivity of 85%. Elfarra.B.K., et al [7]  The main objective of this research is to enhance and introduce a new method for feature extraction and selection in order to build a CADx model to discriminate between cancers, benign, and healthy parenchyma.
For feature extraction, they used both human features and computational features, which are obtained by useing two pre-existed feature extraction methods, the Run Difference Method (RDM) and the Spatial Gray Level Dependence Method (SGLDM). Then, they applied a new method for feature selection by running both of forward sequential and genetic algorithm. Later they evaluated the results. Which are obtained from a data set of 410 images taken from DDSM for different types. Their method select 14 features from 65 extracted features. They used both Receiver Operating Characteristics (ROC) and confusing matrix (CM) to measure the performance. In training stage, their proposed method achieved an overall classification accuracy of 94.6%. Omara.A.I.,et al [8] in this paper; they introduced a computer diagnosis system, which could be helpful in diagnosing abnormalities faster than traditional screening program without the drawback attribute to human factors. they used the wavelet decomposition of locally processed image (region of interest). for feature extraction. and Two Techniques were used for the classification stage The minimum distance classifier and the voting K-Nearest Neighbor classifier.In this study the region of interest was 64x64 pixel. Seventy two regions of interest extracted from 40 X-ray mammograms were used for evaluating the method.36 of these regions are known to be malignant while the remaining 36 regions are known to be normal.36 regions out of the total (72) were left for testing the system chosen to be 18 normal and 18 malignant.
 
   
Methodology   
The basic objectives that we have covered in our work can be enlisted as follows:  
1. First, we use Median filter and Adaptive median  filter to remove background noise  
from the image, then we remove the pectoral muscle region by utilizing the morphological preprocessing  
2. For segmentation, Otsu’s algorithm has been applied. It computes the most favorable threshold that separates the pixels into two classes, so that the inter class  
variance is maximum in addition to use Connected Component Labeling (CCL) algorithm  
3. The next step is features extraction by using Wavelet transform  
4. Then we used Principal component analysis (PCA) to select the features  
5. For classification between malignant, benign and normal tissues we have used the Gaussian mixture model (GMM) classifier.  
Given below is detail of each step:  
1- Pre-processing step to enhance the image quality and make ready to further processing by removing the unrelated and surplus parts in the Background of the mammogram images.  
   
1-1 Median filter  
Median filter is the most commonly used filter. It is a non linear method of filtering. The size of the kernel can be of nxn size which is made to convolve or slide over a mxm corrupted image. While performing this operation the median value of nxn kernel on the image is obtained and then the value of a particular pixel is replaced with the median value of the nxn kernel [9,10].  
   
Fig -1: Sorting in median filter  
1-2 Adaptive median filter  
The adaptive median filter is designed to eliminate the drawbacks faced by the standard median. The main advantage of adaptive median filter is the size of the kernel surrounding the corrupted image is variable due to which better output result is obtained. The other main advantage of adaptive filter is that unlike median filter it does not replace all the pixel values with the median value [9, 10].   
Implementation of adaptive median filter:  
Zmin = Minimum gray level value in Sxy.   
Zmax = Maximum gray level value in Sxy   
Zmed = Median of gray levels in Sxy   
Zxy = gray level at coordinates (x, y)   
Smax = Maximum allowed size of Sxy   
The adaptive median filter works in two levels denoted Level A and Level B as follows  
Level A: A1= Zmed - Zmin   
A2= Zmed - Zmax   
If A1 > 0 AND A2 < 0, Go to level B   
Else increase the window size   
If window size <=Smax repeat level A   
Else output Zxy.   
Level B: B1 = Zxy – Zmin   
B2 = Zxy – Zmin  
If B1 > 0 And B2 < 0 output Zxy   
Else output Zmed  
1-3-The pectoral muscle removel:  
The steps for pectoral muscle segmentation begin with determining a region of interest ROI of the digital mammogram. There are several methods proposed in the literature to identify the pectoral muscle in mammograms. Nagi.J et al. [11] used morphological preprocessing and seeded region growing to detect the pectoral muscle. In this paper we use mathematical morphological operators such as area morphology, openings and closings.  
2- Segmentation  
The segmentation step is a critical and very important process because all subsequent steps (features extraction and classification) depend on its result. The approach we adopted for segmentation from a mammogram image is Otsu’s method.  
2-1 Otsu’s method  
   
Otsu’s method is the most successful global thresholding method. we search for the threshold which would minimize the variance within each class. It is defined as a  
weighted sum of variances of the two classes [12]  
The optimal threshold value (t*), satisfies:   
   
Where: σB between class variances, L intensity levels [0,…,𝐿−1] [13].  
   
2-2 Connected components labeling(CCL)   
A connected component labeling in an image is a maximal subset of the white pixels in the image such that for any pair there is a path between them that crosses only white pixels. Recursively, any white pixel is a connected component, and any white pixel adjacent to another white pixel is part of the latter’s connected component (2), (3). Connected components labeling scans an image and group its pixels into components based on pixel connectivity.[13,14].   
 
   
3- Features extraction  
   
The feature extraction step is one of the most important factors that affects the CAD performance. Features are used to describe the character of an object. the extracted features represent a mathematical description of characteristics that are helpful for isolating the lesions or for distinguishing normal and abnormal lesions.[15].  
   
3- Features extraction  
The feature extraction step is one of the most important factors that affects the CAD performance. Features are used to describe the character of an object. the extracted features represent a mathematical description of characteristics that are helpful for isolating the lesions or for distinguishing normal and abnormal lesions.[15].  
3-1- Wavelet transform:  
is a well-known image processing technique to extract the features from an image. This technique is used to augment the available data by providing the wavelets of original and segmented mammographic scans. The main advantage of using wavelets is that they do a simultaneous localization in frequency and time domain and  is faster than other methods like FFTs[16,17,18].  
4- Feature Selection   
Feature selection is an important part of any classification scheme. The success of a classification scheme largely depends on the features selected and the extent of their role in the model. in this study we use PCA.[19]  
4-1- Principal component analysis( PCA)  
is a dimensionality-reduction method that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set. By the use of ( PCA) method improves the quality and compressed the noise ratio in the image. The Cancer Image becomes high enhanced and we easily detect abnormalities in the images.  
This will help doctor to take or analyze the abnormal signs in the image and they take quickly actions [20]. Where we used these features:  Contrast, Energy, Homogeneity, Correlation some basic formulas are as follows:  
 
   
5- classification  
The classification of the ROI as either normal, benign tumor or a malignant one isachieved using the techniques of machine learning. This classification is based on the features that are extracted from the segmented ROI in previous step in this paper we use Gaussian mixture model (GMM) classifier.  
   
5-1- Gaussian mixture model (GMM) classifier:  
 The Gaussian mixture models (GMM) classifier has gained increasing attention in the pattern recognition community. GMM can be classified as a semi-parametric density estimation method since it defines a very general class of functional forms for the density model. In this mixture model, a probability density function is expressed as a linear combination of basis functions[21].  
 
   
Fig-2 Gaussian mixture model applied  
   
Results and discussion:  
The first phase of pre-processing step is to enhance the quality of the image. In this step before enhancing the image, noise is removed through Median filter and adaptive median filter and segment the pectoral muscle region by utilizing mathematical morphological operators such as area morphology, openings and closings. Next, Connected components labeling(CCL) and Otsu’s method is applied for segmentation. Otsu’s method performed clustering-based image thresholding This algorithm assumes that there are two classes of pixels of an image. It thus uses bi-modal histogram and calculates the optimum threshold that reduces their intra-classes variance and maximizes inter-classesvariance.[22] Following this, Wavelet transform is applied on the resultant filtered image to extract featurs from it. Then we use PSA to select feature. As a final step,we use Gaussian mixture model (GMM) classifier. For classification between malignant, benign and normal tissues. The suggested algorithm was tested for mini-MIAS database  and showed high level of overall accuracy (93.33 %). Where experimentation performed on 60 images,( 20 malignant, 20 benign, 20 normal ).  
   
 
Fig-3 Flowchart representing the main methodology applied for the developed algorithm.  

 
 
Fig. 4. Original and cropped mammogram from MIAS database.  
   
Fig-5 shows the results of our research in this paper with similar reasearch  
   
Fig-6 Adaptive median filter applied on input image  
   
Figure7. Separation of breast profile region from background:(a) Original mammogram image, (b) The connected components after thresholding, (c) The largest component extracted and (d)Breast profile separated  
Conclusions:  
In this paper, we develop a computerized algorithm for the diagnosis of benign and malignant masses in digital mammogram images, according to extracted features from interactively selected (ROI) (700x700 pixel) from 60 images taken from mini-MIAS database (mammographic image analysis society). In our proposed system, the segmentation of breast tissue process is implemented in two step just by using  Otsu’s method and CCL algorithm. The Gaussian mixture model (GMM) classifier is used for classification of the ROI into three categories; normal, benign and malignant tissue. Although the proposed system achieved an overall average accuracy of approximately 93.33%. For future work, the authors intend to operate and test the present proposed system on the whole Mini-MIAS dataset.   
   
References:  
[1] Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM. GLOBOCAN 2008 v2.0, Cancer Incidence and Mortality Worldwide: IARC CancerBase. 2010. Available: http://globocan.iarc.fr. [Accessed April 2013].   
[2] Santoa MD, Molinaraa M, Tortorellab F, Ventoa M. Automatic classification of clustered microcalcifications by a multiple expert system. Pattern Recogn 2003;7(36):1467–77.  
[3] Bocchia L, Coppinib G, Noric J, Vallia G. Detection of single andclustered microcalcifications in mammograms using fractals models and neural networks. Med Eng Phys 2004;4(26):303–12.  
[4] American Cancer Society. Cancer Facts & Figures 2013. Atlanta: American Cancer Society; 2013.   
 [5]G.Saad, A.Khadour,and Q.Kanafani," ANN and Adaboost application for automatic detection of microcalcifications in breast cancer," The Egyptian Journal of Radiology and Nuclear Medicine, 2016.08.020.  
[6] A. Elmoufidi, K. El Fahssi, S. Jai-Andaloussi, and A. Sekkaki, "Detection of regions of interest in mammograms by using local binary pattern and dynamic K-means   
algorithm," International Journal of Image and Video Processing: Theory and Application, vol. 1, pp. 2336-0992, 2014.  
[7] B.K.Elfarra,and IS.S.Abuhaiba,"  Mammogram Computer Aided Diagnosis,"   
 International Journal of Signal Processing, Image Processing and Pattern Recognition Vol. 5, No. 4, December, 2012  
[8] A.I.Omara, A.S.Mohamed, A.M.Youssef, and Y.M.Kadah," Computer Aided Diagnosis in Digital Mammography," PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 ©    
[9] Suman Shrestha, Image denoising using new adaptive based median filter ,Signal & Image Processing : An International Journal (SIPIJ) Vol.5, No.4, August 2014   
[10] H.Soni, D.Sankhe," Image Restoration using Adaptive Median Filtering,"   
 International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072  
[11] J. Nagi, S.A. Kareem, F. Nagi, S.K. Ahmed, “Automated breast profile segmentation for ROI detection using digital mammograms”, in: 2010 IEEE EMBS Conference on Biomedical Engineering & Sciences, 2010, pp. 87–92.  
[13] A.Omer,and M.Elfadil," Preprocessing of Digital Mammogram Image Based on Otsu’s Threshold," American Scientific Research Journal for Engineering, Technology, and Sciences (ASRJETS) (2017) Volume 37, No 1, pp 220-229  
[14] I. K. Maitra, S. Naj, S. K. Bandyopadhyay, “Technique for preprocessing of digital mammogram”, Computer Methods and Programs in Medicine, ELSEVIER Journal, vol. 107, pp. 175-188  
[15] Elmanna, M. “COMPUTER AIDED DIAGNOSIS SYSTEM FOR DIGITAL MAMMOGRAPHY “(2013), FACULTY OF ENGINEERING, CAIRO UNIVERSITY.  
[16] Singh, Garima, DikendraVerma Pushpa Koranga, and G. E. H.U. Tech Scholar.   
”Application of Wavelet Transform on Images: A Review.” 3  
[17] Strickland, Robin N., and Hee Il Hahn. ”Wavelet transforms for detecting microcalcifications in mammograms.”  
 IEEE Transactions on Medical Imaging 15.2 (1996): 218-229.  
[18] Aarthy, S. L., and S. Prabu. ”An approach for detecting breast cancer using wavelet transforms.”  
 Indian Journal of Science and Technology 8.26 (2015): 1-7  
[19] Khan,S.;Islam,N;Jan,Z.; Din, IU.; Rodrigues,J.J.P.C.A novel deep learning based framework for the detection and classification of breast cancer using transfer learning.Pattern Recognit. Lett. 2019, 125, 1-6. [ Google Scholar] [CrossRef]  
[20] M.Saraswat,A.K. Wadhwani,and M. Dubey," COMPRESSION OF BREAST CANCER IMAGES BY PRINCIPAL COMPONENT ANALYSIS," International journal of Advanced Biological and Biomedical Research Volume 1, Issue 7, 2013: 767-776  
[21] M.Shi, and A.Bermak," An Efficient Digital VLSI Implementation of Gaussian Mixture Models-Based Classifier," IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 9, SEPTEMBER 2006  
[22] Sezgin M. Survey over image thresholding techniques and quantitative performance evaluation. J Electron Imag 2004;1(13):146–68.  
[25] B. Zheng, et. al., "Interactive Computer-Aided Diagnosis of Breast Masses: Computerized Selection of Visually Similar Image Sets From a Reference Library", Acad Radio, vol. 14, (2007), pp. 917–927.  
[26] L. Jiang, E. Song, X. Xu, G. Ma and B. Zheng, "Automated Detection of Breast Mass Spiculation Levels and Evaluation of Scheme Performance", Acad Radiol, vol. 15, (2008), pp. 1534–1544.  

[27] B. Surendiran and A. Vadivel, "Feature Selection using Stepwise ANOVA Discriminant Analysis for Mammogram Mass Classification", International J. of Recent Trends in Engineering and Technology, vol. 3, no. 2, (2010), pp. 55-57