Diagnosing Polycystic Ovary Syndrome (PCOS) Using an Enhanced Transfer Learning Model

 

   
  Aurthors/المؤلفون
Abstract/الملخص
Keywords/الكلمات المفتاحية

Content/أقسام الملف
1. Introduction
2. Literature Review
3. Methodology
   3.1 Dataset Description
   3.2 Data Pre-Processing
4. Model Initial Fine-Tuning
   4.1. Preliminary Result Analysis
5. Performance Enhancement
6. Conclusion
   References
Diagnosing Polycystic Ovary Syndrome (PCOS) Using an Enhanced Transfer Learning Model
 
 
Seyed Youssef Heydari (2)    Mohammed Hayyan Alsibai (1)    
Department of Computer and Informatics Engineering, International University of Science and Technology (IUST), Daraa _ Syria(1) (2)
mhdhayyan@gmail.com(1), yousef@alhaidary.net(2)
 
 
 
Abstract
Polycystic Ovarian Syndrome (PCOS) is one of the most worrying concerns for women nowadays. Deep learning has proved its potential and the vital role it can provide for diagnosing ultrasonography Images. A pre-trained DenseNet201 model was fine-tuned for detecting infected ovaries. The results of the study showed that the proposed method was able to achieve a satisfactory accuracy of roughly 70% on the test set. Although the accuracy is not very high, it is still considered as a promising result. This study highlights the potential of using transfer learning to improve the performance of CNNs in medical imaging tasks. The used model was pre-trained on the “ImageNet” dataset. Although the training dataset seems unrelated to predicting medical images, the study proves that transfer learning is a viable option for medical datasets. The main benefit of this methodology is to reduce the need for a large dataset as well as the training time and memory cost.
 
 
Keywords: Convolutional Neural Networks (CNNs), Transfer learning, Deep Learning, Polycystic ovary syndrome (PCOS), Medical Ultrasound Image Processing, Classification  
   
   
1- Introduction  
Polycystic ovary syndrome (PCOS) is one of the most prevalent conditions affecting women of reproductive age. It affects 6%–20% of premenopausal women globally (1). Ovarian dysfunction and an excess of androgen are the two main symptoms of PCOS. Menstrual abnormalities, hirsutism, obesity, insulin resistance, cardiovascular disease, in addition to emotional symptoms like depression (2)  are all common among PCOS patients (3). As a result, it is crucial for the accurate diagnosis and treatment of PCOS. In 1985, Adams et al. (4) discovered that polycystic ovaries have an abnormally high number of follicles, also termed multifollicularity. It was suggested that PCOS be diagnosed when at least two of the three following features were present at an expert meeting in Rotterdam in 2003: (i) oligo- or anovulation, (ii) clinical and/or biochemical hyperandrogenism, or (iii) polycystic ovaries (5). The latter of which can be detected using ultrasonography which offers the highest contribution to the diagnosis of PCOS (6). Early detection of PCOS is important because it can help to manage the symptoms and reduce the risk of long-term health issues.   
   
2. Literature Review  
Although ultrasound images have some disadvantages of strong artifacts, noise and high dependence on the experience of doctors, they are still considered as one of the most widely used modalities in medical diagnosis. Many artificial intelligence systems have been developed to help doctors. Convolutional Neural Networks (CNNs) and deep learning has achieved great success in computer vision with its unique advantages (7). Many diseases are diagnosed using different Deep Learning Models (7). Some examples include the detection of COVID-19 using lung ultrasound imagery achieving 89.1% accuracy using InceptionV3 network (8), the use of deep learning architectures for the segmentation of the left ventricle of the heart (9), and the classification of breast lesions in ultrasound images obtaining an accuracy of 90%, sensitivity of 86%, and specificity of 96% by utilizing the GoogLeNet CNN (10). It is clear that deep learning has proved its potential and the vital role it can provide in benefitting and assisting practitioners that use ultrasonography as a tool for diagnosis. This study is discussing the potential of deep learning in diagnosing PCOS. Since AI and deep learning algorithms can quickly and reliably assess vast volumes of data, they can be utilized to diagnose PCOS in ultrasound scans. AI and deep learning algorithms can examine ultrasound images to find patterns and traits that are indicative of PCOS in the case of PCOS detection. This can help to increase the speed and accuracy of diagnosis as it can be done more accurately and efficiently than by manual analysis. Overall, the use of AI and deep learning in the detection of PCOS in ultrasound images has the potential to improve the accuracy, efficiency, and accessibility of healthcare. This was the motive to tackle such an important health issue that affects millions of women worldwide and apply the potential of deep learning to help nullify this crucial problem.  
3. Methodology  
3.1 Dataset Description  
Obtaining a viable and correct ultrasound dataset for this task is difficult and time-consuming as the annotation of medical images requires significant professional medical knowledge, which makes the annotation very expensive and rare as well as the ethical issues and sensitivity of such dataset which can pose another problem. Therefore, resorting to a publicly available dataset can hugely accelerate the work on this project. Figure 1 shows a Screenshot of the publicly available dataset. The dataset is published by Telkom University Dataverse (11). This dataset is allegedly annotated by specialist doctors. It consists of 54 ultrasound images 14 of which are classified as PCOS and the rest are normal.
Figure 2 shows a sample from this dataset.
 
   
 
Figure 1: Screenshot of the publicly available PCOS dataset by Telkom University Dataverse  
   
 
Figure 2: Samples of infected and healthy ovaries in the dataset  
Since the dataset has a relatively small number of ultrasound images, data augmentation techniques such as random horizontal flipping, random vertical flipping, random brightness alteration, and random rotation are applied using the Python library `imgaug`. Data augmentation increases the number of ultrasound images which improves the training process massively. The total number of images in the dataset was increased from 54 images to 1795 images after conducting data augmentation. These ultrasound images vary in size and dimensions. Therefore, they need to be preprocessed so that they are uniform in this aspect.  
3.2 Data Pre-Processing  
The preprocessing pipeline include normalizing all the images into the same size of (224, 224), changing all the pixel values from the range 0 à 255 to 0 à1 since it makes the computation more efficient in the PyTorch framework which is the framework of choice, and transforming all images to grayscale as this will eliminate the unnecessary color channels if they exist and the resultant image will have only 1 color channel due to it being grayscale. Next, these images and their corresponding labels are turned into batches where each batch consists of 32 images.   
4. Model Initial Fine-Tuning  
In order to train on this dataset, the DenseNet architecture (12) and specifically DenseNet201 was chosen. DenseNet is one of the leading architectures used in ultrasonography analysis when conducting transfer learning as table 3 in the paper (13) shows. It is a type of convolutional neural network (CNN) that was trained on the ImageNet dataset. DenseNet is a variation of the traditional CNN architecture that uses dense blocks, where each layer in a block is connected to every other layer in the block. This allows the network to pass information from earlier layers to later layers more efficiently, which can improve performance on tasks such as image classification. DenseNet201 is a variant of DenseNet that has 201 layers. The structure is shown in figure 3.
Transfer learning can be utilized to import the DenseNet201 model and fine-tune it to adjust it according to the dataset. This fine-tuning involves 2 stages:
1. Adjusting the very first layer to make it accept grayscale images that are composed of one-color channel as opposed to three-color channels that the ImageNet dataset consists of. This is used to train the DenseNet201 model on.
 
   
 
Figure 3: An ultrasound image with size (224,224,1) as an input to the DenseNet model using its weights and architecture to make a prediction  
2. ImageNet dataset consists of 1000 classes, therefore the DenseNet201 model also has 1000 corresponding outputs, with 1 output probability for each class. The dataset used in this paper consists of 2 classes only. Thus, the very last layer must be adjusted to output 1 probability only which will be rounded to either 0 if the value is below 0.5 which indicates that the image is `infected` or 1 if the value is above 0.5 which signifies that the image is `not infected`.  
After conducting the above steps and setting the learning rate to 0.01, the optimizer is set to SGD and training the model is initiated for 15 epochs. Moreover, the model had its pre-trained weights frozen except for its final Linear layer. This method is recommended in the paper (13) for models trained on ImageNet architectures.  
4.1. Preliminary Result Analysis  
After training the model for 15 epochs on the dataset, the following results and confusion matrix were obtained:  
   
 
 
Figure 4: Results of accuracy and loss during training epochs.  
   
 
Figure 5: Confusion Matrix of the test set.  
Figures 4 and 5 show the accuracy and loss during training in addition to the confusion matrix on the test set of the dataset. As figure 4 shows, the train accuracy increases steadily until it reaches 83.33% while the test accuracy remains relatively unchanged at 62.92%. The same can be observed for train and test loss where the train loss steadily decreases but not for the test loss. This is a clear indication of overfitting and the inability of the model to generalize well on unseen data. The poor model performance is also confirmed in the confusion matrix where the number of true positives and true negatives is unsatisfactory. Table 1 shows the precision, recall, and F1-score for the infected data points in the dataset:  
Table 1: Precision, recall, and F1-score for the infected class.  
 
It is clear that the model performance is not good and it needs to be improved.  
   
5. Performance Enhancement:  
Achieving bad results on the dataset means that more tuning and enhancement is needed. Experimentation in the field of deep learning is crucial because it allows for the testing and fine-tuning of different model architectures, hyperparameters, and data pre-processing techniques. This experimentation process helps to identify the most effective combination of these elements for a given task, and can lead to significant improvements in model performance. Therefore, three modifications were carried out. First, the pre-processing in terms of data augmentation on the images has been slightly adjusted where vertical flipping has been neglected, random rotations’ angles has been reduced, the brightness alteration has also been slightly reduced, and cropping has been dropped. This is all done as it is expected to better represent the target classes when artificially creating new images for the classes in question.
Secondly, the number of data points has been reduced in both the training and test sets to almost half as compared to the dataset used previously. Finally, the way the first layer is fine-tuned was modified. Previously, the method to adjust the very first layer was to replace the entire layer with another layer that accepts 1 color-channel input. The problem with this method is that by replacing the entire first layer, some of the features learned by the model in the first layer are also being eliminated and replaced with a new layer that has the same number of parameters, but these parameters are completely random as opposed to the ones removed. Referring to Hughes notes in “Towards Data Science” newsletter  (14), a new method for adjusting the first layer was demonstrated without discarding the entirety of the first layer. Adjustment was performed by only adjusting the number of channels in the first layer. The same goal of accepting 1 color-channel images was achieved more efficiently. After these changes has been carried out, the model is trained with the same remaining hyperparameters as described earlier. Much better results have been obtained, with accuracy of roughly 70% on the test set. Figure 6 shows the result confusion matrix:
 
   
 
Figure 6: Confusion Matrix of test set in the new experiment  
Table 2 shows the precision, recall, and F1-score for the infected class of the test set.  
Table 2: Precision, recall, and F1-score for the infected class in the used dataset.  
 
As observed from the confusion matrix, much improvement has been achieved and the results are more satisfactory. Although the results are not perfect, it is still considered as promising results. Thus we can conclude that transfer learning has the potential to improve the performance of CNNs in medical imaging tasks.  
   
6. Conclusion:  
A comprehensive examination of the use of CNN and transfer learning in classifying ovarian ultrasound images was presented. The main objective of this study was to investigate the potential of using a pre-trained DenseNet201 model to classify ovarian ultrasound images into healthy and infected cases. The datasets used in this study consisted of healthy ovaries and ovaries with PCOS, and the fine-tuned model was trained and evaluated on these datasets. The results of this study indicate that the proposed method was able to achieve an accuracy of 70% on the test set. This level of accuracy suggests that transfer learning can be a valuable tool for improving the performance of CNNs in medical imaging tasks. The proposed method is a promising approach for classifying ovarian ultrasound images and has the potential to be applied to other medical imaging tasks. Furthermore, this study highlights the importance of the pre-processing techniques taken in achieving better results.  
   
References  
1. HF. EM. Polycystic ovary syndrome: definition, aetiology, diagnosis and treatment. Nat Rev Endocrinol. 2018; 14(5):270–84.
2. Farkas J RADZ. Psychological Aspects of the Polycystic Ovary Syndrome. Gynecol Endocrinol. 2013; 30(2).
3. Louwers YV LJ. Characteristics of polycystic ovary syndrome throughout life. Therapeutic Advances in Reproductive Health. 2020; 14.
4. Adams Jea. Prevalence of polycystic ovaries in women with anovulation and idiopathic hirsutism. British Medical Journal. 1986 Aug 9; 293,6543 (1986): 355-9.
5. Azziz R. PCOS: a diagnostic challenge. Reproductive BioMedicine Online. 2004 April 5; 8(6): 644-648.
6. Cesare Battaglia FMNPVZDdA. Ultrasound evaluation of PCO, PCOS and OHSS. Reproductive BioMedicine Online. 2004; 9(6): 614-619.
7. Wang Y, Ge X, Ma H, Qi S, Zhang G, Yao Y. Deep Learning in Medical Ultrasound Image Analysis: A Review. IEEE Access. 2021; 9: 54310-54324.
8. Diaz-Escobar J OGNVRSGMAKVRRRea. Deep-learning based detection of COVID-19 using lung ultrasound imagery. PLoS ONE 16(8): e0255886. 2021.
9. G. Carneiro JCNaAF. The Segmentation of the Left Ventricle of the Heart From Ultrasound Data Using Deep Learning Architectures and Derivative-Based Search Methods. IEEE Transactions on Image Processing. 2012; 21(3): 968-982.
10. Han Sea. A deep learning framework for supporting the classification of breast lesions in ultrasound images. Physics in medicine and biology. 2017; 62(19): 7714-7728.
11. Adiwijaya , NOVIA WISESTY U, Astuti W. Polycystic Ovary Ultrasound Images Dataset. [Online].; Telkom University Dataverse 2021. Available from: https://doi.org/10.34820/FK2/QVCP6V.12. Gao Huang ZLLvdMKQW. Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016;: 2261-2269.
13. Morid MAABaGDF. A scoping review of transfer learning research on medical image analysis using ImageNet. Computers in biology and medicine. 2020; 128: 104115.
14. Hughes C. Transfer Learning on Greyscale Images: How to Fine-Tune Pretrained Models on Black-and-White Datasets. [Online]. Available from: https://towardsdatascience.com/transfer-learning-on-greyscale-images-how-to-fine-tune-pretrained-models-on-black-and-white-9a5150755c7a.