Covid-19 Chest X-rays Classification using fastai
Introduction
Has anyone encountered how to build the complicated deep learning model? Nowadays, there is a powerful tool to construct the machine learning model with a few codes such as pycaret, AutoGluon, etc. Some data scientists said that these tools might have low flexibility to adjust the function and algorithm. So, the author would like to proudly present one of the most efficient tools for producing the model is fastai library which is based on PyTorch Framework.
In this article, the author will tour you to explore how to conduct the image classification model to diagnose COVID-19 Symptoms using X-rays.
This article extracts COVID-19 Radiography database from TAWSIFUR RAHMAN AND 2 COLLABORATORS, the winner of the COVID-19 Dataset Award by Kaggle Community.
Importing Essential Libraries
At first, you should download the required libraries from fastai to your local machine and let’s install fastai on your computer
And then import the Essential libraries as you need to the notebook.
Dataloader
Now, creating the dataloader object along with the image dataset, fastai provides a flexibly convenient function to construct the dataloader called ImageDataLoaders API.
We can check the image in dataloader using show_batch()
function
Model Training
ResNet architecture is widely used for image classification, it is accurate and fast for various datasets. So I decided to use ResNet50 which is a common model. fastai provides the API for building the model in CNN architecture called Leaner.
Learning rate is one of the hyperparameters to calculate loss function. Sometimes, data scientist or ML Engineer spend their time optimising the learning rate to find the best accuracy or loss with a huge time. fastai provides the best algorithm to find the initial learning rate for CNN model in Sylvain Gugger’s article. Thus, we don’t need to experiment anymore. To activate this algorithm, we called .lr_find()
API to see the graph below
To train the model, we need to create fine_tine(epochs, lr)
in which is the analogous system of ML models.
Here, the argument would be expressed as below:
- epochs: the number of epochs for training the model
- lr: the initial learning rate of this model
In this case, 5 epochs are fairly enough to achieve higher accuracy, almost 98%
We can use show_results()
to visualise the prediction in each image of one batch
We can see the prediction as below
Interpretation Result
However, the accuracy of the model couldn’t reflect every aspect of the model and the model might overfit some classes in the dataset if the dataset is imbalanced.
We can also use a confusion matrix to visualise and compare the actual and predicted in each class by using plot_confusion_matrix()
Class Activation Map (CAM)
Some medical experts are sometimes curious about how can we explain deep learning model architecture and why the model decided to predict in that class. We can utilise the Class Activation Map (CAM) to calculate the gradient last layer or the layers that we are interested in and plot it in the heatmap.
Class activation maps are a simple technique to get the discriminative image regions used by a CNN to identify a specific class in the image. In other words, a class activation map (CAM) lets us see which regions in the image were relevant to this class
In fastai, we can use fastai.callback.hook
to call the last layer of CNN model to calculate the gradient
To sum up, we can see the heatmap to show what region in images the model decides to predict in that class.
Conclusion
Finally, it is undeniable that fastai is a useful tool for building deep learning models such in NLP, Computer Vision and Time-series classification fields.
Compared to the classical approach, fastai is to spend lesser time and lower the number of code lines.
fastai also provides various and essential APIs to create DataLoader, Learner (Model), Confusion Matrix, etc.