Melanoma Skin Cancer Prediction with Pre-trained InceptionV3 Model
Melanoma is a deadly form of skin cancer that can be challenging to diagnose accurately. Early detection of melanoma is crucial for improving patient outcomes. In this blog post, we will explore how to build a melanoma skin cancer prediction model using a pre-trained InceptionV3 model and TensorFlow/Keras. This model can classify skin lesions into two classes: "benign" and "malignant."
Understanding Melanoma
Melanoma is a type of skin cancer that arises from melanocytes, the pigment-producing cells in our skin. It is known for its aggressive nature and high potential for metastasis if not detected early. Early detection and accurate diagnosis are essential to save lives.
Building the Model
Pre-trained InceptionV3
To build our melanoma prediction model, we will leverage transfer learning. Transfer learning allows us to use a pre-trained neural network model as a starting point and fine-tune it for our specific task. In this case, we will use the InceptionV3 architecture, which has already been trained on a vast dataset, as the backbone of our model.
InceptionV3 is known for its ability to capture complex features in images, making it suitable for image classification tasks.
Customizing for Binary Classification
The InceptionV3 model was originally designed for multi-class classification tasks. However, we want to perform binary classification: "benign" or "malignant." To adapt the model for this task, we will:
- Remove the top classification layer, which was designed for multi-class output.
- Add a custom output layer with a single neuron and a sigmoid activation function for binary classification.
Data Preparation
To train our model, we need a dataset of skin lesion images labeled as "benign" or "malignant." You can collect such a dataset or use publicly available dermatology datasets. Ensure that the data is well-labeled and properly split into training and validation sets.
Model Training
With the InceptionV3 model customized for binary classification and our dataset prepared, we can proceed with training. During training, we will:
- Augment the data to increase its diversity and improve model generalization.
- Freeze the pre-trained layers to retain the knowledge they have learned.
- Compile the model with an appropriate optimizer and loss function.
- Train the model for a suitable number of epochs while monitoring validation performance.
Code
|
!mkdir -p ~/.kaggle !cp kaggle.json ~/.kaggle/ !kaggle datasets download -d hasnainjaved/melanoma-skin-cancer-dataset-of-10000-images import zipfile zip_ref = zipfile.ZipFile('/content/melanoma-skin-cancer-dataset-of-10000-images.zip','r') zip_ref.extractall('/content') zip_ref.close() import tensorflow as tf from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.applications import InceptionV3 from tensorflow.keras.layers import Dense, GlobalAveragePooling2D from tensorflow.keras.models import Model from tensorflow.keras.optimizers import Adam # Define your custom dataset directory data_dir = '/content/melanoma_cancer_dataset/train' # Define the image size, batch size, and number of classes img_size = (224, 224) batch_size = 32 num_classes = 2 # Data augmentation and preprocessing train_datagen = ImageDataGenerator( rescale=1.0 / 255, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, validation_split=0.2 ) train_generator = train_datagen.flow_from_directory( data_dir, target_size=img_size, batch_size=batch_size, class_mode='categorical', subset='training' ) validation_generator = train_datagen.flow_from_directory( data_dir, target_size=img_size, batch_size=batch_size, class_mode='categorical', subset='validation' ) # Load the InceptionV3 model without top classification layers base_model = InceptionV3(weights='imagenet', include_top=False) # Add custom classification layers x = base_model.output x = GlobalAveragePooling2D()(x) x = Dense(1024, activation='relu')(x) predictions = Dense(num_classes, activation='softmax')(x) # Create the Inception model model = Model(inputs=base_model.input, outputs=predictions) # Freeze the base model layers for layer in base_model.layers: layer.trainable = False # Compile the model model.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy']) # Train the model epochs = 15 history = model.fit( train_generator, validation_data=validation_generator, epochs=epochs ) import matplotlib.pyplot as plt # Plot training and validation accuracy plt.figure(figsize=(12, 6)) plt.subplot(1, 2, 1) plt.plot(history.history['accuracy'], label='Training Accuracy') plt.plot(history.history['val_accuracy'], label='Validation Accuracy') plt.xlabel('Epochs') plt.ylabel('Accuracy') plt.legend(loc='lower right') plt.title('Training and Validation Accuracy') # Plot training and validation loss plt.subplot(1, 2, 2) plt.plot(history.history['loss'], label='Training Loss') plt.plot(history.history['val_loss'], label='Validation Loss') plt.xlabel('Epochs') plt.ylabel('Loss') plt.legend(loc='upper right') plt.title('Training and Validation Loss') plt.tight_layout() plt.show() # ... Rest of the code for dataset loading and model training ... # Evaluate the model and compute confusion matrix test_loss, test_acc = model.evaluate(test_generator, verbose=2) print(f"Test accuracy: {test_acc*100:.2f}%") # Get true labels and predicted labels true_labels = test_generator.classes num_batches = len(test_generator) predicted_probs = model.predict(test_generator) predicted_labels = np.argmax(predicted_probs, axis=1) print(test_generator.class_indices) # Compute confusion matrix conf_matrix = confusion_matrix(true_labels, predicted_labels) # Display confusion matrix using Seaborn heatmap plt.figure(figsize=(8, 6)) sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues', xticklabels=validation_generator.class_indices.keys(), yticklabels=validation_generator.class_indices.keys()) plt.xlabel('Predicted labels') plt.ylabel('True labels') plt.title('Confusion Matrix') plt.show() from tensorflow.keras.preprocessing import image import numpy as np import matplotlib.pyplot as plt def predictImage(filename): img1 = image.load_img(filename, target_size=(224, 224)) plt.imshow(img1) Y = image.img_to_array(img1) X = np.expand_dims(Y, axis=0) val = model.predict(X) print(val) predicted_class = np.argmax(val, axis=1)[0] # Get the index of the class with the highest probability print( predicted_class) if predicted_class == 1: plt.xlabel("malignant", fontsize=30) elif predicted_class == 0: plt.xlabel("benign", fontsize=30) predictImage('/content/melanoma_cancer_dataset/test/malignant/melanoma_10111.jpg') model.save('vgg19_custom_model_with_additional_layers.h5') |
Model Evaluation
After training, we need to evaluate our model's performance. Key evaluation metrics for binary classification tasks include accuracy, precision, recall, F1-score, and ROC-AUC. These metrics will help us assess how well our model can distinguish between benign and malignant skin lesions.
Conclusion
In this blog post, we explored the process of building a melanoma skin cancer prediction model using a pre-trained InceptionV3 model. Early detection of melanoma can significantly impact patient outcomes, and machine learning models like the one we built can assist dermatologists in making accurate diagnoses.
Remember that this is just a starting point, and there are many ways to improve and fine-tune the model further. Additionally, always consult with medical professionals for clinical validation before using such models in real-world healthcare applications.

0 Comments