Enhancing cervical cancer detection and robust classification through a fusion of deep learning models | Scientific … – Nature.com

Dataset description

The dataset we used for this study is accessible through this link: https://www.cs.uoi.gr/~marina/sipakmed.html. It contains five different cell types, as detailed in24. In our research, we've transformed this dataset into a two-class system with two categories: normal and abnormal. Specifically, the normal category includes superficial intermediate cells and parabasal cells, while the aberrant category covers koilocytotic, dyskeratotic, and metaplastic cell types25. Within the normal category, we've further divided cells into two subcategories: superficial intermediate cells and parabasal cells. The essential dataset characteristics are summarized in Table 2. The SIPaKMeD dataset comprises a total of 4068 images, with 3254 allocated for training (making up 80% of the total), and 813 set aside for testing (accounting for 20% of the total). This dataset consists of two distinct classes: normal photos, totalling 1618, and aberrant images, amounting to 2450. Figure2 provides visual examples of photographs from these two different categories. The existing literature extensively covers different screening methods for cervical cancer, such as Pap smear, colposcopy, and HPV testing, emphasizing the importance of early detection. However, a significant gap exists in automated screening systems using pap smear images. Traditional methods rely on expert interpretation, but integrating deep learning (DL) and machine learning (ML) offers potential for intelligent automation. Despite this potential, few studies focus on developing and evaluating such systems specifically for cervical cancer prediction using pap smear images. This research addresses this gap by proposing a methodology that utilizes pre-trained deep neural network models for feature extraction and applies various ML algorithms for prediction. The study aims to contribute to advancing automated screening systems for cervical cancer, aiming to improve early detection and patient outcomes.

Proposed model cervical cancer classification.

The schematic representation of our proposed system can be observed in Fig.2. To facilitate the classification task for cervical cancer, we employ the SIPaKMeD dataset, which comprises images of pap smears. This dataset is categorized into two groups: abnormal and normal, with a distribution of 60% for training and 40% for testing. To extract relevant feature sets from well-established CNN architectures such as Alexnet, Resnet-101, Resnet-152, and InceptionV3, we initiate feature extraction from these pretrained CNN models. This step allows us to gather valuable information from the final layer activation values. For the task of classifying images into normal and abnormal categories, we leverage a variety of machine learning techniques, including Simple Logistic, Decision Tree, Random Forest, Naive Bayes, and Principal Component Analysis. Our approach is designed as a hybrid strategy, merging both DL and ML methodologies. The utilization of DL enables our model to capture intricate and complex features inherent in the data, while ML provides the necessary flexibility to handle diverse scenarios. By harnessing the last layer of pretrained models for feature extraction, we enable different machine learning algorithms to classify data based on these extracted attributes. This combination of DL and ML enhances our system's ability to effectively categorize cervical cancer cases.

The pre-trained model has undergone training on a larger dataset, acquiring specific weights and biases that encapsulate the dataset's distinctive characteristics. This model has been effectively employed for making predictions based on data. The transferability of learned features to other datasets is possible because certain fundamental abstract properties remain consistent across various types of images. By utilizing pre-trained models, significant time and effort savings are achieved, as a substantial portion of the feature extraction process has already been completed. Noteworthy examples of pre-trained models include Resnet152, ResNet101, Inceptionv3, and Alexnet, which are summarized in Table 3 for reference.

The image classification framework based on ResNet-101 consists of two main parts: feature extraction and feature classification. In Fig.3, you can see how the feature extractor is built, comprising five main convolution modules with a total of one hundred convolution layers, an average pooling layer, and a fully connected layer26. Once the features are extracted, they are used to train a classifier with a Softmax structure. Table 4 lists the convolution layers and their configurations in the ResNet-101 backbone. Using shortcut connections to increase data dimensions, the ResNet-101 model significantly improves performance by increasing convolutional depth. These shortcut connections also address the problem of network depth causing degradation by enabling identity mapping. For most binary classification tasks, the loss function is applied using the logical cross-entropy function, as shown in Eq.(1).

$$k_{({h_l},;{q_l})}^b = - {f_l}log left( {q_l} right) - left( {1 - {f_l}} right)log left( {1 - {q_l}} right)$$

(1)

where the ground truth value, (% {f_l}), and the predicted value, (% {q_l}), are respectively indicated as the lth training dataset's ground truth and predicted values. The value of the loss, ({k}_{({h_{l}}, ; {q_{l}})}^{b}), is then backpropagated through the CNN model. At the same time, the CNN model parameters (weights and biases) are gradually optimised during each epoch. This process continues until the loss is minimised and the CNN model converges to a solution.

The ResNet architecture is efficient, promoting the training of very deep neural networks (DNN) and enhancing accuracy. It addresses the challenge of accuracy degradation associated with increasing network depth. When depth is increased, accuracy often drops, which is a drawback. However, deeper networks can improve accuracy by avoiding the saturation of shallow networks, where errors remain minimal27. The key idea here is that information from one layer should easily flow to the next with the help of identity mapping. ResNet tackles the degradation problem, along with the gradient vanishing issue, using residual blocks. These blocks handle the remaining computation while considering the input and output of the block. Figure4, illustrates architecture of ResNet152. Table 5, illustrates the configuration of ResNet152.

This advanced model has undergone training by one of the industry's most renowned hardware specialists, leveraging an impressive repertoire of over 20 million distinct parameters. The model's architecture is a harmonious blend of symmetrical and asymmetrical construction blocks, each meticulously crafted with its own unique set of convolutional, average, and maximum pooling layers, concatenation operations, and fully connected layers configurations. Furthermore, the model's design incorporates an activation layer that takes advantage of batch normalization, a widely adopted technique in the field. This technique helps stabilize and accelerate the training process, making the model more robust and efficient28. For the critical task of classification, the model employs the Softmax method, a popular and well-established approach in machine learning. Softmax is instrumental in producing probability distributions over multiple classes, which enables the model to make informed and precise predictions. To provide a visual understanding of the Inception-V3 model's intricate design, Fig.5 serves as a diagrammatic representation, offering insights into the model's underlying architecture and the various components that make it a powerhouse in the realm of machine learning and artificial intelligence.

InceptionV3 architecture.

The field of machine learning, particularly in the domain of image processing, has witnessed a profound impact thanks to the advent of Alexnet. As suggested in Ref.29, this influential model boasts a preconfigured Convolutional Neural Network (CNN) with a total of eight distinct layers29. Its remarkable performance in the 2012 ImageNet Large Scale Visual Recognition Challenge (LSVRC-2012) competition marked a watershed moment, as it clinched victory with a substantial lead over its competitors. The architectural blueprint of Alexnet bears some resemblance to Yann Lecun's pioneering LeNet, highlighting its historical lineage and the evolutionary progress of convolutional neural networks.

Figure6 provides an insightful visual representation of the holistic design of the Alexnet system. In the journey of data processing within Alexnet, input data traverse through an intricate sequence, comprising five convolution layers and three max-pooling layers, as vividly illustrated in Fig.5. These layers play a pivotal role in feature extraction and hierarchical representation, which are vital aspects of image analysis and understanding. The culmination of AlexNet's network journey is marked by the application of the SoftMax activation function in the final layer, enabling it to produce probabilistic class predictions. Along the way, the Rectified Linear Unit (ReLU) activation function is systematically employed across all the network's convolution layers, providing a nonlinear transformation that enhances the network's capacity to learn and extract features effectively. This combination of architectural elements and activation functions has played a significant role in solidifying AlexNet's position as a groundbreaking model in the domain of image processing and machine learning.

Logistic regression serves as a powerful method for modelling the probability of a discrete outcome based on input variables, making the choice of input variables a pivotal aspect of this modelling process. The most common application of logistic regression involves modelling a binary outcome, which pertains to scenarios where the result can exclusively assume one of two possible values, such as true or false, yes or no, and the like. However, in situations where there are more than two discrete potential outcomes, multinomial logistic regression proves invaluable in capturing the complexity of the scenario. Logistic regression finds its primary utility in the realm of classification problems30. It becomes particularly valuable when the task at hand involves determining which category a new sample best aligns with. This becomes especially pertinent when dealing with substantial datasets, where the need to classify or categorize data efficiently and accurately is paramount. One noteworthy domain where logistic regression finds widespread application is in cybersecurity, where classification challenges are ubiquitous. A pertinent example is the detection of cyberattacks. Here, logistic regression plays a crucial role in identifying and categorizing potential threats, contributing significantly to bolstering the security of digital systems and networks.

In the realm of supervised learning algorithms, decision trees emerge as a highly versatile and powerful tool for both classification and regression tasks. They operate by constructing a tree-like structure, wherein internal nodes serve as decision points, branches represent the outcomes of attribute tests, and terminal nodes store class labels. The construction of a decision tree is an iterative process, continually dividing the training data into subsets based on attribute values until certain stopping conditions, such as reaching the maximum tree depth or the minimum sample size required for further division, are met. To guide this division process, the Decision Tree algorithm relies on metrics like entropy or Gini impurity, which gauge the level of impurity or unpredictability within the data subsets31. These metrics inform the algorithms choice of the most suitable attribute for data splitting during training, aiming to maximize information gain or minimize impurity. In essence, the central nodes of a decision tree represent the features, the branches encapsulate the decision rules, and the leaf nodes encapsulate the algorithms outcomes. This design accommodates both classification and regression challenges, making decision trees a flexible tool in supervised machine learning. One notable advantage of decision trees is their effectiveness in handling a wide range of problems. Moreover, their ability to be leveraged in ensembles, such as the Random Forest algorithm, enables the simultaneous training on multiple subsets of data, elevating their efficacy and robustness in real-world applications.

A Random Forest is a powerful machine learning tool that handles both regression and classification tasks effectively. It works by combining the predictions of multiple decision trees to solve complex problems. Here's how it works: The Random Forest algorithm builds a forest of decision trees using a technique called bagging. Bagging improves the precision and reliability of machine learning ensembles32. The algorithm then makes predictions by averaging the results from these trees, determining the final outcome. What makes the Random Forest special is its scalability. Unlike single decision trees, it can adapt to complex data and improves its accuracy as you add more trees to the forest. The Random Forest also helps prevent overfitting, making it a valuable tool for real-world applications with noisy and complex datasets. Moreover, it reduces the need for extensive fine-tuning, making it an appealing choice for practitioners seeking effective and dependable machine learning models.

Nave Bayes theorem forms the fundamental principle underlying the Naive Bayes algorithm. In this method, a key assumption is that there's no interdependence among the feature pairs, resulting in two pivotal presumptions: feature independence and attribute equality. Naive Bayes classifiers are versatile, existing in three primary variants: Gaussian Naive Bayes, Bernoulli Naive Bayes, and Multinomial Naive Bayes33. The choice of variant depends on the nature of the data being analyzed. For binary data, Bernoulli Nave Bayes is employed, while count data finds its match in Multinomial Nave Bayes, and continuous data is aptly handled by Gaussian Nave Bayes. Equation(2) serves as a proof of Bayes theorem, underpinning the mathematical foundations of this approach.

$$Zleft( {b|a} right) = frac{Zleft( b right)Zleft( b right)}{{Zleft( a right)}}$$

(2)

Principal Component Analysis (PCA) serves as a powerful technique designed to mitigate the impact of correlations among variables through an orthogonal transformation. PCA finds widespread use in both exploratory data analysis and machine learning for predictive modelling. In addition, PCA stands out as an unsupervised learning algorithm that offers a valuable approach for delving into the intricate relationships between variables. This method, also referred to as generic factor analysis, enables the discovery of the optimal line of fit through regression analysis34. What sets PCA apart is its ability to reduce the dimensionality of a dataset without prior knowledge of the target variables while preserving the most critical patterns and interdependencies among the variables. By doing so, PCA simplifies complex data, making it more amenable for various tasks, such as regression and classification. The result is a more streamlined subset of variables that encapsulates the essential essence of the data.

See the rest here:

Enhancing cervical cancer detection and robust classification through a fusion of deep learning models | Scientific ... - Nature.com

Research Fellow: Computer Vision and Deep Learning job with ... - Times Higher Education [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
The Cognitive Abilities of Deep Learning Models - Fagen wasanni [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
The Promise of AI EfficientNet: Advancements in Deep Learning and ... - Fagen wasanni [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
The Intersection of AI Deep Learning and Quantum Computing: A ... - Fagen wasanni [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
Deep learning method developed to understand how chronic pain ... - EurekAlert [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
Deep Learning in Medical Applications: Challenges, Solutions, and ... - Fagen wasanni [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
Revolutionizing Telecommunications: The Impact of Deep Learning ... - Fagen wasanni [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
The Pros and Cons of Deep Learning | eWeek - eWeek [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
Vision-based dirt distribution mapping using deep learning | Scientific Reports - Nature.com [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
Deep learning algorithm predicts Cardano could surge to $0.50 by September - Finbold - Finance in Bold [Last Updated On: August 6th, 2023] [Originally Added On: August 6th, 2023]
Road to safer self-driving cars is paved with deep learning - ISRAEL21c [Last Updated On: May 13th, 2024] [Originally Added On: May 13th, 2024]
Cedars-Sinai research shows deep learning model could improve AFib detection - Healthcare IT News [Last Updated On: May 13th, 2024] [Originally Added On: May 13th, 2024]
Predicting equilibrium distributions for molecular systems with deep learning - Nature.com [Last Updated On: May 13th, 2024] [Originally Added On: May 13th, 2024]
Deep learning-based classification of anti-personnel mines and sub-gram metal content in mineralized soil (DL-MMD ... - Nature.com [Last Updated On: May 13th, 2024] [Originally Added On: May 13th, 2024]