Authors:
Chris Hyunchul Jo、Jiwoong Yang、Byunghwan Jeon、Hackjoon Shim、Ikbeom Jang
Paper:
https://arxiv.org/abs/2408.09894
Introduction
Background
Rotator cuff tears are a common cause of shoulder pain and disability, often requiring surgical intervention. Traditionally, the diagnosis of rotator cuff tears relies heavily on magnetic resonance imaging (MRI) due to its high sensitivity and specificity for soft tissue injuries. However, MRI is expensive and not always readily available, leading to increased healthcare costs and delayed diagnosis.
Problem Statement
Initial evaluations using plain shoulder radiographs often fail to identify soft tissue injuries such as rotator cuff tears. This necessitates further imaging with more expensive MRI examinations. The study aims to determine whether a deep learning model, specifically a Convolutional Block Attention Module (CBAM)-integrated neural network, can accurately predict rotator cuff tears using only shoulder radiographs, potentially offering a cost-effective alternative to MRI.
Related Work
Existing Diagnostic Methods
Current diagnostic methods for rotator cuff tears primarily involve clinical examination followed by imaging techniques such as ultrasound and MRI. While ultrasound is less expensive, it is operator-dependent and less accurate than MRI. Radiographs are typically used to rule out fractures but are not considered reliable for diagnosing soft tissue injuries.
Deep Learning in Medical Imaging
Deep learning has shown promise in various medical imaging applications, including the detection of fractures, tumors, and other abnormalities. Convolutional Neural Networks (CNNs) have been particularly effective in image classification tasks. The integration of attention mechanisms, such as CBAM, has further enhanced the performance of CNNs by allowing the model to focus on the most relevant features in the image.
Research Methodology
Data Collection
The dataset comprises shoulder radiographs from 99 patients, with 50 patients having full-thickness rotator cuff tears (fRCT) and 49 without tears. Radiographs were acquired in four angles: axial, glenoid, outlet, and anteroposterior (AP), totaling 396 images. Regions of interest (ROIs) essential for fRCT diagnosis were annotated with bounding boxes.
ROI Extraction
The YOLO v5 model was employed to automatically crop ROIs from all radiographs, streamlining the annotation process for future data. The extracted ROIs were further processed using Contrast-Limited Adaptive Histogram Equalization (CLAHE) to enhance bone structures and edge visibility.
Network Architecture
The study utilized a ResNet50 model integrated with a Convolutional Block Attention Module (CBAM). CBAM enhances the model’s focus on essential features by sequentially applying channel attention and spatial attention. The pretrained ResNet50 was adapted to classify between the presence and absence of fRCT, with two output classes in the final layer.
Experimental Design
Data Preparation
The dataset was divided based on subject IDs, ensuring no subject overlap between training and test sets. A 5-fold cross-validation was applied due to the limited dataset size, resulting in 316 training images from 79 subjects and 80 test images from 20 subjects.
Training Details
Models were trained on an NVIDIA RTX 3090 GPU using the SGD optimizer with a learning rate of 0.01 and a batch size of 8. Data augmentation techniques, including rotation, horizontal flipping, random crop, scaling, translation, brightness adjustment, and inversion, were employed to prevent overfitting. All data was resized to 512×512 pixels prior to training. CrossEntropyLoss was used as the loss function, and a CosineAnnealingWarmupRestarts scheduler dynamically adjusted the learning rate during training.
Results and Analysis
Model Performance
Using k-fold cross-validation, the model achieved an average accuracy of 0.831 and an AUROC of 0.889, indicating a high level of discriminative ability. The Positive Predictive Value (PPV) was 0.852, and the Negative Predictive Value (NPV) was 0.812.
Confusion Matrix and ROC Curve
The confusion matrix and ROC curve further illustrate the model’s performance, demonstrating its effectiveness in distinguishing between patients with and without rotator cuff tears.
Heatmap Analysis
Heatmaps generated from the model’s predictions highlight the regions of the radiographs that contributed most to the diagnosis, providing insights into the model’s decision-making process.
Overall Conclusion
Summary
The study demonstrates that a CBAM-integrated ResNet50 model can effectively diagnose full-thickness rotator cuff tears using only shoulder radiographs. This approach offers a cost-effective alternative to MRI, potentially reducing healthcare costs and improving diagnostic efficiency.
Future Work
To increase the generalizability of the model, future research will involve collaboration with multiple centers to incorporate a larger and more diverse set of radiograph data. Additionally, the impact of different radiographic views on the diagnosis of rotator cuff tears will be analyzed to further improve the model’s performance.
Long-term Goals
The ultimate goal is to achieve a level of diagnostic accuracy close to that of MRI-based diagnostics for rotator cuff tears, making this deep learning approach a viable pre-assessment tool or alternative to MRI.
By integrating advanced deep learning techniques with traditional radiographic imaging, this study paves the way for more accessible and cost-effective diagnostic methods in orthopedic care.