Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

Authors:

Paper:

Introduction

Facial wrinkle detection is a critical aspect of cosmetic dermatology, serving as an indicator of aging and skin health. However, manual segmentation of facial wrinkles is a challenging and time-consuming task, often leading to inconsistent results due to subjectivity among graders. To address these issues, the study proposes two main solutions: the creation of a public facial wrinkle dataset and a novel training strategy for U-Net-like encoder-decoder models to automatically detect facial wrinkles.

Related Work

Deep Learning-Based Facial Wrinkle Segmentation

Deep learning methods have been increasingly applied to facial wrinkle segmentation. Kim et al. introduced a semi-automatic labeling strategy that combines texture maps with roughly labeled wrinkle masks using a U-Net architecture. They further improved segmentation accuracy with a weighted deep supervision technique. Yang et al. developed Striped WriNet, which uses a Striped Attention Module within a U-shaped network to segment both coarse and fine wrinkles effectively.

Weakly Supervised Learning

Weakly supervised learning trains models using incomplete or inaccurate labeled data. Xu et al. proposed CAMEL, a framework that uses a MIL-based label expansion technique for histopathology image segmentation. Shen et al. trained a deep learning model using scribbles and global labels to segment brain tumors.

Research Methodology

Dataset Specifications

The study introduces the ‘FFHQ-Wrinkle’ dataset, an extension of the NVIDIA FFHQ dataset. It includes 1,000 images with human labels and 50,000 images with automatically generated weak labels. The dataset features diverse individuals of various ages, races, and skin conditions, making it suitable for training models to handle a wide range of clinical scenarios.

Ground Truth Wrinkle Annotation

Wrinkle annotation was performed by three experienced annotators, focusing on dynamic and static wrinkles. The annotation process involved synchronization sessions to minimize inter-rater variability. The final ground truth wrinkle masks were created using majority voting to reduce subjectivity.

Experimental Design

Model Architecture

The study evaluated the proposed method using U-Net and Swin UNETR architectures. U-Net features a standard architecture with four encoder and decoder blocks, while Swin UNETR employs an encoder with a window size of 16 and patches of size 4×4.

Training Strategy

The training strategy involves two stages: weakly supervised pretraining and supervised finetuning. In the pretraining stage, the model is trained on a large dataset with weak labels, using masked texture maps as ground truth. In the finetuning stage, the model is refined using a smaller set of manually labeled wrinkle masks.

Weakly Supervised Pretraining Stage

The pretraining stage uses weakly labeled wrinkle data extracted through computer vision techniques. The texture map is extracted from face images using a Gaussian kernel-based filter, and non-facial regions are masked using a BiSeNet architecture-based facial parsing model.

Supervised Finetuning Stage

In the finetuning stage, the model is refined using human-labeled wrinkle data. The model takes as input a 3-channel RGB face image and a 1-channel masked texture map, producing a 2-channel output indicating the presence of wrinkles and background.

Results and Analysis

Implementation Details

The dataset was partitioned into 80% for training, 10% for validation, and 10% for testing. The AdamW optimizer and SGDR scheduler were used for training. Various augmentations were applied to maintain dataset diversity.

Evaluation Metrics

The performance of the final model was evaluated using the Jaccard Similarity Index (JSI), F1-score, and Accuracy (Acc). These metrics measure the overlap between predicted and ground truth wrinkle regions, the harmonic mean of precision and recall, and the proportion of correctly predicted pixels, respectively.

Quantitative Comparisons

The proposed method outperformed the latest wrinkle segmentation methods and other pretraining techniques. The performance gap was more significant in data-limited scenarios, demonstrating the effectiveness of the two-stage training strategy.

Ablation Study

The inclusion of the masked texture map as an additional input during the finetuning stage led to significant improvements in wrinkle segmentation. This demonstrates the effectiveness of the proposed approach.

Discussion

The study achieved state-of-the-art performance in facial wrinkle segmentation, demonstrating the potential of the two-stage training strategy. The approach shows promise in achieving high performance with limited data, enhancing scalability and flexibility in clinical settings. However, challenges such as false positives and subjectivity in wrinkle annotation remain.

Overall Conclusion

The study proposes a novel two-stage learning strategy for facial wrinkle segmentation using deep learning. By leveraging weakly labeled data for pretraining and manually labeled data for finetuning, the approach significantly reduces the time and cost associated with manual labeling. The release of the ‘FFHQ-Wrinkle’ dataset aims to support ongoing research and enhance reproducibility. Future research will focus on addressing false positives and improving the reliability of ground truth wrinkles through collaboration with dermatologists.

Datasets:

FFHQ

What's Hot

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

NeCo: Improving DINOv2’s spatial representations in 19 GPU hours with Patch Neighbor Consistency

Our Picks

AAAI.2024 – Humans and AI

How Diffusion Models Learn to Factorize and Compose

Temporal Fairness in Decision Making Problems

Subscribe to Updates

What's Hot

Facial Wrinkle Segmentation for Cosmetic Dermatology: Pretraining with Texture Map-Based Weak Supervision

Authors:

Paper:

Introduction

Related Work

Deep Learning-Based Facial Wrinkle Segmentation

Weakly Supervised Learning

Research Methodology

Dataset Specifications

Ground Truth Wrinkle Annotation

Experimental Design

Model Architecture

Training Strategy

Weakly Supervised Pretraining Stage

Supervised Finetuning Stage

Results and Analysis

Implementation Details

Evaluation Metrics

Quantitative Comparisons

Ablation Study

Discussion

Overall Conclusion

Datasets:

Related Posts