Enhancing Automated OCT Classification With GAN-generated Synthetic Data for Improved Accuracy and Privacy Protection

Abstract

This study examines the generation and use of synthetic anterior segment opti- cal coherence tomography (AS-OCT) images through the Style and WAvelet based GAN (SWAGAN) architecture, aiming to address data scarcity and privacy issues in ophthalmology. The realism of the synthetic images was assessed in a blinded test with seven experienced refractive surgeons, who were unable to distinguish real from synthetic images, demonstrating high visual fidelity. To evaluate the objective aspect, machine learning models, including an EfficientNet-B0-based classifier and a simpler CNN, were trained on real, synthetic, and combined datasets. Results showed that models trained only on synthetic images achieved up to 91% clas- sification accuracy, with performance improving to 96% when combining real and synthetic data. Perfect classification accuracy was achieved for conditions like PRK, Kerarings, and Intacs in 10-mm OCT scans, and for normal, ICL, and IOL cases in 16-mm OCT scans. Simpler CNN models showed similar trends, though with slightly lower performance. These results confirm that high-quality GAN-generated synthetic images closely mimic real scans and can significantly enhance model train- ing, surpassing traditional data augmentation techniques. The low Fr“echet Inception Distance (FID) scores suggest that synthetic data closely resembles real data without compromising patient privacy, provided the original data is not linked to any patient and anonymized. This highlights the potential of GAN-based synthetic datasets for building robust machine learning models while overcoming privacy and data access challenges in clinical settings. However, further investigation is needed to ensure that sensitive data is fully protected in accordance with global privacy standards.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By