Julian Schneider, Marvin Walter, Karen Otte, Thierry Meurers, Ines Perrar, Ute Nöthlings, Tim Adams, Holger Fröhlich, Fabian Prasser, Juliane Fluck, Lisa Kühnel
INTRODUCTION: A modern approach to ensuring privacy when sharing datasets is the use of synthetic data generation methods, which often claim to outperform classic anonymization techniques in the trade-off between data utility and privacy. Recently, it was demonstrated that various deep learning-based approaches are able to generate useful synthesized datasets, often based on domain-specific analyses. However, evaluating the privacy implications of releasing synthetic data remains a challenging problem, especially when the goal is to conform with data protection guidelines...
August 30, 2024: Studies in Health Technology and Informatics