We present the first systematic benchmark studying how neural network architectures, initialization strategies, and aggregation methods interact in federated visual classification. Our study spans 5 medical imaging datasets, 12 client sites, and 50K+ images.
IEEE International Symposium on Biomedical Imaging (ISBI) 2024
Federated Learning (FL) has emerged as a promising paradigm for training machine learning models on distributed medical imaging data. However, researchers typically evaluate new FL methods by varying one component (e.g., aggregation strategy) while keeping others fixed. This approach overlooks important interactions between components.
ARIA provides the first comprehensive study of how three fundamental FL components—neural network architectures, weight initialization strategies, and aggregation methods—interact with each other. We find that certain combinations that work well in isolation can fail when combined, while other seemingly suboptimal choices can yield strong results together.
Batch Normalization Failure Modes: We identify specific conditions under which batch normalization catastrophically fails in FL, leading to >20% accuracy drops compared to group normalization alternatives.
SSL Pretraining Benefits: Self-supervised pretraining provides consistent gains only under specific non-IID conditions; in other scenarios, random initialization performs comparably.
Architecture-Aggregation Interactions: Some architecture-aggregation combinations exhibit synergistic effects, achieving results better than either component alone would predict.
Our benchmark covers:
5 Medical Imaging Datasets: Spanning dermatology, ophthalmology, radiology, and pathology domains
12 Client Sites: Simulating realistic multi-institutional collaboration scenarios
50,000+ Images: Diverse imaging modalities and clinical conditions
Built with NVIDIA FLARE: Production-grade FL framework for reproducibility
Our codebase is built on NVIDIA FLARE, providing a production-ready implementation that can be directly deployed in real federated environments. The code includes:
All benchmark experiments with reproducible configurations
Pre-configured dataset loaders for the 5 evaluation datasets
Visualization tools for analyzing component interactions
If you find this work useful, please cite our paper:
@inproceedings{siomos2024aria,
title={ARIA: On the Interaction Between Architectures, Initialization
and Aggregation Methods for Federated Visual Classification},
author={Siomos, Vasilis and Naval-Marimont, Sergio and
Passerat-Palmbach, Jonathan and Tarroni, Giacomo},
booktitle={IEEE International Symposium on Biomedical Imaging (ISBI)},
year={2024}
}