ARIA: On the Interaction Between Architectures, Initialization and Aggregation Methods for Federated Visual Classification

Vasilis Siomos, Sergio Naval-Marimont, Jonathan Passerat-Palmbach, Giacomo Tarroni

City St George's, University of London

We present the first systematic benchmark studying how neural network architectures, initialization strategies, and aggregation methods interact in federated visual classification. Our study spans 5 medical imaging datasets, 12 client sites, and 50K+ images.

Paper Code

IEEE International Symposium on Biomedical Imaging (ISBI) 2024

Abstract

Federated Learning (FL) has emerged as a promising paradigm for training machine learning models on distributed medical imaging data. However, researchers typically evaluate new FL methods by varying one component (e.g., aggregation strategy) while keeping others fixed. This approach overlooks important interactions between components.

ARIA provides the first comprehensive study of how three fundamental FL components—neural network architectures, weight initialization strategies, and aggregation methods—interact with each other. We find that certain combinations that work well in isolation can fail when combined, while other seemingly suboptimal choices can yield strong results together.

Key Findings

Batch Normalization Failure Modes: We identify specific conditions under which batch normalization catastrophically fails in FL, leading to >20% accuracy drops compared to group normalization alternatives.

SSL Pretraining Benefits: Self-supervised pretraining provides consistent gains only under specific non-IID conditions; in other scenarios, random initialization performs comparably.

Architecture-Aggregation Interactions: Some architecture-aggregation combinations exhibit synergistic effects, achieving results better than either component alone would predict.

Experimental Setup

Our benchmark covers:

5 Medical Imaging Datasets: Spanning dermatology, ophthalmology, radiology, and pathology domains

12 Client Sites: Simulating realistic multi-institutional collaboration scenarios

50,000+ Images: Diverse imaging modalities and clinical conditions

Built with NVIDIA FLARE: Production-grade FL framework for reproducibility

Implementation

Our codebase is built on NVIDIA FLARE, providing a production-ready implementation that can be directly deployed in real federated environments. The code includes:

All benchmark experiments with reproducible configurations

Pre-configured dataset loaders for the 5 evaluation datasets

Visualization tools for analyzing component interactions

Citation

If you find this work useful, please cite our paper:

@inproceedings{siomos2024aria, title={ARIA: On the Interaction Between Architectures, Initialization and Aggregation Methods for Federated Visual Classification}, author={Siomos, Vasilis and Naval-Marimont, Sergio and Passerat-Palmbach, Jonathan and Tarroni, Giacomo}, booktitle={IEEE International Symposium on Biomedical Imaging (ISBI)}, year={2024} }