2024 | A. Anil Sinaci, Mert Gencturk, Celia Alvarez-Romero, Gokce Banu Laleci Erturkmen, Alicia Martinez-Garcia, Marfa Jose Escalona-Cuaresma, Carlos Luis Parra-Calderon
This paper introduces a privacy-preserving federated machine learning (ML) architecture built on FAIR (Findable, Accessible, Interoperable, Reusable) health data. The architecture aims to enable collaborative model-building among health data owners without sharing their datasets. Utilizing an agent-based approach, the system creates a global predictive model from various local models, ensuring privacy protection. Five healthcare organizations transformed their datasets into FAIR formats using a common FAIRification workflow and software, and deployed Federated ML Agents within their secure networks. A Federated ML Manager in the cloud orchestrated the process, facilitating communication and model aggregation. The system was validated through a use case predicting 30-day readmission risk for COPD patients, achieving an accuracy rate of 87%. The study highlights the practical application of privacy-preserving federated ML among distinct healthcare entities, emphasizing the value of FAIR health data in machine learning while maintaining privacy and security.This paper introduces a privacy-preserving federated machine learning (ML) architecture built on FAIR (Findable, Accessible, Interoperable, Reusable) health data. The architecture aims to enable collaborative model-building among health data owners without sharing their datasets. Utilizing an agent-based approach, the system creates a global predictive model from various local models, ensuring privacy protection. Five healthcare organizations transformed their datasets into FAIR formats using a common FAIRification workflow and software, and deployed Federated ML Agents within their secure networks. A Federated ML Manager in the cloud orchestrated the process, facilitating communication and model aggregation. The system was validated through a use case predicting 30-day readmission risk for COPD patients, achieving an accuracy rate of 87%. The study highlights the practical application of privacy-preserving federated ML among distinct healthcare entities, emphasizing the value of FAIR health data in machine learning while maintaining privacy and security.