​
Cafe not only outperforms local and standard federated models but also surpasses centralized models that have full access to all the data, which is both surprising and groundbreaking.
Invention Summary:
Decentralized machine learning (ML) models, such as Federated Learning (FL), often face limitations due to incomplete data in distributed datasets. This issue is further complicated by the heterogeneity of missing data, where the distribution of missing data varies across datasets. These discrepancies, caused by privacy concerns or systematic errors, undermine data preprocessing efforts and hinder machine learning performance, preventing organizations from fully leveraging their data’s potential.
To overcome this challenge, Dr. Jaideep Vaidya and his team have developed Complementarity Adjusted Federated Learning (Cafe), a novel approach for federated imputation of missing data, that is effective even for data that is Missing Not At Random (MNAR) — the most difficult form of missing data to address. Cafe locally learns each client’s missing data mechanism, quantifies heterogeneity across clients, and uses pairwise complementarity and sample size scores to create federated averages of local imputation models. This innovative method has consistently outperformed baseline approaches in both imputation and federated prediction tasks, making it a game-changer for decentralized machine learning.
Market Applications:
- Clinical research data preparation tools
- Financial fraud detection systems
- Personalized medicines services
Advantages:
- First imputation model that can handle MNAR type data
- Leverages heterogeneity in missing data to improve data processing and quality control
- Outperformed baseline models over MNAR data
- Outperforms local, global, and federated imputation models
Intellectual Property & Development Status: Patent pending. Available for licensing and/or research collaboration. For any business development and other collaborative partnerships, contact: marketingbd@research.rutgers.edu