Federated Learning Concepts

Federated learning trains a shared model across many decentralised devices or servers, each keeping its data local — only model updates (gradients or weights) are shared, preserving data privacy by design.

The FedAvg Algorithm

The canonical federated learning algorithm, Federated Averaging (FedAvg), alternates between distributing the global model to clients, local training on private data, and aggregating client updates on a central server.

FedAvg Step-by-Step

Server broadcasts the current global model weights \\(w_t\\) to a selected subset of clients
Each client runs E epochs of SGD on its local dataset, producing updated weights \\(w_t^k\\)
Server aggregates client weights via a weighted average: \\(w_{t+1} = \\sum_k \\frac{n_k}{n} w_t^k\\) where \\(n_k\\) is client k's dataset size
Repeat until convergence

This converges to a global model without any raw data leaving the clients.

Conceptual Simulation in NumPy

<pre><code class="language-python">import numpy as np def simulate_fedavg(global_weights, client_updates, client_sizes): """ Aggregate client model updates using weighted FedAvg. client_updates: list of weight arrays from each client client_sizes: number of samples each client trained on """ total_samples = sum(client_sizes) new_weights = np.zeros_like(global_weights) for update, n_k in zip(client_updates, client_sizes): new_weights += (n_k / total_samples) * update return new_weights # Simulate 3 clients with random local updates global_w = np.zeros(10) client_updates = [np.random.randn(10) * 0.1 for _ in range(3)] client_sizes = [100, 200, 150] new_global_w = simulate_fedavg(global_w, client_updates, client_sizes) print("Aggregated weights:", new_global_w[:5])</pre>

Privacy Enhancements

Sharing raw gradients can leak information about training data. Two key techniques harden federated learning against privacy attacks.

Differential Privacy

Differential Privacy (DP) adds calibrated Gaussian noise to client gradients before uploading, providing a mathematical guarantee that individual data points cannot be inferred from the shared updates. The privacy budget \\(\\epsilon\\) controls the trade-off: smaller \\(\\epsilon\\) = stronger privacy but lower model utility.

Secure Aggregation

Secure Aggregation uses cryptographic protocols (secret sharing, homomorphic encryption) so the server learns only the sum of client updates — never any individual client's gradient. This protects against a curious server while still enabling accurate FedAvg aggregation.

Challenges in Federated Learning

Federated learning introduces challenges absent in centralised training: non-IID data across clients, unreliable client connectivity, and slower convergence.

Key Challenges and Mitigations

Statistical heterogeneity (non-IID data): FedProx adds a proximal term to prevent large local deviations
System heterogeneity: Async aggregation or client sampling handles slow or dropping clients
Communication efficiency: Gradient compression, quantisation, and model distillation reduce upload costs
Convergence: More local epochs per round helps, but can increase client drift on non-IID data