This paper is available on arxiv under CC BY 4.0 DEED license.

Authors:

(1) Zhipeng Wang, Department of Computing, Imperial College London;

(2) Nanqing Dong Department of Computer Science, University of Oxford;

(3) Jiahao Sun, Data Science Institute, Imperial College London;

(4) William Knottenbelt, Department of Computing, Imperial College London.

Abstract & Introduction

Related Work & Preliminaries

Methodology

Theoretical and Empirical Analysis

Results

Conclusion & References

Zero-Knowledge Proofs. Zero-knowledge proofs (ZKPs) have emerged as a revolutionary cryptographic concept that allows one party (the prover) to demonstrate the truth of a statement to another party (the verifier) without revealing any information beyond the statement’s validity [12]. One of the most notable applications of ZKPs is in privacy-preserving authentication, where a user can prove to a aggregator that they know a secret without revealing the secret itself. ZKPs have also been applied in other areas, including blockchains [28,30], machine learning [22,31,9] and FL [6]. However, existing research, such as RoFL [6,21], has predominantly focused on using ZKPs to address malicious client behaviors or enhance the client privacy, while assuming the existence of a “honest-butcurious” (i.e., semi-honest) aggregator. In this work, we break away from this assumption and leverage ZKPs to ensure the honest aggregation of the centralized aggregator without requiring trust in the aggregator itself. Blockchain-based Federated Learning. Blockchain is a decentralized and immutable distributed ledger technology [15,10,4] that underpins cryptocurrencies such as Bitcoin [25] and Ethereum [33]. It provides a secure and transparent way to record and verify transactions across a network of nodes. Blockchain has been integrated with FL to tackle the security and privacy issues in existing FL [34,16,27,19,32].

Specifically, blockchain is used to store and manage the training model’s updates and the associated metadata securely and transparently. Instead of relying solely on a centralized aggregator to manage the model updates, the blockchain enables a decentralized and distributed consensus mechanism among the participating clients. However, existing blockchain-based FL designs [8] rely heavily on the computation of smart contracts (i.e., self-executing programs that run on a blockchain network and are triggered by blockchain events), which will cause expensive costs for FL networks with a large number of parameters. In our work, we address the scalability issue of blockchain-based FL by using zero-Knowledge proofs.

Preliminaries

In this section, we present the cryptographic building blocks for our zkFL system.

Commitments. A commitment scheme enables an entity to conceal a value while committing to it, with the ability to disclose the value at a later time if desired. The commitment scheme typically comprises two rounds: the committing round and the revealing round. In the committing round, the client commits to specific values while ensuring their confidentiality from others. The client retains the option to reveal the committed value in the subsequent revealing round. A commitment scheme includes two algorithms:

In this paper, we leverage Pedersen commitments [26,1] to compute the clients’ commitments/encryption on their local training model updates. Specifically, given the model update wi , a client will encrypt the update by computing Enc(wi) = g wi · h si , where g and h are public parameters and si is a random number generated by the client.

Zero-Knowledge Proof

Zero-knowledge proof [18,11,29] is a cryptographic primitive that allows a prover to convince the verifier about the correctness of some assertions without providing any meaningful information to the verifier. A zeroknowledge proof of some statement satisfies the following three properties:

A zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK) is a “succinct” non-interactive zero-knowledge proof (NIZK) for arithmetic circuit satisfiability. The construction of zk-SNARK is based on a field F and an arithmetic circuit C. An arithmetic circuit satisfiability problem of a circuit C : F n × F h → F l is captured by the statement stC : {(x,wit) ∈ F n × F h : C(x,wit) = 0l}, with the language LC = {x ∈ F n | ∃ wit ∈ F h s.t. C(x,wit) = 0 l}. Apart from correctness, soundness, and zero-knowledge properties of zeroknowledge proof, a zk-SNARK requires two additional properties, i.e., succinctness and simulation extractability [12,2].

System and Threat Models

In this section, we outline our system model, threat model, and system goals.

System Model

Threat Model

We consider a malicious aggregator that can choose not to honestly aggregate the local model updates from clients. The malicious aggregator can deviate from the protocol by:

We would like to remark that the scope of this work centers around the malicious aggregator rather than the honest-but-curious one, with a specific emphasis on ensuring aggregation integrity. We also acknowledge the potential for the aggregator to carry out model inversion attacks on the clients, a topic we intend to delve into in a future study.

System Goals