Authors:

(1) Mathias Brossard, Systems Group, Arm Research;

(2) Guilhem Bryant, Systems Group, Arm Research;

(3) Basma El Gaabouri, Systems Group, Arm Research;

(4) Xinxin Fan, IoTeX.io;

(5) Alexandre Ferreira, Systems Group, Arm Research;

(6) Edmund Grimley-Evans, Systems Group, Arm Research;

(7) Christopher Haster, Systems Group, Arm Research;

(8) Evan Johnson, University of California, San Diego;

(9) Derek Miller, Systems Group, Arm Research;

(10) Fan Mo, Imperial College London;

(11) Dominic P. Mulligan, Systems Group, Arm Research;

(12) Nick Spinale, Systems Group, Arm Research;

(13) Eric van Hensbergen, Systems Group, Arm Research;

(14) Hugo J. M. Vincent, Systems Group, Arm Research;

(15) Shale Xiong, Systems Group, Arm Research.

Editor's note: this is part 4 of 6 of a study detailing the development of a framework to help people collaborate securely. Read the rest below.

4 Veracruz

Throughout this section we make reference to the system components presented in the schematic in Fig. 1.

Veracruz is a framework which may be specialized to obtain a particular privacy-preserving, collaborative computation of interest. A Veracruz computation involves an arbitrary number of data owners, trying to collaborate with a single program owner. The framework places no limits on the number of data owners, but a particular computation obtained by specializing Veracruz will always spell out a precise number of participants. We use π to denote the program of the program owner, and use Di for 1 ≤ i ≤ N to denote the data sets of the various data owners in an arbitrary Veracruz computation.

The global policy captures the topology of a computation, specifying where information may flow from, and to whom, in a computation, while varying the program π varies precisely what is being computed. By varying the two, Veracruz can capture a general pattern of interaction shared by many delegated computations, and one could, for example, effect a varied palette of computations of interest, including:

Moving heavy computations safely off a computationally weak device to an untrusted edge device or server. The computationally-weak device is both data provider and result receiver, the untrusted edge device or server is delegate, and the computationally-weak device or its owner is the program provider, providing the computation to be performed.

Privacy-preserving machine learning between a pair of mutually distrusting parties with private datasets, but where learnt models are made available to both participants. Both principals are data providers, contributing their datasets provided in some common format, and also act as result receivers for the learnt model. Arbitrarily one acts as program provider, providing the implementation of the machine learning algorithm of interest. A third-party, e.g., a Cloud host, acts as delegate.

A DRM mechanism wherein novel IP (e.g., computer vision algorithms) are licensed out on a “per use” basis, and where the IP is never exposed to customers. The IP owner is program provider, and the licensee is both data provider and result receiver, providing the inputs to, and receiving the output from, the private IP. The IP owner themselves may act as delegate, or this can be contracted out to a third-party. With this, the IP owner never observes the input or output of the computation, and the licensee never observes the IP.

The implementation of privacy-preserving auctions. An auction service acts as program provider, implementing a sealedbid auction, and also acts as delegate. Bidders are data providers, submitting sealed bids. All principals are also result receivers, receiving notice of the auction winner and the price to be paid, which is public. Neither bidder nor auction service ever learn the details of any bids, other than their own and the winning bid.

In addition, it is easy to see how more complex distributed systems can be built around Veracruz. For example, a volunteer Grid computing framework where confidentiality is not paramount, but computational integrity is; an Ambient computing runtime for mobile computations across a range of devices; a privacy-preserving MapReduce [32] or Functionas-a-Service (FaaS, henceforth) style framework. Here, computational nodes act as an independent delegate for some aspect of the wider computation, and different isolation technologies may also be used in a single computation, either due to availability for Grid or Ambient computing, or due to scheduling of sensitive sub-computations onto stronger isolation mechanisms for MapReduce.

In the most general case, each principal in a Veracruz computation is mutually mistrusting, and does not wish to declassify—or intentionally reveal—their data: data providers do not wish to divulge their input datasets and the program provider does not wish to divulge their program. Nevertheless, as the examples enumerated above indicate, for some computations declassification can be useful, for example as inducement to other principals to enroll in the computation, a “nothing up my sleeve” demonstration. Referring back to the privacy-preserving machine learning use-case, above, the program provider may intentionally declassify their program for auditing—before other principals agree to participate— as a demonstration that the program implements the correct algorithm, and will not (un)intentionally leak secrets. Similarly, for a Grid computing project, revealing details of the computation, as an enticement to users to donate their spare computational capacity, may be beneficial.

Declassification can also occur as a side effect of the computation itself, for example when the result of a computation—which can reveal significant amounts of information about its inputs, depending on π—is shared with an untrusted principal. Principals must evaluate the global policy carefully, before enrolling, to understand where results will flow to, and what they may say about any secrets. Though Veracruz can be used to design privacy-preserving distributed computations, not every computation is necessarily privacy-preserving.

4.1 Attestation

Given Veracruz supports multiple isolation technologies, this poses a series of attestation-related problems:

Complex client code: Software used by principals delegating a computation to Veracruz must support multiple attestation protocols, complicating it. As Veracruz adds support for more isolation mechanisms—potentially with new attestation protocols—this client code must be updated to interact with the new class of isolate.

Leaky abstraction: Veracruz abstracts over isolation technology, allowing principals to easily delegate computations without worrying about the programming or attestation model associated with any one class of isolate. Forcing clients to switch attestation protocols, depending on the isolation technology, breaks this uniformity.

Potential side-channel: For some attestation protocols, each principal in a Veracruz computation must refer attestation evidence to an external attestation service.

Attestation policy: principals may wish to disallow computations on delegates with particular isolation technologies. These policies may stem from security disclosures— vulnerabilities in particular firmware versions, for example— changes in business relationships, or geopolitical trends. Given our support for heterogeneous isolation technologies, being able to declaratively specify who or what can be trusted becomes desirable. Existing attestation services do not take policy into account, pushing the burden onto client code— problematic if policy changes, as client code must be updated.

In response, we introduce a proxy attestation service for Veracruz, which must be explicitly trusted by all principals to a computation, with associated server and management software open source, and auditable by anyone. This service is not protected by an isolate, though in principle could be, and doing so would allow principals to check the authenticity of the proxy attestation service, before trusting it, for example. Implementing this would be straightforward; for now we assume that the attestation service is trusted, implicitly.

The proxy attestation service first uses an onboarding process to enroll an isolate hosting Veracruz, after which the isolate can act as a TLS server for principals participating in a computation. We describe these steps, referring to Fig. 2.

Onboarding an isolate The proxy attestation service maintains a root CA key (a public/private key pair) and a Root CA certificate containing the root CA public key, signed by the root CA private key. This root CA certificate is included in the global policy file of any computation using that proxy attestation service. An onboarding protocol is then followed:

  1. The proxy attestation server authenticates the attestation evidence received via the native attestation flow. Depending on the particular protocol, this could be as simple as verifying signatures via a known-trusted certificate, or by authenticating the received evidence using an external attestation service.

  2. The proxy attestation service computes the hash of the received CSR and compares it against the contents of the user-defined field of the attestation evidence. If it matches, it confirms that the CSR is from the same isolate as the evidence.

  3. The proxy attestation server converts the CSR to an X.509 Certificate [28] containing a custom extension capturing details about the isolate derived from the attestation process, including a hash of the Veracruz runtime executing inside the isolate (and optionally other information about the platform on which the isolate is executing). The certificate is signed by the private component of the proxy attestation server’s Root CA key.

6. The proxy attestation server returns the generated certificate to the Veracruz runtime inside the isolate.

In the typical CA infrastructure, a delegated certificate can be revoked by adding it to a Certificate Revocation List, checked by clients before completing a TLS handshake. While this scheme is possible with our system, we elected to use a different approach, setting the expiry in the isolate’s certificate to a relatively short time in the future, so that the proxy attestation service can limit the amount of time a compromised isolate can be used in computations. The lifetime of isolate certificates can be decided upon via a policy of the proxy attestation service, based upon their appetite for risk.

Note that the proxy attestation service solves the problems with attestation described above. First, client code is provided with a uniform attestation interface—here, we use Arm’s PSA attestation protocol [86]—independent of the underlying isolation technology in use. Second, none of the principals in the computation need to communicate with any native attestation service. Thus, the native attestation service knows that software was started in a supported isolate, but it has no knowledge of the identities or even the number of principals. Finally, the global policy represents the only source of policy enforcement. The authors of the global policy can declaratively describe who and what they are willing to trust, with a principal’s client software taking this information into account when authenticating or rejecting an attestation token.

Lastly, we note. that our attestation process is specifically designed to accommodate client code running on embedded microcontrollers—e.g., Arm Cortex®-M3 devices— with limited computational capacity, constrained memory and storage (often measured in tens of kilobytes), and which tend to be battery-powered with limited network capacity. Communication with an attestation service is therefore costand power-prohibitive, and using a certificate-based scheme allows constrained devices to authenticate an isolate running Veracruz efficiently. To validate this, we developed Veracruz client code for microcontrollers, using the Zephyr embedded OS [100]. Our client code is 9KB on top of the mbedtls stack [60], generally required for secure communication anyway. Using this, small devices can offload large computations safely to an attested Veracruz instance.

4.2 Programming model

Wasm [41] is designed as a sandboxing mechanism for use in security-critical contexts—namely web browsers—designed to be embeddable within a wider host, has a precise semantics [93], is widely supported as a target by a number of high-level programming languages such as Rust and C, and has high-quality interpreters [90] and JIT execution engines available [91]. We have therefore adopted Wasm as our executable format, supporting both interpretation and JIT execution, with the strategy specified in the global policy.

Veracruz uses Wasm to protect the delegate’s machine from the executing program, to provide a uniform programming model, to constrain the behavior of the program, and to act as a portable executable format for programs, abstracting away the underlying instruction set architecture. Via Wasm, the trusted Veracruz runtime implements a “twoway isolate” wherein the runtime is protected from prying and interference from the delegate, and the delegate is protected from malicious program behaviors originating from untrusted code.

To complete a computation, a Wasm program needs some way of reading inputs provided to it by the data provider, and some way of writing outputs to the result receivers. However, we would like to constrain the behavior of the program as far as possible: a program dumping one of its secret inputs to stdout on the host’s machine would break the privacy guarantees that Veracruz aims to provide, for example. Partly for this reason, we have adopted the WebAssembly System Interface [97] (or Wasi, henceforth) as the programming model for Veracruz. Intuitively, this can be thought of as “Posix for Wasm”, providing a system interface for querying Veracruz’s in-memory filesystem, generating random bytes, and executing other similar system tasks. (In this light, the Veracruz runtime can be seen as a simple operating system for Wasm.) By adopting Wasi, one may also use existing libraries and standard programming idioms when targeting Veracruz.

Wasi uses capabilities, in a similar vein to Capsicum [92], and a program may only use functionality which it has been explicitly authorized to use. The program, π’s, capabilities are specified in the global policy, and typically extend to reading inputs, writing outputs, and generating random bytes, constraining the program to act as a pure, randomized, function.

4.3 Ad hoc acceleration

Many potential Veracruz applications make use of common, computationally intensive, or security-sensitive routines: cryptography, (de)serialization, and similar. While these routines could be compiled into Wasm, this may incur a performance penalty compared to optimized native code, and for operations such as cryptography, compilation to Wasm may not preserve security properties such as timing sidechannel safety. Rather, it is beneficial to provide a single, efficient, and correct implementation for common use, rather than routines being compiled into Wasm code haphazardly.

In response, we introduced “native modules” providing acceleration for specific tasks which are linked into the Veracruz runtime and invoked from Wasm programs. In benchmarking one such module—the acceleration of (de)serialization of Json documents from the pinecone binary format—we observe a 35% speed-up when (de)serializing a vector of 10,000 random elements (238s native vs. 375s Wasm). Additional optimization will likely further boost performance.

Given the ad hoc nature of these accelerators, their lack of uniformity, and the fact that more will be added over time, invoking them from Wasm is problematic. Extending the Veracruz system interface to incorporate accelerator-specific functionality would take us beyond Wasi, and require the use of support libraries for programming with Veracruz. Instead, we opt for an interface built around special files in the Veracruz filesystem, with modules invoked by Wasm programs writing-to and reading-from these files, reusing existing programming idioms and filesystem support in Wasi.

4.4 Threat model

The Veracruz TCB includes the underlying isolate, the Veracruz runtime, and the implementation of the Veracruz proxy attestation service. The host of the Veracruz attestation service must also be trusted by all parties, as must the native attestation services or keys in use. The correctness of the various protocols in use—TLS, platform-specific native attestation, and PSA attestation—must also be trusted.

The Wasm execution engine must also be trusted to correctly execute a binary, so that a computation is faithfully executed according to the published bytecode semantics [80, 93], and that the program is unable to escape its sandbox, damage or spy on a delegate, or have any other side-effect than allowed by the Veracruz sandboxing model. Recent techniques have been developed that use post-compilation verification to establish this trust [48]—we briefly discuss our ongoing experiments in this area in §6. Compiler verification could be used to engender trust in the Wasm execution engine, though we are not aware of any verified, high-performance Wasm interpreters or JITs suitable for use with Veracruz at the time of writing (see [94] for progress toward this, however). Memory issues have been implicated in attacks against isolates in the past [58]—we write Veracruz in Rust in an attempt to avoid this, with the compiler therefore also trusted.

Veracruz does not defend against denial-of-service attacks: the delegate is in charge of scheduling execution, and liveness guarantees are therefore impossible to uphold. A malicious principal can therefore deny others access to a computation’s result, or refuse to provision a data input or program, thereby blocking the computation from even starting.

Different isolation technologies defend against different classes of attacker, and as Veracruz supports multiple technologies we must highlight these differences explicitly.

AWS Nitro Enclaves protect computations from the AWS customer running the EC2 instance associated with the isolate. While AWS assures users that isolates are protected from employees and other insiders, these assurances are difficult to validate (and, as silicon manufacturer, AWS and its employees must always be trusted). Our TCB therefore also contains the Nitro hardware, Linux host used inside the isolate, the attestation infrastructure for Nitro Enclaves, and any AWS insiders with access to that infrastructure.

For Arm CCA Realms only the Realm Management Monitor (RMM, henceforth), a separation kernel isolating Realms from each other, has access to the memory of a Realm other than the software executing in the Realm itself. Realms are protected from the non-secure hypervisor, and any other software running on the system other than the RMM, and will be protected against a class of physical attacks using memory encryption. Our TCB therefore contains the RMM, the system hardware, Linux host inside the Realm, along with the attestation infrastructure for Arm CCA.

For IceCap our TCB includes the seL4 kernel which we rely on to securely isolate processes from one another, bolstered by a body of machine-checked proofs of the kernel’s security and functional correctness (though at present these do not extend to the EL2 configuration for AArch64). For a typical hypervisor deployment of seL4, the SMMU is the only defence against physical attacks.

The TCB of Veracruz includes both local and remote stacks of hardware and software, while purely cryptographic techniques merely rely on a trustworthy implementation of a primitive and the correctness of the primitive itself. As demonstrated in §5, Veracruz provides a degree of efficiency and practicality currently out of reach for purely cryptographic techniques, at the cost of this larger TCB.

Principals face a challenging class of threats stemming from collusion between the other principals, including the delegate. Some algorithms may be particularly vulnerable to an unwanted declassification of secret inputs to any result receiver, and some attacks may be enhanced by collusion between principals—e.g., a side-channel inserted into the program for the benefit of the delegate. As discussed in §2, several powerful side-channel attacks have been demonstrated in the past against software executing within isolates, and other side-channels also exist including wall-clock execution time of the program, π, on the input data sets, and data sizes and arrival times leaked by TLS connections. In cases where programs are secret, principals must trust the program provider not to collude with the result receiver, as a secret program could trivially intentionally leak data into the result or contain convert channels. If the existence of this trust relationship is undesirable, then principals should insist on program declassification before enrolling in a computation.

This paper is available on arxiv under CC BY 4.0 DEED license.