Abstract and I. Introduction

II. Background

III. Paranoid Stateful Lambda

IV. SCL Design

V. Optimizations

VI. PSL with SCL

VII. Implementation

VIII. Evaluation

IX. Related Work

X. Conclusion, Acknowledgment, and References

II. BACKGROUND

A. Secure Enclaves

Our design does not assume specific secure enclave hardware or a set of supported instructions; we only require the trusted hardware to have semantics for memory protection and attestation. Here, we introduce Intel Software Guard Extensions (SGX) [15] due to its wide adoption. SGX allows users to create a secure, isolated environment protected from the privileged host OS, hypervisor, or any hardware devices connected to the host. SGX protects against physical adversaries and uses a hardware Memory Encryption Engine (MEE) to guarantee the confidentiality and integrity of enclave memory. All enclave memory must occupy a specific section of memory in the enclave page cache (EPC). If an EPC page is evicted, it is encrypted and stored onto the disk. An EPC page that is loaded back into memory is integrity checked and decrypted.

The host OS is still responsible for mapping page tables and allocating memory, but subsequent memory accesses are checked by SGX. SGX ensures that enclave memory can only be accessed by the specific enclave the page is allocated to when walking the page table. In SGXv1, the EPC is a limited resource and has a static limit of 128MB shared across all enclaves, while SGXv2 dynamically allocates an EPC that can be oversubscribed by multiple secure enclaves.

SGX can also verify the identity of an application running inside a secure enclave. Intel allows for an attestation report of an enclave to be generated. This report includes a measurement of the code and data sections of the application binary signed by a hardware root of trust and can be verified through Intel’s Attestation Services (IAS).

B. DataCapsules and the Global Data Plane

We briefly discuss the benefits of adopting DataCapsules [29] as the underlying storage objects for PSLs. DataCapsules are bundles of data containing data chunks (“records”), along with cryptographic relationships between these records (i.e. hashes) and proofs of membership and/or provenance for these records (i.e. signatures); see Figure 2. DataCapsules have an owner, which is a public/private key pair. Anyone with the private owner key can add records to a DataCapsule but cannot modify existing records. Consequently, write operations are appendonly and must be accompanied by a signature from the DataCapsule owner. Read operations return a proof of membership, consisting of a signature and a chain of hashes along with data. Thus, it is possible to perform secure operations on remote DataCapsules embedded within the network.

The Merkle-tree structure of hashes and signatures within a DataCapsule provides two major benefits: First, it is a Conflict-Free Replicated Data Type (CRDT). This means any two partial DataCapsule replicas can be easily synchronized by simply taking the union of records between them; the resulting tree is uniquely defined by the backlinks (hashes). Second, it prevents malicious parties from forging records or corrupting existing records; the worst that a malicious party could do with a DataCapsule is executing a freshness attack by denying the presence of recent records. Freshness attacks can be prevented or mitigated in a variety of ways, including replication, periodic timestamping of records, and caching of pointers to the most recent records. For our PSL infrastructure, we start by requesting the most recent records from a trusted service provider, then maintain the most recent “wavefront” of signed records within our active enclaves. As a result of their hardened nature, DataCapsules can migrate to the edge and benefit from its storage and networking resources.

Since DataCapsules can be viewed as secure logs, they can encapsulate a wide variety of storage “patterns,” such as key-value stores (in this paper), filesystems, data streams, and databases. All that is required to implement such patterns is a layer of software, called a common access API (CAAPI),

that accepts standard user requests (e.g. POSIX filesystem requests) and translates them to operations on the underlying DataCapsule. Such CAAPIs run in secure enclaves, since they need access to cryptographic keys to produce signatures and to encrypt/decrypt information over the DataCapsule API.

In this paper, we assume that DataCapsules reside in some network server that is able satisfy DataCapsule read and write operations. However, the true power of DataCapsules is revealed in the context of a data-centric network such as the Global Data Plane (GDP) [29]. Each DataCapsule has a unique 256-bit identity derived from a hash over the public owner key and other metadata. The GDP can route messages to a DataCapsule using its identity rather than a location (i.e. an IP address). Thus, a data client can send reads and writes to a DataCapsule without knowing its location[4]. Thus, with the GDP, PSLs could launch anywhere and access their data simply by possessing (1) the unique identity of the DataCapsule containing its data, (2) the cryptographic ownership and encryption keys for the DataCapsule, and (3) a connection into the GDP.

Authors:

(1) Kaiyuan Chen, University of California, Berkeley ([email protected]);

(2) Alexander Thomas, University of California, Berkeley ([email protected]);

(3) Hanming Lu, University of California, Berkeley (hanming [email protected]);

(4) William Mullen, University of California, Berkeley ([email protected]);

(5) Jeff Ichnowski, University of California, Berkeley ([email protected]);

(6) Rahul Arya, University of California, Berkeley ([email protected]);

(7) Nivedha Krishnakumar, University of California, Berkeley ([email protected]);

(8) Ryan Teoh, University of California, Berkeley ([email protected]);

(9) Willis Wang, University of California, Berkeley ([email protected]);

(10) Anthony Joseph, University of California, Berkeley ([email protected]);

(11) John Kubiatowicz, University of California, Berkeley ([email protected]).


This paper is available on arxiv under CC BY 4.0 DEED license.

[4] When multiple DataCapsules exist with the same identity, they are assumed to be equivalent; thus the GDP will try to route queries to the “closest” equivalent DataCapsule. Replication thus provides a mechanism for content distribution, providing a cryptographically hardened form of CDN with well-defined, in-network update semantics, unlike alternatives such as NDN [47].