Abstract and 1. Introduction

  1. Background & Related Work

  2. Method

    3.1 Sampling Small Mutations

    3.2 Policy

    3.3 Value Network & Search

    3.4 Architecture

  3. Experiments

    4.1 Environments

    4.2 Baselines

    4.3 Ablations

  4. Conclusion, Acknowledgments and Disclosure of Funding, and References

Appendix

A. Mutation Algorithm

B. Context-Free Grammars

C. Sketch Simulation

D. Complexity Filtering

E. Tree Path Algorithm

F. Implementation Details

3.4 Architecture

We add two additional types of tokens: an <EDIT> token, which serves as a start-of-sentence token for the model; and <POS x>tokens, which allow the model to reference positions within its context. Given a current image, a target image, and a current tokenized program, we train this transformer model to predict the edit position and the replacement text autoregressively. While making predictions, the decoding is constrained under the grammar. We mask out the prediction logits to only include edit positions that represent nodes in the syntax tree, and only produce replacements that are syntactically valid for the selected edit position.

We set σsmall = 2, which means the network is only allowed to produce edits with fewer than two primitives. For training data, we sample an infinite stream of random expressions from the CFG. We choose a random number of noise steps, s ∈ [1, 5], to produce a mutated expression. For some percentage of the examples, ρ, we instead sample a completely random new expression as our mutated expression. We trained for 3 days for the environments we tested on a single Nvidia A6000 GPU.

Authors:

(1) Shreyas Kapur, University of California, Berkeley ([email protected]);

(2) Erik Jenner, University of California, Berkeley ([email protected]);

(3) Stuart Russell, University of California, Berkeley ([email protected]).


This paper is available on arxiv under CC BY-SA 4.0 DEED license.