Authors:

(1) Vivian Liu, Columbia University ([email protected]);

(2) Rubaiat Habib Kazi, Adobe Research ([email protected]);

(3) Li-Yi Wei, Adobe Research ([email protected]);

(4) Matthew Fisher, Adobe Research ([email protected]);

(5) Timothy Langlois, Adobe Research ([email protected]);

(6) Seth Walker, Adobe Research ([email protected]);

(7) Lydia Chilton, Columbia University ([email protected]).

Abstract and 1 Introduction

2 Related Work

2.1 Program Synthesis

2.2 Creativity Support Tools for Animation

2.3 Generative Tools for Design

3 Formative Steps

4 Logomotion System and 4.1 Input

4.2 Preprocess Visual Information

4.3 Visually-Grounded Code Synthesis

5 Evaluations

5.1 Evaluation: Program Repair

5.2 Methodology

5.3 Findings

6 Evaluation with Novices

7 Discussion and 7.1 Breaking Away from Templates

7.2 Generating Code Around Visuals

7.3 Limitations

8 Conclusion and References

ABSTRACT

Animated logos are a compelling and ubiquitous way individuals and brands represent themselves online. Manually authoring these logos can require significant artistic skill and effort. To help novice designers animate logos, design tools currently offer templates and animation presets. However, these solutions can be limited in their expressive range. Large language models have the potential to help novice designers create animated logos by generating animation code that is tailored to their content. In this paper, we introduce LogoMotion, an LLM-based system that takes in a layered document and generates animated logos through visually-grounded program synthesis. We introduce techniques to create an HTML representation of a canvas, identify primary and secondary elements, synthesize animation code, and visually debug animation errors. When compared with an industry standard tool, we find that LogoMotion produces animations that are more content-aware and are on par in terms of quality. We conclude with a discussion of the implications of LLM-generated animation for motion design.

1 INTRODUCTION

Motion suggests life, and as such, motion is a dimension we add to our designs to make them more dynamic and engaging. Animation is a special type of design form which we have created to help us take static designs into more media-rich and interactive contexts. A specific type of animated content that we frequently create is the animated logo. Animation allows logos, which have been defined as the “visual figureheads" of brands [25], to better integrate within videos, livestreams, websites, and social media. A well-executed animation can quickly engage an audience, introduce the brand or individual online, and elevate content to have more visual interest.

Authoring an animated logo is challenging. Logos are often more than just a pairing of icon with text. Because they can have different layouts, layers, color, and typography, they can take on great variety and be complex artifacts to animate. For a novice designer, it can be difficult to understand which design elements should be animated, in what sequence, and how to build up compelling and believable motion. There are many facets of motion to consider such as speed, timing, positioning, duration, easing, and motion personality (e.g. a playful bounce vs. a strong entrance). Additionally, when logos have more design elements, designers also have to understand how groups of elements can synchronize to coordinate motion and orchestrate a visual flow.

While there is a great demand for animated content, it is difficult for people outside of motion design to develop this kind of expertise. Design tools such as Adobe Express, Canva, and Figma often provide solutions in the form of animated templates and automatic animation techniques [10, 12, 13]. Templates pre-populate logo layouts with animations that users can customize. They illustrate how users can apply motion presets (e.g. slide, flicker, or fade) onto logo elements to create professional-looking animations. However, templates do not always adapt to every use case. When users make edits (e.g. add/remove/replace elements) to customize logo templates, they can easily break the seamless and professional look the templates were originally packaged with. An alternative to templates are automatic animation techniques, which globally apply rules and heuristics to animate canvases [12]. For example, all elements on a page can be directed to slide in from one side or sequentially fade into place. While templates and automatic techniques can get users to a starting point fast, neither solution works with a recognition of the user’s content, which is something that can be enabled by emerging technologies.

Large language models (LLMs) present the potential for contentaware animation. They can generate animation code that is specific to the design elements and their layout on the canvas. Code is a text representation that is often used to drive animation [18, 33, 53], because it can concisely specify how elements interact over time and space on a canvas. Because LLMs encode a vast amount of world knowledge, they can draw upon actions and activities related to the content being animated and generate a near infinite number of animations. This open-ended generative capacity can go beyond the scope of what templates, presets, and rule-based techniques usually cover.

Recent advancements have made LLMs more multimodal, such that they can take in both text and image as inputs, and provide visually-grounded responses. This make LLMs more applicable in domains like animation where a visual understanding of the canvas matters. It opens up the potential for users to provide images of their layout to an LLM and receive animations tailored to their layout and design elements. For example, if a novice designer wanted to animate a taxi, they could use an LLM to generate code to drive a taxi onto the canvas. This code could translate the taxi object along the x-axis before easing it into the center of the canvas to imply a stop-and-go motion befitting of taxis.

In this paper, we present LogoMotion, an LLM-based method that automatically animates static layouts in a content-aware way. LogoMotion generates code in a two-stage approach involving visually-grounded program synthesis and program repair. The first stage introduces multimodal LLM operators that take in visual context and handle the 1) construction of a text representation of the canvas, 2) conceptual grouping of elements, and 3) implementation of animation code. The second stage of our approach introduces a technique for visually-grounded program repair, which helps LLMs check what they have generated against the original layout and debug differences in a targeted layer-wise fashion.

Our contributions are as follows:

• LogoMotion, an LLM system that uses visually-grounded code generation to automatically generate logo animations from a PDF. The system identifies the visual content in each layer, infers the primary and secondary elements, and creates groups of elements. Based on this, the system suggests a design concept (in text) and uses the LLM to generate animation code. Users can optionally improve the animation by editing or adding their own design concept.

• Visually-grounded program repair, a mechanism that lets the LLM automatically detect and debug visual errors within its generated animation code, creating a feedback loop between LLM-generated code and its visual outputs.

• A technical evaluation of 276 animations showing that compared to Canva Magic Animate and an ablated version of the system (without stages for hierarchy analysis and design concept suggestions), the full pipeline of LogoMotion produces animations that are more content-aware.

• A qualitative evaluation of novice users showing that LogoMotion is able to quickly achieve their desired animation with minimal reprompting.

This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

[1] Video: https://youtu.be/Jo9opkMH7iY

[2] Project Page: https://vivian-liu.com/#/logomotion