In 1998, disaster struck at Pixar. A single mistyped command — rm -rf / — began erasing Toy Story 2 from existence. Character by character, scene by scene, the movie that had taken a year to build vanished in seconds. The team watched in disbelief as Woody’s hat, Buzz’s wings, and entire sets disappeared before their eyes. When engineers rushed to restore from backups, they discovered something worse — the backup system had quietly failed weeks earlier. As IT professionals, we have all been there before, but what can we learn from this and get Buzz to his ship on time?
Andy’s Party
This “Core memory” took place in 1998, with Pixar co-founder Ed Catmull remembering it in his book called “Creativity, Inc.”. The story begins with an unfortunate, unnamed Pixar employee who was doing some routine file clearance on internal servers when they accidentally entered a deletion command on Toy Story 2's root folder…That’s some good news. This “Updating your resume event” resulted in character models and assets disappearing, and the file servers were quickly shut down.
Unfortunately, by that point, around 90% of the work done on Toy Story 2 was gone, and the sequel's backup system was not working properly for around a month either. At this point, Toy Story 2 would either have to start from scratch - or production would be scrapped altogether.
A Mother Saves the Day
A mother saves the day, just like when Buzz and Woody team up to get home. Galyn Susman, the film’s technical direction supervisor, who would be affected by Disney’s layoffs in 2023, had a copy of the Toy Story project at home. Galyn was on maternity leave and decided to continue working from home – something that is seen as normal today - but at the time, taboo. Being a mother and always planning ahead, just like having children, made it a point to take her work home once a week. This was a huge benefit because it allowed her to stay updated and maintain a reliable backup of Toy Story 2.
Just like a newborn baby, Pixar carefully transported the laptop back to the office, cradled and wrapped in blankets during the car ride - I imagine they even played lullaby music for the laptop…or maybe that is something I would do. Having the backup from Susman’s laptop allowed the team to copy the files and recover nearly everything that had been lost.
It was a joyous occasion with many high fives, and maybe put a smile on the face of the person responsible for the deletion. Susman’s backup copy didn’t have the entire movie on her computer, but they were able to retrieve enough to complete and deliver Toy Story 2 on time. Queue the inspirational music and dance like nobody is watching. What a story, right?
What about the employee who deleted the files? I am glad you are paying attention. So far, there are no reports of them being fired or facing consequences. I will say it’s easy to imagine the tension at the time, and maybe a future project with them working on the backup process.
Lessons Learned
The experience serves as a valuable lesson, not just for the Pixar folks but for IT professionals worldwide. There is a strong commitment to create multiple backups and implement extra security measures to prevent such incidents from happening again.
In this story, the backup system had failed months earlier, and nobody noticed. That meant there weren’t any backups to restore from, and business was at a standstill. Does that sound familiar to today’s events? It should because it happens a lot these days. What can businesses do to keep safe from this disaster?
Backups
-
The 3-2-1 rule - data backup rule is a strategy that recommends keeping three copies of your data, on two different types of storage media, with one copy stored offsite. This method ensures redundancy and protects data from a single point of failure, such as hardware failure, theft, or a local disaster.
-
Offsite backups - An offsite, air-gapped data backup stores a copy of your data in a separate physical or cloud location (offsite) and keeps it disconnected from your primary network (air-gapped). This combination protects your data from localized disasters and cyber threats like ransomware, which cannot remotely access or corrupt the air-gapped backup copy.
-
RPO & RTO - Recovery Point Objective and Recovery Time Objective. It’s not just important, but vital to your business continuity and survival in the event of a disaster. Most businesses state that they have backups tested and that pass the audits, but when they have to restore their systems when a disaster happens, it takes a lot longer than they had planned, and the business loses money because of it.
Technical Controls and Permissions: Restricted Folder Deletion Privileges.
- The simplest prevention would have been to set permissions on the server so that not all employees could delete the top-level directory for the movie. Granting "full control" access to a large group of users is common in collaborative environments, but it is a major security risk. Only a small number of administrators should have the permission to run "delete" commands on critical, high-level folders.
-
Command-level restrictions. The employee used the rm -r Linux command, which deletes a directory and all its contents recursively. A more advanced system could have prevented this command from running at the highest project directory level, either with a special script or by requiring a second authentication step.