Calendar Icon - Dark X Webflow Template
May 24, 2023
Clock Icon - Dark X Webflow Template
 min read

Mixamo animations + Stable Diffusion = Rapid Animation Prototyping

A rapid prototyping workflow for animations using Mixamo and Stable Diffusion v2.

Mixamo animations + Stable Diffusion = Rapid Animation Prototyping

Alright, so we're back with another Stable Diffusion v2 tutorial.

If you're new here, we're Generative Nation, a website and community fully dedicated to generative AI. We release new workflows and tutorials just like this every single week.

A few days ago we released a tutorial on pose control for generative art.

It was only for static images so we thought we explore how we use the same techniques but for animations.

The main goal was to create a rapid prototyping framework for animations. Basically, prototyping 3D animations and movie scenes is still very expensive and you need a lot of 3D modelling knowledge. In this tutorial we share a workflow that you can use to prototype new animation ideas using text-to-image AI tools.

Now before we start, we want to emphasize that generative video/animation is still challenging and we don't claim to fully solve it here. But the workflow described in this tutorial might spark some ideas in other people's head.

The main idea

Okay so let's get started.

Our idea was that we could turn a generic animation into a stylized format using depth2img. This way, we can animate the characters of our scene while we can generate the background and other assets with static text-to-image.

First we needed a few animations. Fortunately, Adobe owns a website called Mixamo that has has over 3,000+ motion tracked animations. We can use these as base and stylize them with Stable Diffusion v2's depth2img. Finally, we can generate these assets and assemble the scene.

Exporting Mixamo animations

First of all, we created a free account on Mixamo.

Let's say we want to prototype a scene where Spiderman is fighting with a stormtrooper in a pub.

First we need some kind of kicking animation for one character and a defensive animation for the other.

We found a kicking animation (this will be Spiderman's) and a jumping animation (this will be the stormtrooper's).

We recorded these using screen capture software and trimmed the videos such that we had continuous loops.

Once we had these videos we converted them to image sequences using ezgif.

We changed the length of the animations so we had 54 frames for each animation. If you don't do this, you're going to end up with animations that aren't in snyc.

So far so good.

Stylizing with depth2img

In our previous tutorial we shared a Colab notebook for depth2img.

We opened it up, ran the cells and stylized every single frame one by one with the same prompt. For the spiderman animation we used spiderman, realistic, 4k whereas for the stormtrooper one we used stormtrooper, realistic, 4k. Simple.

This is a super boring process and although we're sure that this could be automated we were just lazy and stylized every single frame one by one.

Once we had these frames saved on our computer we used QuickTime to generate a video from the image sequences. If you don't know how to do this, check out this tutorial.

Here are the output videos:

There's quite a lot of flickering but it's not too bad.

Finally, we removed the gray background from both videos using Unscreen.

Using Unscreen's free version we can only save our video in GIF format but this is perfect for us (more on this later).

Creating the background and the assets

Now this part is very simple.

We wanted to add a pub background so we generated one with Midjourney.

Background generated using Midjourney, prompt: cartoon pub inside --v 4

Spiderman's kick is quite high so we decided to put the stormtrooper on a table to make the animation look more realistic.

Table generated using Midjourney, prompt: cartoon pub table with beers on it, viewing from the side, 2d camera view --v 4

Assembling the scene

Finally we assembled the scene using Online Video Cutter. The cool thing is that it handles GIFs quite well so it was relatively easy.

The scene looks pretty damn good given the fact that all these frames were generated by AI. Of course, there's flickering and the scene is not very cohesive but we're quite happy with the end result. Not to mention that the entire workflow could be automated. Perhaps Adobe should create a generative product from Mixamo?

If this workflow delivered value to you, subscribe to our newsletter to receive other tutorials like this.

Mixamo animations + Stable Diffusion = Rapid Animation Prototyping

Founder of Generative Nation

Latest articles

Browse all