The progress of the Animatediff community over the past 10 months has been miraculous - see attached!
Now, closed startups like Krea are taking the fruits of all this effort - so I'd like to tell the story of how we got here & what people who believe in open source can do. https://t.co/rZhaKGFm0X
First, @CeyuanY & team released Animatediff in July of last year.
Though it produced just 16 frames w/ no native ability to control the motion, the output looked good, took fairly little RAM and, due to the fact that it was built on top of SD1.5, it could be extended! https://t.co/Pm9BpZQcCI
I'm going to do a rough fly-through of progress in the area of image interpolation - it's one of many capabilities but I'm using it as an example because I did a bunch of work in it + documented progress well!
For those who don't know - interpolation = traveling between images: https://t.co/ReSvcUHRgA
Progress here is due to a lot of effort by a huge number of people - the original Animatediff team, many in the AD/Banodoco community, lots of people from the SD1.5 ecosystem, ComfyAnon, and many artists who inspired progress!
I name some below but it's an ecosystem of effort: https://t.co/hLVXD7FUYQ

To start this story, @TDS_95514874 was one of the first people publicly experimenting w/ Animatediff - he shared a bunch of early examples of image interpolation w/ ControlNet Tile hacked together in A1111:
https://t.co/JrAQK5h1DI
However, in A1111, it was difficult for people to build on this. Thankfully, in September, @Kosinkadink built infrastructure for Comfy that allowed people to combine it w/ other models - making it easy to make/share progress.
He also made the first interpolation Comfy workflow. https://t.co/sh3CnIM5S9
Kosinkadink also expanded the capabilities of the base model, creating sliding context windows - basically, allowing the videos to be longer than 16 frames.
This meant that I was able to extend his workflow to 3 images - though it still inherited all the issues of prior versions https://t.co/Zy8dVx5gm3
In October, @cubiq and laksjdjf brought IP-Adapter to Comfy!
Implementing IP-Adapter in this workflow helped hugely for both the colours/adherence to the images!
Fannovel16 also implemented FiLM interpolation into Comfy which made the output smoother! https://t.co/FLKcbESATL
From there, in November, I started to build the Steerable Motion node, in order to provide simple controls for batches of images - the first version butchered Kosinkadink's + looked pretty weird but it worked!
The release of AD v.2 also provided smoother motion: https://t.co/BNgJ7OGTeY
Next, @cubiq introduced more features on top of IP-Adapter that allowed it to be used easier in batches - this enabled more style-adherent transformations w/ significantly better motion. Kosinkadink introduced motion scale for Animatediff which allowed for balanced movement. https://t.co/gYTjo6hDd8
In January, the AD team released SparseCtrl, a CN for Animatediff that forced image adherence.
While it could do interpolation, this was very limited but implementing it in combination w/ IP-Adapter allowed greater image adherence!
With a lot of tweaking, it looked pretty good: https://t.co/RtyDQqwtHh
Next, in March, the legendary @cubiq came through one again, implementing a tiling capability for IP-Apapter - this allowed IP-Adapter to work at a higher resolution!
Balancing this w/ SparseCtrl allowed for far greater image adherence with far fewer artifacts! https://t.co/F9oAzk8wKR
ExponentialML also implemented MotionDirector by the ShowLab team - motion LoRAs which effectively allow you to guide the videos w/ reference videos - providing a whole other vector of control.
Kosinkadink implemented this into Comfy: https://t.co/F6r4vt9iBd
Next, @cubiq implemented a bunch more improvements which hugely reduced the RAM consumption.
@idgallagher implemented this into Steerable Motion - allowing for close-to unlimited frames at <16gb VRAM!
Example by @idgallagher & @ChangAnTing: https://t.co/LnRwPBacG4
And the evolution will continue - it will get better and better - for example, next week Steerable Motion 1.5 will be able to handle complex motion a lot better: https://t.co/BY4Vjk1qoc
And Steerable Motion is also just one way of MANY to travel between images - for examples, @PurzBeats Context Smashing - which uses guidances videos to drive the motion with a stunning effect - example by
siraev.vis: https://t.co/KBmmbNkCBm
All of this progress has been due to extreme effort by a ragtag group of talented people - who each decided to share their progress openly for others to build on top of and create beautiful things with!
It's honestly utopian
And interpolation is also just one vector of progress - there are so many more driven by brilliant people in the community on top of Animatediff. As just one example of progress elsewhere, check out this weird shit by @jboogx_creative:
https://t.co/kob3oOJQp2
When Animatediff came out, I shared posts explaining its potential & built a community of nerds who were equally excited: https://t.co/oHK8ycW2MO
Most people who build/create art with it hang out there - it's hard to say but I think bringing people together made a difference.
However, now that progress has reached this point, there's money to be made from it.
For example, here is Krea taking @PurzBeats Context Smashing workflow above - using the output of months of work/infrastructure by many people w/ no credit/compensation:
https://t.co/6VWlvIUuI6
This is inevitable but damaging for the ecosystem: not only does it demoralise but companies that commercialise accrue resources to invest - if those companies are closed, they'll invest in closed models - despite most of what they do being based on open models & ingenuity.
So what can people who believe in open source do?
One thing open source has on its side is the chaotic energy of many brilliant people - people who believe in it and want to help support it.
With Banodoco, my goal is to build a company that people who believe in in open source want to get behind - sharing 75% of profit & returns of 100% of our equity outside investment (https://t.co/yAeViMxq1I) w/ the community whose work made it possible - & being 100% transparent.
I believe that we can make openness our strength - that because we've been contributing towards this effort from the beginning & the community shares in our success, people might support our commercialisation efforts - meaning our tools evolve faster & reach more people.
Concretely, one thing I hope people do is to create art with our tools - not just straight up generations but labours of love that show the power of what they can do.
You can turn your images into beautiful videos - what could you do with that? https://t.co/DINiiOVfMG
To access this capability, we have simple but powerful Discord bot - we make money from this which we share w/ the community: https://t.co/oc35yK40RH
However, we also have a ComfyUI node: https://t.co/k2eA8poZHj
And a powerful local art app: https://t.co/9kHGNV0hEx
I believe that it's possible to create amazing art with these - for example, look at what @hannahsubmarine did with Dough for Grimes' Coachella show:
https://t.co/8TZApywEu7
And check this insanity from @FlippingSigmas:
https://t.co/HkdSjvdKvi
And here's a significantly calmer video I made for a poem I love: https://t.co/6mwMyBxS15
I believe that if more and more people create art with these tools and share links back to @banodoco or tag #madewithDough, this can act like the best marketing - especially if the art people see is high-effort and beautiful. I'm biased but I think these are great tools!
Right now, these tools use the base Steerable Motion workflow but I hope @PurzBeats lets me use his workflow & others will too - this will allow us to chaotically evolve faster than closed-source companies w/ creator's support + we will share profit from each workflow use.