Meta brings us a step closer to AI-generated movies
Like “Avengers” director Joe Russo, I’m becoming increasingly convinced that fully AI-generated movies and TV shows will be possible within our lifetimes.
A host of AI unveilings over the past few months, in particular OpenAI’s ultra-realistic-sounding text-to-speech engine, have given glimpses into this brave new frontier. But Meta’s announcement today put our AI-generated content future into especially sharp relief — for me at least.
Meta his morning debuted Emu Video, an evolution of the tech giant’s image generation tool, Emu. Given a caption (e.g. “A dog running across a grassy knoll”), image or a photo paired with a description, Emu Video can generate a four-second animated clip.
Emu Video’s clips can be edited with a complementary AI model called Emu Edit, which was also announced today. Users can describe the modifications they want to make to Emu Edit in natural language — e.g. “the same clip, but in slow motion” — and see the changes reflected in a newly generated video.
Now, video generation tech isn’t new. Meta’s experimented with it before, as has Google. Meanwhile, startups like Runway are already building businesses on it.
But Emu Video’s 512×512, 16-frames-per-second clips are easily among the best I’ve seen in terms of their fidelity — to the point where my untrained eye has a tough time distinguishing them from the real thing.
Well — at least some of them. It seems Emu Video is most successful animating simple, mostly static scenes (e.g. waterfalls and timelapses of city skylines) that stray from photorealism — that is to say in styles like cubism, anime, “paper cut craft” and steampunk. One clip of the Eiffel Tower at dawn “as a painting,” with the tower reflected in the River Seine beneath it, reminded me of an e-card you might see on American Greetings.
A host of AI unveilings over the past few months, in particular OpenAI’s ultra-realistic-sounding text-to-speech engine, have given glimpses into this brave new frontier. But Meta’s announcement today put our AI-generated content future into especially sharp relief — for me at least.
Meta his morning debuted Emu Video, an evolution of the tech giant’s image generation tool, Emu. Given a caption (e.g. “A dog running across a grassy knoll”), image or a photo paired with a description, Emu Video can generate a four-second animated clip.
Emu Video’s clips can be edited with a complementary AI model called Emu Edit, which was also announced today. Users can describe the modifications they want to make to Emu Edit in natural language — e.g. “the same clip, but in slow motion” — and see the changes reflected in a newly generated video.
Now, video generation tech isn’t new. Meta’s experimented with it before, as has Google. Meanwhile, startups like Runway are already building businesses on it.
But Emu Video’s 512×512, 16-frames-per-second clips are easily among the best I’ve seen in terms of their fidelity — to the point where my untrained eye has a tough time distinguishing them from the real thing.
Well — at least some of them. It seems Emu Video is most successful animating simple, mostly static scenes (e.g. waterfalls and timelapses of city skylines) that stray from photorealism — that is to say in styles like cubism, anime, “paper cut craft” and steampunk. One clip of the Eiffel Tower at dawn “as a painting,” with the tower reflected in the River Seine beneath it, reminded me of an e-card you might see on American Greetings.
Comments
Post a Comment