Tech

We made a cat drink a beer with Runway’s AI video generator, and it sprouted arms


A screen capture of an AI-generated video of a cat drinking a can of beer, created by Runway Gen-3 Alpha.

In June, Runway debuted a brand new text-to-video synthesis mannequin known as Gen-3 Alpha. It converts written descriptions known as “prompts” into HD video clips with out sound. We have since had an opportunity to make use of it and wished to share our outcomes. Our checks present that cautious prompting is not as essential as matching ideas doubtless discovered within the coaching knowledge, and that reaching amusing outcomes doubtless requires many generations and selective cherry-picking.

A permanent theme of all generative AI fashions we have seen since 2022 is that they are often wonderful at mixing ideas present in coaching knowledge however are usually very poor at generalizing (making use of realized “information” to new conditions the mannequin has not explicitly been educated on). Which means they will excel at stylistic and thematic novelty however battle at basic structural novelty that goes past the coaching knowledge.

What does all that imply? Within the case of Runway Gen-3, lack of generalization means you would possibly ask for a crusing ship in a swirling cup of espresso, and supplied that Gen-3’s coaching knowledge contains video examples of crusing ships and swirling espresso, that is an “straightforward” novel mixture for the mannequin to make pretty convincingly. However in case you ask for a cat consuming a can of beer (in a beer business), it’ll usually fail as a result of there aren’t doubtless many movies of photorealistic cats consuming human drinks within the coaching knowledge. As a substitute, the mannequin will pull from what it has realized about movies of cats and movies of beer commercials and mix them. The result’s a cat with human arms pounding again a brewsky.

A couple of fundamental prompts

In the course of the Gen-3 Alpha testing part, we signed up for Runway’s Customary plan, which offers 625 credit for $15 a month, plus some bonus free trial credit. Every era prices 10 credit per one second of video, and we created 10-second movies for 100 credit a chunk. So the amount of generations we might make have been restricted.

We first tried a couple of requirements from our picture synthesis checks prior to now, like cats drinking beer, barbarians with CRT TV sets, and queens of the universe. We additionally dipped into Ars Technica lore with the “moonshark,” our mascot. You will see all these outcomes and extra beneath.

We had so few credit that we could not afford to rerun them and cherry-pick, so what you see for every immediate is precisely the one era we acquired from Runway.

“A highly-intelligent individual studying “Ars Technica” on their laptop when the display explodes”

“business for a brand new flaming cheeseburger from McDonald’s”

“The moonshark leaping out of a pc display and attacking an individual”

“A cat in a automotive consuming a can of beer, beer business”

Will Smith eating spaghetti” triggered a filter, so we tried “a black man consuming spaghetti.” (Watch till the tip.)

“Robotic humanoid animals with vaudeville costumes roam the streets amassing safety cash in tokens”

“A basketball participant in a haunted passenger practice automotive with a basketball courtroom, and he’s taking part in towards a workforce of ghosts”

“A herd of 1 million cats working on a hillside, aerial view”

“online game footage of a dynamic Nineteen Nineties third-person 3D platform recreation starring an anthropomorphic shark boy”



Source

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button