Google is giving text-to-video generation another shot with Lumiere, a new artificial intelligence model capable of creating surprisingly high-quality content.
The tech giant has certainly come a long way since the days of Imagen Video. The subjects of Lumiere's videos are no longer those nightmarish creatures with melted faces. Now things seem much more realistic. The sea turtles look like sea turtles, the animals' fur has the right texture, and the people in the AI clips have genuine smiles (for the most part). What's more, there's very little of the strange jerky motion seen in other text-to-video generative AI. The movement is largely buttery smooth. Inbar Mosseri, research team leader at Google Research, posted a video on his YouTube channel demonstrating Lumiere's capabilities.
Google worked hard to make Lumiere content look as realistic as possible. The development team achieved this by implementing something called the Space-Time U-Net (STUNet) architecture. The technology behind STUNet is quite complex. But as Ars Technica explains, it allows Lumiere to understand where objects are in a video, how they move and change, and renders these actions at the same time, resulting in a fluid creation.
This goes against other generative platforms that first set keyframes on clips and then fill in the empty spaces. Doing so results in the jerky motion the technology is known for.
Well-appointed
In addition to text-to-video generation, Lumiere has numerous features in its toolset, including support for multimodality.
Users will be able to upload original images or videos to the AI so it can edit them to their specifications. For example, you can upload an image of The girl with the pearl earring by Johannes Vermeer and turn it into a short clip where she smiles instead of staring. Lumiere also has an ability called Cinemagraph that can animate highlighted parts of images.
Google demonstrates this by selecting a butterfly perched on a flower. Thanks to AI, the output video shows the butterfly flapping its wings while the flowers around it remain stationary.
Things get particularly impressive when it comes to video. Video Inpainting, another feature, works similarly to Cinemagraph in that the AI can edit parts of clips. A woman's green patterned dress can be turned into shiny gold or black. Lumiere goes a step further by offering video stylization to modify video themes. A normal car driving on the road can be converted into a vehicle made entirely of wood or Lego bricks.
still in process
It is unknown if there are plans to release Lumiere to the public or if Google intends to implement it as a new service.
Perhaps we could see AI appear in a future Pixel phone as the evolution of Magic Editor. If you're not familiar with it, Magic Editor uses “AI processing [to] Cleverly change spaces or objects in photos on the Pixel 8. Video Inpainting, to us, seems like a natural progression for the technology.
For now, it looks like the team will keep it behind closed doors. As impressive as this AI may be, it still has its problems. There are choppy animations. In other cases, subjects have deformed limbs. If you want to know more, Google's research paper on Lumiere can be found on Cornell University's arXiv website. Warning: it is a dense read.
And be sure to check out TechRadar's roundup of the best AI art generators for 2024.