Listen to the 'Mona Lisa' recite a famous Shakespeare monologue: Chinese engineers manage to obtain an image to sing and speak using an artificial intelligence application called Emote Portrait Live

Chinese engineers at the Alibaba Group Institute of Intelligent Computing have developed an artificial intelligence app called Emote Portrait Live that can animate a still photo of a face and sync it with an audio track.

The technology behind this is based on the generative capabilities of diffusion models (mathematical models used to describe how things propagate or diffuse over time), which can directly synthesize videos of character heads from an image provided and any audio clips. This process avoids the need for complex preprocessing or intermediate renderings, thus simplifying the creation of talking head videos.

scroll to top