Research finds companies are training AI models on YouTube content without permission


Artificial intelligence models require as much useful data as possible to function, but some of the largest AI developers rely in part on YouTube videos transcribed without the creators' permission, violating YouTube's own rules, as discovered in an investigation by Test News and Cabling.

The two outlets revealed that Apple, Nvidia, Anthropic and other major AI companies have trained their models on a dataset called YouTube Captions that incorporates transcripts of nearly 175,000 videos across 48,000 channels — all without the videos’ creators knowing.

scroll to top