Even if the content was 100% AI generated (which is the furthest thing from reality today) human engagement with the content is a powerful signal that can be used by AI to learn. It would be like RLHF with free human annotation at scale.
Back in the day when everyone used to watch broadcast TV, and stations synchronised their add breaks, water consumption would spike with every add break.
The UK has a unique problem with demand spikes for electricity during commercial breaks, due to the British penchant for using high-power electric kettles to make tea. In the worst case, demand could rise and fall by gigawatts within a matter of minutes.
Google already invests a tremendous amount of resources into identifying and preventing fraudulent ad impressions -- I don't see that changing much until AI is so cheap that it makes sense to run a full agent for pennies per hour. Sadly.
Not talking about fraud per se - in the sense of trying to drive revenue for a particular video channel - just that if you wanted to train AI on youtube videos you are in effect getting the advertisers to pay for the serving of them.
Perhaps the difference here is the behaviour would be much more human and thus harder to detect using current fraud detection?
I'm pretty sure YouTube saves the metadata from all the video files uploaded to it. It seems pretty trivial to exclude videos uploaded without camera model or device setting information. I seriously doubt even a tiny fraction of people uploading AI content to YouTube are taking the time to futz about with the XMP data before they upload it. Sure, they'll miss out on a lot of edited videos doing that, but that's probably for the best if you're trying to create a data set that's maintaining fidelity to the real world. Lots of ways to create false images without AI
"Since launching in 2023, SynthID has watermarked over 10 billion images, videos, audio files and texts, helping identify them as AI-generated and reduce the chances of misinformation and misattribution. Outputs generated by Veo 3, Imagen 4 and Lyria 2 will continue to have SynthID watermarks.
Today, we’re launching SynthID Detector, a verification portal to help people identify AI-generated content. Upload a piece of content and the SynthID Detector will identify if either the entire file or just a part of it has SynthID in it.
With all our generative AI models, we aim to unleash human creativity and enable artists and creators to bring their ideas to life faster and more easily than ever before."
I somewhat doubt that YT cares much about AI content being uploaded, as long as it’s clearly marked as such.
What they do care about is their training set getting tainted, so I imagine they will push quite hard to have some mechanism to detect AI; it’s useful to them even if users don’t act on it.
I agree, especially because in practice the vast majority of AI-generated videos uploaded to YouTube are going to be from one of about 3 or 4 generators (Sora, Veo, etc.). May change in the future, but at the moment the detection problem is pretty well constrained.
> Excluding videos from training datasets doesn't mean excluding them from Youtube.
Ah then sure. It was this part that was problematic.
If users are still allowed to upload flagged content, then false positives almost don't matter, so Youtube could just roll out some imperfect solution and it would be fine
In the future, a new intelligent species will roam the earth, they will ask, "why did their civilization fall?" The answer? These homo-sapiens strip mined the Earth and exacerbated climate change to generate enough power to make amusing cat videos...
Of course its my opinion, its my comment after all.
Nonetheless, survival can't be the life goal after all the moon will drift away from earth in the future, the sun will explode and if we survive that as a species, all bonds between elements will disolve.
It also can't be about giving your dna away because your dna has very little to no impact over just a handful of generations.
And no the goal of our society has to be to have as much energy available as possible to us. So much energy, that energy doesn't matter. There is enough ways of generating energy without a real issue at all. Fusion, renewable energy directly from the sun.
There is also no inherant issue right now preventing us all having clean stable energy besides capitalsm. We have the technology, we have the resources, we have the manufacturing capacity.
To finish my comment: Its not about energy, its about entropy. You need energy to create entropy. We don't even consume the energy of the sun, we use it for entropy and dissipate it back to space after.
There is an ever growing percentage of new AI-generated videos among every set of daily uploads.
How long until more than half of uploads in a day are AI-generated?