Read an amazing stat over the weekend about the amount of stuff on the internet that is already AI generated. I was expecting it to be somewhere around 10%. I thought that sounded reasonable given that generative language bots have only really been in wide public use for a year, and the web has been filling up with organic, hand tooled content for decades.
Nope.
According to a study, by researchers at the Amazon Web Services, over half the web could already be written by droids.
“…specifically, 57.1 percent of all of the sentences on the internet have been translated into two or more other languages. The poor quality and staggering scale of these translations suggest that large language model (LLM) -powered AI models were used to both create and translate the material. The phenomenon is especially prominent in "lower-resource languages," or languages with less readily available data with which to more effectively train AI models.
In other words, in what the researchers believe to be a ploy to garner clickbait-driven ad revenue, AI is being used to first generate poor-quality English-language content at a remarkable scale, and then AI-powered machine translation (MT) tools transcribe said content into several other languages. The translated material gets worse each time — and as a result, entire regions of the web are filling to the brim with degrading AI-scrambled copies of copies.
On that sort of growth trajectory 99% could be AI filler within a year or so. It’s making me think having a small, hand tooled blog might become cool again.
On the up side, since the machines can't tell the generated from the real, and they train on everything, they will be rapidly heading towards some sort of gibberish strange attractor, and will make less and less sense over time.
Should make hand-tooled blogs even more popular. At least among people who read.
this is so on point in our household. We were discussing about how proper journalism will become a niche part of our lives again. You'll get 1% of the cream of the crop that attract the millions of followers, but people will congregate around sources they trust. Which becomes a problem because humans are biased and will just go to sources that they agree with, but i wonder how much of that actually happens and most reasonable (the majority of) people will follow a Leigh Sales or Laura Tingle. And that will then form a basis for uni teaching in that sphere focusing on research and how to filter out the bot shit. Journalists will have to stake their reputation on how well they can sort the 1% from the 99% bot dribble and become information gateway gurus. I've already had to talk to my kids who have said "i cant believe i found out about this before they were talking about it on the news" and had to sit them down and discuss the benefits of news outlets verifying information, biased reporting and pressures of news outlets in this day and age, and that positive reinforcement on one event blocks out the memory of all those times they didnt get it right. Brave New World.