The Trash-Text Tsunami
In 2024, The Internet Will Become An Ocean of (Null)formation
A few months ago the Center For Mark Twain Studies hosted Sheila Williams, the longtime editor of Asimov’s Science Fiction. Williams’s address was part of our 2024 Quarry Farm Symposium focused on speculative, technocratic, and science fictions and she discussed how these genres have developed over the course of her decades as an editor, often responding to seismic changes in publishing.
The contemporary epoch, as Williams describes it, began just over a year ago, with the release of OpenAI’s ChatGPT. It is not that ChatGPT has increased the quantity of publishable submissions to her magazine, but rather that it threatens to exponentially increase the quantity of unpublishable ones. Citing the recent screenwriters strike and the associated “fear that their labor would be replaced by AI,” Williams says, “At the moment, that’s not my concern. My concern is that I will be overwhelmed by AI spam.”
Within three months of the public release of ChatGPT-3, Asimov’s submissions spiked by 55%. Williams says, “I have never read an original human-authored submission that was as poorly-written or as uninteresting as these pieces.” But even as she grew more adept at identifying ChatGPT-generated work, each submission required some time and attention, and the scale of increased submissions made her fearful of long-delayed responses to real authors or undesirable changes to her magazine’s long-standing open submission policy, changes which would make it harder for them to discover and publish fresh new voices, something which Asimov’s has a long, proud record of doing.
As a member of my college’s ad hoc committee tasked with advising faculty and administration on the implications of Generative AI, what struck me about Williams’s testimony was how it resembled what I was hearing from colleagues.
Educators’ initial concern - that students would be able to use ChatGPT to automate assignments and therefore receive satisfactory grades without developing any of the skills those assignments were intended to assess - had been almost immediately superseded by concerns about the increase of bureaucratic labor for which ChatGPT was directly responsible.
You see, in college courses, as at Asimov’s Science Fiction, we are being inundated with trash-text submissions. It isn’t so much that ChatGPT-generated submissions are difficult to identify, but rather that the prevalence of them puts instructors in a bind. Upon recognizing a ChatGPT-generated submission, they have a few options. They can:
follow existing academic honesty protocols, in many cases protocols which remain vague about what constitutes academic dishonesty in relationship to AI tools.
have careful conversations with each individual student about work they suspect has been generated by ChatGPT, conversations which may be met, at least at first, with denial, anger, tears, etc. Such conversations usually lead to asking students to repeat assignments.
treat the work as if it were composed by the student. This approach - which, for the record, has been my default position - leads to some pretty comical assessments (“this citation does not exist,” “this is not a text we read,” “not the topic for this assignment,” etc.), but hopefully it signals to the student that ChatGPT will not yield passable work.
give the student a low, but passing grade, with minimal comment (which is likely what they expect) and move on to submissions from students who value learning and demonstrate actual investment in their coursework.
To be clear, I am not advocating for any of the above strategies. I can see the merits (and risks) associated with each of them, and a wide range of factors (discipline, assignment type, grade weight, course level, class size, etc.) could determine the approach an instructor might prefer.
What’s most significant, I think, is that each of the first three strategies requires a significant increase in the time and labor devoted to each ChatGPT-generated submission compared to “normal” submissions. If even a relatively small proportion of students (say, 3-5%) are relying on ChatGPT, that’s likely hours of labor over the course of a term added to each class.
So, either ChatGPT creates more work for instructors already stretched by the well-documented labor intensification since 2020, or, following a pretty clear incentive structure, they simply ignore it, at least until they are given clearer directives (including support and compensation) from their institutions.
The latter strategy, unfortunately, will signal to many students that ChatGPT-generated submissions are largely indistinguishable from average submissions, which will undoubtedly lead more students to use ChatGTP, at least selectively, placing further pressure on instructors, more of whom will choose the labor-saving path, and so on, in a vicious cycle.
It’s imperative that both publishers and educators develop processes for dealing with the deluge of trash-text submissions, and I believe we are, though the work is incomplete and ongoing, and the establishment of some kind of viable long-term procedural solution is probably years away.
But my purpose today is simply to say: this trash-text tsunami is coming for us all. The avalanche of garbage prose which publishers and educators have been sifting through is going to descend upon everybody who uses the internet.
In his year-end prediction dump earlier this week, Casey Newton wrote,
The quality of search results degrades as Google proves unable to reliably detect AI-generated content. In 2024, AI-produced dreck will find its way into nearly every corner of the internet. While most of it will be inoffensive and generally correct, it will get enough wrong — and cause enough frustration among those it misleads — that trust in search will decline.
Of course, SEO grifts, webcrawler-generated mirror sites, vacuous clickbait, and other largely automated content has littered the web for decades. But, though there are a raft of exceptions, Google (and other high-usage tools, and even social platforms) has been relatively efficient at identifying such content and suppressing its circulation.
I’m not talking about misinformation and disinformation. That’s another category of problem. Propaganda is, at least, created with a rhetorical purpose, with an imagined audience, and with human input. I’m talking about (null)formation. (Null)formation is generated with little or no human oversight, and no aspiration to be coherent to actual readers, merely to algorithms.
Much like Sheila Williams reading ChatGPT-generated submissions, you will be able to recognize (null)formation in a matter of seconds, and unlike educators, you will have no responsibility for deciding what to do with it, except click the back button. The consequences will only be dire if/when it becomes difficult to find anything but (null)formation. When internet browsing, our dominant medium for knowledge acquisition, becomes like trying to find the pages of Hamlet scattered amongst a 100 billion type-writing monkeys, that’s a recipe for chaos.
Like Newton, I expect Google will, in time, regain its control over (null)formation, and that the incentives for creating it will subside, and whatever utility AI tools have for building knowledge through text will be marshaled by, well, writers. But, in the interim, much of the internet will come to resemble Babel.
What worries me most is that this (probably inevitable) phase coincides with an escalating feud between Big Tech, legacy news organizations, and governments, a feud which has already led to the throttling of news content. Might it be convenient for corporations mired in anti-monopoly litigation and fearing further investigations from both journalists and regulators, to bury us for awhile in (null)formation, if only to demonstrate that they can.