John Mount describes a problem:
I would like to write a bit about text. That is: technical writing, legal briefs, or even an opinion piece such as this note. Such writings make up much of our society and form a “marketplace of ideas.”
Texts are now very cheap to produce using large language models (LLMs). Some simulated texts remain correct and useful, and some contain numerous subtle flaws and fabrications. In my opinion it remains expensive to reliably determine which text is which type, as LLMs are not as good at detection as fabrication.
Read on for some of the challenges that have come with the proliferation of language models and text auto-generation. John mentions scientific conferences being overwhelmed with AI-generated abstracts, peer reviews, and the like. In the technical world, we’re also seeing an inundation of AI-generated abstracts. For example, we’ve developed a few key tells for submissions to speak at our user group and will automatically reject abstracts that hit those tells. I’m sure there’s a false positive rate there, but that kind of protection mechanism is important to avoid no-shows from artificially generated abstracts.