Guides and Tutorials: Productively Addressing Artificial Intelligence in the Classroom: Identifying AI-Generated Work

Identifying AI-generated work

Using Turnitin

Turnitin's AI detection model breaks a paper into small chunks and makes a determination as to whether a sentence is AI generated or human generated. It then averages the scores across the paper to give a probability as to whether the paper was AI or Human generated. Turnitin describes how it's model was trained

GPT-3 and ChatGPT are trained on the text of the entire internet, and they are essentially taking that large amount of text and generating sequences of words based on picking the next highly probable words. This means that GPT-3 and ChatGPT tend to generate the next word in a sequence of words in a consistent and highly probable fashion. Human writing, on the other hand, tends to be inconsistent and idiosyncratic, resulting in a low probability of picking the next word the human will use in the sequence.

Our classifiers are trained to detect these differences in word probability and are adept to the particular word probability sequences of human writers.

Turnitin states that it has a 98% accuracy rate but there are concerns about it's rapid rollout and accuracy. Inside Higher Ed's recent article, Turnitin's AI Detector: Higher -Than-Expected False Positives states that:

When Turnitin’s AI-detection tool reports that a piece of writing has a less than 20 percent chance of having been written by a machine, it has a higher incidence of false positives, according to the statement. Now, the company will add an asterisk with a message casting some doubt on such results.... [and] the tool appears to have particular trouble with text that mixes AI-generated and human-written prose. For example, more than half (54 percent) of false positive (human-written) sentences are located right next to AI-written sentences, according to the statement. More than one-quarter (26 percent) of false positive sentences are located two sentences away from an AI-written sentence.

Remember: Turnitin is making accusations, based on predictions, about the likelihood that a paper was generated by Artificial Intelligence. This is not proof.

Papers showing a low percentage of AI use in Turnitin are more prone to false positives
Short writing assignments are more prone to false positives
If a paper shows a higher percentage of AI use:
- talk with the student about it
- ask to see their notes, annotated readings, and other materials that may have been used for the assignment
- ask the student about their topic in more depth.

Circumventing AI-detection tools

As AI-detection tools come on the market, ways for circumventing these tools are also making its way onto social media and online forums. People have found that editing an AI-generated document and putting one's own voice into the work is often enough to pass through an AI-detection tool.

Making up facts (AI hallucinations)

While AI-generated text can write impressive responses to general prompts, and it can even provide citations when prompted, some sources indicate it just makes up some quotations and facts. Tom Zeller from Undark Magazine notes that when ChatGPT was asked to produce a journalistic story citing “quote from experts,” the bot both made up some of those quotes and even created “fictional composites” of various real people in the field. (And, interestingly, we plugged a portion of the story ChatGPT’s created into the GPT-2 Output Detector and received a “99.92% real” report, casting more doubt on that detection tool’s accuracy.) Newer versions of these AI tools, including GPT-4, seem to be better at avoiding these "hallucinations," to use the industry's terminology.

Blustering

ChatGPT’s developers at OpenAI admit that this is a current shortcoming of the tool, noting that “ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.” This isn’t always a clear giveaway, since some student writing has a “BS” component to it, as they will say, particularly when students are focusing on an imagined academic voice over substance.

Using AI text-generators yourself

One of the best ways to learn about the tools is to try using them yourself. Try plugging in one of your assingment prompts to see what it generates. Try manipulating the prompt to see how it changes the output. This will help you to understand what ChatGPT can do well along with it's limitations.