Turnitin AI detection help detects AI writing

Turnitin, a leading provider of academic integrity solutions, recently activated new features that detect the use of artificial intelligence (AI) writing tools like ChatGPT. Source: Shutterstock

Turnitin AI detection tackling the issue of academic integrity

Article by Nathan Hew

Technology has radically changed the education sphere. In 2020, most universities were forced to shift to remote learning — almost overnight — due to COVID-19. Over time, educators adopted technologies to encourage interactivity and establish hybrid models of online and in-person activities. 

As a leading provider of academic integrity solutions, Turnitin has kept its eyes on the latest technological development in education. Recently, it activated new features that detect the use of artificial intelligence (AI) writing tools. They are found in Turnitin Feedback Studio (TFS), TFS with Originality, Turnitin Originality, Turnitin Similarity, Simcheck, Originality Check, and Originality Check+.

We sat down with James Thorley, the Regional Vice President of Asia Pacific at Turnitin, to learn more about how the provider tackles the issue of academic integrity with the rise of generative AI technology. 

James Thorley, the Regional Vice President of Asia Pacific at Turnitin

Despite not having a computer science or coding background, Thorley understands education technology and how universities want to implement it. (Source: Turnitin)

Turnitin AI detection: Modern day problems, modern solutions 

The groundwork for developing Turnitin’s AI writing detection features started early — nearly two years before the release of ChatGPT. 

“It’s something we’ve been thinking about and deciding what to do with it for a number of years,” says Thorley. “As soon as ChatGPT launched, it became a huge thing across the world in nearly every field. It’s pretty rare for you to get sort of front-page news around a technology. In education, there was a lot of excitement and concerns.”

What sets ChatGPT apart is how OpenAI made its powerful GPT3 language-generating system available to the general public with a sleek interface, sparking competition among humans to come up with the most creative instructions.

Students have used the technology for exams at various levels. A Twitter user reportedly used ChatGPT to take an SAT exam. In another incident, a Midwestern high school senior told the Washington Post that he used ChatGPT for two homework assignments: a computer science quiz and a coding assignment.

Teachers have hopped on this trend too. One high school English teacher in Oregon used this AI chatbot in her classes to create outlines for their essays on two 19th-century short stories (“The Story of an Hour,” by Kate Chopin and “The Yellow Wallpaper” by Charlotte Perkins Gilman”). After they got the outline from ChatGPT, the students placed their laptops aside and wrote their essays. 

To detect AI writing, a submission is broken into segments of text that are roughly a few hundred words (about five to ten sentences). Those segments are then overlapped to capture each sentence in context. Since ChatGPT is trained on the text of the entire Internet, they tend to take large amounts of text and generate sequences of words based on picking the next highly probable word. 

Human writing tends to be inconsistent, resulting in a low probability of picking the next word a human will use in the sequence. Turnitin classifiers are trained on the differences in word probability and are adept at the particular word probability sequence of human writers. 

For the first iteration of Turnitin’s AI writing detection tool, they can only detect AI writing for documents submitted in long-form English. Users can check this FAQ to learn more about the tool.

An example of how Turnitin detects AI writing.

An example of how Turnitin detects AI writing. On the left, the platform will display how much of the submission has been generated by AI. Source: Turnitin

Reducing “false positives”, maintaining accuracy

It’s impossible to get a tool that can detect AI-generated text with 100% certainty. Muhammad Abdul-Mageed, a professor who oversees research in natural-language processing and machine learning at the University of British Columbia, explains why that’s the case.

“The whole point of AI language models is to generate fluent and human-seeming text, and the model is mimicking text created by humans,” Abdul-Mageed shares. “New AI language models are more powerful and better at generating even more fluent language, which quickly makes our existing detection tool kit outdated.”

Turnitin, however, does its best to reduce the number of “false positives” — maintaining a less than 1% false positive rate — to ensure the most reliable results. “In a lab setting and the work we did there, we had a 98% recall rate,” Thorley shares. “I think one of the key things that we’ve been focusing on in terms of balance is really minimizing the false positives, as it were, rather than maximizing or highlighting the definite [possibility of AI writing].”

He adds: “We’re still very confident that we have absolutely the best detector out there when it comes to academic settings.”

False positive rates tend to vary from industry to industry, but one theme remains the same — most organizations strive to achieve a low false positive rate in their detection tools. When IBM launched its high-tech AI to help stop real-time payment fraud, it saw a reduction of 20% in false positive results compared to its previous detection systems. Fewer false positives increase operational efficiency and save cost.

Still, there is a small risk of false positives. According to David Adamson, an AI scientist at Turnitin and a former high school teacher, the company has decided to prioritize precision in its detector. “Preferring precision might mean we miss some AI writing that’s really there; we might have a lower recall,” he explains. “We’re fine with that. Let’s miss some stuff and be more right about what we find.”

Turnitin AI detection: Higher-than expected false positives 

Since its release, Annie Chechitelli, the company’s chief product officer, revealed that Turnitin’s AI writing detection tool had a higher false positive rate than originally asserted. As of May 14, 38.5 million submissions have gone through the tool, with 9.6% of those documents reporting over 20% of AI writing and 3.5% over 80% of AI writing, Chechitelli wrote.

When Turnitin’s AI-detection tool reports that a piece of writing has a less than 20% chance of having been written by a machine, it will have a higher chance of false positives, according to the statement. Now, the company will add an asterisk with a message casting some doubt on such results.

The tool produces two statistics — one at the document level and one at the sentence level. The sentence-level false positive rate is approximately 4%. Turnitin has not disclosed the new document-level false positive rate.

Turntin AI detection can help deal with plagiarism

With the new tool, educators and universities would have an easier time to combat plagiarism.

Positive feedback from educators

Educators and university administrators in Southeast Asia have been keenly aware of the potential impact of AI tools like ChatGPT within the region. “Educators in Southeast Asia recognize the impact AI tools may have on the quality of student’s work and the learning experience,” Thorley explained in a press release.

As soon as the features were ready to be released, the company was eager to push out new solutions to the community. “The reason was to give academics the best understanding of what was happening in their classrooms. From the seven universities I spoke to, it’s much more around starting a conversation with students and understanding what’s happening.” 

With this, over 10,700 institutions and more than 2.1 million educators could quickly and easily evaluate a submission for the presence of AI-generated text and provide feedback to students in their current Turnitin workflows.

Given that ChatGPT is still a novel technology, some universities are split on adopting this new tool. In Australia, the University of Melbourne, the University of New South Wales, and Western Sydney University have adopted the tool. Several considered integrating it into their detection programs, but others have raised concerns over its efficacy, stating that the Turnitin tool was rushed, reported the Guardian. 

Thorley, though, has heard nothing but positive feedback “in terms of the people who have used it [and] how they felt about it.”

How much AI writing is trusted?

A screenshot of a Tweet showcasing sentiments towards AI.

Change is the only constant 

Turnitin’s AI detection technology is just the start. As AI continues to influence many facets of education, the company will adapt its solutions. “One of the interesting things with ChatGPT is the fact that it makes up resources or references,” Thorley shares. “One of the things we’re exploring is reference checking because that is something we have knowledge and experience on. We have a lot of data.” 

The company is also looking at other ways to use generative AI for more personalized learning, according to Thorley. “As a company, we have this philosophy around AI that it always has to be human-centered AI,” he says. “We look at how we can give efficiency to the teacher so they can spend more time doing more meaningful interactions with the students.”