OpenAI shutters AI detector due to low accuracy

Posted on


Artificial intelligence powerhouse OpenAI has quietly taken the blame for its AI detection software citing a low accuracy rate.

The AI ​​classifier developed by OpenAI was first launched on January 31 and aimed to help users, such as teachers and professors, distinguish human-written text from AI-generated text.

However, according to the original blog post which announced the launch of the tool, the AI ​​classifier was stopped on July 20:

“As of July 20, 2023, the AI ​​classifier is no longer available due to its low accuracy rate.”

The link to the tool is no longer functional, while the note only offered simple reasoning as to why the tool was discontinued. However, the company explained that it was looking for new, more efficient ways to identify AI-generated content.

“We are working to integrate feedback and are currently researching more effective provenance techniques for text, and are committed to developing and deploying mechanisms that allow users to understand whether audio or visual content is AI-generated,” the note reads.

OpenAI’s legacy AI classifier in action. Source: Originality AI

From the outset, OpenAI made it clear that the detection tool was error-prone and could not be considered “fully reliable”.

The company said limitations of its AI detection tool included being “very inaccurate” when checking text under 1,000 characters and that it could “confidently” label human-written text as AI-generated.

Related: Apple has its own GPT AI system but no announced plan for public release: Report

The classifier is the latest of OpenAI’s products to come under scrutiny.

See also  A car crashed into the barricade of SBF's current home, lawyers say

On July 18, researchers from Stanford and UC Berkeley published a study that found that OpenAI’s flagship ChatGPT got significantly worse with age.

The researchers found that over the past few months, ChatGPT-4’s ability to accurately identify prime numbers had dropped from 97.6% to just 2.4%. Additionally, ChatGPT-3.5 and ChatGPT-4 experienced a significant drop in their ability to generate new lines of code.

AI eye: AI trained on AI content goes MAD, is Threads a loss leader for AI data?