OpenAI May Ban Probes into AI Model's Reasoning Process

By Clementine Crooks

September 16, 2024

223

OpenAI has been making headlines since its inception, but the latest buzz around the company isn't about a groundbreaking new technology or innovation. Rather, it's about what OpenAI doesn't want you to know—specifically, what its AI model o1 is "thinking" under the hood.

Since launching their latest AI model family last week, named "Strawberry," which includes models like o1-preview and o1-mini that are touted for their reasoning abilities, OpenAI has gone on high alert to protect these models' inner workings. Users who have attempted to investigate how these models function have reportedly received warning emails and threats of bans from the company.

The introduction of this AI model family marks a shift in OpenAI's approach compared to previous models such as GPT-4o. The o1 was trained with specific focus on step-by-step problem-solving processes before generating an answer. In the ChatGPT interface, where users interact with an "o1" model by asking questions, they can choose whether or not they want to see this chain-of-thought process written out. However, only a filtered interpretation created by another AI model is shown instead of a raw thought process.

This shrouding of information sparked curiosity among hackers and red-teamers who began attempting jailbreaking or prompt injection techniques aimed at tricking the model into revealing its secrets. Despite some claims of success in uncovering raw chains of thoughts from o1, no substantial proof has emerged yet.

In response to these attempts at probing into their system’s reasoning processes through the ChatGPT interface, OpenAI started sending warning emails when certain triggering phrases were used during interaction with o1. These warnings cautioned against violating policies related to circumventing safeguards or safety measures and threatened loss of access upon further violations.

These actions taken by OpenAI raised concerns among researchers engaging in positive red-teaming safety research on this particular AI model, including Marco Figueroa from Mozilla’s GenAI bug bounty programs.

OpenAI’s decision to keep the raw chains of thought hidden from users has been justified by several reasons in their blog post titled "Learning to Reason with LLMs." The company believes that these hidden processes offer a unique monitoring opportunity, enabling them to understand the model's so-called thought process. However, they also admitted that this choice might not align with their best commercial interests.

The company decided against showing these raw chains of thought, citing factors like the need for retaining a raw feed for its own use, user experience, and maintaining competitive advantage. This decision was met with frustration by independent AI researcher Simon Willison, who interpreted it as an attempt by OpenAI to prevent other models from training against the reasoning work they have put into o1.

In fact, it's well known in the AI industry that researchers often use outputs from OpenAI's GPT-4 and GPT-3 as training data for new AI models, which later become competitors despite violating OpenAI's terms of service. Revealing O1’s inner workings would provide invaluable training data for those looking to develop similar “reasoning” models.

As such, while OpenAI continues its relentless pursuit of refining artificial intelligence capabilities through its latest offerings like o1-preview and o1-mini under the Strawberry family, the veil over what happens beneath remains tightly drawn. Despite arguments around user experience or competitive edge, concerns about transparency persist within community circles concerned about understanding how complex prompts are evaluated within these advanced systems.

LATEST ARTICLES IN AI

Study: Brands Concerned About Agencies' Use of AI.

Scientists Demand 'Plan B' for Uncontrolled AI Scenario.

AI Revolutionizes Speedy Drug Synthesis Scouting.

Indian Pilot Creates AI Aircraft Inspection System.

OpenAI Warns of Bans for Testing AI Model's Thought Process

LATEST ARTICLES IN AI

Join Our Newsletter

Popular Articles

Categories

Useful Links

Subscribe