In a bid to increase the usefulness of generative AI tools for developers, OpenAI has introduced CriticGPT, a new model it says can help identify errors in ChatGPT code output.
Based on GPT-4, OpenAI claims that CriticGPT was able to outperform unassisted efforts 60% of the time, demonstrating its ability to augment human performance on code review tasks, rather than replacing human workers.
The OpenAI initiative aims to refine the 'Reinforcement Learning from Human Feedback' (RLHF) process to ensure higher quality and greater reliability in AI systems.
OpenAI launches new code verification model
OpenAI's latest GPT-4 series, which powers the publicly available versions of ChatGPT, relies heavily on RLHF to ensure its results are reliable and interactive. Until now, this process has been manual and has relied on the human power of AI trainers, who have scored ChatGPT responses to improve model performance.
With the launch of CriticGPT, OpenAI can now critique ChatGPT responses autonomously, addressing concerns that the AI chatbot will become too sophisticated for many human trainers.
CriticGPT was trained by trainers who provided feedback after inserting intentional bugs into the code generated by ChatGPT. The results were promising: CriticGPT reviews were preferred by coaches about two-thirds (63%) of the time thanks to the tool's ability to reduce nitpicking and hallucinations.
However, the project is not without limitations and collaboration between AI and humans continues to prove to be more effective compared to AI alone.
In its announcement, OpenAI summarized: “CriticicGPT's suggestions aren't always correct, but we found that they can help trainers catch many more problems with model-written responses than they would without AI help.”
The company also acknowledged that “errors can be spread across many parts of a response,” making it more complex for an AI tool to identify the cause.
Looking ahead, OpenAI has confirmed plans to expand its work on CriticGPT and put it into practice.