Categories: Technology

OpenAI Develops CriticGPT Model Capable of Spotting GPT-4 Code Generation Errors


OpenAI published a study about a new artificial intelligence (AI) model on Thursday that can catch GPT-4’s mistakes in code generation. The AI firm stated that the new chatbot was trained using the reinforcement learning from human feedback (RLHF) framework and was powered by one of the GPT-4 models. The under-development chatbot was designed to improve the quality of the AI-generated code that users get from the large language models. At present, the model is not available to users or testers. OpenAI also highlighted several limitations of the model.

OpenAI Shares Details about CriticGPT

The AI firm shared details of the new CriticGPT model in a blog post, stating that it was based on GPT-4 and designed to identify errors in code generated by ChatGPT. “We found that when people get help from CriticGPT to review ChatGPT code they outperform those without help 60 percent of the time,” the company claims. The model was developed using the RLHF framework and the findings have been published in a paper.

RLHF is a machine learning technique that combines machine output with humans to train AI systems. In such a system, human evaluators provide feedback to the AI’s performance. This is used to adjust and improve the model’s behaviour. Humans who provide feedback to the AI are called AI trainers.

CriticGPT was trained on a large volume of code data that contained errors. The AI model was tasked with finding these mistakes and to critique the code. For this, AI trainers were asked to write the mistakes in the code on top of the naturally occuring mistakes, and then write example feedback as if they had caught those errors.

Once the CriticGPT shared its multiple variations of its critique, the trainers were asked to spot if the errors they inserted was caught by the AI alongside the naturally occurring errors. OpenAI, in its research, found that CriticGPT performed 63 percent better than ChatGPT in catching errors.

However, the model still has certain limitations. CriticGPT was trained on short strings of code generated by OpenAI. The model is yet to be trained on long and complex sets of tasks. The AI firm also found that the new chatbot continues to hallucinate (generate incorrect factual responses). Further, the model has not been tested in scenarios where multiple errors are dispersed in the code.

This model is unlikely to be made public as it is designed to help OpenAI better understand training techniques that can generate higher quality outputs. If CriticGPT does make it to public, it is believed to be integrated within ChatGPT.

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who’sThat360 on Instagram and YouTube.


Bolivia Reverses Bitcoin Ban, Legalises Crypto Transactions for Banks



Source link

24timenews.com

Recent Posts

Jerry Seinfeld’s Mercedes 500E Sold For Crazy Money

Jerry Seinfeld’s former Mercedes-Benz 500 E (W124) sold for $320,000 (about €276,000) at Amelia Island.…

9 hours ago

Astronomers just found the source of the brightest fast radio burst ever

An international team of astronomers, including researchers from the University of Toronto, has identified the…

9 hours ago

Kia Explains Why The 2027 Telluride No Longer Has A V6

Kia dropped the naturally aspirated V6 engine for the 2027 Telluride, replacing it with a…

19 hours ago

Millions of kids take melatonin but doctors are raising red flags

Melatonin has quickly become one of the most widely used sleep aids for children around…

19 hours ago

TRON ARES dazzles with visuals and music

Tron: Ares (English) Review 2.5/5 & Review RatingStar Cast: Jared Leto, Greta Lee Director: Joachim…

1 day ago

Ford Recalls 2026: 7.4 Million Cars

Last year, Ford Motor Company issued more recalls than any other automaker ever. It surpassed…

1 day ago