Sam Altman, CEO of OpenAI, leaves a luncheon during the Allen & Company Sun Valley Conference on July 6, 2022 in Sun Valley, Idaho.
Kevin Deitch | News from Getty Images | Getty Images
OpenAI announced the latest version of its core large language model, GPT-4, on Tuesday, which it says shows “human-level performance” on many professional tests.
ChatGPT-4 is “bigger” than previous versions, meaning it is trained on more data and has more weights in its model file, which also makes it more expensive to run.
Many researchers in the field currently believe that many of the recent advances in AI come from running increasingly large models of thousands of supercomputers in training processes that can cost tens of millions of dollars. GPT-4 is an example of an approach centered around “scaling up” to achieve better results.
OpenAI said it uses Microsoft Azure for training the model; Microsoft invested billions in the startup. OpenAI did not release details about the specific size of the model or the hardware it used to train it that could be used to recreate the model, citing the “competitive environment.”
OpenAI’s GPT Big Language Model powers many of the AI demos that have wowed people in the tech industry over the past six months, including Bing’s AI chat and ChatGPT, and the latest version is a preview of new improvements that may be coming to filter down to consumer products like chatbots in the coming weeks. Bing’s AI chatbot uses GPT-4, Microsoft said on Tuesday.
OpenAI says the new model will produce fewer factually incorrect answers, go off the rails and talk about forbidden topics less often, and even outperform humans on many standardized tests.
GPT-4 performed in the 90th percentile on a simulated bar exam, in the 93rd percentile on the SAT reading exam, and in the 89th percentile on the SAT math exam, OpenAI claims.
However, OpenAI cautions that the new software is not yet perfect and that it is less capable than humans in many scenarios. He still has a big problem with “hallucinating,” or making things up, and is not factually reliable, the company said. It still tends to insist that it is right when it is wrong.
“GPT-4 still has many known limitations that we are working to address, such as social bias, hallucinations, and adversarial prompting,” the company said in a blog post.
“In casual conversation, the difference between GPT-3.5 and GPT-4 can be subtle. The difference becomes apparent when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and capable of handling many more nuanced instructions than GPT-3.5,” OpenAI wrote in a blog post.
The new model will be available to paid ChatGPT subscribers and will also be available as part of an API that allows developers to integrate AI into their applications. OpenAI will charge about 3 cents for about 750 words of prompts and 6 cents for about 750 words in response.