Four months ago, a small San Francisco company became the talk of the technology industry when it introduced a new online chatbot that could answer complex questions, write poetry and even mimic human emotions.
Now the company is back with a new version of the technology that powers its chatbots.
The system will up the ante in Silicon Valley’s race to embrace artificial intelligence and decide who will be the next generation of leaders in the technology industry.
OpenAI, which has around 375 employees but has been backed with billions of dollars of investment from Microsoft and industry celebrities, said on Tuesday that it had released a technology that it calls GPT-4.
It was designed to be the underlying engine that powers chatbots and all sorts of other systems, from search engines to personal online tutors.
Most people will use this technology through a new version of the company’s ChatGPT chatbot, while businesses will incorporate it into a wide variety of systems, including business software and e-commerce websites.
The technology already drives the chatbot available to a limited number of people using Microsoft’s Bing search engine.
OpenAI’s progress has, within just a few months, landed the technology industry in one of its most unpredictable moments in decades.
Many industry leaders believe developments in AI represent a fundamental technological shift, as important as the creation of web browsers in the early 1990s.
Their rapid improvement has stunned computer scientists. GPT-4, which learns its skills by analysing huge amounts of data culled from the Internet, improves on what powered the original ChatGPT in several ways. It is more precise.
It can, for example, ace the Uniform Bar Exam, instantly calculate someone’s tax liability and provide detailed descriptions of images. But OpenAI’s new technology still has some of the strangely human-like shortcomings that have vexed industry insiders and unnerved people who have worked with the newest chatbots.
It is an expert on some subjects and a dilettante on others. It can do better on standardised tests than most people and offer precise medical advice to doctors, but it can also mess up basic arithmetic.
Companies that bet their futures on the technology may — at least for now — have to put up with imprecision, which was long taboo in an industry built from the ground up on the notion that computers are more exacting than their human creators.
“I don’t want to make it sound like we have solved reasoning or intelligence, which we certainly have not,” Sam Altman, OpenAI’s chief executive, said in an interview.
“But this is a big step forward from what is already out there.”
Other tech companies are likely to include GPT-4’s features in an array of products and services, including Microsoft’s software for performing business tasks and e-commerce sites that want to give customers new ways of virtually trying out their products.
A number of industry giants like Google and Facebook’s parent company, Meta, are also working on their own chatbots and AI technology.
ChatGPT and similar technologies are already shifting the behaviour of students and educators who are trying to understand whether the tools should be embraced or banned. Because the systems can write computer programmes and perform other business tasks, they are also on the cusp of changing the nature of work.
Even the most impressive systems tend to complement skilled workers rather than replace them. The systems cannot be used in lieu of doctors, lawyers or accountants.
Experts are still needed to spot their mistakes. But they could soon replace some paralegals (whose work is reviewed and edited by trained lawyers), and many AI experts believe they will replace workers who moderate content on the internet.
“There is definitely disruption, which means some jobs go away and some new jobs get created,” said Greg Brockman, OpenAI’s president. “But I think the net effect is that barriers to entry go down, and the productivity of the experts goes up.”
On Tuesday, OpenAI started selling access to GPT-4 so that businesses and other software developers could build their own applications on top of it.
The company has also used the technology to build a new version of its popular chatbot, which is available to anyone who purchases access to ChatGPT Plus — a subscription service priced at $20 a month. A handful of companies are already working with GPT4.
Morgan Stanley Wealth Management is building a system that will instantly retrieve information from company documents and other records, and serve it up to financial advisers in conversational prose.
Khan Academy, an online education company, is using technology to build an automated tutor. “This new technology can act more like a tutor,” said Khan Academy’s chief executive and founder, Sal Khan. “We want it to teach the student new techniques while the student does most of the work.”
Like similar technologies, the new system sometimes “hallucinates”.
It generates completely false information without warning. If asked for websites that lay out the latest in cancer research, it might give several Internet addresses that do not exist.
GPT-4 is a neural network, a type of mathematical system that learns skills by analysing data.
It is the same technology that digital assistants like Siri use to recognise spoken commands and self-driving cars use to identify pedestrians.
Around 2018, companies like Google and OpenAI began building neural networks that learned from enormous amounts of digital text, including books, Wikipedia articles, chat logs and other information posted to the Internet. They are called large language models, or LLM.s.
By pinpointing billions of patterns in all that text, the LLMs learn to generate text on their own, including tweets, poems and computer programs.
OpenAI threw more and more data into its LLM. More data, the company hoped, would mean better answers. OpenAI also refined this technology using feedback from human testers.
As people tested ChatGPT, they rated the chatbot’s responses, separating those that were useful and truthful from those that were not. Then, using a technique called reinforcement learning, the system spent months analysing those ratings and gaining a better understanding of what it should and should not do.
“Humans rate which stuff they like to see and which stuff they don’t like to see,” said Luke Metz, an OpenAI researcher. The original ChatGPT was based on a large language model called GPT-3.5. OpenAI’s GPT-4 learned from significantly larger amounts of data.
OpenAI executives declined to disclose just how much data the new chatbot had learned from, but Brockman said the data set was “internet scale,” meaning it spanned enough websites to provide a representative sample of all English speakers on the Internet.
GPT-4’s new capabilities may not be obvious to the average person first using the technology. But they are likely to quickly come into focus as lay people and experts continue to use the service.
Given a lengthy article from The New York Times and asked to summarise it, the bot will give a precise summary nearly every time.
Add a few random sentences to that summary and ask the chatbot if the revised summary is accurate, and it will point to the added sentences as the only inaccuracies.
Altman described the behaviour as “reasoning”. But technology cannot duplicate human reasoning. It is good at analysing, summarising and answering complex questions about a book or news article.
It is far less adept if asked about events that have not yet happened. It can write a joke, but it does not show that it understands what will actually make someone laugh.
“It doesn’t grasp the nuance of what is funny,” said Oren Etzioni, the founding chief executive of the Allen Institute for AI, a prominent lab in Seattle. As with similar technologies, users may find ways of coaxing the system into strange and creepy behaviour.
Asked to imitate another person or play act, this kind of bot sometimes veers into areas it was designed to stay away from.
New York Times News Service