OpenAI’s new artificial intelligence tool, Operator, is designed to make your mundane tasks easier, including making dinner reservations, ordering groceries and filling out forms.
In a demo video, OpenAIÂ highlights how the AI agent can interact with web pages by typing, clicking and scrolling when using a special browser. You just describe the task you want done and the agent can handle multiple requests at the same time — like shopping on Etsy while making a dinner reservation.Â
It can “see” via screenshots and “interact” in the same way a mouse and keyboard would allow within a browser, according to OpenAI. Operator, which OpenAI describes as “one of our first agents,” is available in a limited preview.Â
With competitors like Google and Anthropic already offering similar AI agents, OpenAI is working to narrow the gap. It’s also part of OpenAI’s larger effort to make its generative AI even more useful by automating more aspects of daily life, potentially getting closer to delivering on the promise that it’ll forever change the way we interact with technology.Â
“The ability to use the same interfaces and tools that humans interact with on a daily basis broadens the utility of AI, helping people save time on everyday tasks while opening up new engagement opportunities for businesses,” the company said in a blog post.
The tool is powered by a new model called the Computer-Using Agent, which combines GPT-4o’s vision capabilities with advanced reasoning through reinforcement learning. It’s trained to interact with graphical user interfaces, including the buttons, menus and text fields people see on a screen.
If problems arise, the company said, Operator can use its reasoning capabilities to self-correct or return control to the user. It’s also trained to ask the user to take over for tasks that require certain inputs, such as login credentials or payment details.
The tool is now available to paying Pro users in the US at operator.chatgpt.com.