Tuesday, February 11, 2025

OpenAI’s ‘Operator’ Agent Automates Online Tasks

Date:


OpenAI launches AI agent called ‘Operator’ to automatically fill out forms, make restaurant reservations, book holidays, order shopping etc

AI startup OpenAI has just launched an AI agent that it calls ‘Operator’, which is designed to automate everyday web tasks for users.

CEO Sam Altman in the launch event said that ‘Operator’ uses its own web browser to accomplish tasks that a user gives it. This could remove the need for a user for example to do their online shopping, book a holiday or restaurant reservation, or just fill out forms.

Operator has gone live for Pro users in the United States on Thursday at operator.chatgpt.com, and will be in other countries “soon”.

However its arrival in Europe “will take a while.”

OpenAI chief executive Sam Altman. Image credit: OpenAI

Operator agent

Sam Altman also revealed that it is early days for ‘Operator’ as it is an ‘early research preview’ (meaning it still makes mistakes) and it will be improved. He also stated that OpenAI will launch more agents in the coming months.

“Operator is one of our first agents, which are AIs capable of doing work for you independently – you give it a task and it will execute it” said OpenAI.

“Operator can be asked to handle a wide variety of repetitive browser tasks such as filling out forms, ordering groceries, and even creating memes,” the firm stated. “The ability to use the same interfaces and tools that humans interact with on a daily basis broadens the utility of AI, helping people save time on everyday tasks while opening up new engagement opportunities for businesses.”

Operator is powered by a new model called Computer-Using Agent (CUA), which combines GPT-4o’s vision capabilities with advanced reasoning through reinforcement learning. CUA is trained to interact with graphical user interfaces (GUIs) – the buttons, menus, and text fields people see on a screen.

Operator can “see” (via screenshots) and “interact” (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations.

If it encounters challenges or makes mistakes, Operator can utilise its reasoning capabilities to self-correct. When it gets stuck and needs assistance, it simply hands control back to the user.

Users can choose to take over control of the remote browser at any point, and Operator is trained to proactively ask the user to take over for tasks that require login, payment details, or when solving CAPTCHAs.

Users can personalize their workflows in Operator by adding custom instructions.

Operator safety

OpenAI said it is collaborating with companies like DoorDash, Instacart, OpenTable, Priceline, StubHub, Thumbtack, Uber, and others to ensure Operator addresses real-world needs while respecting established norms.

OpenAI stressed that ensuring Operator is safe to use is a top priority, with three layers of safeguards to prevent abuse and ensure users are firmly in control.

First, Operator is trained to ensure that the person using it is always in control and asks for input at critical points.

Secondly, OpenAI has made it easy to manage data privacy in Operator.

And thirdly OpenAI said it has “built defences against adversarial websites that may try to mislead Operator through hidden prompts, malicious code, or phishing attempts.”



Source link

Share post:

spot_img

Popular

More like this
Related

iQOO Neo 10R Pricing and AnTuTu Score Teased Ahead of India Launch

iQOO Neo 10R will be launched in India...

Safer Internet Day: Yoel Roth on safety and trust in dating apps – The Economic Times Video

From his first PhD thesis to his illustrious...

Would you stop using OpenAI if Elon Musk took it over?

Join our daily and weekly newsletters for the...