Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Subscribe to our daily and weekly newsletters for the latest updates and content from the industry’s leading AI site. learn more
Organizations interested in deploying AI assistants need to get the hang of it first, especially for workflows that often feel like they’re moving. While some organizations require assistants who only work on one job at a time, sometimes assistants need to be brought into new environments in hopes of making a difference.
Researchers from Beijing University of Posts and Telecommunications unveiled a new method, AgentRefine. It trains agents to self-regulate, leading to more intelligent and adaptive AIs.
The researchers said that the current transfer process reduces the agents to the same tasks as part of their training, or “reserved” tasks, and they do not work in “reserved” tasks, or new positions. By following only the rules given through the training data, agents who have been trained by this method may have difficulty “learning” from their mistakes and may not be made into general agents and brought to new ways.
To overcome that limitation, AgentRefine aims to create training datasets that enable the brand to learn from mistakes and move into new trends. In a new paperthe researchers said that the goal of AgentRefine is “to create a clear agent data and to establish a relationship between the increase of the agent and self-refinement.” If the assistants correct themselves, they will not continue the mistakes they have learned and bring the same mistakes to other places they are assigned.
“We find that controlling the helper’s self-efficacy helps the helper to see what he or she is capable of when faced with challenges, leading to better adaptation to new environments,” the researchers wrote.
Based on their ideas from the comedy show Dungeons & Dragons, The researchers created personas, scripts for the agent to follow and challenges. And yes, there is a Dungeon Master (DM).
He divided the data architecture of AgentRefine into three phases: document generation, trajectory generation and validation.
In scripting, the model creates a script, or guide, containing information about the environment, tasks and actions that people can take. (The researchers tested AgentRefine using Llama-3-8B-Instruct, Llama-3-70B-Instruct, Mistral-7B-Instruct-v0.3, GPT-4o-mini and GPT-4o)
The model generates agent data that contains errors and acts as both the DM and the player during the trajectory. It checks what it can do and see if there are any errors. The final stage, validation, looks at the script and trajectory, allowing the ability of training agents to improve themselves.
The researchers found that agents who are trained to use the AgentRefine method and dataset perform better on a variety of tasks and adapt to new situations. These agents self-regulate to improve their actions and decision-making to avoid mistakes, and become more robust in their actions.
In particular, AgentRefine improved the performance of all models to handle the remaining tasks.
Businesses need to make employees flexible so that they don’t just repeat what they’ve learned so they can make better decisions. Call agents are not only “direct vehicles” to multiple agents but also determine whether agents will complete tasks based on user requests.
OpenAIand o3 provides “program synthesis” which can lead to job flexibility. Other methods of orchestration and training, such as Magentic-One from Microsoftit sets actions for managers to learn when to shift work to different agents.