This repo describes the schema required for agents to interoperate with each other in Azure AI Agent Service and how agents are defined in Azure AI Agent Service so ISVs can templatize agents that can be deployed in Azure AI Agent Service within a customer's Azure tenant.
In this repo, we define an agent as a system that processes input messages and generates output messages in response. This definition is intentionally broad to encompass a wide range of agent types. For a complete breakdown of what constitutes an agent, see What is an Agent.
Research–from projects like AutoGen–has shown that complex tasks may be better solved by having multiple agents work together. Because agents are not defined in a standard way, however, it is difficult to take agents from multiple platforms, frameworks, or systems and orchestrate them together.
The goal of this repo is to define a common input and output protocol for agents so that a central orchestrator can securely manage the flow of information between agents in a way that is agnostic to the underlying agent implementation. Without this common protocol, the central orchestrator would need custom adapters for each agent.
For simple agents, it should also be feasible to declaratively define an agent and its tools and deploy them to a platform that supports the common protocol. This would allow ISVs to create agents that can be deployed in Azure AI Agent Service without needing to write custom code that provides the common protocol.
In multi-agent workflows (such as those pioneered by AutoGen), a common pattern is to have one agent generate content and another approve it. After the content has been approved, it can then be published by a third agent.
To demonstrate this pattern, we'll define the following agents:
-
Content writer: This agent generates content. It will be originally written in Azure AI Agent Service by the customer.
-
Approver: This agent checks the content for grammar, style, and accuracy before approving or rejecting the post. It will be written by an ISV using the agent and tool declarative file formats so a customer can use to deploy the agent in Azure AI Agent Service. The declarative file format will include a reference to the ISV's fine-tuned model, a templatized system instruction, and tools to perform fact-checking.
The agent's declaration file, for example, might look like this:
name: "Approver" type: "declarative" description: "Approves or rejects content based on predefined rules." instructions: "Check the content for grammar, style, and accuracy. Use the Bing to fact-check the content." tools: - name: "microsoft.bing.grounding" output: type: object properties: approved: type: boolean reason: type: string required: - approved
-
Publisher: This agent publishes the content. It will be hosted outside of Azure AI Agent Service within the customer's blog platform. This agent will act as a "black box" to Azure AI Agent Service–meaning that Azure AI Agent Service will not have visibility into the agent's internal workings–but it will conform to the common protocol so that it can be orchestrated by Azure AI Agent Service.
Since Azure AI Agent Service will be connecting to the publisher agent using the common protocol, the author of the publisher agent will merely need to provide metadata information about the agent's metadata. The endpoint for the agent will be provided by the customer when the agent is connected to Azure AI Agent Service.
name: "Publisher" type: "external" description: "Publishes content to a blog platform."
Note
We need to still define how authentication is handled between Azure AI Agent Service and the publisher agent. This information will need to be included in the agent's metadata.
Once these agents have been defined, a customer would perform the following steps to create the workflow:
- The user would create the content writer agent in Azure AI Agent Service.
- The user would discover the approver agent in Foundry Catalog and deploy it to Azure AI Agent Service. If the Azure AI Foundry project does not yet have the necessary tools, the system would also deploy and create a connection to the tools in Azure AI Agent Service.
- The user would discover the publisher agent in Foundry Catalog and connect it to Azure AI Agent Service. During the connection process, the user would provide the endpoint for the publisher agent and setup any necessary authentication.
- The user would create a workflow in Azure AI Agent Service that connects the content writer to the approver and the approver back to the content writer (if the content is rejected) or to the publisher (if the content is approved).
Once the workflow is invoked, the following steps would occur. In this example, we'll assume that all three agents are participating in the same thread.
- The user provides an initial prompt to the workflow in Azure AI Agent Service.
<user>Write a blog post about the lifespan of cats.</user>
- The workflow triggers the content writer with the initial thread.
<user>Write a blog post about the lifespan of cats.</user> <agent name="content_writer">Here's a draft of the blog post...</agent>
- The workflow passes the thread to the approver agent to check the content.
<user>Write a blog post about the lifespan of cats.</user> <agent name="content_writer">Here's a draft of the blog post...</agent> <agent name="approver"> <toolCall name="microsoft.bing" id="1"><input name="query" value="Life expectancy of cats"></toolCall> <toolCall name="microsoft.bing" id="2"><input name="query" value="Oldest cat ever"></toolCall> <toolCall name="microsoft.bing" id="3"><input name="query" value="Current oldest cat"></toolCall> </agent> <tool for="1">[{"site": "wikipedia.org", "title": "Cat", "snippet": "The life expectancy of a cat is..."}]</tool> <tool for="2">[{"site": "wikipedia.org", "title": "Oldest cat", "snippet": "The oldest cat ever recorded..."}]</tool> <tool for="3">[{"site": "cnn.com", "title": "Oldest cat", "snippet": "Marco won the title of oldest cat..."}]</tool> <agent name="approver">{"approved": false, "reason": "The blog claims..."}</agent>
- Since the approver rejected the content, the workflow passes the thread back to the content writer. The thread contains the approver's research and feedback so the content writer can revise the content correctly.
<user>Write a blog post about cats.</user> <agent name="content_writer">Here's a draft of the blog post...</agent> <agent name="approver"> <toolCall name="microsoft.bing" id="1"><input name="query" value="Life expectancy of cats"></toolCall> <toolCall name="microsoft.bing" id="2"><input name="query" value="Oldest cat ever"></toolCall> <toolCall name="microsoft.bing" id="3"><input name="query" value="Current oldest cat"></toolCall> </agent> <tool for="1">[{"site": "wikipedia.org", "title": "Cat", "snippet": "The life expectancy of a cat is..."}]</tool> <tool for="2">[{"site": "wikipedia.org", "title": "Oldest cat", "snippet": "The oldest cat ever recorded..."}]</tool> <tool for="3">[{"site": "cnn.com", "title": "Oldest cat", "snippet": "Marco won the title of oldest cat..."}]</tool> <agent name="approver">{"approved": false, "reason": "The blog claims..."}</agent> <agent name="content_writer">Here's a revised draft of the blog post...</agent>
- The second time the approver is given the thread, it can approve the content. It does not need to re-research the content because it's previous research is still available in the thread.
<user>Write a blog post about cats.</user> <agent name="content_writer">Here's a draft of the blog post...</agent> <agent name="approver"> <toolCall name="microsoft.bing" id="1"><input name="query" value="Life expectancy of cats"></toolCall> <toolCall name="microsoft.bing" id="2"><input name="query" value="Oldest cat ever"></toolCall> <toolCall name="microsoft.bing" id="3"><input name="query" value="Current oldest cat"></toolCall> </agent> <tool for="1">[{"site": "wikipedia.org", "title": "Cat", "snippet": "The life expectancy of a cat is..."}]</tool> <tool for="2">[{"site": "wikipedia.org", "title": "Oldest cat", "snippet": "The oldest cat ever recorded..."}]</tool> <tool for="3">[{"site": "cnn.com", "title": "Oldest cat", "snippet": "Marco won the title of oldest cat..."}]</tool> <agent name="approver">{"approved": false, "reason": "The blog claims..."}</agent> <agent name="content_writer">Here's a revised draft of the blog post...</agent> <agent name="approver">{"approved": true, "approved_content": "..."}</agent>
- The workflow can then pass the thread to the publisher agent to publish the content. Since the publisher agent is an external agent, the entire thread is passed using the common protocol. As the publisher agent is performing it's "work", it will stream back any new messages to Azure AI Agent Service so the updates can be added to the thread. This includes tool calls and tool results the agent wants to make visible and any other chat messages the agent generates.
<user>Write a blog post about cats.</user> <agent name="content_writer">Here's a draft of the blog post...</agent> <agent name="approver"> <toolCall name="microsoft.bing" id="1"><input name="query" value="Life expectancy of cats"></toolCall> <toolCall name="microsoft.bing" id="2"><input name="query" value="Oldest cat ever"></toolCall> <toolCall name="microsoft.bing" id="3"><input name="query" value="Current oldest cat"></toolCall> </agent> <tool for="1">[{"site": "wikipedia.org", "title": "Cat", "snippet": "The life expectancy of a cat is..."}]</tool> <tool for="2">[{"site": "wikipedia.org", "title": "Oldest cat", "snippet": "The oldest cat ever recorded..."}]</tool> <tool for="3">[{"site": "cnn.com", "title": "Oldest cat", "snippet": "Marco won the title of oldest cat..."}]</tool> <agent name="approver">{"approved": false, "reason": "The blog claims..."}</agent> <agent name="content_writer">Here's a revised draft of the blog post...</agent> <agent name="approver">{"approved": true, "approved_content": "..."}</agent> <agent name="publisher"><toolCall name="wordpress" id="4"><input name="content" value="..."></toolCall></agent> <tool for="4">{"status": "published"}</tool> <agent name="publisher">{"status": "published"}</agent>