LLM之Agent（三十八）｜AI Agents（七）：Multi-Agent架构 - 文章 - 开发者社区

推荐阅读列表：

一、引言

 Agent系统的开发，这些系统可能会随着时间的推移变得越来越复杂，从而难以管理和扩展。例如，可能会遇到以下问题：

智能体调用的工具太多，导致工具选择的失误率大增；
上下文变得过于复杂，单个代理无法胜任；
智能体系统中需要多个专业领域（例如规划师、研究员、数学专家等）。

为了解决这些问题，可以考虑将单智能体拆分成多个较小的独立代理，并将它们组合成一个多智能体系统。这些独立智能体可以像提示词和 LLM 调用一样简单，也可以像 ReAct 代理（甚至更复杂！）一样复杂。

picture.image

 随着智能体框架的发展，许多公司开始构建自己的多智能体系统，并寻求解决所有智能体任务的万能方案。两年前，研究人员设计了一个名为 ChatDev 的多智能体协作系统。ChatDev 就像一家虚拟软件公司，通过各种智能体运作，这些智能体扮演着不同的角色，例如首席执行官、首席产品官、美术设计师、程序员、审校员、测试员等等，就像一家普通的软件工程公司一样。

picture.image

 这些智能体协同工作、相互沟通，最终成功开发出一款电子游戏。这一成就让许多人相信，任何软件工程任务都可以使用这种多智能体架构来解决，其中每个人工智能都扮演着独特的角色。然而，现实世界的实验表明，并非所有问题都能用这种架构解决。在某些情况下，更简单的架构反而能提供更高效、更经济的解决方案。

1.1 单智能体架构与多智能体架构

  单智能体方案起初可能很合理（例如，一个智能体可以完成从浏览器导航到文件操作的所有任务）。但随着时间的推移，任务变得越来越复杂，工具的数量也越来越多，单智能体方案将开始捉襟见肘。

picture.image

二、Multi-agent架构

 单智能体架构和多智能体架构各有优缺点。当任务简单明了、定义明确且资源没有具体限制时，单智能体架构是理想之选。当用例复杂多变、需要更专业的知识和协作，或者具有可扩展性和适应性要求时，多智能体架构则更为有效。

2.1 Multi-agent系统中的模式

在多智能体系统中，连接各个智能体有多种方法：

2.1.1 并行

多个智能体同时处理任务的不同部分。

picture.image

 例如： 我们希望使用 3 个代理同时对给定文本进行总结 、 翻译和情感分析 。

picture.image

完整示例代码，请查看： https://t.zsxq.com/q4cc0

2.1.2 顺序

任务按顺序处理，一个代理的输出成为下一个代理的输入。

picture.image

例如：多步骤审批。

完整示例代码，请查看：https://t.zsxq.com/sP9eD

2.1.3 循环

智能体以迭代循环的方式运行，根据其他智能体的反馈不断改进其输出。

picture.image

例如：评估用例，如代码编写和代码测试。

picture.image

完整示例代码，请查看：https://t.zsxq.com/NwAtr

2.1.4 路由器

中央路由器根据任务或输入确定要调用哪个代理。

picture.image

示例：客户支持工单路由

picture.image

完整示例代码，请查看：https://t.zsxq.com/bRBi9

2.1.5 聚合器（或合成器）

 收集各个智能体的贡献输出，并综合这些输出，形成最终结果。

picture.image

例如：社交媒体情感分析聚合器

picture.image

完整示例代码，请查看：https://t.zsxq.com/hcJcj

2.1.6 网络（或水平）

智能体之间以多对多的方式直接通信，形成去中心化网络。

picture.image

  这种架构适用于没有清晰的代理层级结构或没有特定代理调用顺序的问题。

优点：分布式协作和群体驱动决策。即使部分智能体出现故障，系统仍能正常运行。

缺点：管理智能体之间的沟通可能变得具有挑战性。更多的沟通可能会导致效率低下，并可能出现智能体重复工作的情况。

完整示例代码，请查看：https://t.zsxq.com/lZPGu

2.1.7 交接

 在多智能体架构中，智能体可以表示为图节点。每个智能体节点执行其步骤，并决定是完成执行还是路由到另一个智能体，包括可能路由到自身（例如，循环运行）。多智能体交互中常见的模式是交接，即一个智能体将控制权移交给另一个智能体。交接允许您指定：

目标：要导航到的目标代理（例如，要前往的节点名称）；
有效载荷：要传递给该代理的信息（例如，状态更新）

picture.image

 为了在 LangGraph 中实现切换，agent节点可以返回 Command 对象，该对象允许您将控制流和状态更新结合起来：

  
def agent(state) -> Command[Literal["agent", "another_agent"]]:  
    # the condition for routing/halting can be anything, e.g. LLM tool call / structured output, etc.  
    goto = get_next_agent(...)  # 'agent' / 'another_agent'  
    return Command(  
        # Specify which agent to call next  
        goto=goto,  
        # Update the graph state  
        update={"my_state_key": "my_state_value"}  
    )

 在更复杂的场景中，如果每个agent节点本身就是一个图（即子图 ），则某个agent子图中的节点可能需要导航到另一个代理。例如，如果您有两个代理， alice 和 bob （父图中的子图节点），并且 alice 需要导航到 bob ，则可以在 Command 对象中设置 graph=Command.PARENT ：

  
def some_node_inside_alice(state)  
    return Command(  
        goto="bob",  
        update={"my_state_key": "my_state_value"},  
        # specify which graph to navigate to (defaults to the current graph)  
        graph=Command.PARENT,  
    )

 如果您需要支持使用 Command(graph=Command.PARENT) 进行通信的子图的可视化，则需要将它们包装在带有 Command 注解的节点函数中，例如，而不是这样：

  
builder.add\_node(alice)

而是：

  
def call_alice(state) -> Command[Literal["bob"]]:  
    return alice.invoke(state)  
  
builder.add_node("alice", call_alice)

交接作为一种工具

 最常见的agent类型之一是 ReAct 风格的工具调用agent。对于这类agent，常见的模式是将交接操作封装在工具调用中，例如：

  
def transfer_to_bob(state):  
    """Transfer to bob."""  
    return Command(  
        goto="bob",  
        update={"my_state_key": "my_state_value"},  
        graph=Command.PARENT,  
    )

 这是从工具更新图状态的一个特殊情况，除了状态更新之外，还包含了控制流。


如果你想使用返回 Command 的工具，你可以使用预构建的 create\_react\_agent / ToolNode 组件，或者实现你自己的工具执行节点，该节点收集工具返回的 Command 对象并返回一个列表，例如：

  
def call_tools(state):  
    ...  
    commands = [tools_by_name[tool_call["name"]].invoke(tool_call) for tool_call in tool_calls]  
    return commands

   现在让我们仔细看看不同的多智能体架构。

2.1.8 主管

在这个架构中，我们将agents定义为节点，并添加一个监管节点（LLM），由其决定接下来应该调用哪些agents节点。我们使用 Command 根据监管节点的决策将执行路由到相应的agents节点。该架构也非常适合并行运行多个agents或使用 MapReduce 模式。

  
from typing import Literal  
from langchain_openai import ChatOpenAI  
from langgraph.graph import StateGraph, MessagesState, START, END  
  
model = ChatOpenAI()  
  
def supervisor(state: MessagesState) -> Command[Literal["agent_1", "agent_2", END]]:  
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])  
    # to determine which agent to call next. a common pattern is to call the model  
    # with a structured output (e.g. force it to return an output with a "next_agent" field)  
    response = model.invoke(...)  
    # route to one of the agents or exit based on the supervisor's decision  
    # if the supervisor returns "__end__", the graph will finish execution  
    return Command(goto=response["next_agent"])  
  
def agent_1(state: MessagesState) -> Command[Literal["supervisor"]]:  
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])  
    # and add any additional logic (different models, custom prompts, structured output, etc.)  
    response = model.invoke(...)  
    return Command(  
        goto="supervisor",  
        update={"messages": [response]},  
    )  
  
def agent_2(state: MessagesState) -> Command[Literal["supervisor"]]:  
    response = model.invoke(...)  
    return Command(  
        goto="supervisor",  
        update={"messages": [response]},  
    )  
  
builder = StateGraph(MessagesState)  
builder.add_node(supervisor)  
builder.add_node(agent_1)  
builder.add_node(agent_2)  
  
builder.add_edge(START, "supervisor")  
  
supervisor = builder.compile()

2.1.9 主管（工具调用）

  在这种监管架构中，我们将各个智能体定义为工具 ，并在监管节点中使用支持工具调用的 LLM。可以实现为一个 ReAct 风格的代理，包含两个节点——一个 LLM 节点（监管节点）和一个执行工具（在本例中为代理）的工具调用节点。

  
from typing import Annotated  
from langchain_openai import ChatOpenAI  
from langgraph.prebuilt import InjectedState, create_react_agent  
  
model = ChatOpenAI()  
  
# this is the agent function that will be called as tool  
# notice that you can pass the state to the tool via InjectedState annotation  
def agent_1(state: Annotated[dict, InjectedState]):  
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])  
    # and add any additional logic (different models, custom prompts, structured output, etc.)  
    response = model.invoke(...)  
    # return the LLM response as a string (expected tool response format)  
    # this will be automatically turned to ToolMessage  
    # by the prebuilt create_react_agent (supervisor)  
    return response.content  
  
def agent_2(state: Annotated[dict, InjectedState]):  
    response = model.invoke(...)  
    return response.content  
  
tools = [agent_1, agent_2]  
# the simplest way to build a supervisor w/ tool-calling is to use prebuilt ReAct agent graph  
# that consists of a tool-calling LLM node (i.e. supervisor) and a tool-executing node  
supervisor = create_react_agent(model, tools)

2.1.10 层级式（或垂直式）

  Agents以树状结构组织，其中高级代理（主管代理）管理低级代理。

picture.image

 随着系统中智能体数量增加，主管智能体可能难以管理所有智能体。主管智能体可能会在选择下一个调用的智能体时做出错误的决策，上下文也可能变得过于复杂，以至于单个主管智能体无法跟踪。换句话说，最终你会遇到多智能体架构的那些问题。


 为了解决这个问题，可以采用层级式系统设计。例如，创建由各个主管管理的独立、专业的代理团队，以及一个顶级主管来管理这些团队。

  
from typing import Literal  
from langchain_openai import ChatOpenAI  
from langgraph.graph import StateGraph, MessagesState, START, END  
from langgraph.types import Command  
model = ChatOpenAI()  
  
# define team 1 (same as the single supervisor example above)  
  
def team_1_supervisor(state: MessagesState) -> Command[Literal["team_1_agent_1", "team_1_agent_2", END]]:  
    response = model.invoke(...)  
    return Command(goto=response["next_agent"])  
  
def team_1_agent_1(state: MessagesState) -> Command[Literal["team_1_supervisor"]]:  
    response = model.invoke(...)  
    return Command(goto="team_1_supervisor", update={"messages": [response]})  
  
def team_1_agent_2(state: MessagesState) -> Command[Literal["team_1_supervisor"]]:  
    response = model.invoke(...)  
    return Command(goto="team_1_supervisor", update={"messages": [response]})  
  
team_1_builder = StateGraph(Team1State)  
team_1_builder.add_node(team_1_supervisor)  
team_1_builder.add_node(team_1_agent_1)  
team_1_builder.add_node(team_1_agent_2)  
team_1_builder.add_edge(START, "team_1_supervisor")  
team_1_graph = team_1_builder.compile()  
  
# define team 2 (same as the single supervisor example above)  
class Team2State(MessagesState):  
    next: Literal["team_2_agent_1", "team_2_agent_2", "__end__"]  
  
def team_2_supervisor(state: Team2State):  
    ...  
  
def team_2_agent_1(state: Team2State):  
    ...  
  
def team_2_agent_2(state: Team2State):  
    ...  
  
team_2_builder = StateGraph(Team2State)  
...  
team_2_graph = team_2_builder.compile()  
  
  
# define top-level supervisor  
  
builder = StateGraph(MessagesState)  
def top_level_supervisor(state: MessagesState) -> Command[Literal["team_1_graph", "team_2_graph", END]]:  
    # you can pass relevant parts of the state to the LLM (e.g., state["messages"])  
    # to determine which team to call next. a common pattern is to call the model  
    # with a structured output (e.g. force it to return an output with a "next_team" field)  
    response = model.invoke(...)  
    # route to one of the teams or exit based on the supervisor's decision  
    # if the supervisor returns "__end__", the graph will finish execution  
    return Command(goto=response["next_team"])  
  
builder = StateGraph(MessagesState)  
builder.add_node(top_level_supervisor)  
builder.add_node("team_1_graph", team_1_graph)  
builder.add_node("team_2_graph", team_2_graph)  
builder.add_edge(START, "top_level_supervisor")  
builder.add_edge("team_1_graph", "top_level_supervisor")  
builder.add_edge("team_2_graph", "top_level_supervisor")  
graph = builder.compile()

优点：不同层级agent的角色和职责划分清晰。沟通流程精简高效。适用于具有结构化决策流程的大型系统。

缺点：高层级的故障可能导致整个系统瘫痪。底层agent的独立性有限。

2.1.11 自定义多代理工作流程

 每个智能体只与智能体的一部分通信。部分智能体通信是确定的，部分智能体可以决定接下来要呼叫哪些其他智能体。


  在这种架构中，我们将各个智能体添加为图节点，并预先定义智能体在自定义工作流中的调用顺序。在 LangGraph 中，工作流可以通过两种方式定义：

显式控制流（普通边）：LangGraph 通过普通图边显式定义应用程序的控制流（即智能体之间的通信顺序），这是上述架构中最具确定性的变体——我们可以预先知道接下来会调用哪个智能体。

动态控制流（命令）：在 LangGraph 中，可以通过使用 Command 来实现 LLM 动态控制应用程序部分内容。一种特殊情况是监督工具调用架构。在这种情况下，驱动监督智能体工具调用的 LLM 将决定工具（智能体）的调用顺序。

  
from langchain_openai import ChatOpenAI  
from langgraph.graph import StateGraph, MessagesState, START  
  
model = ChatOpenAI()  
  
def agent_1(state: MessagesState):  
    response = model.invoke(...)  
    return {"messages": [response]}  
  
def agent_2(state: MessagesState):  
    response = model.invoke(...)  
    return {"messages": [response]}  
  
builder = StateGraph(MessagesState)  
builder.add_node(agent_1)  
builder.add_node(agent_2)  
# define the flow explicitly  
builder.add_edge(START, "agent_1")  
builder.add_edge("agent_1", "agent_2")

三、Agent之间通信

 构建多智能体系统时，最重要的是弄清楚智能体之间如何通信。这其中涉及以下几个方面：

智能体是通过图状态进行通信还是通过工具调用进行通信？
如果两个智能体具有不同的状态模式会怎样？
如何通过共享邮件列表进行沟通？

3.1 图状态与工具调用

  智能体之间传递的“有效载荷”是什么？在上述大多数架构中，智能体通过图状态进行通信。对于带有工具调用功能的监管者而言，有效载荷是工具调用参数。

picture.image

图状态

 为了通过图状态进行通信，需要将各个智能体定义为图节点 。这些节点可以作为函数或整个子图添加。在图执行的每个步骤中，智能体节点接收图的当前状态，执行智能体代码，然后将更新后的状态传递给下一个节点。

例如，在社交媒体聚合器的代码中，我们将图状态定义为：

picture.image

 通常情况下，智能体节点共享同一个状态模式 。但是，您可能希望设计具有不同状态模式的智能体节点。

3.2 不同的状态模式

 某个智能体可能需要与其他智能体不同的状态模式。例如，搜索代理可能只需要跟踪查询和检索到的文档。在 LangGraph 中，有两种方法可以实现这一点：

为子智能体定义独立的状态模式。如果子图和父图之间没有共享的状态键（通道），则必须添加输入/输出转换，以便父图知道如何与子图通信；
为智能体节点函数定义一个私有的输入状态模式，该模式与整体图状态模式不同。这样就可以传递仅执行该特定智能体所需的信息。

3.3 共享消息列表

  智能体之间最常见的通信方式是通过共享状态通道，通常是一个消息列表。这假设状态中始终至少存在一个由智能体共享的通道（键）。通过共享消息列表进行通信时，还需要考虑另一个问题：智能体应该共享其完整的思考过程，还是只共享最终结果 ？

picture.image

四、结论

 多智能体 LLM 系统提供了一种强大的范式，它利用并行、顺序、路由器和聚合器工作流等各种架构模式来处理复杂任务，正如本博客所探讨的那样。

picture.image

 通过对共享状态、消息列表和工具调用等通信机制的详细研究，我们已经了解了智能体如何协作以实现无缝协调。