ReAct 框架的代码实现工作原理 - 文章 - 开发者社区

👆 👆👆 欢迎关注 👆👆👆

React Agent 是如何工作的？

“Reason Only”（向内求索） ：COT 类型的 Prompt，希望模型一步一步思考去解决问题。这类型 Prompt 非常擅长逻辑推理，能将问题拆解成一个一个小问题，逐一解决。但是没有灵活运用外部工具和外部知识的能力。向内求索型人格 ：像一个思维严谨，思考问题很深入，但是不够开放，不喜欢接触新知识的人。
“Act Only”（向外探索） ：工具使用 Prompt，非常擅长借助外部工具和外部知识去解决问题，但是逻辑推理能力不强。向外探索人格 ：像一个思维开放，非常喜欢探索新知识，尝试新工具，但是思想深度不够的人。
“ReAct”(内外兼修) ：就是将“Reason Only”（向内求索）和“Act Only”（向外探索）这两者相结合，

将严谨思维和探索的心态结合起来，优势互补，成为一个内外兼修的人 。

直观的理解就是，ReAct 就是结合了的 推理能力 + 工具使用能力/外部知识能力，利用大模型去解决难问题。

在 ReAct Prompting 中，使用 LLMs 以交错的方式生成推理轨迹和特定于任务的操作，从而实现两者之间更大的协同作用：

推理跟踪帮助模型诱导、跟踪和更新行动计划以及处理异常。
执行操作与外部源（例如知识库或环境）进行交互，以获取相关信息。

接下来深入研究 ReAct 框架的代码实现的工作原理，总共分三步。

第一步：

定义 Prompt 模版

  
TOOL_DESC = """{name\_for\_model}: Call this tool to interact with the {name\_for\_human} API. What is the {name\_for\_human} API useful for? {description\_for\_model} Parameters: {parameters} Format the arguments as a JSON object."""  
  
REACT_PROMPT = """Answer the following questions as best you can. You have access to the following tools:  
  
{tool\_descs}  
  
Use the following format:  
  
Question: the input question you must answer  
Thought: you should always think about what to do  
Action: the action to take, should be one of [{tool\_names}]  
Action Input: the input to the action  
Observation: the result of the action  
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)  
Thought: I now know the final answer  
Final Answer: the final answer to the original input question  
  
Begin!  
  
Question: {query}"""

工具构建函数

  
def build\_planning\_prompt(TOOLS, query):  
    tool_descs = []  
    tool_names = []  
    for info in TOOLS:  
        tool_descs.append(  
            TOOL_DESC.format(  
                name_for_model=info['name\_for\_model'],  
                name_for_human=info['name\_for\_human'],  
                description_for_model=info['description\_for\_model'],  
                parameters=json.dumps(  
                    info['parameters'], ensure_ascii=False),  
            )  
        )  
        tool_names.append(info['name\_for\_model'])  
    tool_descs = '\n\n'.join(tool_descs)  
    tool_names = ','.join(tool_names)  
  
    prompt = REACT_PROMPT.format(tool_descs=tool_descs, tool_names=tool_names, query=query)  
    return prompt  
  
prompt_1 = build_planning_prompt(TOOLS[0:1], query="加拿大2023年人口统计数字是多少？")  
print(prompt_1)

调用函数查看 response 结果

  
Answer the following questions as best you can. You have access to the following tools:  
  
Search: Call this tool to interact with the google search API. What is the google search API useful for? useful for when you need to answer questions about current events. Parameters: [{"name": "query", "type": "string", "description": "search query of google", "required": true}] Format the arguments as a JSON object.  
  
Use the following format:  
  
Question: the input question you must answer  
Thought: you should always think about what to do  
Action: the action to take, should be one of [Search]  
Action Input: the input to the action  
Observation: the result of the action  
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)  
Thought: I now know the final answer  
Final Answer: the final answer to the original input question  
  
Begin!  
  
Question: 加拿大2023年人口统计数字是多少？

初始化模型

  
# 国内连 hugginface 网络不好，这段代码可能需要多重试  
checkpoint = "Qwen/Qwen-7B-Chat"  
TOKENIZER = AutoTokenizer.from_pretrained(checkpoint, trust_remote_code=True)  
MODEL = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", trust_remote_code=True).eval()  
MODEL.generation_config = GenerationConfig.from_pretrained(checkpoint, trust_remote_code=True)  
MODEL.generation_config.do_sample = False  # greedy

设置 "Observation" 为 stop word

  
stop = ["Observation:", "Observation:\n"]  
react_stop_words_tokens = [TOKENIZER.encode(stop_) for stop_ in stop]  
response_1, _ = MODEL.chat(TOKENIZER, prompt_1, history=None, stop_words_ids=react_stop_words_tokens)  
print(response_1)

模型在预测到要生成的下一个词是 "Observation" 时马上停止生成，此时这个 prompt 后会生成如下的结果

  
Thought: 我应该使用搜索工具帮助我完成任务。search api能完成搜索的任务。  
Action: Search  
Action Input: {"query": "加拿大 2023年人口统计数字"}  
Observation:

第二步：

从输出中解析需要使用的工具和入参

  
def parse\_latest\_plugin\_call(text: str) -> Tuple[str, str]:  
    i = text.rfind('\nAction:')  
    j = text.rfind('\nAction Input:')  
    k = text.rfind('\nObservation:')  
    if 0 <= i < j:  # If the text has `Action` and `Action input`,  
        if k < j:  # but does not contain `Observation`,  
            # then it is likely that `Observation` is ommited by the LLM,  
            # because the output text may have discarded the stop word.  
            text = text.rstrip() + '\nObservation:'  # Add it back.  
            k = text.rfind('\nObservation:')  
    if 0 <= i < j < k:  
        plugin_name = text[i + len('\nAction:'):j].strip()  
        plugin_args = text[j + len('\nAction Input:'):k].strip()  
        return plugin_name, plugin_args  
    return '', ''

调用对应工具

  
def use\_api(tools, response):  
    use_toolname, action_input = parse_latest_plugin_call(response)  
    if use_toolname == "":  
        return "no tool founds"  
  
    used_tool_meta = list(filter(lambda x: x["name\_for\_model"] == use_toolname, tools))  
    if len(used_tool_meta) == 0:  
        return "no tool founds"  
      
    api_output = used_tool_meta[0]["tool\_api"](action_input)  
    return api_output  
  
api_output = use_api(TOOLS, response_1)  
print(api_output)

工具返回的结果

  
根据加拿大统计局预测，加拿大人口今天（2023年6月16日）预计将超过4000万。 联邦统计局使用模型来实时估计加拿大的人口，该计数模型预计加拿大人口将在北美东部时间今天下午3点前达到4000万。 加拿大的人口增长率目前为2.7％。

第三步：

根据工具返回结果继续调用模型

拼接上述返回答案，形成新的prompt，并获得生成最终结果

  
prompt_2 = prompt_1 + response_1 + ' ' + api_output  
stop = ["Observation:", "Observation:\n"]  
react_stop_words_tokens = [TOKENIZER.encode(stop_) for stop_ in stop]  
response_2, _ = MODEL.chat(TOKENIZER, prompt_2, history=None, stop_words_ids=react_stop_words_tokens)  
print(prompt_2, response_2)

续写，生成最终答案

按照 ReAct 格式继续生成 Thought 和 Final Answer，得到最终答案。

  
Answer the following questions as best you can. You have access to the following tools:  
  
Search: Call this tool to interact with the google search API. What is the google search API useful for? useful for when you need to answer questions about current events. Parameters: [{"name": "query", "type": "string", "description": "search query of google", "required": true}] Format the arguments as a JSON object.  
  
Use the following format:  
  
Question: the input question you must answer  
Thought: you should always think about what to do  
Action: the action to take, should be one of [Search]  
Action Input: the input to the action  
Observation: the result of the action  
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)  
Thought: I now know the final answer  
Final Answer: the final answer to the original input question  
  
Begin!  
  
Question: 加拿大2023年人口统计数字是多少？Thought: 我应该使用搜索工具帮助我完成任务。search api能完成搜索的任务。  
Action: Search  
Action Input: {"query": "加拿大 2023年人口统计数字"}  
Observation: 根据加拿大统计局预测，加拿大人口今天（2023年6月16日）预计将超过4000万。 联邦统计局使用模型来实时估计加拿大的人口，该计数模型预计加拿大人口将在北美东部时间今天下午3点前达到4000万。 加拿大的人口增长率目前为2.7％。   
Thought: I now know the final answer.  
Final Answer: 加拿大2023年人口统计数字预计为4000万。

总结

  
def main(query, choose\_tools):  
 # 组织 prompt  
    prompt = build_planning_prompt(choose_tools, query)   
    # 设置 stop word，模型停止生成   
    stop = ["Observation:", "Observation:\n"]  
    react_stop_words_tokens = [TOKENIZER.encode(stop_) for stop_ in stop]  
    response, _ = MODEL.chat(TOKENIZER, prompt, history=None, stop_words_ids=react_stop_words_tokens)  
  
    while "Final Answer:" not in response: # 出现final Answer时结束  
        api_output = use_api(choose_tools, response) # 抽取入参并执行api  
        api_output = str(api_output) # 部分api工具返回结果非字符串格式需进行转化后输出  
        if "no tool founds" == api_output:  
            break  
        print("\033[32m" + response + "\033[0m" + "\033[34m" + ' ' + api_output + "\033[0m")  
        prompt = prompt + response + ' ' + api_output # 合并api输出  
        response, _ = MODEL.chat(TOKENIZER, prompt, history=None, stop_words_ids=react_stop_words_tokens) # 继续生成  
  
    print("\033[32m" + response + "\033[0m")

核心思想是利用了大模型的续写能力，按照 ReAct 的步骤进行续写。
设置停止词，当生成与停止词一样的输出时，模型停止生成。
进行工具和参数解析，并调用工具。
将工具调用结果进行 ReAct 格式进行拼装，调用模型继续进行续写生成。
直到遇见 Final Answer: ，获得最终答案。
使用一次工具，至少要调用 2 次大模型。