久久人人青草97香蕉,性爽交免费视频,国产精品高清在线

第二部分應用挑戰

2.基本流程與相關技術

4）Prompt

在前面的內容里，我們提到過要使用模型完成下游任務，有兩種方式，一種是通過收集標記樣本針對不同的任務進行指令微調，另一種方式便是大模型特有的，可以通過將指令以對話的方式提供給模型，期待模型能夠給我們返回預期的結果。相較于前者，后者具有更高的靈活性，使用成本也更低，因此，這一方式成了如今大語言模型區別于傳統NLP模型的重要標志。在本章你將學習到：

1）Prompt，In-Context-Learning，Prompt engineering等相關概念

2）如何寫好一個Prompt及相關Prompt調試工具

3）基于Prompt催生的一些新的編程范式

Agent框架

在了解了Agent的基本概念，原理之后，我們這一節介紹一些有關Agent的知名框架和項目，通過這些項目可以快速構建自己的Agent應用。

編排框架類-(Langchain/llamaindex/Semantic Kernel）

如前面介紹，Agent本質上是一種使用LLM替代人工制定目標任務處理執行流程的應用形態，因此，從RAG應用衍生出來的編排框架，如Langchain，llamaindex，Semantic Kernel都隨著應用復雜度的不斷提高，有了對agent以及mutiAgent應用的支持。下面以langchain為例重點介紹一下編排框架類的agent應用的相關細節，其他框架更多使用介紹見下一章編排與集成。
在langchain中，agent 應用構建涉及到幾個核心概念：

1）Agent

其核心作用是與大模型交互，提供大模型當前可以使用的工具，用戶的輸入，以及歷史過程中的執行動作及相關工具的輸出。大模型基于這些輸入，進而獲得下一步行動或發送給用戶的最終響應（AgentActions 或 AgentFinish）。這里的行動可以是指定一個工具和該工具的輸入。

 try:

            # Call the LLM to see what to do.

            output = self.agent.plan(

                intermediate_steps,

                callbacks=run_manager.get_child() if run_manager else None,

                **inputs,

            )

        except Exception as e:

            if not self.handle_parsing_errors:

                raise e

            text = str(e).split("`")[1]

            observation = "Invalid or incomplete response"

            output = AgentAction("_Exception", observation, text)

            tool_run_kwargs = self.agent.tool_run_logging_kwargs()

            observation = ExceptionTool().run(

                output.tool,

                verbose=self.verbose,

                color=None,

                callbacks=run_manager.get_child() if run_manager else None,

                **tool_run_kwargs,

            )

            return [(output, observation)]

不難想到，要和大模型進行溝通，自然需要Prompt，langchain默認內置了若干agent類型（agent type），如：Zero-shot ReAct，Structured input ReAct，OpenAI Functions，Self-ask with search等，也可以自定義agent。

2）Tools
作為LLM能力的擴展，需要提供可被Agent調起的工具（可以理解為一個函數調用）。這里的工具必須滿足兩個條件：提供解決該任務合適的工具及一個有效的工具描述，讓大模型知道這個工具到底有啥功能，什么時候可以使用它。langchain默認提供了大量的內置工具，包含搜索，查庫，計算等，詳見：https://python.langchain.com/docs/integrations/tools/。

return Tool(

    name="Calculator",

    description="Useful for when you need to answer questions about math.",

    func=LLMMathChain.from_llm(llm=llm).run,

    coroutine=LLMMathChain.from_llm(llm=llm).arun,

    )

3）Toolkits

langchain將一組相關的工具合并在一個toolkits中方便管理，比如對于一個網站的增刪改查。同樣，langchain提供了大量的toolkits的預置集合。在使用方法上沒有太多區別，都會打平成Tools提交給LLM。

tools = []

unwanted_tools = ["Get Issue", "Delete File", "Create File", "Create Pull Request"]



for tool in toolkit.get_tools():

    if tool.name not in unwanted_tools:

        tools.append(tool)

tools += [

    Tool(

        name="Search",

        func=DuckDuckGoSearchRun().run,

        description="useful for when you need to search the web",

    )

]

4）AgentExecutor

Agent執行的運行環境，通過它來串聯agent的工作流程，可以認為是一個無限循環，并保證在執行過程中可能出現的一些錯誤及執行兜底策略。下面是langchain默認的最常見泛化運行時的實現示例：

next_action = agent.get_action(...)

while next_action != AgentFinish:

    observation = run(next_action)

    next_action = agent.get_action(..., next_action, observation)

return next_action

除此之外，langchain還提供了一些特定模式的Agent執行環境，比如Plan-and-execute Agent，AutoGPT等，其中Plan-and-execute 的使用方法如下：

from langchain.agents.tools import Tool

from langchain.chains import LLMMathChain

from langchain.chat_models import ChatOpenAI

from langchain.llms import OpenAI

from langchain.utilities import DuckDuckGoSearchAPIWrapper

from langchain_experimental.plan_and_execute import (

    PlanAndExecute,

    load_agent_executor,

    load_chat_planner,

)

search = DuckDuckGoSearchAPIWrapper()

llm = OpenAI(temperature=0)

llm_math_chain = LLMMathChain.from_llm(llm=llm, verbose=True)

tools = [

    Tool(

        name="Search",

        func=search.run,

        description="useful for when you need to answer questions about current events",

    ),

    Tool(

        name="Calculator",

        func=llm_math_chain.run,

        description="useful for when you need to answer questions about math",

    ),

]



model = ChatOpenAI(temperature=0)

planner = load_chat_planner(model)

executor = load_agent_executor(model, tools, verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor)



agent.run(

    "Who is the current prime minister of the UK? What is their current age raised to the 0.43 power?"

)

這里關鍵的planner的prompt如下：

SYSTEM_PROMPT = (

    "Let's first understand the problem and devise a plan to solve the problem."

    " Please output the plan starting with the header 'Plan:' "

    "and then followed by a numbered list of steps. "

    "Please make the plan the minimum number of steps required "

    "to accurately complete the task. If the task is a question, "

    "the final step should almost always be 'Given the above steps taken, "

    "please respond to the users original question'. "

    "At the end of your plan, say '<END_OF_PLAN>'"

)

基于以上的分析，想要定義一個自己的Agent應用，就是需要自定義這些關鍵組件。

OpenAI原生Assistant

在早期，OpenAI僅僅對外提供生成和對話接口，外圍的編排框架需要完成輸出解析，需要基于應用模式進行檢索增強或者調用工具，隨著function call，code?interperter等能力的增強，OpenAI將這些原本外部實現的Agent需要的核心功能放到了內部，提供了Assistant接口，基于這個接口大大簡化了LLM應用開發的模式，RAG及Agent應用開發變得更加簡單。因此，在不久未來基于OpenAI原生的GPTs的Agent應用將會大量出現。

在新的Assistant接口中，有這樣一些領域概念。

領域對象	解釋
Assistant	使用 OpenAI 模型和調用工具的特定目的的Assistant，它有多個屬性，其中包括 tools 和 file_ids，分別對應 Tool 對象和 File 對象。
Thread	對象表示一個聊天會話，它是有狀態的，就像 ChatGPT 網頁上的每個歷史記錄，我們可以對歷史記錄進行重新對話，它包含了多個 Message 對象。
Message	表示一條聊天消息，分不同角色的消息，包括 user、assistant 和 tool 等。Message以列表形式存儲在Thread中。
Run	表示一次指令執行的過程，需要指定執行命令的對象 Assistant 和聊天會話 Thread，一個 Thread 可以創建多個 Run。
Run Step	對象表示執行的步驟，一個 Run 包含多個 Run Step。查看”Run Step”可讓您了解助手是如何取得最終結果的。

其開發過程如下：

創建 Assistant,由于這個 API 是 beta 版本，如果是通過 curl 調用 API 的話，需要在 header 中加上OpenAI-Beta: assistants=v1。

assistant = client.beta.assistants.create(

  name="Data visualizer",

  description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",

  model="gpt-4-1106-preview",

  tools=[{"type": "code_interpreter"}],

  file_ids=[file.id]

)

其中，tool參數可以最多128個，也就是最多支持128個工具調用，其中OpenAI托管code_interpreter and retrieval兩種工具，第三方自定義工具采用functioncall。file_ids參數可指定綁定的文件，最多可以綁定20個，單個不超過512MB，最大不能超過100GB。通過file創建接口可獲得file.id然后附在assistant創建接口上。同時也可以通過AssistantFile方式與具體Assitant關聯，并且注意刪除AssistantFile并不會刪除原始的File對象，它只是刪除該File和Assistant之間的關聯。若要刪除文件，還需要進一步利用file接口刪除。

curl https://api.openai.com/v1/assistants/asst_abc123/files \

    -H 'Authorization: Bearer $OPENAI_API_KEY"' \

    -H 'Content-Type: application/json' \

    -H 'OpenAI-Beta: assistants=v1' \

    -d '{

      "file_id": "file-abc123"

    }'

2.創建 Thread 和 Message，可以分開創建也可以一起創建。

thread = client.beta.threads.create(

  messages=[

    {

      "role": "user",

      "content": "Create 3 data visualizations based on the trends in this file.",

      "file_ids": [file.id]

    }

  ]

)

雖然thread沒有設置Message條數上限，但整體收受到context window限制。在當前thread仍然可以關聯上傳文件。

3.創建由 Assistant 和 Thread 組成的 Run，創建完 Run 后會自動執行 Thread 中的指令

run = client.beta.threads.runs.create(

  thread_id=thread.id,

  assistant_id=assistant.id,

  model="gpt-4-1106-preview",

  instructions="additional instructions",

  tools=[{"type": "code_interpreter"}, {"type": "retrieval"}]

)

4.輪詢 Run 狀態，檢查是否為 completed，整個流轉狀態如圖：

狀態	定義
queued	新建Run或完成 required_action時會變為queued狀態。進而立即變為in_progress。
in_progress	在in_progress，助手會使用模型和工具來執行指令。可以通過檢查 ” Run Steps”來查看當前RUN的進度。
completed	Run成功執行。可以查看Assistant添加到Thread中所有消息，以及當前RUN的所有步驟。您還可以通過向Thread添加更多用戶消息和創建另一個RUN來繼續對話。
requires_action	使用Function calling時，一旦模型確定了要調用的函數名稱和參數，運行就會轉入 required_action 狀態。然后，調用方必須運行這些函數，并在運行繼續之前提交輸出。如果在過期時間戳（expires_at，大約為創建后 10 分鐘）前未提供輸出，運行將轉入expired狀態。
expired	如果調用方未將Function calling輸出在 expires_at 之前提交，運行就會過期。此外，如果運行時間過長，超過了 expires_at 中規定的時間，OpenAI系統就會使運行過期。
cancelling	可以使用 “取消運行 “接口嘗試取消正在進行的RUN。一旦嘗試取消成功，RUN的狀態將變為cancelled。嘗試取消但不保證一定能夠取消。
cancelled	RUN成功取消。
failed	可以通過查看RUN中的 last_error 對象來了解故障原因。失敗的時間戳將記錄在 failed_at 下。

這個過程是個異步過程，可以通過之前生成的assistant_id，thread_id，run_id來輪詢來檢查執行的進度以及run step，如果有調用自定義工具（Function call），需要提交工具的執行結果，避免RUN任務停留在requires_action或者expired狀態。

獲得Run信息:

curl https://api.openai.com/v1/threads/thread_abc123/runs/run_abc123/steps/step_abc123 \

  -H "Authorization: Bearer $OPENAI_API_KEY" \

  -H "Content-Type: application/json" \

  -H "OpenAI-Beta: assistants=v1"



  {

  "id": "step_abc123",

  "object": "thread.run.step",

  "created_at": 1699063291,

  "run_id": "run_abc123",

  "assistant_id": "asst_abc123",

  "thread_id": "thread_abc123",

  "type": "message_creation",

  "status": "completed",

  "cancelled_at": null,

  "completed_at": 1699063291,

  "expired_at": null,

  "failed_at": null,

  "last_error": null,

  "step_details": {

    "type": "message_creation",

    "message_creation": {

      "message_id": "msg_abc123"

    }

  }

}

獲得Run Step信息:

curl https://api.openai.com/v1/threads/thread_abc123/runs/run_abc123/steps/step_abc123 \

  -H "Authorization: Bearer $OPENAI_API_KEY" \

  -H "Content-Type: application/json" \

  -H "OpenAI-Beta: assistants=v1"



 {

  "id": "step_abc123",

  "object": "thread.run.step",

  "created_at": 1699063291,

  "run_id": "run_abc123",

  "assistant_id": "asst_abc123",

  "thread_id": "thread_abc123",

  "type": "message_creation",

  "status": "completed",

  "cancelled_at": null,

  "completed_at": 1699063291,

  "expired_at": null,

  "failed_at": null,

  "last_error": null,

  "step_details": {

    "type": "message_creation",

    "message_creation": {

      "message_id": "msg_abc123"

    }

  }

}

5.如果狀態是 completed 則可獲取最終結果

run = client.beta.threads.runs.retrieve(

 thread_id=thread.id,

 run_id=run.id

)



while run.status != "completed":

   print(run.status)

   time.sleep(60)  # 等待60秒

   run = client.beta.threads.runs.retrieve(

   thread_id=thread.id,

   run_id=run.id

  )       



messages = client.beta.threads.messages.list(

 thread_id=thread.id

)



print(messages.data)

以上整個過程可以在OpenAI的playGround中進行調試開發。

Langchain也第一時間支持了該接口，并提供了新的Agent實現教程。

!pip install e2b duckduckgo-search

from langchain.tools import DuckDuckGoSearchRun, E2BDataAnalysisTool



tools = [E2BDataAnalysisTool(api_key="..."), DuckDuckGoSearchRun()]

agent = OpenAIAssistantRunnable.create_assistant(

    name="langchain assistant e2b tool",

    instructions="You are a personal math tutor. Write and run code to answer math questions. You can also search the internet.",

    tools=tools,

    model="gpt-4-1106-preview",

    as_agent=True,

)



from langchain.agents import AgentExecutor



agent_executor = AgentExecutor(agent=agent, tools=tools)

agent_executor.invoke({"content": "What's the weather in SF today divided by 2.7"})

目前該接口處于Beta測試階段，未來官方將補齊如下功能。

支持流式輸出（包括 “Message”和 “Run Step”）。
支持無需輪詢即可共享對象狀態更新的通知。
支持作為圖片生成工具 DALL-E。
支持用圖片類用戶信息。

不僅如此，在API的基礎上，OpenAI還提供了面向普通用戶的無代碼界面化的Agent開發方法，那就是的GPTs。

可觀看此視頻學習如何創建(來自@agishaun的一個簡歷向導的GPTs)：

相較于當前Beta版本的API構建方式，GPTS的界面操作更為簡單直觀，提供了更多的內置插件能力，如網頁瀏覽，圖片生成等，以及外部的Action調用，能夠方便的分享給其它用戶，甚至直接獲得收益。由于其制作門檻足夠低，發布之后短短幾天就有上千GPTs上架GPT store。這里提供一些導航網站可以探索和借鑒最新最熱的GPTs：

https://gptsdex.com/

Agent專有框架(單agent/MutiAgent)

除了上面這些基本的agent構建的工具外，業內還有一些專門設計旨在更好完成agent構建的項目。通常包含單Agent和多Agent協同兩類項目。
單Agent

babyAGI

由Yohei Nakajima開發的babyAGI作為chatGPT橫空出世后，是最先利用大模型能力構建的任務驅動型自主Agent的概念性項目之一，在當時獲得了巨大的關注，對后面的Agent項目有很大的啟發。它使用 OpenAI GPT-4接口和向量數據庫（如 Chroma 或 Weaviate）來創建、優先排序和執行任務。其核心實現就在babyagi.py這一個腳本中，在里面可以看到其精華所在來自于prompt，其關鍵的三個Agent的prompt如下：

1.task_creation_agent：根據目標創建任務

prompt = f""" You are a task creation AI that uses the result of an execution agent to create new tasks with the following objective: {objective}, The last completed task has the result: {result}. This result was based on this task description: {task_description}. These are incomplete tasks: {', '.join(task_list)}. Based on the result, create new tasks to be completed by the AI system that do not overlap with incomplete tasks. Return the tasks as an array."""

2.execution_agent：根據運行歷史及當前任務，獲得任務返回

prompt = f""" You are an AI who performs one task based on the following objective: {objective}\n. Take into account these previously completed tasks: {context}\n. Your task: {task}\nResponse:"""

3.prioritization_agent：基于目標對任務進行排序

prompt = f""" You are a task prioritization AI tasked with cleaning the formatting of and reprioritizing the following tasks: {task_names}. Consider the ultimate objective of your team:{OBJECTIVE}. Do not remove any tasks. Return the result as a numbered list, like: #. First task #. Second task Start the task list with number {next_task_id}."""

其工作過程如下：

1. 設定目標，并通過任務生成Agent生成任務列表，從任務列表中提取第一個任務。
2. 將任務發送給執行Agent，Agent根據上下文構建prompt調用OpenAI接口完成任務。
3. 將任務與結果存儲在記憶模塊中，如Chroma/Weaviate 等向量數據庫中。
4. 根據目標和上下文，任務創建Agent創建新的任務。
5. 任務優先級Agent，根據目標和先前任務的結果重新排列任務列表的優先級。
6. 重復步驟2-5，直到任務列表為空，判定結束目標。該項目項目歷經多次迭代，其最初實現參考：https://yoheinakajima.com/task-driven-autonomous-agent-utilizing-gpt-4-pinecone-and-langchain-for-diverse-applications/。不過遺憾的是，作為概念性產品，其核心思想也被langchain等框架吸收替代，項目本身也停止了更新。

AutoGPT

AutoGPT由Toran Bruce Richards開發的開源自主AI代理，于2023年3月發布，與babyAGI齊名，并且其功能更為完善，支持抓取網站、搜索信息、生成圖像、創建和運行代碼等。

AutoGPT相較于babyAGI來講，在Prompt engieering層面走的更遠，充分發揮大模型能力，讓大模型代替編程來控制流程。因此，它的核心在于如何構造prompt，其執行過程分為六步：

1.創建計劃，包含agent的名字，角色，以及計劃目標。對應的prompt片段如下：

You are



AI Name <-(Variable)

AI Role <-(Variable)



Your decisions must always be made independently 

without seeking user assistance. 



Play to your strengths as a LLM and 

pursue simple strategies with no legal complications.



Golas <-(Variable)

2.提供可用的工具列表，如下包含搜索、瀏覽網站和生成圖像。

COMMANDS:



1. Google Search: "google", args: "input": "<search>"

5. Browse Website: "browse_website", args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"

20. Generate Image: "generate_image", args: "prompt": "<prompt>"

3.提供可用的命令，此部分與工具都在COMMANDS下聲明。

8. List GPT Agents: "list_agents", args: ""

9. Delete GPT Agent: "delete_agent", args: "key": "<key>"

10. Write to file: "write_to_file", args: "file": "<file>", "text": "<text>"

11. Read file: "read_file", args: "file": "<file>"

12. Append to file: "append_to_file", args: "file": "<file>", "text": "<text>"

13. Delete file: "delete_file", args: "file": "<file>"

4.進入執行計劃迭代。框架可基于大模型返回來調用相關工具。下面是框架調用工具的基本邏輯。

if command_name == "google":



            # Check if the Google API key is set and use the official search method

            # If the API key is not set or has only whitespaces, use the unofficial search method

            if cfg.google_api_key and (cfg.google_api_key.strip() if cfg.google_api_key else None):

                return google_official_search(arguments["input"])

            else:

                return google_search(arguments["input"])

5.準備上下文信息，這里包含大模型在執行過程中的限制、可以使用的資源及評價方法，執行歷史，結果返回格式等，部分Prompt如下：

CONSTRAINTS:



1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.

2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.

3. No user assistance

4. Exclusively use the commands listed in double quotes e.g. "command name"



RESOURCES:



1. Internet access for searches and information gathering.

2. Long Term memory management.

3. GPT-3.5 powered Agents for delegation of simple tasks.

4. File output.



PERFORMANCE EVALUATION:



1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.

2. Constructively self-criticize your big-picture behavior constantly.

3. Reflect on past decisions and strategies to refine your approach.

4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.



You should only respond in JSON format as described below



RESPONSE FORMAT:

{

    "thoughts":

    {

        "text": "thought",

        "reasoning": "reasoning",

        "plan": "- short bulleted\n- list that conveys\n- long-term plan",

        "criticism": "constructive self-criticism",

        "speak": "thoughts summary to say to user"

    },

    "command": {

        "name": "command name",

        "args":{

            "arg name": "value"

        }

    }

}



Ensure the response can be parsed by Python json.loads

在Prompt中添加對話歷史：

memory_to_add = f"Assistant Reply: {assistant_reply} " \

                    f"\nResult: {result} " \

                    f"\nHuman Feedback: {user_input} "

6.將計劃，工具，上下文等內容整合為最終Prompt提交給大模型，等待大模型返回下一步更新后的執行計劃。

最后，重復4-6步，不斷更新計劃，直到計劃完成。
AutoGPT也提供了自己的前端UI實現，可在 Web、Android、iOS、Windows 和 Mac 上運行。作為當下最領先的Agent開發框架，開發者可以基于AutoGPT作為自己Agent，結合自己的垂直場景進行改造完善。

AgentGPT

相較于BabyAGI，AutoGPT，AgentGPT最大的特點就是試用方便，官方提供了試用的網站（https://agentgpt.reworkd.ai/zh/），設置自己的OpenAI key，就可以直接使用。在技術實現上，與其它框架區別不大，核心還是在Prompt和工具使用上。下面是它的一個prompt：

start_goal_prompt = PromptTemplate(

    template="""You are a task creation AI called AgentGPT. 

You answer in the "{language}" language. You have the following objective "{goal}". 

Return a list of search queries that would be required to answer the entirety of the objective. 

Limit the list to a maximum of 5 queries. Ensure the queries are as succinct as possible. 

For simple questions use a single query.



Return the response as a JSON array of strings. Examples:



query: "Who is considered the best NBA player in the current season?", answer: ["current NBA MVP candidates"]

query: "How does the Olympicpayroll brand currently stand in the market, and what are its prospects and strategies for expansion in NJ, NY, and PA?", answer: ["Olympicpayroll brand comprehensive analysis 2023", "customer reviews of Olympicpayroll.com", "Olympicpayroll market position analysis", "payroll industry trends forecast 2023-2025", "payroll services expansion strategies in NJ, NY, PA"]

query: "How can I create a function to add weight to edges in a digraph using {language}?", answer: ["algorithm to add weight to digraph edge in {language}"]

query: "What is the current weather in New York?", answer: ["current weather in New York"]

query: "5 + 5?", answer: ["Sum of 5 and 5"]

query: "What is a good homemade recipe for KFC-style chicken?", answer: ["KFC style chicken recipe at home"]

query: "What are the nutritional values of almond milk and soy milk?", answer: ["nutritional information of almond milk", "nutritional information of soy milk"]""",

    input_variables=["goal", "language"],

)

其更多prompt可查看：https://github.com/reworkd/AgentGPT/blob/c2084e4faa46ecd91621be17574ef9532668cbfc/platform/reworkd_platform/web/api/agent/prompts.py

Muti-Agent?

https://arxiv.org/pdf/2304.03442.pdf

對于一個復雜系統，如何讓Agent之間協同，共同完成更為復雜的系統性工作，自然而然有了多Agent的概念。相較于單Agent，多Agent更為早期，更多是一種概念性的展示，還需要很長的路要走。而被廣大同行關注的是來自于斯坦福小鎮的項目（Generative Agents: Interactive Simulacra of Human Behavior），也得益于它的啟發，大量的多Agent項目出現。

在這個虛擬的小鎮里，每個角色都是一個單獨的智能體，每天依據制定的計劃按照設定的角色去活動和做事情，當他們相遇并交談時，他們的交談內容會被存儲在記憶數據庫中，并在第二天的活動計劃中被回憶和引用，這一過程中就能涌現出許多頗有趣味性的社會學現象。

下面我們介紹幾個比較知名的MutiAgent項目。

MetaGPT & ChatDev

MetaGPT是一個開源多智能體框架，模擬一家軟件公司，讓Agent協同起來完成開發工作，它可以僅憑一行軟件需求就能生成 API、用戶故事、數據結構、競爭分析等。

MetaGPT符合人類軟件開發的標準流程。Agent可以充當產品經理、軟件工程師和架構師等角色協同起來完成開發流程。

在線試用：https://huggingface.co/spaces/deepwisdom/MetaGPT

與該項目類似，清華大學等國內機構發起的一個多智能體項目ChatDev，它虛擬一個由多智能體協作運營的軟件公司，在人類“用戶”指定一個具體的任務需求后，不同角色的Agent將進行交互式協同，以生產一個完整軟件（包括源代碼、環境依賴說明書、用戶手冊等）。

AutoGen

Autogen 是微軟開發的一款通用的多代理框架。它提供可定制和可對話的代理，將 LLM、工具和人類整合在一起。通過自動處理多個有能力的代理之間的聊天，人們可以輕松地讓它們共同自主地或在人類反饋下執行任務，包括需要通過代碼使用工具的任務。相較于MetaGPT，AutoGen是通用的，利用它可以構建各種不同形式的多Agent應用。

該框架具有以下特點：

多代理對話：AutoGen 代理可以相互交流，共同完成任務。
定制：可以對 AutoGen 代理進行定制，以滿足應用程序的特定需求。這包括選擇要使用的 LLM、允許的人工輸入類型以及要使用的工具。
人可參與：AutoGen 可以人無縫對接。這意味著人類可以根據需要向代理提供輸入和反饋。

AutoGen框架使我們能夠協調編排多智能體工作流，相較于傳統流程驅動的任務流，這種消息驅動的任務流程顯得更為靈活，對處理復雜流程以及解耦領域邏輯有一定的幫助。它提供了一些通用的代理類，還提供了一個“Group Chat”的上下文，以促進跨Agent協作。以下是一些常用到的Agent類。

User Proxy Agent：用戶代理可以執行我們在python腳本中定義的功能。相當于對外聲明了function，可供框架調用，如下面聲明了查詢wiki的函數。

user_proxy.register_function(

    function_map={

        "search_and_index_wikipedia": search_and_index_wikipedia,

        "query_wiki_index":query_wiki_index,

    }

)

Assistant Agent：由大模型能力的支持，使其能夠完成特定的任務，可扮演不同的AI角色。比如，設置一個分析師角色。

analyst = autogen.AssistantAgent(

    name="analyst",

    system_message='''

    As the Information Gatherer, you must start by using the search_and_index_wikipedia 
    function to gather relevant data about the user's query. Follow these steps:
    1. Upon receiving a query, immediately invoke the search_and_index_wikipedia 
    function to find and index Wikipedia pages related to the query. Do not proceed without completing this step.
    2. After successfully indexing, utilize the query_wiki_index to extract detailed 
    information from the indexed content.
    3. Present the indexed information and detailed findings to the Reporter, 
    ensuring they have a comprehensive dataset to draft a response.
    4. Conclude your part with "INFORMATION GATHERING COMPLETE" to signal that you have 
    finished collecting data and it is now ready for the Reporter to use in formulating the answer.
    Remember, you are responsible for information collection and indexing only. 
    The Reporter will rely on the accuracy and completeness of your findings to generate the final answer.
    ''',
    llm_config=llm_config,
)

Group Chat Manager：向群聊提供初始查詢，并管理所有代理之間的交互。其協調和任務分發能力受大模型能力的支持。如下：

# Define the group chat manager.

manager = autogen.GroupChatManager(

    groupchat=groupchat, 

    llm_config=llm_config, 

    system_message='''You should start the workflow by consulting the analyst, 

    then the reporter and finally the moderator. 

    If the analyst does not use both the search_and_index_wikipedia 
    and the query_wiki_index, you must request that it does.'''
    )

下面是一個利用AutoGen實現類似于MetaGPT或chatDev項目軟件開發場景的小例子：

# %pip install pyautogen~=0.2.0b4



import autogen

config_list_gpt4 = autogen.config_list_from_json(

    "OAI_CONFIG_LIST",

    filter_dict={

        "model": ["gpt-4", "gpt-4-0314", "gpt4", "gpt-4-32k", "gpt-4-32k-0314", "gpt-4-32k-v0314"],

    },

)



llm_config = {"config_list": config_list_gpt4, "cache_seed": 42}

user_proxy = autogen.UserProxyAgent(

   name="User_proxy",

   system_message="A human admin.",

   code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},

   human_input_mode="TERMINATE"

)

coder = autogen.AssistantAgent(

    name="Coder",

    llm_config=llm_config,

)

pm = autogen.AssistantAgent(

    name="Product_manager",

    system_message="Creative in software product ideas.",

    llm_config=llm_config,

)

groupchat = autogen.GroupChat(agents=[user_proxy, coder, pm], messages=[], max_round=12)

manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)



user_proxy.initiate_chat(manager, message="Find a latest paper about gpt-4 on arxiv and find its potential applications in software.")

# type exit to terminate the chat

更多例子：https://microsoft.github.io/autogen/docs/Examples/AgentChat

總結

Prompt是人類與大模型交互的語言，通過本章從概念，理論到應用的介紹，對于其價值已經有了較為深刻的理解。而Agent就是prompt應用的巔峰典范，從某種意義上講，它將向聲明式編程又向前推進了一步，傳統的我們只能在某些細分領域采用獨特的dsl才可能實現，如SQL，而現在我們只需要利用自然語言，描述我們想要達成的目標，Agent系統就將幫我們實現。另一方面，RAG/Copilot應用到Agent應用的過渡，也體現了“以人為主，AI輔助”向“以AI為主，人為輔助”的模式躍變。

實際上，對于一個LLM應用，不論是Rag或是Agent，僅有一些要素組件，如大模型，Prompt，向量數據庫等是不夠的，怎么讓他們有效整合集成起來形成一個應用系統才是最終的目標，這也是眾多LLM應用框架首先切入編排集成領域的關鍵原因。

本文章轉載微信公眾號@AI工程化