亚洲的男人的天堂,国产男男做爰免费视频,精品国产三级a在线观看

在這個例子中，我們清楚地定義了LLM的行為，LLM很好地解決了這個任務。然而，如果我們構建更高級、更模糊的東西，比如LLM支持的分析師，這種方法就不會奏效。

如果你曾經作為分析師或與分析師共事過至少一天，你就會知道分析師會收到各種各樣的問題，從基本問題（如“昨天我們的網站上有多少客戶？”或“你能為明天的董事會會議做一張圖嗎？”）到非常高層次的問題（如，“主要的客戶痛點是什么？”或《我們下一步應該推出什么市場？》）。不用說，描述所有可能的場景是不可行的。

然而，有一種方法可以幫助我們——agent。代理的核心思想是使用LLM作為推理引擎，可以選擇下一步要做什么以及何時將最終答案返回給客戶。這聽起來很接近我們的行為：我們完成一項任務，定義所需的工具，使用它們，然后在準備好后給出最終答案。

與代理相關的基本概念是工具。工具是LLM可以調用的函數，以獲取丟失的信息（例如，執行SQL、使用計算器或調用搜索引擎）。工具至關重要，因為它們可以讓你將LLM提升到一個新的水平，并與世界互動。在本文中，我們將主要關注OpenAI 函數提供的工具功能。

OpenAI提供了很多微調后的模型，這些模型可以提供工具功能：

可以將帶有描述的函數列表傳遞給模型；
如果是查詢相關，則模型將返回一個函數調用——函數名稱和用于調用它的輸入參數。

PS：OpenAI支持的最新模型和函數可以參考（https://platform.openai.com/docs/guides/function-calling）。

下面使用兩個用例來說明函數與LLM的使用方法：

標記和提取——在這些任務中，函數用于確保模型的輸出格式，會得到一個結構化的函數調用，而不是通常的帶內容的輸出；
工具和路由——這是一個更令人興奮的用例，可以自定義創建代理。

三、用例1：標記和提取

標記和提取唯一的區別是模型是提取文本中呈現的信息，還是標記文本以提供新信息（即定義語言或情感）。

既然我們決定專注于描述性分析和報告任務，那么讓我們使用這種方法來構建傳入的數據請求，并提取以下組件：度量、維度、過濾器、周期和所需輸出。

下面是一個提取的例子，因為我們只需要文本中的信息。

OpenAI Completion API基本示例

首先，我們需要定義函數。OpenAI期望函數描述為JSON。這個JSON將被傳遞給LLM，所以我們需要告訴它所有的上下文：這個函數做什么以及如何使用它。

下面是一個函數JSON的示例：

函數所需要的name和description；
每個參數的type和description；
函數所需輸入參數的列表。

extraction_functions = [

    {

        "name": "extract_information",

        "description": "extracts information",

        "parameters": {

            "type": "object",

            "properties": {

                "metric": {

                    "type": "string",

                    "description": "main metric we need to calculate, for example, 'number of users' or 'number of sessions'",

                },

                "filters": {

                    "type": "string",

                    "description": "filters to apply to the calculation (do not include filters on dates here)",

                },

                "dimensions": {

                    "type": "string",

                    "description": "parameters to split your metric by",

                },

                "period_start": {

                    "type": "string",

                    "description": "the start day of the period for a report",

                },

                "period_end": {

                    "type": "string",

                    "description": "the end day of the period for a report",

                },

                "output_type": {

                    "type": "string",

                    "description": "the desired output",

                    "enum": ["number", "visualisation"]

                }

            },

            "required": ["metric"],

        },

    }

]

在這個用例中，沒有必要實現函數本身，因為我們不會使用它。我們只在函數調用時以結構化的方式獲得LLM響應。

現在，我們可以使用標準的OpenAI聊天完成API來調用該函數。我們傳遞給API調用：

模型——使用最新的ChatGPT 3.5 Turbo，它支持函數調用；
消息列表——用于設置上下文的系統消息和一個用戶請求；
我們前面定義的函數列表。

import openai



messages = [

    {

        "role": "system",

        "content": "Extract the relevant information from the provided request."

    },

    {

        "role": "user",

        "content": "How did number of iOS users change over time?"

    }

]



response = openai.ChatCompletion.create(

    model = "gpt-3.5-turbo-1106", 

    messages = messages,

    functions = extraction_functions

)



print(response)

結果，我們得到了以下JSON。

{

  "id": "chatcmpl-8TqGWvGAXZ7L43gYjPyxsWdOTD2n2",

  "object": "chat.completion",

  "created": 1702123112,

  "model": "gpt-3.5-turbo-1106",

  "choices": [

    {

      "index": 0,

      "message": {

        "role": "assistant",

        "content": null,

        "function_call": {

          "name": "extract_information",

          "arguments": "{\"metric\":\"number of users\",\"filters\":\"platform='iOS'\",\"dimensions\":\"date\",\"period_start\":\"2021-01-01\",\"period_end\":\"2021-12-31\",\"output_type\":\"visualisation\"}"

        }

      },

      "finish_reason": "function_call"

    }

  ],

  "usage": {

    "prompt_tokens": 159,

    "completion_tokens": 53,

    "total_tokens": 212

  },

  "system_fingerprint": "fp_eeff13170a"

}

該模型返回了一個函數調用，而不是一個常見的響應：我們可以看到內容是空的，finish_reason等于function_call。在響應中，還有用于函數調用的輸入參數：

metric = “number of users”,
filters = “platform = ‘iOS'”,
dimensions = “date”,
period_start = “2021-01-01”,
period_start = “2021-12-31”,
output_type = “visualisation”。

這個模型做得很好。唯一的問題是，它不知從哪里推測出了這一時期。我們可以通過在系統消息中添加更明確的指導來修復它，例如，”Extract the relevant information from the provided request. Extract ONLY the information presented in the initial request; don’t add anything else. Return partial information if something is missing.”

默認情況下，模型決定是否獨立使用函數（function_call=’auto’）。我們可以要求它每次返回一個特定的函數調用，或者根本不使用函數。

# always calling extract_information function

response = openai.ChatCompletion.create(

    model = "gpt-3.5-turbo-1106",

    messages = messages,

    functions = extraction_functions,

    function_call = {"name": "extract_information"}

)



# no function calls

response = openai.ChatCompletion.create(

    model = "gpt-3.5-turbo-1106",

    messages = messages,

    functions = extraction_functions,

    function_call = "none"

)

我們有了第一個使用LLM函數的應用程序。但是，用JSON描述函數不是很方便。讓我們討論一下如何改進。

使用Pydantic定義函數

為了更方便地定義函數，我們可以利用Pydantic。Pydantic是用于數據驗證的最流行的Python庫。

我們已經使用Pydantic定義了LangChain輸出解析器。

首先，我們需要創建一個繼承自BaseModel類的類，并定義所有字段（函數的參數）。

from pydantic import BaseModel, Field

from typing import Optional



class RequestStructure(BaseModel):

  """extracts information"""

  metric: str = Field(description = "main metric we need to calculate, for example, 'number of users' or 'number of sessions'")

  filters: Optional[str] = Field(description = "filters to apply to the calculation (do not include filters on dates here)")

  dimensions: Optional[str] = Field(description = "parameters to split your metric by")

  period_start: Optional[str] = Field(description = "the start day of the period for a report")

  period_end: Optional[str] = Field(description = "the end day of the period for a report")

  output_type: Optional[str] = Field(description = "the desired output", enum = ["number", "visualisation"])

然后，我們可以使用LangChain將Pydantic類轉換為OpenAI函數。

from langchain.utils.openai_functions import convert_pydantic_to_openai_function

extract_info_function = convert_pydantic_to_openai_function(RequestStructure, 

    name = 'extract_information')

LangChain驗證我們提供的類。例如，它確保指定了功能描述，因為LLM需要它才能使用此工具。

因此，我們得到了相同的JSON來傳遞給LLM，但現在我們將其表示為Pydantic類。

{'name': 'extract_information',

 'description': 'extracts information',

 'parameters': {'title': 'RequestStructure',

  'description': 'extracts information',

  'type': 'object',

  'properties': {'metric': {'title': 'Metric',

    'description': "main metric we need to calculate, for example, 'number of users' or 'number of sessions'",

    'type': 'string'},

   'filters': {'title': 'Filters',

    'description': 'filters to apply to the calculation (do not include filters on dates here)',

    'type': 'string'},

   'dimensions': {'title': 'Dimensions',

    'description': 'parameters to split your metric by',

    'type': 'string'},

   'period_start': {'title': 'Period Start',

    'description': 'the start day of the period for a report',

    'type': 'string'},

   'period_end': {'title': 'Period End',

    'description': 'the end day of the period for a report',

    'type': 'string'},

   'output_type': {'title': 'Output Type',

    'description': 'the desired output',

    'enum': ['number', 'visualisation'],

    'type': 'string'}},

  'required': ['metric']}}

現在，我們可以在調用OpenAI時使用它。讓我們從OpenAI API切換到LangChain，使我們的API調用更加模塊化。

定義LangChain鏈

讓我們定義一個chain來根據請求提取所需的信息。我們的chain很簡單，它由一個OpenAI模型和一個request變量（用戶消息）的提示組成。

我們還使用了bind函數將函數參數傳遞給模型。bind函數允許我們為不屬于輸入的模型（例如，函數或溫度）指定常量參數。

from langchain.prompts import ChatPromptTemplate

from langchain.chat_models import ChatOpenAI



model = ChatOpenAI(temperature=0.1, model = 'gpt-3.5-turbo-1106')\

  .bind(functions = [extract_info_function])



prompt = ChatPromptTemplate.from_messages([

    ("system", "Extract the relevant information from the provided request. \

            Extract ONLY the information presented in the initial request. \

            Don't add anything else. \

            Return partial information if something is missing."),

    ("human", "{request}")

])



extraction_chain = prompt | model

現在是時候試試我們的功能了。我們需要使用invoke方法并傳遞一個請求。

extraction_chain.invoke({'request': "How many customers visited our site on iOS in April 2023 from different countries?"})

在輸出中，我們得到了沒有任何內容但帶有函數調用的AIMessage。

AIMessage(

  content='', 

  additional_kwargs={

    'function_call': {

       'name': 'extract_information', 

       'arguments': '''{

         "metric":"number of customers", "filters":"device = 'iOS'",

         "dimensions":"country", "period_start":"2023-04-01",

         "period_end":"2023-04-30", "output_type":"number"}

        '''}

  }

)

因此，我們已經學會了如何在LangChain中使用OpenAI函數來獲得結構化輸出。現在，讓我們轉到更有趣的用例——工具和路由。

四、用例2：工具和路由

現在是時候使用工具并賦予我們的模型外部能力了。這種方法中的模型是推理引擎，它們可以決定使用什么工具以及何時使用（這稱為路由）。

LangChain有一個工具的概念——代理可以用來與世界交互的接口。工具可以是函數、LangChain鏈，甚至是其他代理。

我們可以使用format_tool_to_penai_function輕松地將工具轉換為OpenAI函數，并不斷將functions參數傳遞給LLM。

定義自定義工具

讓我們來教我們的LLM分析師計算兩個指標之間的差異。我們知道LLM可能會在數學上出錯，所以我們想讓模型使用計算器，而不是自己計算。

要定義一個工具，我們需要創建一個函數并使用@tool裝飾器。

from langchain.agents import tool



@tool

def percentage_difference(metric1: float, metric2: float) -> float:

    """Calculates the percentage difference between metrics"""

    return (metric2 - metric1)/metric1*100

現在，這個函數具有將傳遞給LLM的名稱和描述參數。

print(percentage_difference.name)

# percentage_difference.name



print(percentage_difference.args)

# {'metric1': {'title': 'Metric1', 'type': 'number'},

# 'metric2': {'title': 'Metric2', 'type': 'number'}}



print(percentage_difference.description)

# 'percentage_difference(metric1: float, metric2: float) -> float - Calculates the percentage difference between metrics'

這些參數將用于創建OpenAI功能規范。讓我們將我們的工具轉換為OpenAI函數。

from langchain.tools.render import format_tool_to_openai_function

print(format_tool_to_openai_function(percentage_difference))

結果我們得到了以下JSON。它展示了結構，但缺少字段描述。

{'name': 'percentage_difference',

 'description': 'percentage_difference(metric1: float, metric2: float) -> float - Calculates the percentage difference between metrics',

 'parameters': {'title': 'percentage_differenceSchemaSchema',

  'type': 'object',

  'properties': {'metric1': {'title': 'Metric1', 'type': 'number'},

   'metric2': {'title': 'Metric2', 'type': 'number'}},

  'required': ['metric1', 'metric2']}

}

我們可以使用Pydantic為參數指定一個模式。

class Metrics(BaseModel):

    metric1: float = Field(description="Base metric value to calculate the difference")

    metric2: float = Field(description="New metric value that we compare with the baseline")



@tool(args_schema=Metrics)

def percentage_difference(metric1: float, metric2: float) -> float:

    """Calculates the percentage difference between metrics"""

    return (metric2 - metric1)/metric1*100

現在，如果我們將新版本轉換為OpenAI函數規范，它將包括參數描述。這要好得多，因為我們可以與模型共享所有需要的上下文。

{'name': 'percentage_difference',

 'description': 'percentage_difference(metric1: float, metric2: float) -> float - Calculates the percentage difference between metrics',

 'parameters': {'title': 'Metrics',

  'type': 'object',

  'properties': {'metric1': {'title': 'Metric1',

    'description': 'Base metric value to calculate the difference',

    'type': 'number'},

   'metric2': {'title': 'Metric2',

    'description': 'New metric value that we compare with the baseline',

    'type': 'number'}},

  'required': ['metric1', 'metric2']}}

因此，我們已經定義了LLM將能夠使用的工具。讓我們練習一下。

在實踐中使用工具

讓我們定義一個chain，并將我們的工具傳遞給函數。然后，我們可以根據用戶請求對其進行測試。

model = ChatOpenAI(temperature=0.1, model = 'gpt-3.5-turbo-1106')\

  .bind(functions = [format_tool_to_openai_function(percentage_difference)])



prompt = ChatPromptTemplate.from_messages([

    ("system", "You are a product analyst willing to help your product team. You are very strict to the point and accurate. You use only facts, not inventing information."),

    ("user", "{request}")

])



analyst_chain = prompt | model

analyst_chain.invoke({'request': "In April we had 100 users and in May only 95. What is difference in percent?"})

我們得到了一個帶有正確參數的函數調用，所以它正常工作。

AIMessage(content='', additional_kwargs={

    'function_call': {

      'name': 'percentage_difference', 

      'arguments': '{"metric1":100,"metric2":95}'}

  }

)

為了有一種更方便的方法來處理輸出，我們可以使用OpenAIFunctionsAgentOutputParser。讓我們把它添加到我們的chain中。

from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser

analyst_chain = prompt | model | OpenAIFunctionsAgentOutputParser()

result = analyst_chain.invoke({'request': "There were 100 users in April and 110 users in May. How did the number of users changed?"})

現在，我們以一種更結構化的方式獲得了輸出，并且我們可以很容易地將工具的參數檢索為result.tool_input。

AgentActionMessageLog(

   tool='percentage_difference', 

   tool_input={'metric1': 100, 'metric2': 110}, 

   log="\nInvoking: percentage_difference with {'metric1': 100, 'metric2': 110}\n\n\n", 
   message_log=[AIMessage(content='', additional_kwargs={'function_call': {'name': 'percentage_difference', 'arguments': '{"metric1":100,"metric2":110}'}})]
)

因此，我們可以按照LLM的請求執行函數，如下所示。

observation = percentage_difference(result.tool_input)

print(observation)

# 10

如果我們想從模型中得到最終答案，我們需要將函數執行結果傳遞回來。要做到這一點，我們需要定義一個消息列表來傳遞給模型觀察結果。

from langchain.prompts import MessagesPlaceholder



model = ChatOpenAI(temperature=0.1, model = 'gpt-3.5-turbo-1106')\

  .bind(functions = [format_tool_to_openai_function(percentage_difference)])



prompt = ChatPromptTemplate.from_messages([

    ("system", "You are a product analyst willing to help your product team. You are very strict to the point and accurate. You use only facts, not inventing information."),

    ("user", "{request}"),

    MessagesPlaceholder(variable_name="observations")

])



analyst_chain = prompt | model | OpenAIFunctionsAgentOutputParser()

result1 = analyst_chain.invoke({

    'request': "There were 100 users in April and 110 users in May. How did the number of users changed?",

    "observations": []

})



observation = percentage_difference(result1.tool_input)

print(observation)

# 10

然后，我們需要將觀測值添加到我們的observations變量中。我們可以使用format_to_openai_functions函數以預期的方式對模型的結果進行格式化。

from langchain.agents.format_scratchpad import format_to_openai_functions

format_to_openai_functions([(result1, observation), ])

因此，我們得到了這樣一個LLM可以理解的信息。

[AIMessage(content='', additional_kwargs={'function_call': {'name': 'percentage_difference', 

                                           'arguments': '{"metric1":100,"metric2":110}'}}),

 FunctionMessage(content='10.0', name='percentage_difference')]

讓我們再次調用我們的鏈，將函數執行結果作為觀測值傳遞。

result2 = analyst_chain.invoke({

    'request': "There were 100 users in April and 110 users in May. How did the number of users changed?",

    "observations": format_to_openai_functions([(result1, observation)])

})

現在，我們從模型中得到了最終結果，這聽起來很合理。

AgentFinish(

  return_values={'output': 'The number of users increased by 10%.'}, 

  log='The number of users increased by 10%.'

)

PS：如果我們使用普通的OpenAI Chat Completion API，我們可以使用role=tool來添加另一條消息。（https://platform.openai.com/docs/guides/function-calling）有一個詳細的例子。

如果我們打開debug模式，我們可以看到傳遞給OpenAI API完整的Prompt。

System: You are a product analyst willing to help your product team. You are very strict to the point and accurate. You use only facts, not inventing information.

Human: There were 100 users in April and 110 users in May. How did the number of users changed?

AI: {'name': 'percentage_difference', 'arguments': '{"metric1":100,"metric2":110}'}

Function: 10.0

要打開LangChain調試，可以執行以下代碼：

import langchain

langchain.debug = True

我們已經嘗試使用一個tool，現在來推廣到toolkit，看看LLM如何處理它。

路由：使用多個tool

我們在分析師工具包中再添加幾個工具：

獲取月活躍用戶；
使用維基百科。

首先，讓我們定義一個偽函數，以按月份和城市過濾部分受眾，我們將再次使用Pydantic來指定函數的輸入參數。

import datetime

import random



class Filters(BaseModel):

    month: str = Field(description="Month of customer's activity in the format %Y-%m-%d")

    city: Optional[str] = Field(description="City of residence for customers (by default no filter)", 

                    enum = ["London", "Berlin", "Amsterdam", "Paris"])



@tool(args_schema=Filters)

def get_monthly_active_users(month: str, city: str = None) -> int:

    """Returns number of active customers for the specified month"""

    dt = datetime.datetime.strptime(month, '%Y-%m-%d')

    total = dt.year + 10*dt.month

    if city is None:

        return total

    else:

        return int(total*random.random())

然后，讓我們使用wikipedia Python包來查詢wikipedia。

import wikipedia



class Wikipedia(BaseModel):

    term: str = Field(description="Term to search for")



@tool(args_schema=Wikipedia)

def get_summary(term: str) -> str:

    """Returns basic knowledge about the given term provided by Wikipedia"""

    return wikipedia.summary(term)

讓我們用我們模型知道的所有函數來定義一個字典。這個字典為后面路由做鋪墊。

toolkit = {

    'percentage_difference': percentage_difference,

    'get_monthly_active_users': get_monthly_active_users,

    'get_summary': get_summary

}



analyst_functions = [format_tool_to_openai_function(f) 

  for f in toolkit.values()]

對以前的設置進行了一些更改：

稍微調整了一下系統prompt，迫使LLM在需要一些基本知識的情況下查閱維基百科。
將模型更改為GPT4，因為它更適合處理需要推理的任務。

from langchain.prompts import MessagesPlaceholder



model = ChatOpenAI(temperature=0.1, model = 'gpt-4-1106-preview')\

  .bind(functions = analyst_functions)



prompt = ChatPromptTemplate.from_messages([

    ("system", "You are a product analyst willing to help your product team. You are very strict to the point and accurate. \

        You use only information provided in the initial request. \

        If you need to determine some information i.e. what is the name of the capital, you can use Wikipedia."),

    ("user", "{request}"),

    MessagesPlaceholder(variable_name="observations")

])



analyst_chain = prompt | model | OpenAIFunctionsAgentOutputParser()

我們可以用所有函數調用我們的chain。讓我們從一個非常簡單的查詢開始。

result1 = analyst_chain.invoke({

    'request': "How many users were in April 2023 from Berlin?",

    "observations": []

})

print(result1)

我們在get_monthly_active_users的結果函數調用中輸入了參數-{‘month’：’2023–04–01’，’city’: ‘Berlin’}，這看起來是正確的。該模型能夠找到正確的工具并解決任務。

讓我們試著讓任務變得更復雜一點。

result1 = analyst_chain.invoke({

    'request': "How did the number of users from the capital of Germany\

        change between April and May 2023?",

    "observations": []

})

讓我們停下來思考一下我們希望模型如何推理。很明顯，沒有足夠的信息讓模型立即回答，因此它需要進行一系列函數調用：

調用維基百科獲取德國首都；
兩次調用get_monthly_active_users函數以獲取四月和五月的MAU；
調用percentage_difference來計算度量之間的差異。

它看起來相當復雜。讓我們看看ChatGPT是否能夠處理這個問題。

對于第一個調用，LLM返回了一個對維基百科的函數調用，其中包含以下參數-{“term”：“capital of Germany”。

看看接下來會采取什么步驟。

observation1 = toolkit[result1.tool](result1.tool_input)

print(observation1)



# The capital of Germany is the  city state of Berlin. It is the seat of 

# the President of Germany, whose official residence is Schloss Bellevue. 

# The Bundesrat ("federal council") is the representation of the Federal States 

# (Bundesl?nder) of Germany and has its seat at the former Prussian Herrenhaus 

# (House of Lords). Though most of the ministries are seated in Berlin, 

# some of them, as well as some minor departments, are seated in Bonn, 

# the former capital of West Germany.

# Although Berlin is officially the capital of the Federal Republic of Germany,

# 8,000 out of the 18,000 total officials employed at the federal bureaucracy 

# still work in Bonn, about 600 km (370 mi) away from Berlin.



# source: https://en.wikipedia.org/wiki/Capital_of_Germany 



result2 = analyst_chain.invoke({

    'request': "How did the number of users from the capital of Germany change between April and May 2023?",

    "observations": format_to_openai_functions([(result1, observation1)])

})

該模型希望使用參數{“month”：“2023–04–01”，“city”：“Berlin”}執行get_monthly_active_users。讓我們執行此操作，并再次將信息返回給模型。

observation2 = toolkit[result2.tool](result2.tool_input)

print(observation2)

# 168



result3 = analyst_chain.invoke({

    'request': "How did the number of users from the capital of Germany change between April and May 2023?",

    "observations": format_to_openai_functions([(result1, observation1), (result2, observation2)])

})

然后，模型請求再次調用get_monthly_active_users，參數為｛‘month’：‘2023–05–01’，‘city’：‘Berlin’｝。到目前為止，它做得很好。讓我們遵循它的邏輯。

observation3 = toolkit[result3.tool](result3.tool_input)

print(observation3)

# 1046



result4 = analyst_chain.invoke({

    'request': "How did the number of users from the capital of Germany change between April and May 2023?",

    "observations": format_to_openai_functions(

      [(result1, observation1), (result2, observation2), 

      (result3, observation3)])

})

隨后的結果是使用以下參數調用percentage_difference｛’metric1’：168，’metric2’：1046｝。讓我們計算觀測值并再次調用我們的chain。希望這將是最后一步。

observation4 = toolkit[result4.tool](result4.tool_input)

print(observation4)



# 523.27



result5 = analyst_chain.invoke({

    'request': "How did the number of users from the capital of Germany change between April and May 2023?",

    "observations": format_to_openai_functions(

      [(result1, observation1), (result2, observation2), 

      (result3, observation3), (result4, observation4)])

})

最終，我們從該模型中得到了以下回應：The number of users from Berlin, the capital of Germany, increased by approximately 523.27% between April and May 2023.。

以下是LLM對這個問題的完整調用方案。