前言：

我们如何给大模型搭上话很关键，我们可以定义大模型的人设，任务，基本都是通过提示词完成的，我们常用的百分之90都是对话模型，这边只讲常用的ChatPrompttemplate，非对话模型的建议了解即可

1. ChatPromptTemplate.from_template

作用：把一段纯文本变成模板。

特点：最简单、最快，不区分角色，默认当作用户消息（Human）。

这边可以看到输出结果是一个数组，具体而言这是一个 LangChain 框架中的 对象（Object），具体类型是 HumanMessage 类的一个 实例（Instance）。

详细解析：

所属库：来自于 langchain_core.messages。 2. 作用：它代表了用户发送给 AI 的消息。在对话流中，LangChain 会将对话统一封装成特定的消息对象（如 HumanMessage 代表用户，AIMessage 代表 AI，SystemMessage 代表系统指令）。 3. 数据结构： - content: 消息的具体文本内容（这里是“讲一个关于小狗的笑话”）。 - additional_kwargs: 存储额外参数的字典（预留位）。 - response_metadata: 存储与该消息相关的元数据。你可能好奇我们输入给AI也是带这些奇怪的东西吗？并不是的，additional_kwargs，response_metadata这些都属于本地字段，我们发送的只有content

往后看会再说的

2. ChatPromptTemplate.from_messages

作用：接收消息列表，支持角色区分（system / human / ai）。

特点：聊天机器人必备，可加人设、历史记录。

2.1最常用写法（元组列表）

prompt = ChatPromptTemplate.from_messages([
    ("system", "你是专业翻译官，翻译成{language}"),
    ("human", "翻译内容：{text}")
])
#使用format_message格式化，用关键词传参像函数一样
print(prompt.format_messages(
    language='英语',
    text='香樟'
))

==output:==

[
    SystemMessage(
        content='你是专业翻译官...英语',  # 核心内容
        additional_kwargs={},  # 本地预留字段，不发
        response_metadata={}   # 本地预留字段，不发
    ),
    HumanMessage(
        content='翻译内容...',  # 核心内容
        additional_kwargs={},  # 本地预留字段，不发
        response_metadata={}   # 本地预留字段，不发
    )
]

3. 真正发给 AI 的「原始请求体」（以 OpenAI GPT 为例）

当你用prompt | model调用大模型时，LangChain 会自动把这些消息对象，转换成 OpenAI API 要求的标准 JSON 格式，只提取content和角色信息，最终发出去的真实内容是：

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "system",
      "content": "你是..."
    },
    {
      "role": "user",
      "content": "翻译..."
    }
  ]
}

你看：

完全没有additional_kwargs、response_metadata这些字段
只有 AI 能识别的role（角色）和content（内容）
这就是 AI 服务器收到的、100% 真实的输入文本

输出解析器

这个是langchain返回给我们的信息

其中只有 content 是 AI 真正说的话。
其他全是 LangChain 包的 “壳 + 调试信息 + 计费信息”，AI 根本看不见，你也不用管。
真正 AI 回答 = 只有这一句

content=“香樟”翻译成英语是 camphor tree。

这里我们先学习一个最近的的解析器，StrOutputParser 是 LangChain 中最基础、最常用的输出解析器，它的作用很简单：
把 LLM 的输出对象（AIMessage）转换成纯字符串

就是平时我们需要.content才能拿到字符串，加上

StrOutputParser 后，直接得到字符串，就这么简单

我学这个之前问了Kimi,他说这个是传统解析器，其实已经不用了
输出解析器（Output Parsers）是 LangChain 中用于将 LLM 的文本输出转换为结构化数据的关键组件。根据 LangChain 1.x 的最新发展，输出解析器的使用方式发生了重要变化。

核心变化：从解析器到原生结构化输出

重要提示：LangChain 1.x 推荐使用 with_structured_output 或 create_agent 的 response_format 参数，而非传统的输出解析器。

现代 LLM（如 GPT-4、Claude、Gemini）已原生支持结构化输出，传统解析器仅在以下场景使用：

旧模型不支持原生结构化输出
需要额外的后处理或验证逻辑

1. 现代推荐方式：`with_structured_output`

这是 LangChain 当前最推荐的方式，直接绑定 Pydantic 模型：

from pydantic import BaseModel, Field

# 1. 定义您期望的数据结构
class WeatherInfo(BaseModel):

    city: str = Field(description="城市名称")
    temperature: int = Field(description="当前气温（摄氏度）")
    condition: str = Field(description="天气状况，如：晴、多云、雨")

# 3. 绑定结构化输出架构
# NVIDIA 模型通常对工具调用协议支持良好，这里直接绑定模型结构

structured_llm = llm.with_structured_output(WeatherInfo)

# 4. 调用模型
result = structured_llm.invoke("上海现在天气怎么样，气温是25度，天气晴朗")


# 5. 查看输出结果
print(f"解析后的对象类型: {type(result)}")
print(f"城市: {result.city}")
print(f"气温: {result.temperature}")
print(f"状况: {result.condition}")

这个是输出的结果

1
2
3

城市: 上海
 气温: 25
 状况: 晴朗

~~但是呢为了应对传统的情况还是得学一下的传统解析器~~

2 传统的输出解析器

2.1我们先定义一个聊天提示词

prompt = ChatPromptTemplate.from_messages([
    ("system", """你是专业翻译官，翻译成{language}。
    {format_instructions}"""),#这里给格式指令占位
    ("human", "翻译内容：{text}")
])

2.2 定义输出解析器

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field
#定义一个你想要他输出的结结构
class Output(BaseModel):

    translation: str = Field(description="翻译后的文本")

    confidence: float = Field(description="翻译置信度")

    notes: str = Field(description="文化背景说明")
#创建解析器

parser = PydanticOutputParser(pydantic_object=Output)

print(parser.get_format_instructions())

我们可以康康这个解析器长啥样

the output should be formatted as a JSON instance that conforms to the JSON schema below.
As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.
Here is the output schema:
{"properties": {"translation": {"description": "翻译后的文本", "title": "Translation", "type": "string"}, "confidence": {"description": "翻译置信度", "title": "Confidence", "type": "number"}, "notes": {"description": "文化背景说明", "title": "Notes", "type": "string"}}, "required": ["translation", "confidence", "notes"]}

大致的意思就是你得用json格式输出，并且把我后面的自定义的数据结构放进去了。

测试结果


temp=prompt.format_messages(
    language='英语',
    text='香樟',
    format_instructions=parser.get_format_instructions()  # 直接传！
)
print(llm.invoke(temp).content)

这个是输出结果，可以看到是json完全没问题！

{"translation":"camphor tree","confidence":0.95,"notes":"在中国语境中，“香樟”常指具有芳香气味、广泛用于行道与园林的樟树（Cinnamomum camphora），而非单纯的‘camphor’提取物，故译为‘camphor tree’以兼顾植物学与文化意象。"}

还有如何在 `from_messages` 中优雅地使用 `partial_variables`？

partial_variables 和在的 ChatPromptTemplate 结合起来，其实非常简单。不需要修改 from_messages 的主体，只需要在构建 Prompt 实例时传入即可：

# 1. 定义模版
prompt = ChatPromptTemplate.from_messages([
    ("system", "你是专业翻译官，目标语言：{language}。\n{format_instructions}"),
    ("human", "翻译内容：{text}")
])

# 2. 在使用时，直接通过 partial 注入固定参数
prompt = prompt.partial(language="英语")

3 提示词与输出解析

前言：