第6章：多模态图像提示调用-ClaudeAPI基础入门教程

视觉能力 #

Claude 3 系列模型具有视觉能力，使 Claude 能够理解和分析图像。我们现在可以提供文本和图像输入，以丰富对话并实现强大的新用例。Opus、Sonnet 和 Haiku 都能够理解和处理图像。由于 Claude 3.5 Sonnet 具有最强大的视觉能力，我们将在本课中使用它。

为了向 Claude 提供图像，我们只需使用与纯文本对话相同的“消息”格式。典型的纯文本用户消息遵循以下模式：

messages = [
{
"role": "user",
"content": "tell me a joke"
}
]

from anthropic import Anthropic
from dotenv import load_dotenv

load_dotenv()

client = Anthropic()

messages = [
    {"role": "user", "content": "tell me a joke"}
]

response = client.messages.create(
    messages=messages,
    model="claude-3-haiku-20240307",
    max_tokens=200
)
print(response.content[0].text)
Here's a silly joke for you:

Why don't scientists trust atoms? Because they make up everything!

How was that? I tried to keep it lighthearted and family-friendly. Let me know if you'd like to hear another joke.

我们还没有看到的是，我们还可以将消息中的“内容”设置为内容块的列表。而不是：

messages = [
{"role": "user", "content": "tell me a joke"}
]

我们可以将其重新构造为如下所示：

messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "tell me a joke"},
]
}
]

以下消息是相同的：

{"role": "user", "content": "Tell me a story"}

{"role": "user", "content": [{"type": "text", "text": "Tell me a story"}]}

让我们尝试一下：

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "tell me a joke"},
        ]
    }
]

response = client.messages.create(
    messages=messages,
    model="claude-3-haiku-20240307",
    max_tokens=200
)
print(response.content[0].text)
Here's a silly joke for you:

Why can't a bicycle stand up on its own? It's two-tired!

(I hope you groan at that one - that's the sign of a good joke!)

如您所见，它起作用了！我们可以向列表中添加任意数量的内容块，如下例所示：

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "who"},
            {"type": "text", "text": "made"},
            {"type": "text", "text": "you?"},
        ]
    }
]

response = client.messages.create(
    messages=messages,
    model="claude-3-haiku-20240307",
    max_tokens=200
)
print(response.content[0].text)
I was created by Anthropic, an artificial intelligence research company.

我们为什么要这样做？我们可能不会使用纯文本提示，但在使用多模式提示时我们需要使用这种格式！

向 Claude 提供图像时，我们必须编写一个图像内容块。以下是示例：

messages = [
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "/9j/4AAQSkZJRg..."
}
}
]
}
]

此图解释了向 Claude 提供图像时所需的重要信息：

我们消息中的 content 设置为包含以下属性的字典：

type – 图像编码格式。目前，这必须是 base64
media_type – 图像媒体类型。我们目前支持 image/jpeg、image/png、image/gif 和 image/webp 媒体类型。
data – 实际图像数据本身

仅图像提示 #

通常，我们希望在提示中提供一些文本和图像，但仅提供图像也是完全可以接受的。让我们试试吧！我们在“prompting_images”文件夹中为本课提供了一些图像。让我们首先使用 Python 查看其中一张图片：

此内容仅限注册用户查看，请先

加微信：tianming608，加入大模型学习交流群，获取最新技术和创业经验。

Claude应用开发教程

Claude提示词教程

Claude函数/工具调用教程

ClaudeAPI基础入门教程

大模型应用实践课程

大模型应用实践高级课程

第6章：多模态图像提示调用-ClaudeAPI基础入门教程

视觉能力 #

仅图像提示 #

Claude大模型学习社区

分类

Welcome Back!

Create New Account!

Retrieve your password

视觉能力 #

仅图像提示 #

请给该文章点个赞吧！

Claude大模型学习社区

分类

标签

Welcome Back!

Create New Account!

Retrieve your password