FAQ

收集和整理各个 MAAS Provider 的 API 问题

tip

tool call 缓存实际是缓存的 schema+描述等

en	cn
Guardrails	护栏（安全与合规边界过滤）
Turn	轮次（对话中的一问一答）
Step	步骤（Agent 执行任务的推演动作或操作）
Thread	线索 / 话题流（维护独立上下文的对话分支）
Conversation	对话
Session	会话（持续交互的状态与生命周期）
Prompt	提示词 / 提示语
Token	词元 / Token
Context Window	上下文窗口
Hallucination	幻觉（模型生成看似合理但实质错误的内容）
Grounding	事实锚定 / 溯源 / 接地（通过引入外部权威数据限制幻觉）
Agent	智能体 / 代理
Alignment	价值对齐（让大模型行为符合人类意图和价值观规则）
Persona	角色设定 / 人设
Chain of Thought (CoT)	思维链 / 链式思考（`Think step by step`）
Function Calling / Tool Use	函数调用 / 工具调用
RAG	检索增强生成 (Retrieval-Augmented Generation)
Embedding	嵌入 / 向量表示
Inference	推理（服务端提供模型运行响应的服务过程）
Reasoning	推理（模型进行内在逻辑推演和思考的能力）
Orchestration	编排（业务中对多个 Agent 或工具的流程调度）
Few-shot / Zero-shot	少样本 / 零样本（提示工程技巧）

阿里云 ASR 模型错误

InternalError.Algo.InvalidParameter: The dedicated task `asr` corresponding to the current service does not support this input

error type
usage_limit_reached
stream_incomplete
invalid_json_schema
stream_read_error

finish_reason

finish_reason	说明
`stop`	模型自然结束生成（遇到停止符）
`length`	达到最大 Token 限制或上下文窗口限制
`tool_calls` / `function_call`	模型决定调用外部工具或函数
`content_filter`	内容因触发安全或合规策略被过滤
`end_turn`	模型表示当前轮次结束（如 Anthropic）
`error`	生成过程中发生错误
`unknown`	未知原因
refuse

non stream timeout

一般 idle timeout 默认 10m
- anthropic
- openai

499

⚠️ 有些供应商会持续处理
- zhipu

Anthropic Bedrock need thinking block for thinking

Expected `thinking` or `redacted_thinking`, but found `tool_use`.
When `thinking` is enabled, a final `assistant` message must start with a thinking block

GCP Vertex AI 要求没这么严格

Thinking encryption

闭源模型会对思考内容加密，避免被蒸馏
可能会提供思考内容的总结内容
思考内容加密后得到 singature
交叉 thinking 的时，tool call 也会包含 thinking 信息用于保留推理状态

Vertex AI

非 function 的 thought_signature 不强制要求，但推荐包含
- 确保模型高质量推理

{
  "content": {
    "role": "model",
    "parts": [
      {
        "functionCall": {
          "name": "check_flight",
          "args": {
            "flight": "AA100"
          }
        },
        "thoughtSignature": "<SIGNATURE_A>"
      }
    ]
  }
}

Anthropic

{
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this step by step...",
      "signature": "WaUjzkypQ2mUEVM36O2TxuC06KN8xyfbJwyem2dw3URve/op91XWHOEBLLqIOMfFG/UvLEczmEsUjavL...."
    },
    {
      "type": "redacted_thinking",
      "data": "EmwKAhgBEgy3va3pzix/LafPsn4aDFIT2Xlxh0L5L8rLVyIwxtE3rAFBa8cr3qpP..."
    },
    {
      "type": "text",
      "text": "Based on my analysis..."
    }
  ]
}

type signature_delta
redacted_thinking
- sonet 3.7
signature
- claude 4+
- 返回总结的思考内容

Bedrock 特殊测试 prompt

ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING_46C9A13E193C177646C7398A98432ECCCE4C1253D5E2D82641AC0E52CC2876CB

https://docs.aws.amazon.com/bedrock/latest/userguide/claude-messages-thinking-encryption.html

role developer vs system

OpenAI o1-2024-12-17 之后推出的
developer 权重比 system 高
developer
- 强调规则
system
- 强调角色

AI_APICallError: Error while downloading [URL REDACTED].

openai 相关似乎不允许 wikimedia 来源图片

Output Speed

参考	TPS
朗读/听书	3-4
正常默读	5-10
快速略读	15 - 25

Model	TPS
Claude Sonnet 4.5	40
gemini-3-flash-preview	80-100

级别	TPS	典型应用场景
超快 (Instant)	800 - 1200	实时语音助手、搜索建议
快速 (Fast)	150 - 250	简单翻译、摘要、简单对话
标准 (Standard)	70 - 100	复杂指令、代码生成、字幕
重型 (Heavy)	20 - 50	深度写作、复杂逻辑推理

Prefill Speed
- 一般 > 2000t/s
- Context Caching 加速 Prefill
TPS / Token Per Seconds
思考影响速度
- 思考 budget 影响思考深度

Gemini

Missing thought_signature in function call

Please ensure that the number of function response parts is equal to the number of function call parts of the function call turn.

Unable to submit request because thinking_budget and thinking_level are not supported together

Gemini 限制

Anthropic

<FIELD>: Extra inputs are not permitted

tool_use ids were found without tool_result blocks immediately after

tool_use ids were found without tool_result blocks immediately after: <*>. Each `tool_use` block must have a corresponding `tool_result` block in the next message.

转换

assistant: [tool_use a, tool_use b]
user: [tool_result a]
user: [tool_result b]

为

assistant: [tool_use a, tool_use b]
user: [tool_result a, tool_result b]

https://github.com/anthropics/claude-code/issues/1894

Claude temperature, top_p 不能一起传

Claude Sonnet 4.5 and Claude Haiku 4.5 only support specification of one of temperature or top_p parameters, but cannot handle both.
思考与 temperature、top_p 或 top_k 修改不兼容，也不兼容强制使用工具。
启用思考后，您无法预先填写响应。
对思考预算进行更改，会导致包含消息的缓存提示前缀失效。但是，当思考的参数发生变化时，缓存系统提示和工具定义将继续起作用。
参考
- https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages-request-response.html
- https://docs.aws.amazon.com/zh_cn/bedrock/latest/userguide/claude-messages-extended-thinking.html

max_tokens must be greater than thinking.budget_tokens

https://docs.claude.com/en/docs/build-with-claude/extended-thinking

Input should be greater than or equal to 1024

budget_tokens 最小 1024

`thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified These blocks must remain as they were in the original response

上下文丢失

https://github.com/anthropics/claude-code/issues/12311

Invalid `signature` in `thinking` block

消息里的 singature 无效，判断请求是否正确

Moonshoot

协议严格，kimi follow 类似 anthropic 的限制

tool_call_id is not found

缺少 tool_calls，但是有 tool 角色和 tool_call_id

thinking is enabled but reasoning_content is missing in assistant tool call message at index

tool_call 缺少 reasoning_content

Bedrock

reasoning: Extra inputs are not permitted

协议很严格，不允许额外字段

Access to Bedrock models is not allowed for this account.

Access to Bedrock models is not allowed for this account.
Request a quota increase from: https://support.console.aws.amazon.com/support/home?region=us-east-1#/case/create?issueType=service-limit-increase

This request has been blocked by our content filters. Our filters automatically flagged this prompt because it may conflict our AUP or AWS Responsible AI Policy. Please adjust your input image to submit a new request.

原因：

使用 AWS Bedrock 生成服务（主要集中在使用 Amazon Titan Image Generator 或特定的高敏模型时）抛出的内容审查过滤错误。 AWS 内置了严格的内容审查机制（Guardrails 和底层模型审查），当你的 prompt 或者输入内容命中了 AWS 的 AUP (Acceptable Use Policy) 或 Responsible AI Policy 时会被拦截。

处理建议：

修改 prompt，移除可能触发暴恐、色情、偏见、敏感实体（有时甚至商标名称，如 Disney 的角色名）词汇。
检查应用是否启用了自定义的 Bedrock Guardrails，如果有，可以调整 Guardrails 的敏感度阈值。
纯文本模型偶尔也会触发，大部分底层模型自身内置的安全限制用户无法修改，只能通过调整 prompt 来规避。

OpenAI

Function tools with reasoning_effort are not supported for gpt-5.5 in /v1/chat/completions. Please use /v1/responses instead.

Our servers are currently overloaded. Please try again later.

The encrypted content for item <*> could not be verified. Reason: Encrypted content could not be decrypted or parsed.

An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID <request_id> in your message.

This content was flagged for possible cybersecurity risk. If this seems wrong, try rephrasing your request. To get authorized for security work, join the Trusted Access for Cyber program: https://chatgpt.com/cyber

{"type":"error","sequence_number":0,"error":{"type":"upstream_error","message":"stream_read_error","code":"stream_read_error"}}

Codex

Your input exceeds the context window of this model. Please adjust your input and try again.

Previous response with id <*> not found

Codex SSE response headers timed out after 10000ms

Our servers are currently overloaded. Please try again later.

Misc

the upstream load is saturated, please try again later
Upstream service temporarily unavailable

The `content[].thinking` in the thinking mode must be passed back to the API.

API_ERROR

thinking is enabled but reasoning_content is missing in assistant tool call message at index

API_ERROR

finish_reason​

non stream timeout​

499​

Anthropic Bedrock need thinking block for thinking​

Thinking encryption​

role developer vs system​

AI_APICallError: Error while downloading [URL REDACTED].​

Output Speed​

Gemini

Missing thought_signature in function call​

Please ensure that the number of function response parts is equal to the number of function call parts of the function call turn.​

Unable to submit request because thinking_budget and thinking_level are not supported together​

Anthropic

tool_use ids were found without tool_result blocks immediately after​

Claude temperature, top_p 不能一起传​

max_tokens must be greater than thinking.budget_tokens​

Input should be greater than or equal to 1024​

thinking or redacted_thinking blocks in the latest assistant message cannot be modified These blocks must remain as they were in the original response​

Invalid signature in thinking block​

Moonshoot

tool_call_id is not found​

thinking is enabled but reasoning_content is missing in assistant tool call message at index​

Bedrock

reasoning: Extra inputs are not permitted​

Access to Bedrock models is not allowed for this account.​

This request has been blocked by our content filters. Our filters automatically flagged this prompt because it may conflict our AUP or AWS Responsible AI Policy. Please adjust your input image to submit a new request.​

OpenAI

Misc

The content[].thinking in the thinking mode must be passed back to the API.​

thinking is enabled but reasoning_content is missing in assistant tool call message at index​

finish_reason

non stream timeout

499

Anthropic Bedrock need thinking block for thinking

Thinking encryption

role developer vs system

AI_APICallError: Error while downloading [URL REDACTED].

Output Speed

Missing thought_signature in function call

Please ensure that the number of function response parts is equal to the number of function call parts of the function call turn.

Unable to submit request because thinking_budget and thinking_level are not supported together

tool_use ids were found without tool_result blocks immediately after

Claude temperature, top_p 不能一起传

max_tokens must be greater than thinking.budget_tokens

Input should be greater than or equal to 1024

`thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified These blocks must remain as they were in the original response

Invalid `signature` in `thinking` block

tool_call_id is not found

thinking is enabled but reasoning_content is missing in assistant tool call message at index

reasoning: Extra inputs are not permitted

Access to Bedrock models is not allowed for this account.

This request has been blocked by our content filters. Our filters automatically flagged this prompt because it may conflict our AUP or AWS Responsible AI Policy. Please adjust your input image to submit a new request.

The `content[].thinking` in the thinking mode must be passed back to the API.

thinking is enabled but reasoning_content is missing in assistant tool call message at index