工具调用（又称函数调用）

了解如何使用 Firebase AI Logic SDK 实现工具调用、管理智能体循环，并融入人机回圈交互。

诚然，LLM 的训练数据本质上覆盖整个互联网，但它们并非无所不知。

它们知道训练当日公开互联网上的内容，但不知道更晚的信息；不知道你或组织的私有信息；即使已知内容也易与其他知识纠缠。

在这些场景及许多其他情况下，我们常向 LLM 提供一个或多个工具。

工具由名称、描述以及 LLM「调用」工具时输入数据格式的 JSON schema 组成。例如，若我们提示 LLM「减少 Grandma's All America Breakfast 食谱中的碳水」，它不知道奶奶的菜谱是什么，除非我们提供接受查询字符串的 lookupRecipe 工具来查找食谱。

概念上，工具是我们在 LLM 需要该数据或服务时供其调用的东西。 LLM 调用工具的方式是：以表示「工具调用」的特殊格式消息响应应用请求；工具调用消息包含工具名称与 JSON 参数。应用处理工具调用，将结果打包进另一条 LLM 请求，LLM 再据此响应。

这可能持续多轮。

应用可为模型实例配置任意数量的工具（尽管 LLM 在较小、目标明确且功能不重叠的工具集上表现更好）。 LLM 可在响应中打包任意数量的工具调用，也可在请求中接收任意数量的工具结果。 LLM 通过构成请求/响应对历史的消息栈，整合提示词与工具调用结果的多轮往返。

工具调用结束后，LLM 返回最终响应，例如「这是 Grandma's All American Breakfast 食谱的高蛋白低碳水版本……」。

在 Firebase AI Logic SDK 中，工具称为「function（函数）」，但含义相同。在示例中，线索求解模型配置了查找单词详情的函数。若 LLM 需要单词详情辅助求解，调用该函数可从 Free Dictionary API 获取数据：

json

[
  {
    "word": "tool",
    "phonetic": "/tuːl/",
    "phonetics": [
      {
        "text": "/tuːl/",
        "audio": "https://api.dictionaryapi.dev/media/pronunciations/en/tool-uk.mp3",
        "sourceUrl": "https://commons.wikimedia.org/w/index.php?curid=94709459",
        "license": {
          "name": "BY-SA 4.0",
          "url": "https://creativecommons.org/licenses/by-sa/4.0"
        }
      }
    ],
    "meanings": [
      {
        "partOfSpeech": "noun",
        "definitions": [
          {
            "definition": "A mechanical device intended to make a task easier.",
            "synonyms": [],
            "antonyms": [],
            "example": "Hand me that tool, would you?   I don't have the right tools to start fiddling around with the engine."
          },
...

应用中有执行查找的 Dart 函数：

dart

// Look up the metadata for a word in the dictionary API.
Future<Map<String, dynamic>> _getWordMetadataFromApi(String word) async {
  final url = Uri.parse(
    'https://api.dictionaryapi.dev/api/v2/entries/en/${Uri.encodeComponent(word)}',
  );

  final response = await http.get(url);
  return response.statusCode == 200
      ? {'result': jsonDecode(response.body)}
      : {'error': 'Could not find a definition for "$word".'};
}

模型在初始化时配置了该查找函数：

dart

// The model for solving clues.
_clueSolverModel = FirebaseAI.googleAI().generativeModel(
  model: 'gemini-2.5-flash',
  systemInstruction: Content.text(clueSolverSystemInstruction),
  tools: [
    Tool.functionDeclarations([
      FunctionDeclaration(
        'getWordMetadata',
        'Gets grammatical metadata for a word, like its part of speech. '
        'Best used to verify a candidate answer against a clue that implies a '
        'grammatical constraint.',
        parameters: {
           'word': Schema(SchemaType.string, description: 'The word to look up.'),
         },
       ),
    ]),
  ],
);

为可靠起见，在系统指令中列出工具也是好主意：

dart

static String get clueSolverSystemInstruction =>
    '''
You are an expert crossword puzzle solver.

...

### Tool: `getWordMetadata`

You have a tool to get grammatical information about a word.

**When to use:**
- This tool is most helpful as a verification step after you have a likely answer.
- Consider using this tool when a clue contains a grammatical hint that could be ambiguous.
- **Good candidates for verification:**
  - Clues that seem to be verbs (e.g., "To run," "Waving").
  - Clues that are adverbs (e.g., "Happily," "Quickly").
  - Clues that specify a plural form.
- **Try to avoid using the tool for:**
  - Simple definitions (e.g., "A small dog").
  - Fill-in-the-blank clues (e.g., "___ and flow").
  - Proper nouns (e.g., "Capital of France").

**Function signature:**
```json
${jsonEncode(_getWordMetadataFunction.toJson())}
```
''';

应用发起请求时，模型在认为有帮助时可使用该工具。要支持工具调用，需实现智能体循环。

LLM 在功能上是无状态的，意味着每次请求都必须提供其所需的全部数据。若请求仅含提示词及附带文件，Firebase AI Logic SDK 在模型实例上提供 generateContent 方法。

然而，工具调用需要由初始提示词以及构成工具调用与工具结果的响应/请求对组成的消息历史。为此 Firebase AI Logic 提供用于收集历史的「chat」对象。我们用它构建智能体循环：

启动 chat，在多个请求/响应对之间保存消息历史
收集其提供的任意工具调用的工具结果
将工具结果打包进新请求
循环直到模型返回不含工具调用的响应
返回所有响应累积的文本

以下算法以 GenerativeModel 类的扩展方法表达，可像调用 generateContent 一样调用：

dart

extension on GenerativeModel {
  Future<String> generateContentWithFunctions({
    required String prompt,
    required Future<Map<String, dynamic>> Function(FunctionCall) onFunctionCall,
  }) async {
    // Use a chat session to support multiple request/response pairs, which is
    // needed to support function calls.
    final chat = startChat();
    final buffer = StringBuffer();
    var response = await chat.sendMessage(Content.text(prompt));

    while (true) {
      // Append the response text to the buffer.
      buffer.write(response.text ?? '');

      // If no function calls were collected, we're done
      if (response.functionCalls.isEmpty) break;

      // Append a newline to separate responses.
      buffer.write('\n');

      // Execute all function calls
      final functionResponses = <FunctionResponse>[];
      for (final functionCall in response.functionCalls) {
        try {
          functionResponses.add(
            FunctionResponse(
              functionCall.name,
              await onFunctionCall(functionCall),
            ),
          );
        } catch (ex) {
          functionResponses.add(
            FunctionResponse(functionCall.name, {'error': ex.toString()}),
          );
        }
      }

      // Get the next response stream with function results
      response = await chat.sendMessage(
        Content.functionResponses(functionResponses),
      );
    }

    return buffer.toString();
  }
}

该方法接收提示词与处理具体工具调用的回调；示例用它处理单词查找函数：

dart

await _clueSolverModel.generateContentWithFunctions(
  prompt: getSolverPrompt(clue, length, pattern),
  onFunctionCall: (functionCall) async => switch (functionCall.name) {
    'getWordMetadata' => await _getWordMetadataFromApi(
      functionCall.args['word'] as String,
    ),
    _ => throw Exception('Unknown function call: ${functionCall.name}'),
  },
);

结构化输出使 LLM 便于编程对接，而工具将 LLM 变为「智能体 (agent)」（详见交互模式一节）。

结合结构化输出与工具调用是强大组合。在示例中，线索求解器有查找单词详情的工具，还被要求返回捆绑解法与置信度的 JSON，二者显示在应用任务列表中：

App task list showing crossword clues followed by bold answers and
confidence scores in parentheses

遗憾的是，截至本文撰写时，在 Firebase AI Logic SDK 中同时启用结构化输出与函数会抛出异常：

Function calling with a response mime type: 'application/json' is unsupported

作为（希望临时的）变通，示例移除结构化输出配置，改用名为 returnResult 的工具模拟结构化输出：

dart

 // The model for solving clues.
_clueSolverModel = FirebaseAI.googleAI().generativeModel(
  model: 'gemini-2.5-flash',
  systemInstruction: Content.text(clueSolverSystemInstruction),
  tools: [
    Tool.functionDeclarations([
      ...,
      FunctionDeclaration(
        'returnResult',
        'Returns the final result of the clue solving process.',
        parameters: {
        'answer': Schema(
          SchemaType.string,
          description: 'The answer to the clue.',
        ),
        'confidence': Schema(
          SchemaType.number,
          description: 'The confidence score in the answer from 0.0 to 1.0.',
          ),
        },
      ),
    ]),
  ],
);

returnResult 也在系统指令中说明：

dart

static String get clueSolverSystemInstruction =>
    '''
You are an expert crossword puzzle solver.

...

### Tool: `returnResult`

You have a tool to return the final result of the clue solving process.

**When to use:**
- Use this tool when you have a final answer and confidence score to return. You
must use this tool exactly once, and only once, to return the final result.

**Function signature:**
```json
${jsonEncode(_returnResultFunction.toJson())}
```
''';

模型调用 returnResult 时，示例缓存结果， solveClue 在调用 generateContentWithFunctions 后读取：

dart

// Buffer for the result of the clue solving process.
final _returnResult = <String, dynamic>{};

// Cache the return result of the clue solving process via a function call.
// This is how we get JSON responses from the model with functions, since the
// model cannot return JSON directly when tools are used.
Map<String, dynamic> _cacheReturnResult(Map<String, dynamic> returnResult) {
  assert(_returnResult.isEmpty);
  _returnResult.addAll(returnResult);
  return {'status': 'success'};
}

Future<ClueAnswer?> solveClue(Clue clue, int length, String pattern) async {
  // Clear the return result cache; this is where the result will be stored.
  _returnResult.clear();

  // Generate JSON response with functions and schema.
  await _clueSolverModel.generateContentWithFunctions(
    prompt: getSolverPrompt(clue, length, pattern),
    onFunctionCall: (functionCall) async => switch (functionCall.name) {
      'getWordMetadata' => ...,
      'returnResult' => _cacheReturnResult(functionCall.args),
      _ => throw Exception('Unknown function call: ${functionCall.name}'),
    },
  );

  // Use the structured output that the LLM has called function with
  assert(_returnResult.isNotEmpty);
  return ClueAnswer(
    answer: _returnResult['answer'] as String,
    confidence: (_returnResult['confidence'] as num).toDouble(),
  );
}

用 Firebase AI Logic 同时获得结构化输出与工具调用需要多费些功夫，但结果值得。

到目前为止，工具用于收集数据与格式化输出。也可用它让人类参与。

例如，示例有时传入解法应匹配的模式（如 "_R_Y"），模型却想建议不匹配的模式（如 "RENT"）。此类冲突适合向用户求助：

Crossword Companion app displaying a Conflict Detected dialog asking for
user input to resolve a clue pattern

这称为「人机回圈（human in the loop）」，是人类与 LLM 协作的又一方式。 Flutter 与 Firebase AI Logic SDK 使其实现简单。首先，示例定义函数并配置模型：

dart


// The new function to let the LLM resolve solution conflicts
static final _resolveConflictFunction = FunctionDeclaration(
  'resolveConflict',
  'Asks the user to resolve a conflict between the letter pattern and the '
  'proposed answer. Use this BEFORE calling returnResult if the answer you '
  'want to propose does not match the letter pattern.',
  parameters: {
    'proposedAnswer': Schema(
      SchemaType.string,
      description: 'The answer the LLM wants to suggest.',
    ),
    'pattern': Schema(
      SchemaType.string,
      description: 'The current letter pattern from the grid.',
    ),
    'clue': Schema(SchemaType.string, description: 'The clue text.'),
  },
);

// Pass the new tool to the model for solving clues.
final _clueSolverModel = FirebaseAI.googleAI().generativeModel(
  model: 'gemini-2.5-flash',
  systemInstruction: Content.text(clueSolverSystemInstruction),
  tools: [
    Tool.functionDeclarations([
      ...
      _resolveConflictFunction,
    ]),
  ],
);
// Let the LLM know that it has a new tool.
static String get clueSolverSystemInstruction =>
    '''
You are an expert crossword puzzle solver.

...

### Tool: `resolveConflict`

You have a tool to ask the user to resolve a conflict.

**When to use:**
- Use this tool **BEFORE** `returnResult` if your proposed answer conflicts with the provided letter pattern.
- For example, if the pattern is `_ R _ Y` and you want to suggest `RENT` (which fits the clue), there is a conflict at the second letter (`R` vs `E`). You should call `resolveConflict(proposedAnswer: "RENT", pattern: "_ R _ Y", clue: "...")`.
- The tool will return the user's decision (either your proposed answer or a new one). You should then use that result to call `returnResult`.

**Function signature:**
```json
${jsonEncode(_resolveConflictFunction.toJson())}
```
''';

模型发现冲突时会调用该工具：

dart

// handle the LLM's request to resolve the conflict
await _clueSolverModel.generateContentWithFunctions(
  prompt: getSolverPrompt(clue, length, pattern),
  onFunctionCall: (functionCall) async => switch (functionCall.name) {
    ...
    'resolveConflict' => await _handleResolveConflict(
      functionCall.args,
      onConflict,
    ),
  },
);

// Show the dialog to gather the user's input
Future<Map<String, dynamic>> _handleResolveConflict(
  Map<String, dynamic> args,
  Future<String> Function(String clue, String proposedAnswer, String pattern)?
  onConflict,
) async {
  final proposedAnswer = args['proposedAnswer'] as String;
  final pattern = args['pattern'] as String;
  final clue = args['clue'] as String;

  if (onConflict != null) {
    final result = await onConflict(clue, proposedAnswer, pattern);
    return {'result': result};
  }

  return {'result': proposedAnswer};
}

示例通过 onConflict 实现处理该工具，调用 showDialog 收集用户数据。这一切发生在智能体循环中间，但没问题——模型并未等待；它已向应用的初始请求返回响应。用户可慢慢操作 UI，示例等待 showDialog 返回的 Future。完成后，模型借助消息历史与最近请求（此处为与用户交互收集的数据）从断点继续。

模态对话框是人机回圈 (human in the loop) 的简单方式，但不是 Flutter 的唯一方式。若你愿意，Completer 实例可让应用进入「向用户收集数据」模式；有数据后对 Completer 调用 complete 并恢复智能体循环。

或者，既然你拥有智能体循环，可检查对「特殊」函数的调用——表示需向用户收集数据。此类特殊函数有时称为「interrupt（中断）」，获得用户数据后你「resume（恢复）」与模型的对话。

请记住 LLM 是无状态的。它不会等你，因此可按最适合应用的方式处理智能体循环。你可随时带着更新的消息历史与新提示词回到 LLM，无论间隔一分钟还是一个月。

本页内容对你有帮助吗？

除非另有说明，本文档之所提及适用于 Flutter 3.44.0 版本。本页面最后更新时间：2026-06-12。查看文档源码或者为本页面内容提出建议。

工具调用（又称函数调用）

工具的定义

Gemini 函数

智能体循环

结构化输出与工具调用

人机回圈 (Human in the loop)