Skip to content

Conversation

dnandakumar-nv
Copy link
Contributor

@dnandakumar-nv dnandakumar-nv commented Jul 7, 2025

Description

📋 Summary

This PR adds comprehensive support for OpenAI's Responses API to the NeMo Agent toolkit, enabling users to leverage built-in tools (like Code Interpreter) and remote tools via Model Context Protocol (MCP) alongside existing NAT tools.

✨ What's New

Responses API Agent Implementation

  • Native Responses API Clients - New api_type config element for LLMs and supported instantiation of responses API type clients from supported packages. Validation of no-support for others.
  • New agent type: responses_api_agent that integrates with OpenAI's Responses API
  • Multi-tool support: Combines NAT tools, OpenAI built-in tools, and MCP remote tools in a single workflow
  • Streaming support: Full compatibility with Responses API streaming capabilities
  • Error handling: Configurable tool error handling with graceful fallbacks

Key Features

  • Built-in tool integration: Direct support for Code Interpreter, file search, and image generation tools
  • MCP protocol support: Connect to remote MCP servers with flexible configuration options
  • Tool binding: Automatic tool binding to LLM with strict=True and optional parallel tool calls
  • Validation: Runtime validation ensuring LLM supports Responses API

🔧 Configuration

Example Configuration

llms:
  openai_llm:
    _type: openai
    model_name: gpt-5-mini-2025-08-07
    api_type: responses  # Required for Responses API. Default is `chat_completions`

workflow:
  _type: responses_api_agent
  llm_name: openai_llm
  nat_tools: [current_datetime]
  builtin_tools:
    - type: code_interpreter
      container:
        type: "auto"
  mcp_tools:
    - type: mcp
      server_label: deepwiki
      server_url: https://mcp.deepwiki.com/mcp
      allowed_tools: [read_wiki_structure, read_wiki_contents]
      require_approval: never

Tool Types Supported

  • nat_tools: NeMo Agent toolkit tools (executed by agent graph)
  • builtin_tools: OpenAI built-in tools (Code Interpreter, file search, image generation)
  • mcp_tools: Remote tools via Model Context Protocol

📚 Documentation

Complete README Section

Added comprehensive documentation covering:

  • What is Responses API: Clear explanation of capabilities vs Chat Completions
  • Prerequisites: API key setup and model requirements
  • Usage examples: CLI commands, server setup, and curl requests
  • Configuration guide: Detailed explanation of all configuration options
  • Built-in tools: Code Interpreter setup and other available tools
  • MCP integration: Complete schema reference and examples
  • Troubleshooting: Common issues and solutions

Code Examples

# Run the agent
nat run --config_file=examples/agents/tool_calling/configs/config-responses-api.yml --input "How many 0s are in the current time?"

🏗️ Implementation Details

Files Added/Modified

  • src/nat/agent/responses_api_agent/register.py: Core agent implementation
  • src/nat/data_models/openai_mcp.py: MCP tool schema definitions
  • examples/agents/tool_calling/configs/config-responses-api.yml: Example configuration
  • examples/agents/tool_calling/README.md: Comprehensive documentation
  • .... more core library changes and tests

Technical Architecture

  • New support for api_type in LLM configurations: One of chat_completions or responses.
  • New clients for several packages that supports responses API for use in any workflow
  • New Agent Leverages existing ToolCallAgentGraph for consistent agent behavior
  • Binds built-in and MCP tools directly to LLM for Responses API execution
  • Maintains separation between NAT tools (graph-executed) and API tools (LLM-executed)
  • Supports configurable recursion limits and error handling

✅ Testing

Manual Testing

  1. Set OpenAI API key: export OPENAI_API_KEY=<key>
  2. Run example: nat run --config_file=examples/agents/tool_calling/configs/config-responses-api.yml --input "How many 0s are in the current time?"
  3. Verify Code Interpreter works: Ask for mathematical calculations or data analysis
  4. Test MCP integration: Configure a public MCP server and verify tool calls

Validation

  • ✅ Linting passes (Vale, etc.)
  • ✅ Configuration validation works
  • ✅ Error handling for unsupported models
  • ✅ Documentation formatting correct

🚀 Usage

After merging, users can:

  1. Configure any supported model with api_type: responses
  2. Enable Code Interpreter for data analysis and computation
  3. Connect to MCP servers for extended functionality
  4. Combine multiple tool types in a single workflow
  5. Use streaming responses for real-time interaction

🔄 Backward Compatibility

  • ✅ No breaking changes to existing tool calling agent
  • ✅ New agent type is opt-in via configuration
  • ✅ Existing examples and workflows unchanged
  • ✅ Follows established NAT toolkit patterns

📖 Related Documentation

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Upgraded multiple dependencies, including `llama-index`, `llama-index-core`, and `langchain-openai` to newer versions for compatibility and stability improvements. Adjusted `uv.lock` to reflect updated dependency revisions and metadata changes.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce `APITypeEnum` to define supported API types and add `api_type` field in `LLMBaseConfig` with default and schema extras. Update LangChain plugin to validate API type for NVIDIA and AWS Bedrock, and enable `responses` API type for OpenAI integration.

Signed-off-by: dnandakumar-nv <[email protected]>
This update introduces explicit checks for the `api_type` field in NVIDIA, OpenAI, and AWS Bedrock configurations. Unsupported API types now raise a `ValueError` with clear error messages, ensuring compatibility and preventing misconfigurations. Additionally, support for `OpenAIResponses` is added for OpenAI when the API type matches.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce an OpenAIMCPSchemaTool class and MCPApprovalRequiredEnum to define the OpenAI MCP schema. This includes fields for tool configuration, server information, and approval requirements, leveraging Pydantic models for validation.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce a new `ResponsesAPIAgentWorkflowConfig` and `responses_api_agent_workflow` function for managing an LLM-based ReAct agent that integrates tool calls, including AIQ, MCP, and built-in tools. The agent supports flexible configurations, detailed logging, and error handling, enhancing the extensibility of tool interaction workflows.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated config handling to exclude "api_type" in model dumps across multiple plugins for improved consistency. Adjusted agent graph building to use `await` for proper asynchronous operation. These changes ensure better API compatibility and alignment with async programming standards.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated the logic to handle cases where the response content is a list of dictionaries, ensuring compatibility and preventing errors. This change improves robustness and avoids potential crashes when extracting text from output messages.

Signed-off-by: dnandakumar-nv <[email protected]>
Added a try-except block to safely handle exceptions during tool schema extraction and log relevant errors. Also, adjusted logging for tool information to display objects directly instead of accessing their names.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated MCP tools iteration for clarity by renaming variables in `register.py`. Added a default value for `headers` in `openai_mcp.py` to ensure consistent behavior when undefined.

Signed-off-by: dnandakumar-nv <[email protected]>
Improved logging messages with additional context for errors and adjusted code style for consistency across multiple files. Also fixed import order and formatting for better readability and maintainability.

Signed-off-by: dnandakumar-nv <[email protected]>
This commit introduces `config_responses_api.yml`, a configuration file defining tools, LLM settings, and workflow for a response API integration in the simple calculator example. It includes OpenAI's GPT-4.1 setup and tools like calculator operations and a code interpreter. This enhancement enables expanded functionality and better modularity.

Signed-off-by: dnandakumar-nv <[email protected]>
This change adds the required Apache 2.0 license header to the `__init__.py` file in the `responses_api_agent` module. It ensures compliance with licensing requirements and explicitly states the terms under which the code may be used.

Signed-off-by: dnandakumar-nv <[email protected]>
Updated llm_config usage to exclude the "api_type" field in model_dump across multiple LLM-related components. This ensures consistent handling of configuration objects and prevents unintended data inclusion. Also, made a minor docstring adjustment in ResponsesAPIAgentWorkflowConfig for clarity.

Signed-off-by: dnandakumar-nv <[email protected]>
Excluded "api_type" from metadata processing to improve clarity and relevance in documentation. Adjusted model and id key formatting in CrewAI and Agno LLM configurations for better readability.

Signed-off-by: dnandakumar-nv <[email protected]>
Remove skipping of the `api_type` field in metadata utilities to include it in documentation. Update tests to validate the new behavior, ensuring `api_type` is correctly handled for LLM configurations.

Signed-off-by: dnandakumar-nv <[email protected]>
Copy link

copy-pr-bot bot commented Jul 7, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@dnandakumar-nv dnandakumar-nv added enhancement non-breaking Non-breaking change improvement Improvement to existing functionality and removed enhancement labels Jul 7, 2025
Introduced `ServerToolUseSchema` to handle server-side tool output parsing. Updated callback handler to extract and store tool outputs from `message.additional_kwargs`. Minor exception handling adjustments were made for robustness in tool schema extraction and parsing processes.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduce a logger to the LLM module for better debugging and observability. Added a warning log and enforced `stream=False` when using the OpenAI Responses API, as streaming is not supported in this mode.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduced a logger to the Llama Index LLM module to improve traceability. Added a warning for OpenAIResponses API users about the lack of support for AIQ callback handlers and the absence of intermediate step logging.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduced a `ServerToolUseSchema` model for handling tool usage data from server-side tool calls. Updated the `LlamaIndexProfilerHandler` to process and log `built_in_tool_calls` from response metadata. Simplified logging and removed unnecessary warnings in the LLM plugin code.

Signed-off-by: dnandakumar-nv <[email protected]>
Moved the `ServerToolUseSchema` class to `intermediate_step.py` for reusability across modules. Adjusted imports and removed redundant definitions in `langchain_callback_handler.py` and `llama_index_callback_handler.py`.

Signed-off-by: dnandakumar-nv <[email protected]>
Introduced `_validate_no_responses_api` to ensure LLM configurations do not use the unsupported Responses API for Semantic Kernel, CrewAI, and Agno connectors. Updated specific workflow registrations to integrate this validation logic, improving robustness and preventing misconfigurations.

Signed-off-by: dnandakumar-nv <[email protected]>
This update introduces comprehensive unit tests for multiple frameworks, including LLaMA-Index, CrewAI, LangChain, Semantic Kernel, and Agno. Key additions include validation for API types, parameter passthroughs, and decorator registrations, ensuring correctness and reliability of wrapper implementations.

Signed-off-by: dnandakumar-nv <[email protected]>
This update introduces comprehensive unit tests for multiple frameworks, including LLaMA-Index, CrewAI, LangChain, Semantic Kernel, and Agno. Key additions include validation for API types, parameter passthroughs, and decorator registrations, ensuring correctness and reliability of wrapper implementations.

Signed-off-by: dnandakumar-nv <[email protected]>
@dnandakumar-nv dnandakumar-nv marked this pull request as ready for review July 8, 2025 14:04
dnandakumar-nv and others added 4 commits August 8, 2025 17:49
`google-search-results` has been removed from the dependencies and lock file as it is no longer required. This update simplifies dependency management and reduces unnecessary package overhead.
`google-search-results` has been removed from the dependencies and lock file as it is no longer required. This update simplifies dependency management and reduces unnecessary package overhead.
Upgraded several dependencies including `openai`, `llama-index`, `langchain-openai`, and others to their latest compatible versions. Added new dependencies such as `banks` and `griffe` where needed. Ensures compatibility, security, and access to the latest features.

Signed-off-by: dnandakumar-nv <[email protected]>
@dnandakumar-nv
Copy link
Contributor Author

/ok to test a46457f

This commit introduces various `pylint: disable` statements to suppress specific warnings in multiple files. Additionally, it updates the dependency versions in `pyproject.toml` and simplifies metadata in `uv.lock` by removing redundant upload-time information from package entries. These changes aim to enhance code clarity and ensure compatibility with updated libraries.

Signed-off-by: dnandakumar-nv <[email protected]>
@dnandakumar-nv
Copy link
Contributor Author

/ok to test e6da879

dnandakumar-nv and others added 2 commits August 13, 2025 10:01
# Conflicts:
#	examples/frameworks/multi_frameworks/pyproject.toml
#	packages/nvidia_nat_agno/src/nat/plugins/agno/llm.py
#	packages/nvidia_nat_agno/tests/test_llm_agno.py
#	packages/nvidia_nat_crewai/src/nat/plugins/crewai/llm.py
#	packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
#	packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
#	packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
#	src/nat/data_models/openai_mcp.py
#	src/nat/profiler/callbacks/langchain_callback_handler.py
#	src/nat/profiler/callbacks/llama_index_callback_handler.py
#	src/nat/utils/responses_api.py
#	uv.lock
The revision number in the `uv.lock` file was reset to 1 for tracking consistency. Additionally, upload-time metadata was removed from all dependencies for cleaner file management and improved reproducibility.

Signed-off-by: dnandakumar-nv <[email protected]>
@dnandakumar-nv
Copy link
Contributor Author

/ok to test d20b108

dnandakumar-nv and others added 4 commits August 13, 2025 12:47
Introduce a new YAML configuration file for a tool-calling agent that integrates functionalities like Wikipedia search, current datetime retrieval, and Python code generation. This setup also includes evaluation workflows for accuracy, relevance, and groundedness using the specified metrics.

Signed-off-by: dnandakumar-nv <[email protected]>
Added support for OpenAI's Responses API, enabling use of built-in tools like Code Interpreter, and restructured tool configurations for cleaner workflows. Updated example YAML and README to illustrate Responses API usage, including NAT, built-in, and MCP tools. Enhanced LLM configuration to include gpt-5 models for broader compatibility.
Added "raising-format-tuple" and "comparison-with-callable" to pylint disable directives to suppress unnecessary warnings. This ensures the code adheres to the desired linting standards without being cluttered by irrelevant alerts.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 06370c7

Replaces old `aiq` commands with updated `nat` commands and adjusts an example input. Improves clarity by formatting tool references and field details with consistent code styling. Refines descriptions of built-in and remote tools for better readability.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 9149a2d

Replaces old `aiq` commands with updated `nat` commands and adjusts an example input. Improves clarity by formatting tool references and field details with consistent code styling. Refines descriptions of built-in and remote tools for better readability.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 605de7b

This commit introduces a new documentation file explaining the integration of OpenAI's Responses API with the NeMo Agent toolkit. It details LLM and agent configuration, tool usage (built-in, NAT, and MCP), and configurable options, providing examples for setup and execution.
Introduce tests to validate Responses API agent functionality, including tool binding, LLM capability checks, and workflow execution. These ensure proper integration of NAT tools, MCP tools, and built-in tools with the agent.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test e786c12

Updated the workflows documentation to include the Responses API and Agent. Adjusted test script to disable specific pylint rules related to unused arguments and non-async context managers.
@dnandakumar-nv
Copy link
Contributor Author

/ok to test 66ff758

dnandakumar-nv and others added 6 commits August 13, 2025 19:44
Updated the workflows documentation to include the Responses API and Agent. Adjusted test script to disable specific pylint rules related to unused arguments and non-async context managers.
…pport

# Conflicts:
#	examples/agents/tool_calling/README.md
#	src/nat/profiler/callbacks/llama_index_callback_handler.py
…pport

# Conflicts:
#	packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py
#	packages/nvidia_nat_llama_index/pyproject.toml
#	packages/nvidia_nat_llama_index/src/nat/plugins/llama_index/llm.py
#	packages/nvidia_nat_semantic_kernel/src/nat/plugins/semantic_kernel/llm.py
#	uv.lock
Revised error messages to reflect specific LLM support for API types. Updated multiple dependencies including `llama-index` and `langchain-openai` to their latest versions for improved compatibility and features.

Signed-off-by: dnandakumar-nv <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement to existing functionality non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants