Skip to content

Conversation

ltwlf
Copy link
Contributor

@ltwlf ltwlf commented Aug 7, 2025

Summary

Adds reasoning capabilities to Python OpenAI ResponsesAgent, bringing parity with the C# implementation. This enables fine-grained control over reasoning effort for O-series models (o1, o3-mini, o4-mini) and gpt-5 that support the reasoning parameter.

Fixes #12843

Changes

Core Implementation

  • openai_responses_agent.py: Added reasoning configuration support
    • Constructor-level reasoning effort setting
    • Per-invocation reasoning effort override capability
    • Proper parameter validation and model compatibility checks
  • responses_agent_thread_actions.py: Extended thread actions to support reasoning parameters
    • Reasoning effort propagation through thread operations
    • Metadata preservation for reasoning tokens and summaries

Sample and Tests

  • responses_agent_reasoning.py: Comprehensive demonstration sample
    • Constructor vs per-invocation reasoning configuration
    • Function calling integration with reasoning
    • Reasoning comparison scenarios (low/medium/high effort)
    • Error handling and troubleshooting guidance
  • test_openai_responses_agent_reasoning.py: Full unit test coverage
    • Parameter validation tests
    • Integration scenarios with function calling
    • Edge cases and error conditions

Features

Constructor-Level Reasoning: Set default reasoning effort when creating agents
Per-Invocation Override: Override reasoning effort per request
Priority Hierarchy: per-invocation > constructor > model default
Function Calling Compatible: Works seamlessly with existing plugin system
Azure OpenAI & OpenAI Support: Compatible with both service providers
Model Validation: Automatic compatibility checks for O-series models nad GÜT-5
Metadata Access: Reasoning tokens and summaries available in response metadata

Usage Example

# Constructor-level reasoning
# Constructor-level reasoning configuration
agent = AzureResponsesAgent(
    ai_model_id="gpt-5",
    client=client,
    reasoning={"effort": "low"}  # Default reasoning for all requests
)

# Per-invocation override
response = await agent.invoke(
    "Solve this complex problem step by step",
    reasoning={"effort": "high"}  
)


# Invoke with reasoning callback to capture intermediate thoughts
response = await agent.invoke(
    "Analyze this data step by step",
    reasoning={"effort": "high", "summary": "detailed"},
    on_intermediate_message=handle_reasoning_message
)

# Streaming with reasoning
async for response in agent.invoke_stream(
    "Explain quantum computing in detail",
    reasoning={"effort": "high", "summary": "detailed"},
    on_intermediate_message=handle_reasoning_message
):
    print(response.content, end="", flush=True)

@moonbox3 moonbox3 added the python Pull requests for the Python Semantic Kernel label Aug 7, 2025
@ltwlf ltwlf force-pushed the feature/response-reasoning branch 2 times, most recently from 6a7b8a7 to 492975f Compare August 7, 2025 08:32
@ltwlf ltwlf marked this pull request as ready for review August 7, 2025 08:33
@Copilot Copilot AI review requested due to automatic review settings August 7, 2025 08:33
@ltwlf ltwlf requested a review from a team as a code owner August 7, 2025 08:33
Copilot

This comment was marked as outdated.

@ltwlf
Copy link
Contributor Author

ltwlf commented Aug 7, 2025

@eavanvalkenburg @markwallace-microsoft this PR for reasoning support for ResponseAgent that I mention yesterday in the Office Hours. Looking forward to your feedback.
Best, Christian

@moonbox3
Copy link
Collaborator

moonbox3 commented Aug 7, 2025

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
__init__.py40100% 
const.py70100% 
agents/open_ai
   openai_responses_agent.py41710475%67, 96, 115–118, 126, 130–131, 163–167, 171, 176, 181, 186, 192, 197–198, 201, 204, 209–214, 219–220, 222–223, 229–232, 238, 240–241, 245–249, 350, 354, 358, 360, 366, 379, 381, 383, 387, 484–485, 492–493, 495–496, 498–499, 501–506, 508, 556–557, 560, 605, 621, 631, 665, 673, 676, 680, 682, 685, 708, 711–715, 725, 765, 790, 801, 890, 1013, 1057, 1059, 1133, 1177–1178, 1180–1184, 1203
   responses_agent_thread_actions.py43712671%177, 194, 207, 218–219, 227, 234, 247, 407–408, 410, 424, 429, 436, 451–452, 462, 469, 471–472, 481, 488, 490–491, 502, 509, 511–512, 522, 529, 534–535, 539, 551, 585–587, 590, 608–609, 620–622, 626, 630–631, 635–637, 648–650, 672, 674–675, 682–683, 685–686, 688, 690–691, 693–694, 792, 797, 805–806, 808, 814, 819, 824–826, 832–834, 839–840, 842–847, 849, 853–854, 856, 860, 864, 869–870, 875, 878–879, 939–942, 965, 1011–1014, 1055–1056, 1060–1062, 1064, 1089–1090, 1092–1096, 1098, 1108–1109, 1117, 1121, 1128, 1134, 1210
contents
   __init__.py220100% 
   chat_message_content.py1370100% 
   const.py290100% 
   reasoning_content.py24675%46–48, 53, 55, 59
   streaming_reasoning_content.py120100% 
TOTAL27050467982% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3715 22 💤 0 ❌ 0 🔥 1m 37s ⏱️

@ltwlf
Copy link
Contributor Author

ltwlf commented Aug 7, 2025

GPT-5 reasoning implementation tested and working flawlessly!

Key findings:

  • GPT-5 supports reasoning with reasoning_effort="low" for structured responses
  • Works great without reasoning too
  • Function calling performs excellently with calculator demos

Model behavior:

  • O-series (o1, o3-mini, o4-mini): Auto-reasoning when unspecified
  • GPT-5: Reasoning support with new minimal effort option
  • GPT-4.1: Fails with reasoning params (expected)

Implementation notes:

  • Priority hierarchy handles GPT-5 seamlessly
  • OpenAIResponsesAgent(ai_model_id="gpt-5") works out of the box
  • Explicit reasoning leverages GPT-5's enhanced capabilities

Result: Reasoning feature is ready for OpenAI's latest models and gracefully handles both current O-series and future reasoning-capable models.

@ltwlf ltwlf changed the title Python: Add reasoning support for OpenAI Responses Agents (o3-mini, o1) Python: Add reasoning support for OpenAI Responses Agents (o3-mini, o1, GPT-5) Aug 8, 2025
@ltwlf ltwlf changed the title Python: Add reasoning support for OpenAI Responses Agents (o3-mini, o1, GPT-5) Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o3-mini, o3) Aug 8, 2025
@ltwlf ltwlf changed the title Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o3-mini, o3) Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) Aug 8, 2025
@ltwlf ltwlf force-pushed the feature/response-reasoning branch from aa82b4f to 8ded72a Compare August 9, 2025 08:41
@ltwlf ltwlf requested a review from a team as a code owner August 9, 2025 08:41
@moonbox3 moonbox3 added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel documentation labels Aug 9, 2025
@github-actions github-actions bot changed the title Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) .Net: Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) Aug 9, 2025
@ltwlf ltwlf changed the title .Net: Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) Python: Add reasoning support for OpenAI Responses Agents (GPT-5, o4-mini, o3) Aug 9, 2025
@ltwlf ltwlf force-pushed the feature/response-reasoning branch from 8ded72a to ec28c06 Compare August 9, 2025 08:47
@moonbox3 moonbox3 removed .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel documentation labels Aug 9, 2025
@ltwlf ltwlf force-pushed the feature/response-reasoning branch from ec28c06 to 2dda739 Compare August 9, 2025 08:51
@ltwlf ltwlf requested a review from Copilot August 9, 2025 08:52
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds reasoning support to Python OpenAI ResponsesAgent to achieve parity with the C# implementation. It enables fine-grained control over reasoning effort for O-series models that support the reasoning parameter.

  • Adds constructor-level and per-invocation reasoning effort configuration with priority hierarchy
  • Implements comprehensive reasoning content handling and metadata extraction
  • Provides extensive test coverage and practical usage examples

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test_openai_responses_agent_reasoning.py Comprehensive unit tests for reasoning functionality validation
reasoning_content.py New content type for handling reasoning output from O-series models
const.py Added reasoning content type constant
chat_message_content.py Updated to support reasoning content in messages
__init__.py Exposed ReasoningContent in public API
responses_agent_thread_actions.py Core reasoning logic implementation and API integration
openai_responses_agent.py Agent-level reasoning configuration and validation
responses_agent_reasoning.py Demonstration sample showing reasoning capabilities

@angangwa
Copy link

angangwa commented Aug 13, 2025

Hello, thanks for working on this!

With this PR, would the reasoning summary parameter also work? If this is already supported - it would be super helpful to get an example!

response = await client.responses.create(   
  model="o4-mini",
  input="What is the capital of France.",
  reasoning={
    "effort": "high",
    # SUPPORTS "auto", "concise", or "detailed" 
    # GATED: Will raise ERROR if org / subscription not allow listed (on both OpenAI and Azure.)
    "summary": "auto"
  }
)

@ltwlf
Copy link
Contributor Author

ltwlf commented Aug 14, 2025

@angangwa yes, I'm just working on this!

This commit adds complete reasoning functionality to the OpenAI ResponsesAgent:

Core Features:
- Add ReasoningContent and StreamingReasoningContent classes with proper SK conventions
- Implement reasoning callback mechanism with on_intermediate_message parameter
- Support streaming reasoning events (delta and done) in invoke_stream
- Add reasoning item extraction and yield pattern (False for intermediate, True for final)
- Export reasoning content types in contents package

Implementation Details:
- Fix metadata merging bug in StreamingReasoningContent addition
- Follow SK patterns with StreamingContentMixin + BaseContent inheritance
- Maintain vendor neutrality without OpenAI-specific dependencies
- Add reasoning configuration with priority hierarchy (per-invocation > constructor)
- Support reasoning-capable models (gpt-5, o3, o1-mini) with proper error handling

Testing & Examples:
- Add comprehensive test coverage (31 tests) for all reasoning functionality
- Create clean sample demonstrating reasoning with dual OpenAI/Azure support
- Test content creation, streaming, callbacks, error conditions, and integration flows
- Validate reasoning configuration priority, multi-agent isolation, and edge cases

API Enhancements:
- Extend invoke() and invoke_stream() methods with reasoning parameters
- Add reasoning item processing in ResponsesAgentThreadActions
- Support reasoning effort configuration and summary options
- Implement proper reasoning content extraction from OpenAI responses
@ltwlf ltwlf force-pushed the feature/response-reasoning branch from 9be069c to 36027a6 Compare August 14, 2025 13:07
@ltwlf
Copy link
Contributor Author

ltwlf commented Aug 14, 2025

Hi @moonbox3 @eavanvalkenburg,

I've refactored the code and implemented the suggested changes with one exception: I've kept the constructor-level reasoning parameter. I believe this is useful for multi-agent collaboration scenarios where an orchestrator automatically invokes agents.
If you think otherwise I can change it.
It would be interesting to have a smarter orchestrator that can decide what reasoning effort is needed based on context, but that's beyond the scope of this PR.

I've also removed the "minimal" effort option since I'm now using OpenAI types directly. The minimal effort will automatically return when we update to the latest OpenAI SDK.

I've tested this implementation with O-series models and GPT-5. After this PR is merged, I plan to bump the OpenAI version to the latest and then submit another PR adding GPT-5 verbosity support.

Could you please review the changes when you have a chance?

@dmytrostruk
Copy link
Member

@ltwlf It looks like there are some code quality check failures in CI:
image

ltwlf added a commit to ltwlf/semantic-kernel that referenced this pull request Aug 21, 2025
…tion

- Improve ReasoningContent docstring for better user understanding
- Make text property optional (str | None = None) for better API consistency
- Restore @OverRide decorator to _create method as required by base class
- Refactor complex sample into focused, smaller examples:
  - responses_agent_reasoning.py: Basic non-streaming reasoning examples
  - responses_agent_reasoning_streaming.py: Streaming-specific examples
- Remove unnecessary complexity from samples while maintaining functionality
- Maintain backward compatibility and OpenAI API compliance

Addresses feedback from moonbox3, dmytrostruk, and eavanvalkenburg in PR microsoft#12881
@ltwlf
Copy link
Contributor Author

ltwlf commented Aug 21, 2025

@ltwlf It looks like there are some code quality check failures in CI.

@dmytrostruk not sure what was wrong. Pre-commit checks were successful on my box:
image

I've resolved the review comments. Hope it will be fine now 😅

@ltwlf ltwlf force-pushed the feature/response-reasoning branch from 634963b to 91bd65a Compare August 27, 2025 15:52
@moonbox3
Copy link
Collaborator

@ltwlf we want to help get this PR across the finish line; however, we need the CI/CD code quality checks to pass. CI/CD is installing mypy==1.17.1, can you please verify that you have that version as well? Also, please try to run the VSCode mypy task, outside of running the pre-commit.

image

@moonbox3
Copy link
Collaborator

@ltwlf It looks like there are some code quality check failures in CI.

@dmytrostruk not sure what was wrong. Pre-commit checks were successful on my box: image

I've resolved the review comments. Hope it will be fine now 😅

Mypy isn’t run as part of pre-commit. It’s run separately in CI/CD. That’s why one should manually run the mypy task locally.

@ltwlf ltwlf force-pushed the feature/response-reasoning branch from b347c67 to 4738772 Compare August 28, 2025 07:55
@ltwlf
Copy link
Contributor Author

ltwlf commented Aug 28, 2025

@ltwlf It looks like there are some code quality check failures in CI.

@dmytrostruk not sure what was wrong. Pre-commit checks were successful on my box: image
I've resolved the review comments. Hope it will be fine now 😅

Mypy isn’t run as part of pre-commit. It’s run separately in CI/CD. That’s why one should manually run the mypy task locally.

image I hope mypy is finally happy. Thanks for your support!

@moonbox3
Copy link
Collaborator

@ltwlf I checked out your branch and am able to see the mypy errors:

image

@moonbox3
Copy link
Collaborator

@ltwlf do you want me to fix these mypy issues?

@ltwlf
Copy link
Contributor Author

ltwlf commented Aug 28, 2025

@ltwlf do you want me to fix these mypy issues?

@moonbox3 Yes, please—if you don’t mind.
Not sure why mypy works locally for me but not for others. Maybe I should reset my project setup

@TaoChenOSU TaoChenOSU added this pull request to the merge queue Aug 28, 2025
Merged via the queue into microsoft:main with commit 5e50e19 Aug 28, 2025
28 checks passed
@moonbox3
Copy link
Collaborator

Thanks for your support on this, @ltwlf, and seeing it through.

@ltwlf ltwlf deleted the feature/response-reasoning branch September 2, 2025 07:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python: Bug: AzureResponsesAgent with reasoning model doesn't work
6 participants