-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Labels
agentsbugSomething isn't workingSomething isn't workingpythonPull requests for the Python Semantic KernelPull requests for the Python Semantic Kernel
Description
Describe the bug
When calling the invoke_stream
method in chat_completion_agent.py
the usage information isn't being sent back in the final chunk.
It appears the issue is with lines 453-460 in chat_completion_agent.py
if (
role == AuthorRole.ASSISTANT
and response.items
and not any(
isinstance(item, (FunctionCallContent, FunctionResultContent)) for item in response.items
)
):
yield AgentResponseItem(message=response, thread=thread)
if response.items
is empty, it skips yielding back the usage information.
To Reproduce
Call the invoke_stream
method.
Expected behavior
The streaming usage information should be returned back in the final chunk.
Screenshots
Current behavior (showing zero token usage):
Desired behavior (this was achieved by commenting out line 455
in chat_completion_agent.py
:
if (
role == AuthorRole.ASSISTANT
# and response.items
and not any(
isinstance(item, (FunctionCallContent, FunctionResultContent)) for item in response.items
)
):
yield AgentResponseItem(message=response, thread=thread)
Platform
- Language: Python
- Source: Semantic Kernel v1.29.0
- AI model: gpt-4o-mini-2024-07-18
- IDE: VS Code
- OS: Mac
Additional context
While commenting out line 455
in chat_completion_agent.py
yielded the usage information, it did also include additional "empty" partial responses.
uluvtu
Metadata
Metadata
Assignees
Labels
agentsbugSomething isn't workingSomething isn't workingpythonPull requests for the Python Semantic KernelPull requests for the Python Semantic Kernel
Type
Projects
Status
No status