Skip to content

Python: Bug: Responses Agent image count issue #12848

@kenakamu

Description

@kenakamu

Describe the bug
When using AzureResponsesAgent (Or even OpenAI one), if we attach more than 3 images, it fails by saying it has more than 10 imgaes.

To Reproduce

from semantic_kernel.agents import AzureResponsesAgent
from semantic_kernel.connectors.ai.open_ai import AzureOpenAISettings
from dotenv import load_dotenv
from semantic_kernel.contents import ChatMessageContent
from semantic_kernel.contents.image_content import ImageContent
from semantic_kernel.contents.text_content import TextContent
from semantic_kernel.contents.utils.author_role import AuthorRole

load_dotenv()



async def main():
    # Set up the client and model using Azure OpenAI Resources
    client = AzureResponsesAgent.create_client()

    # Create the AzureResponsesAgent instance using the client and the model
    agent = AzureResponsesAgent(
        ai_model_id=AzureOpenAISettings().responses_deployment_name,
        client=client,
        instructions="You are a helpful assistant",
        name="image_agent",
    )

    thread = None

    user_message = ChatMessageContent(
        role=AuthorRole.USER,
        items=[
            TextContent(text="How many pictures do you get?"),
            ImageContent(
                uri="https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/New_york_times_square-terabass.jpg/1200px-New_york_times_square-terabass.jpg"
            ),
            ImageContent(
                uri="https://upload.wikimedia.org/wikipedia/commons/b/b2/Skyscrapers_of_Shinjuku_2009_January.jpg"
            ),
            ImageContent(
                uri="https://upload.wikimedia.org/wikipedia/commons/b/b2/Skyscrapers_of_Shinjuku_2009_January.jpg"
            ),
            ImageContent(
                uri="https://upload.wikimedia.org/wikipedia/commons/b/b2/Skyscrapers_of_Shinjuku_2009_January.jpg"
            )
        ],
    )


    response = await agent.get_response(messages=user_message, thread=thread)
    print(f"# {response.name}: {response.content}")
    # Update the thread so the previous response id is used
    thread = response.thread

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

Expected behavior
It should accept up to 10 images

Screenshots
If applicable, add screenshots to help explain your problem.

Platform

  • Language: python
  • Source: main branch of repository
  • AI model: o4-mini

Additional context
The cause is

response_inputs.append({"role": original_role, "content": contents})
that adding contents in every loop, that add duplicate contents.

The response_inputs.append({"role": original_role, "content": contents}) should be done once outside of the loop.

Metadata

Metadata

Assignees

Labels

agentsbugSomething isn't workingpythonPull requests for the Python Semantic Kernel

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions