Skip to content

Python: Bug: Argument Escaping in Semantic Kernel Prompt Templates #11821

@LeonardHd

Description

@LeonardHd

Describe the bug

When using Semantic Kernel (SK) with prompt templates, only arguments of type str are properly escaped. However, SK allows arguments of any type (e.g., int, float, list, dict). If a non-string argument is passed, it is not escaped. If the string representation of such an argument contains special characters, this can cause unexpected behavior—such as the chat history defaulting to a simple user-based format instead of a structured one.

Impact:

  • Non-string arguments with special characters can break prompt rendering or chat history structure.
  • The issue is subtle and may not be noticed during runtime, as it does not cause failures. It is typically only visible in logs, making it easy to overlook and difficult to trace (but can have significant consequences in model behavior).

To Reproduce

The issue is demonstrated in two code snippets:

  1. An end-to-end example from our codebase where the problem was discovered. To replicate the issue run the code and observe the logging output:

    INFO:semantic_kernel.functions.kernel_function:Function example_plugin-example_function invoking.
    INFO:semantic_kernel.contents.chat_history:Could not parse prompt You are a helpful assistant. Answer the user's questions.
    
    {'text': 'This <& is a test document. It contains some information that might be useful for the assistant.'}
    
    <chat_history /> as xml, treating as text, error was: not well-formed (invalid token): line 3, column 16
    
    import asyncio
    import logging
    import os
    from textwrap import dedent
    
    from azure.identity import DefaultAzureCredential, get_bearer_token_provider
    from dotenv import load_dotenv
    from semantic_kernel import Kernel
    from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
    from semantic_kernel.contents import (
        ChatHistory,
    )
    from semantic_kernel.functions import KernelArguments, KernelFunction
    from semantic_kernel.prompt_template import InputVariable, PromptTemplateConfig
    
    logging.basicConfig(level=logging.INFO)
    
    PROMPT = dedent(
        """\
        You are a helpful assistant. Answer the user's questions.
    
        {{$source_document}}
    
        {{$chat_history}}
        """
    )
    
    kernel_function = KernelFunction.from_prompt(
        function_name="example_function",
        plugin_name="example_plugin",
        prompt_template_config=PromptTemplateConfig(
            template=PROMPT,
            input_variables=[
                InputVariable(
                    name="source_document",
                    description="Source document to use for the answer.",
                    is_required=True,
                ),
                InputVariable(
                    name="chat_history",
                    description="The chat history so far.",
                    is_required=True,
                ),
            ],
        ),
    )
    
    
    load_dotenv()
    
    DEPLOYMENT_NAME = os.getenv("AZURE_CHAT_COMPLETION_DEPLOYMENT_NAME")
    AZURE_CHAT_COMPLETION_BASE_URL = os.getenv("AZURE_CHAT_COMPLETION_BASE_URL")
    
    
    kernel = Kernel()
    kernel.add_service(
        AzureChatCompletion(
            service_id="example_service",
            deployment_name=DEPLOYMENT_NAME,
            ad_token_provider=get_bearer_token_provider(
                DefaultAzureCredential(),
                "https://cognitiveservices.azure.com/.default",
            ),
            base_url=AZURE_CHAT_COMPLETION_BASE_URL,
        )
    )
    
    chat_history = ChatHistory()
    
    source_document = {
        "text": "This <& is a test document. It contains some information that might be useful for the assistant."
    }
    
    
    awaitable = kernel.invoke(
        function=kernel_function,
        arguments=KernelArguments(
            source_document=source_document, chat_history=chat_history
        ),
    )
    
    result = asyncio.run(awaitable)
    print(result)
  2. A set of pytests showing the issue, mimicking the implementation on a kernel.invoke method, which first renders the prompt and then creates a chat history from the rendered prompt. One test case fails as the chat history cannot be parsed.

    import os
    from textwrap import dedent
    import pytest
    from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion
    from azure.identity import DefaultAzureCredential, get_bearer_token_provider
    from semantic_kernel.contents import ChatHistory, AuthorRole
    from dotenv import load_dotenv
    from semantic_kernel.functions import KernelArguments, KernelFunction
    from semantic_kernel.prompt_template import InputVariable, PromptTemplateConfig
    
    load_dotenv()
    DEPLOYMENT_NAME = os.getenv("AZURE_CHAT_COMPLETION_DEPLOYMENT_NAME")
    AZURE_CHAT_COMPLETION_BASE_URL = os.getenv("AZURE_CHAT_COMPLETION_BASE_URL")
    
    PROMPT = dedent(
        """\
        You are a helpful assistant. Answer the user's questions.
    
        {{$source_document}}
    
        {{$chat_history}}
        """
    )
    
    kernel_function = KernelFunction.from_prompt(
        function_name="example_function",
        plugin_name="example_plugin",
        prompt_template_config=PromptTemplateConfig(
            template=PROMPT,
            input_variables=[
                InputVariable(
                    name="source_document",
                    description="Source document to use for the answer.",
                    is_required=True,
                ),
                InputVariable(
                    name="chat_history",
                    description="The chat history so far.",
                    is_required=True,
                ),
            ],
        ),
    )
    
    @pytest.mark.asyncio
    @pytest.mark.parametrize(
        "source_document",
        [
            pytest.param(
                {"text": "This is a test document."},
                id="simple_document_dict"
            ),
            pytest.param(
                "This is a test document.",
                id="simple_document_str"
            ),
            pytest.param(
                {"text": "This is a test document with special characters: !@#$%^&*()<>"},
                id="xml_special_characters_dict"
            ),
            pytest.param(
                {"text": "THis is a test document"},
                id="no_special_characters_dict"
            ),
            pytest.param(
                "This is a test document with special characters: !@#$%^&*()<>",
                id="xml_special_characters_str"
            ),
            pytest.param(
                "{\"text\": \"This is a test document with special characters: !@#$%^&*()<>\"}",
                id="json_special_characters_str"
            ),
        ],
    )
    async def test_render_prompt(source_document):
        # Arrange
        from semantic_kernel.functions import KernelArguments, KernelFunction, FunctionResult
        from semantic_kernel.filters.functions.function_invocation_context import FunctionInvocationContext
        from semantic_kernel.kernel import Kernel
        FunctionInvocationContext.model_rebuild()
    
        chat_history = ChatHistory()
        chat_history.add_user_message("What type of document is this?")
        
        kernel = Kernel()
        kernel.add_service(
            AzureChatCompletion(
                service_id="example_service",
                deployment_name=DEPLOYMENT_NAME,
                ad_token_provider=get_bearer_token_provider(
                    DefaultAzureCredential(),
                    "https://cognitiveservices.azure.com/.default",
                ),
                base_url=AZURE_CHAT_COMPLETION_BASE_URL,
            )
        )
    
        rendered_prompt_result = await kernel_function._render_prompt(
            context=FunctionInvocationContext(
                function=kernel_function,
                kernel=kernel,
                arguments=KernelArguments(
                    source_document=source_document,
                    chat_history=chat_history,
                ),
            )
        )
    
        # Act
        chat_history = ChatHistory.from_rendered_prompt(rendered_prompt_result.rendered_prompt)
    
        # Assert
        assert chat_history is not None
        assert chat_history.messages[0].role == AuthorRole.SYSTEM
        assert chat_history.messages[1].role == AuthorRole.USER

Expected behavior

Framework users expect all arguments to be safely escaped, regardless of their type.

Screenshots
N/A

Platform

  • Language: Python
  • Source: 1.29.0
  • AI model: n/a

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingpythonPull requests for the Python Semantic Kernel

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions