Refactor structured output API to make it more flexible, support native structured output for OpenAI and Google. #443

EugeneTheDev · 2025-07-14T23:13:51Z

Rewrite JSON schema generation from serializable classes to make it generally more extensible and flexible. This allows supporting different JSON schema formats for different LLM providers while still reusing common generation logic.
Update OpenAI and Google clients to support native structured output.
Refactor and improve structured output API - PrompExecutor.executeStructured, AIAgentLLMSession.requestLLMStructured, nodeLLMRequestStructured - to make it easier to use with different modes (manual and native)

Fixes #366

nathanfallet · 2025-07-16T09:08:55Z

When this will be merged, it should unlock #377 and its PR #378 (just linking so I can track it and find it later)

github-actions · 2025-07-17T12:44:45Z

Qodana for JVM

419 new problems were found

Inspection name	Severity	Problems
`Check Kotlin and Java source code coverage`	🔶 Warning	414
`Vulnerable imported dependency`	🔶 Warning	4
`String concatenation that can be converted to string template`	◽️ Notice	1

@@ Code coverage @@
+ 67% total lines covered
9121 lines analyzed, 6127 lines covered
# Calculated according to the filters of your coverage tool

☁️ View the detailed Qodana report

Contact Qodana team

Contact us at [email protected]

Or via our issue tracker: https://jb.gg/qodana-issue
Or share your feedback: https://jb.gg/qodana-discussions

EugeneTheDev · 2025-07-18T23:42:57Z

I've added simple method overloads to the structured output API, allowing users to avoid always configuring all parameters manually and to rely on the framework to figure out the best approach instead. It is possible now to write very little code to get the structure in common cases where no fine-grained control is required:

// Prompt executor
promptExecutor.executeStructured<MyStruct>(prompt, model)

// Agent LLM contenxt
llm.writeSession {
  requestLLMStructured<MyStruct>()
}

// Node
val getMyStruct by nodeLLMRequestStructured<MyStruct>()

Full example is in the StructuredOutputExample.kt.

@devcrocod can you please check, is it better now?

OfekTeken · 2025-07-19T10:13:00Z

This is perfect, I just have a small suggestion - In my experience, most of the times I'd want to also give a specific system message when generating structured output (instead of relying solely on the history)
Do you think it'd be possible that these methods allow such optional systemMessage parameter to be passed? Or an alternative method, etc..
I think it would be pretty useful, thanks! 👍

EugeneTheDev · 2025-07-19T17:22:14Z

@OfekTeken sorry, updated the original example, executeStructured has similar signature to other execute methods, with prompt and model. And the other two methods also follow common pattern in their respective locations. All requestLLM methods in LLM session operate on the current prompt, there are separate methods to update it. And the node version accepts String, so you can pass some input message to update the prompt.

devcrocod

I have some questions, mostly related to the design. I’d appreciate it if you could clarify them for me

devcrocod · 2025-07-23T13:59:39Z

.../prompt-structure/src/commonMain/kotlin/ai/koog/prompt/structure/PromptExecutorExtensions.kt

+ */
+public data class StructuredOutputConfig<T>(
+    public val default: StructuredOutput<T>? = null,
+    public val byProvider: Map<LLMProvider, StructuredOutput<T>> = emptyMap(),


Why is it LLMProvider here instead of LLModel?
As I see, in PromptExecutor, where the config will be used, all providers are already present

Hm, by using the model as a key we can specify the behavior more granularly, e.g. that Gemini Flash 2.5 Lite should use Native output, but Gemini Flash 2.5 Pro should use Manual. But I'm not sure if this is a realistic case though. The main idea is that since providers might have differences in JSON schema formats and Natve/No Native support, specifiying the behavior per provider only would be enough. I would say we can leave it like this for now and intrduce per model mapping if there will be a real need for it, wdyt?

agents/agents-core/src/commonMain/kotlin/ai/koog/agents/core/agent/entity/AIAgentSubgraph.kt

devcrocod · 2025-07-24T09:55:18Z

agents/agents-core/src/commonMain/kotlin/ai/koog/agents/core/agent/entity/AIAgentSubgraph.kt

+                config = StructuredOutputConfig(
+                    default = StructuredOutput.Manual(
+                        JsonStructuredData.createJsonStructure<SelectedTools>(
+                            schemaGenerator = FullJsonSchemaGenerator,


As I see it, there’s only one generator, so why are we choosing it here?
We could just use FullJsonSchemaGenerator by default, and if users want to use a custom one, they can specify it themselves

There are multiple types of generators, and I'm again (as I explained above) chose to be a bit more explicit here for the sake of clarity, to make it easier to understand what is going on here exactly, since it's part of the core implementation. But now that I made Standard (ex Full) JSON schema generator a default value, this can be removed if you want.

agents/agents-core/src/commonMain/kotlin/ai/koog/agents/core/agent/entity/AIAgentSubgraph.kt

agents/agents-core/src/commonMain/kotlin/ai/koog/agents/core/agent/session/AIAgentLLMSession.kt

devcrocod · 2025-07-24T11:01:04Z

.../prompt-structure/src/commonMain/kotlin/ai/koog/prompt/structure/PromptExecutorExtensions.kt

+ * Used to attempt to get a proper generator in the simple version of [executeStructured] (that does not accept [StructuredOutput] explicitly)
+ * to attempt to generate an appropriate schema for the passed [KType].
+ */
+public val DefaultSimpleSchemaGenerators: Map<LLMProvider, JsonSchemaGenerator> = mapOf(


In my opinion, it’s not a good idea to use such global variables.

Can we put jsonSchemaGenerator into the companion object for OpenAI and Google respectively?

After some consideration I decided to use "register pattern" (similar to ServiceLoader) to be able to move LLM provider specific JSON schema generators to their clients' respective modules. So now these LLM clients register these generators (formats) in global maps. This is not an ideal solution, but seems like it achieves nice balance in terms of stability/convenience. I marked these global maps with internal API annotations.

prompt/prompt-structure/src/commonMain/kotlin/ai/koog/prompt/structure/StructureFixingParser.kt

devcrocod · 2025-07-24T11:08:58Z

...commonMain/kotlin/ai/koog/prompt/structure/json/generator/core/GenericJsonSchemaGenerator.kt

+ * Note: it does not handle nullability because these might be different in different schema specs.
+ * Implementations must handle these themselves.
+ */
+public abstract class GenericJsonSchemaGenerator : JsonSchemaGenerator() {


Will this work if I have a custom serializer and the descriptor is overridden?

generate method accepts an instance of Json that is used to (de)serialize objects, so yes. Everything this json instance knows about (registered subtypes, custom serializers, etc.) should be used properly by the generator.

...re/src/commonMain/kotlin/ai/koog/prompt/structure/json/generator/core/JsonSchemaGenerator.kt

devcrocod · 2025-07-24T11:15:13Z

.../commonMain/kotlin/ai/koog/prompt/structure/json/generator/core/SimpleJsonSchemaGenerator.kt

+ * It is recommended to use [SimpleJsonSchemaGenerator.Default] companion object, which provides a default instance,
+ * instead of creating new instances manually.
+ */
+public open class SimpleJsonSchemaGenerator : GenericJsonSchemaGenerator() {


The use of Simple/Full in the naming raises questions, since it’s hard for the user to understand what Simple means and why it’s considered simple without additional context

After some considertation, I decided to rename these as follows:

Simple -> Basic. "Basic" implementation of JSON schema, without additional capabilities such as special functions, nested objects only.

Full -> Standard. "Standard" implementation of JSON schema, with support for object definitions, refs, and certain special functions like oneOf/anyOf.

Is it better now?

Yes, but that’s just my personal preference :)

I’m curious how we can fill the entire JSON schema for a class? I mean all fields of json schema

…ve structured output for OpenAI and Google * Add helper Gradle tasks to verify compilation on all targets * Adjust envs loading in integration tests, fix tests * Load envs in integration tests only for `jvmIntegrationTest` tasks, not for all Test tasks (we don't need credentials here) * Simple structured output API that tries to find the best approach to get the structured response on its own, without requiring the users to specify structures and modes manually.

nathanfallet · 2025-08-11T22:05:15Z

Thank you! I'll have a look and work back on #378 soon (if it's still applicable; maybe it's no longer needed, i'll have to inspect the new way)
EDIT: This is now available at #638

OfekTeken · 2025-08-12T20:57:04Z

Thanks! Can't wait to use this :)

chris-hatton · 2025-08-13T13:50:36Z

Can't wait to use this as well - without being dramatic it makes the difference between koog.ai being usable/unusable for my use cases! Thanks @EugeneTheDev let's goo..! 🚀 (KSlack post for context)

Fixes #377 Changes are the same as #378, it's just the updated version after the refactor of #443 Proposed changes: Add an `excludedProperties` parameter like this: ```kotlin simpleSchemaGenerator.generate( "TestClass", serializer<TestClass>(), descriptionOverrides = emptyMap(), excludedProperties = setOf("TestClass.someProperty") ) ``` --- #### Type of the change - [x] New feature - [ ] Bug fix - [ ] Documentation fix #### Checklist for all pull requests - [x] The pull request has a description of the proposed change - [x] I read the [Contributing Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md) before opening the pull request - [x] The pull request uses **`develop`** as the base branch - [x] Tests for the changes have been added - [x] All new and existing tests passed ##### Additional steps for pull requests adding a new feature - [x] An issue describing the proposed change exists - [x] The pull request includes a link to the issue - [ ] The change was discussed and approved in the issue - [ ] Docs have been added / updated

…ains#638) Fixes JetBrains#377 Changes are the same as JetBrains#378, it's just the updated version after the refactor of JetBrains#443 Proposed changes: Add an `excludedProperties` parameter like this: ```kotlin simpleSchemaGenerator.generate( "TestClass", serializer<TestClass>(), descriptionOverrides = emptyMap(), excludedProperties = setOf("TestClass.someProperty") ) ``` --- #### Type of the change - [x] New feature - [ ] Bug fix - [ ] Documentation fix #### Checklist for all pull requests - [x] The pull request has a description of the proposed change - [x] I read the [Contributing Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md) before opening the pull request - [x] The pull request uses **`develop`** as the base branch - [x] Tests for the changes have been added - [x] All new and existing tests passed ##### Additional steps for pull requests adding a new feature - [x] An issue describing the proposed change exists - [x] The pull request includes a link to the issue - [ ] The change was discussed and approved in the issue - [ ] Docs have been added / updated

EugeneTheDev requested a review from devcrocod July 16, 2025 14:59

EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch 3 times, most recently from 99b8819 to 703baa3 Compare July 16, 2025 23:47

EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch from 12b52ee to 3d2962f Compare July 18, 2025 23:30

EugeneTheDev requested a review from sdubov July 22, 2025 11:39

devcrocod reviewed Jul 24, 2025

View reviewed changes

EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch 2 times, most recently from 3d531c9 to a749563 Compare July 27, 2025 14:09

EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch from a749563 to abacafc Compare August 10, 2025 18:12

EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch from abacafc to 5e7ea7e Compare August 10, 2025 21:02

EugeneTheDev requested a review from Rizzen August 11, 2025 14:41

Rizzen approved these changes Aug 11, 2025

View reviewed changes

EugeneTheDev merged commit c0ca25d into develop Aug 11, 2025
7 checks passed

EugeneTheDev deleted the eugenethedev/improve-structured-response branch August 11, 2025 18:38

nathanfallet mentioned this pull request Aug 21, 2025

Add support for excluding properties in JSON schema generation #638

Merged

12 tasks

Refactor structured output API to make it more flexible, support native structured output for OpenAI and Google. #443

Refactor structured output API to make it more flexible, support native structured output for OpenAI and Google. #443

Uh oh!

Conversation

EugeneTheDev commented Jul 14, 2025

Uh oh!

nathanfallet commented Jul 16, 2025

Uh oh!

github-actions bot commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Qodana for JVM

Uh oh!

EugeneTheDev commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OfekTeken commented Jul 19, 2025

Uh oh!

EugeneTheDev commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devcrocod left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EugeneTheDev Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EugeneTheDev Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nathanfallet commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

OfekTeken commented Aug 12, 2025

Uh oh!

chris-hatton commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jul 17, 2025 •

edited

Loading

EugeneTheDev commented Jul 18, 2025 •

edited

Loading

EugeneTheDev commented Jul 19, 2025 •

edited

Loading

EugeneTheDev Jul 27, 2025 •

edited

Loading

EugeneTheDev Jul 27, 2025 •

edited

Loading

nathanfallet commented Aug 11, 2025 •

edited

Loading

chris-hatton commented Aug 13, 2025 •

edited

Loading