Skip to content

Conversation

EugeneTheDev
Copy link
Contributor

  • Rewrite JSON schema generation from serializable classes to make it generally more extensible and flexible. This allows supporting different JSON schema formats for different LLM providers while still reusing common generation logic.
  • Update OpenAI and Google clients to support native structured output.
  • Refactor and improve structured output API - PrompExecutor.executeStructured, AIAgentLLMSession.requestLLMStructured, nodeLLMRequestStructured - to make it easier to use with different modes (manual and native)

Fixes #366

@nathanfallet
Copy link
Contributor

When this will be merged, it should unlock #377 and its PR #378 (just linking so I can track it and find it later)

@EugeneTheDev EugeneTheDev requested a review from devcrocod July 16, 2025 14:59
@EugeneTheDev EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch 3 times, most recently from 99b8819 to 703baa3 Compare July 16, 2025 23:47
Copy link

github-actions bot commented Jul 17, 2025

Qodana for JVM

419 new problems were found

Inspection name Severity Problems
Check Kotlin and Java source code coverage 🔶 Warning 414
Vulnerable imported dependency 🔶 Warning 4
String concatenation that can be converted to string template ◽️ Notice 1
@@ Code coverage @@
+ 67% total lines covered
9121 lines analyzed, 6127 lines covered
# Calculated according to the filters of your coverage tool

☁️ View the detailed Qodana report

Contact Qodana team

Contact us at [email protected]

@EugeneTheDev EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch from 12b52ee to 3d2962f Compare July 18, 2025 23:30
@EugeneTheDev
Copy link
Contributor Author

EugeneTheDev commented Jul 18, 2025

I've added simple method overloads to the structured output API, allowing users to avoid always configuring all parameters manually and to rely on the framework to figure out the best approach instead. It is possible now to write very little code to get the structure in common cases where no fine-grained control is required:

// Prompt executor
promptExecutor.executeStructured<MyStruct>(prompt, model)

// Agent LLM contenxt
llm.writeSession {
  requestLLMStructured<MyStruct>()
}

// Node
val getMyStruct by nodeLLMRequestStructured<MyStruct>()

Full example is in the StructuredOutputExample.kt.

@devcrocod can you please check, is it better now?

@OfekTeken
Copy link

This is perfect, I just have a small suggestion - In my experience, most of the times I'd want to also give a specific system message when generating structured output (instead of relying solely on the history)
Do you think it'd be possible that these methods allow such optional systemMessage parameter to be passed? Or an alternative method, etc..
I think it would be pretty useful, thanks! 👍

@EugeneTheDev
Copy link
Contributor Author

EugeneTheDev commented Jul 19, 2025

@OfekTeken sorry, updated the original example, executeStructured has similar signature to other execute methods, with prompt and model. And the other two methods also follow common pattern in their respective locations. All requestLLM methods in LLM session operate on the current prompt, there are separate methods to update it. And the node version accepts String, so you can pass some input message to update the prompt.

@EugeneTheDev EugeneTheDev requested a review from sdubov July 22, 2025 11:39
Copy link
Contributor

@devcrocod devcrocod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some questions, mostly related to the design. I’d appreciate it if you could clarify them for me

*/
public data class StructuredOutputConfig<T>(
public val default: StructuredOutput<T>? = null,
public val byProvider: Map<LLMProvider, StructuredOutput<T>> = emptyMap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it LLMProvider here instead of LLModel?
As I see, in PromptExecutor, where the config will be used, all providers are already present

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, by using the model as a key we can specify the behavior more granularly, e.g. that Gemini Flash 2.5 Lite should use Native output, but Gemini Flash 2.5 Pro should use Manual. But I'm not sure if this is a realistic case though. The main idea is that since providers might have differences in JSON schema formats and Natve/No Native support, specifiying the behavior per provider only would be enough. I would say we can leave it like this for now and intrduce per model mapping if there will be a real need for it, wdyt?

config = StructuredOutputConfig(
default = StructuredOutput.Manual(
JsonStructuredData.createJsonStructure<SelectedTools>(
schemaGenerator = FullJsonSchemaGenerator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see it, there’s only one generator, so why are we choosing it here?
We could just use FullJsonSchemaGenerator by default, and if users want to use a custom one, they can specify it themselves

Copy link
Contributor Author

@EugeneTheDev EugeneTheDev Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple types of generators, and I'm again (as I explained above) chose to be a bit more explicit here for the sake of clarity, to make it easier to understand what is going on here exactly, since it's part of the core implementation. But now that I made Standard (ex Full) JSON schema generator a default value, this can be removed if you want.

* Used to attempt to get a proper generator in the simple version of [executeStructured] (that does not accept [StructuredOutput] explicitly)
* to attempt to generate an appropriate schema for the passed [KType].
*/
public val DefaultSimpleSchemaGenerators: Map<LLMProvider, JsonSchemaGenerator> = mapOf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, it’s not a good idea to use such global variables.

Can we put jsonSchemaGenerator into the companion object for OpenAI and Google respectively?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some consideration I decided to use "register pattern" (similar to ServiceLoader) to be able to move LLM provider specific JSON schema generators to their clients' respective modules. So now these LLM clients register these generators (formats) in global maps. This is not an ideal solution, but seems like it achieves nice balance in terms of stability/convenience. I marked these global maps with internal API annotations.

* Note: it does not handle nullability because these might be different in different schema specs.
* Implementations must handle these themselves.
*/
public abstract class GenericJsonSchemaGenerator : JsonSchemaGenerator() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work if I have a custom serializer and the descriptor is overridden?

Copy link
Contributor Author

@EugeneTheDev EugeneTheDev Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate method accepts an instance of Json that is used to (de)serialize objects, so yes. Everything this json instance knows about (registered subtypes, custom serializers, etc.) should be used properly by the generator.

* It is recommended to use [SimpleJsonSchemaGenerator.Default] companion object, which provides a default instance,
* instead of creating new instances manually.
*/
public open class SimpleJsonSchemaGenerator : GenericJsonSchemaGenerator() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of Simple/Full in the naming raises questions, since it’s hard for the user to understand what Simple means and why it’s considered simple without additional context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some considertation, I decided to rename these as follows:

  1. Simple -> Basic. "Basic" implementation of JSON schema, without additional capabilities such as special functions, nested objects only.
  2. Full -> Standard. "Standard" implementation of JSON schema, with support for object definitions, refs, and certain special functions like oneOf/anyOf.

Is it better now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that’s just my personal preference :)

I’m curious how we can fill the entire JSON schema for a class? I mean all fields of json schema

@EugeneTheDev EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch 2 times, most recently from 3d531c9 to a749563 Compare July 27, 2025 14:09
@EugeneTheDev EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch from a749563 to abacafc Compare August 10, 2025 18:12
…ve structured output for OpenAI and Google

* Add helper Gradle tasks to verify compilation on all targets

* Adjust envs loading in integration tests, fix tests

* Load envs in integration tests only for `jvmIntegrationTest` tasks, not for all Test tasks (we don't need credentials here)

* Simple structured output API that tries to find the best approach to get the structured response on its own, without requiring the users to specify structures and modes manually.
@EugeneTheDev EugeneTheDev force-pushed the eugenethedev/improve-structured-response branch from abacafc to 5e7ea7e Compare August 10, 2025 21:02
@EugeneTheDev EugeneTheDev requested a review from Rizzen August 11, 2025 14:41
@EugeneTheDev EugeneTheDev merged commit c0ca25d into develop Aug 11, 2025
7 checks passed
@EugeneTheDev EugeneTheDev deleted the eugenethedev/improve-structured-response branch August 11, 2025 18:38
@nathanfallet
Copy link
Contributor

nathanfallet commented Aug 11, 2025

Thank you! I'll have a look and work back on #378 soon (if it's still applicable; maybe it's no longer needed, i'll have to inspect the new way)
EDIT: This is now available at #638

@OfekTeken
Copy link

Thanks! Can't wait to use this :)

@chris-hatton
Copy link

chris-hatton commented Aug 13, 2025

Can't wait to use this as well - without being dramatic it makes the difference between koog.ai being usable/unusable for my use cases! Thanks @EugeneTheDev let's goo..! 🚀 (KSlack post for context)

kpavlov pushed a commit that referenced this pull request Sep 2, 2025
Fixes #377 
Changes are the same as #378, it's just the updated version after the
refactor of #443

Proposed changes:
Add an `excludedProperties` parameter like this:
```kotlin
simpleSchemaGenerator.generate(
    "TestClass",
    serializer<TestClass>(),
    descriptionOverrides = emptyMap(),
    excludedProperties = setOf("TestClass.someProperty")
)
```

---

#### Type of the change
- [x] New feature
- [ ] Bug fix
- [ ] Documentation fix

#### Checklist for all pull requests
- [x] The pull request has a description of the proposed change
- [x] I read the [Contributing
Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md)
before opening the pull request
- [x] The pull request uses **`develop`** as the base branch
- [x] Tests for the changes have been added
- [x] All new and existing tests passed

##### Additional steps for pull requests adding a new feature
- [x] An issue describing the proposed change exists
- [x] The pull request includes a link to the issue
- [ ] The change was discussed and approved in the issue
- [ ] Docs have been added / updated
karloti pushed a commit to karloti/koog that referenced this pull request Sep 2, 2025
…ains#638)

Fixes JetBrains#377 
Changes are the same as JetBrains#378, it's just the updated version after the
refactor of JetBrains#443

Proposed changes:
Add an `excludedProperties` parameter like this:
```kotlin
simpleSchemaGenerator.generate(
    "TestClass",
    serializer<TestClass>(),
    descriptionOverrides = emptyMap(),
    excludedProperties = setOf("TestClass.someProperty")
)
```

---

#### Type of the change
- [x] New feature
- [ ] Bug fix
- [ ] Documentation fix

#### Checklist for all pull requests
- [x] The pull request has a description of the proposed change
- [x] I read the [Contributing
Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md)
before opening the pull request
- [x] The pull request uses **`develop`** as the base branch
- [x] Tests for the changes have been added
- [x] All new and existing tests passed

##### Additional steps for pull requests adding a new feature
- [x] An issue describing the proposed change exists
- [x] The pull request includes a link to the issue
- [ ] The change was discussed and approved in the issue
- [ ] Docs have been added / updated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Json Schema / Response Schema Support
6 participants