Skip to content

Commit 5601cb4

Browse files
Introduce Retry logic for LLM clients (#592)
1 parent cb5da25 commit 5601cb4

File tree

8 files changed

+1367
-32
lines changed

8 files changed

+1367
-32
lines changed

docs/docs/prompt-api.md

Lines changed: 307 additions & 21 deletions
Large diffs are not rendered by default.

prompt/prompt-executor/Module.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ The prompt-executor module provides a unified interface for executing prompts ag
88

99
- **prompt-executor-model**: Core interfaces and models for executing prompts against language models
1010
- **prompt-executor-cached**: Caching implementation for prompt execution
11-
- **prompt-executor-clients**: Client implementations for various LLM providers (OpenAI, Anthropic, OpenRouter)
11+
- **prompt-executor-clients**: Client implementations for various LLM providers and a retry logic decorator
1212
- **prompt-executor-llms**: Implementations of PromptExecutor for executing prompts with LLMs
1313
- **prompt-executor-llms-all**: Unified access to multiple LLM providers for prompt execution
1414
- **prompt-executor-ollama**: Client implementation for executing prompts using Ollama, a local LLM service

prompt/prompt-executor/prompt-executor-clients/Module.md

Lines changed: 41 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,28 @@
11
# Module prompt:prompt-executor:prompt-executor-clients
22

3-
A collection of client implementations for executing prompts using various LLM providers.
3+
A collection of client implementations for executing prompts using various LLM providers and retry logic features.
44

55
### Overview
66

77
This module provides client implementations for different LLM providers, allowing you to execute prompts using various
8-
models with support for multimodal content including images, audio, video, and documents. The module includes the
9-
following sub-modules:
8+
models with support for multimodal content including images, audio, video, and documents. The module includes
9+
**production-ready retry logic** through the `RetryingLLMClient` decorator, which adds automatic error handling and
10+
resilience to any client implementation.
1011

11-
1. **prompt-executor-anthropic-client**: Client implementation for Anthropic's Claude models with image and document
12-
support
12+
The module consists of:
13+
14+
**Core Functionality:**
15+
- **LLMClient interface**: Base interface for all LLM client implementations
16+
- **RetryingLLMClient**: Decorator that adds retry logic with configurable policies
17+
- **RetryConfig**: Flexible retry configuration with predefined settings for different use cases
18+
19+
**Provider-Specific Sub-modules:**
20+
1. **prompt-executor-anthropic-client**: Client implementation for Anthropic's Claude models with image and document support
1321
2. **prompt-executor-openai-client**: Client implementation for OpenAI's GPT models with image and audio capabilities
14-
3. **prompt-executor-google-client**: Client implementation for Google Gemini models with comprehensive multimodal
15-
support (audio, image, video, documents)
16-
4. **prompt-executor-openrouter-client**: Client implementation for OpenRouter's API with image, audio, and document
17-
support
18-
5. **prompt-executor-ollama-client**: Client implementation for local Ollama models
22+
3. **prompt-executor-google-client**: Client implementation for Google Gemini models with comprehensive multimodal support
23+
4. **prompt-executor-openrouter-client**: Client implementation for OpenRouter's API with image, audio, and document support
24+
5. **prompt-executor-bedrock-client**: Client implementation for AWS Bedrock with support for multiple model providers (JVM only)
25+
6. **prompt-executor-ollama-client**: Client implementation for local Ollama models
1926

2027
Each client handles authentication, request formatting, response parsing, and media content encoding specific to its
2128
respective API requirements.
@@ -94,6 +101,30 @@ val response = client.execute(
94101
println(response)
95102
```
96103

104+
### Retry Logic
105+
106+
Wrap any client with `RetryingLLMClient` to add automatic retry capabilities:
107+
108+
```kotlin
109+
val baseClient = OpenAILLMClient(apiKey = System.getenv("OPENAI_API_KEY"))
110+
val resilientClient = RetryingLLMClient(
111+
delegate = baseClient,
112+
config = RetryConfig.PRODUCTION // Or CONSERVATIVE, AGGRESSIVE, DISABLED
113+
)
114+
115+
val response = resilientClient.execute(prompt, model)
116+
117+
resilientClient.executeStreaming(prompt, model).collect { chunk ->
118+
print(chunk)
119+
}
120+
```
121+
122+
**Retry Configurations:**
123+
- `RetryConfig.PRODUCTION` - Recommended for production (3 attempts, balanced delays)
124+
- `RetryConfig.CONSERVATIVE` - Fewer retries, longer delays (3 attempts, 2s initial delay)
125+
- `RetryConfig.AGGRESSIVE` - More retries, shorter delays (5 attempts, 500ms initial delay)
126+
- `RetryConfig.DISABLED` - No retries (1 attempt)
127+
97128
### Multimodal Content Support
98129

99130
All clients now support multimodal content through the unified MediaContent API:

prompt/prompt-executor/prompt-executor-clients/build.gradle.kts

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,18 @@ kotlin {
2222
api(kotlin("reflect"))
2323
}
2424
}
25+
commonTest {
26+
dependencies {
27+
implementation(kotlin("test"))
28+
implementation(libs.kotlinx.coroutines.test)
29+
}
30+
}
31+
jvmTest {
32+
dependencies {
33+
implementation(kotlin("test-junit5"))
34+
implementation(libs.slf4j.simple)
35+
}
36+
}
2537
}
2638

2739
explicitApi()
Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
package ai.koog.prompt.executor.clients.retry
2+
3+
import kotlin.time.Duration
4+
import kotlin.time.Duration.Companion.milliseconds
5+
import kotlin.time.Duration.Companion.seconds
6+
7+
/**
8+
* Configuration for retry behavior in LLM client operations.
9+
*
10+
* @property maxAttempts Maximum number of attempts (including initial)
11+
* @property initialDelay Initial delay before first retry
12+
* @property maxDelay Maximum delay between retries
13+
* @property backoffMultiplier Multiplier for exponential backoff
14+
* @property jitterFactor Random jitter factor (0.0 to 1.0)
15+
* @property retryablePatterns Patterns to identify retryable errors
16+
* @property retryAfterExtractor Optional extractor for retry-after hints
17+
*/
18+
public data class RetryConfig(
19+
val maxAttempts: Int = 3,
20+
val initialDelay: Duration = 1.seconds,
21+
val maxDelay: Duration = 30.seconds,
22+
val backoffMultiplier: Double = 2.0,
23+
val jitterFactor: Double = 0.1,
24+
val retryablePatterns: List<RetryablePattern> = DEFAULT_PATTERNS,
25+
val retryAfterExtractor: RetryAfterExtractor? = DefaultRetryAfterExtractor
26+
) {
27+
init {
28+
require(maxAttempts >= 1) { "maxAttempts must be at least 1" }
29+
require(backoffMultiplier >= 1.0) { "backoffMultiplier must be at least 1.0" }
30+
require(jitterFactor in 0.0..1.0) { "jitterFactor must be between 0.0 and 1.0" }
31+
require(initialDelay <= maxDelay) { "initialDelay ($initialDelay) must not be greater than maxDelay ($maxDelay)" }
32+
}
33+
34+
public companion object {
35+
/**
36+
* Default retry patterns that work across all providers.
37+
*/
38+
public val DEFAULT_PATTERNS: List<RetryablePattern> = listOf(
39+
// HTTP status codes
40+
RetryablePattern.Status(429), // Rate limit
41+
RetryablePattern.Status(500), // Internal server error
42+
RetryablePattern.Status(502), // Bad gateway
43+
RetryablePattern.Status(503), // Service unavailable
44+
RetryablePattern.Status(504), // Gateway timeout
45+
RetryablePattern.Status(529), // Anthropic overloaded
46+
47+
// Error keywords
48+
RetryablePattern.Keyword("rate limit"),
49+
RetryablePattern.Keyword("too many requests"),
50+
RetryablePattern.Keyword("overloaded"),
51+
RetryablePattern.Keyword("request timeout"),
52+
RetryablePattern.Keyword("connection timeout"),
53+
RetryablePattern.Keyword("read timeout"),
54+
RetryablePattern.Keyword("write timeout"),
55+
RetryablePattern.Keyword("connection reset by peer"),
56+
RetryablePattern.Keyword("connection refused"),
57+
RetryablePattern.Keyword("temporarily unavailable"),
58+
RetryablePattern.Keyword("service unavailable")
59+
)
60+
61+
/**
62+
* Conservative configuration - fewer retries, longer delays.
63+
*/
64+
public val CONSERVATIVE: RetryConfig = RetryConfig(
65+
maxAttempts = 3,
66+
initialDelay = 2.seconds,
67+
maxDelay = 30.seconds
68+
)
69+
70+
/**
71+
* Aggressive configuration - more retries, shorter delays.
72+
*/
73+
public val AGGRESSIVE: RetryConfig = RetryConfig(
74+
maxAttempts = 5,
75+
initialDelay = 500.milliseconds,
76+
maxDelay = 20.seconds,
77+
backoffMultiplier = 1.5
78+
)
79+
80+
/**
81+
* Production configuration - balanced for production use.
82+
*/
83+
public val PRODUCTION: RetryConfig = RetryConfig(
84+
maxAttempts = 3,
85+
initialDelay = 1.seconds,
86+
maxDelay = 20.seconds,
87+
backoffMultiplier = 2.0,
88+
jitterFactor = 0.2
89+
)
90+
91+
/**
92+
* No retry - effectively disables retry logic.
93+
*/
94+
public val DISABLED: RetryConfig = RetryConfig(maxAttempts = 1)
95+
}
96+
}
97+
98+
/**
99+
* Pattern for identifying retryable errors.
100+
*/
101+
public sealed class RetryablePattern {
102+
public abstract fun matches(message: String): Boolean
103+
104+
/**
105+
* Matches HTTP status codes in error messages.
106+
*/
107+
public data class Status(val code: Int) : RetryablePattern() {
108+
private val patterns = listOf(
109+
Regex("\\b$code\\b"),
110+
Regex("status:?\\s*$code"),
111+
Regex("error:?\\s*$code", RegexOption.IGNORE_CASE)
112+
)
113+
114+
override fun matches(message: String): Boolean =
115+
patterns.any { it.containsMatchIn(message) }
116+
}
117+
118+
/**
119+
* Matches keywords in error messages.
120+
*/
121+
public data class Keyword(val keyword: String) : RetryablePattern() {
122+
override fun matches(message: String): Boolean =
123+
keyword.lowercase() in message.lowercase()
124+
}
125+
126+
/**
127+
* Matches using a custom regex.
128+
*/
129+
public data class Regex(val pattern: kotlin.text.Regex) : RetryablePattern() {
130+
override fun matches(message: String): Boolean =
131+
pattern.containsMatchIn(message)
132+
}
133+
134+
/**
135+
* Custom matching logic.
136+
*/
137+
public class Custom(private val matcher: (String) -> Boolean) : RetryablePattern() {
138+
override fun matches(message: String): Boolean = matcher(message)
139+
}
140+
}
141+
142+
/**
143+
* Extracts retry-after hints from error messages.
144+
*/
145+
public fun interface RetryAfterExtractor {
146+
public fun extract(message: String): Duration?
147+
}
148+
149+
/**
150+
* Default implementation that extracts common retry-after patterns.
151+
*/
152+
public object DefaultRetryAfterExtractor : RetryAfterExtractor {
153+
private val patterns = listOf(
154+
Regex("retry\\s+after\\s+(\\d+)\\s+second", RegexOption.IGNORE_CASE),
155+
Regex("retry-after:\\s*(\\d+)", RegexOption.IGNORE_CASE),
156+
Regex("wait\\s+(\\d+)\\s+second", RegexOption.IGNORE_CASE)
157+
)
158+
159+
override fun extract(message: String): Duration? {
160+
for (pattern in patterns) {
161+
pattern.find(message)?.let { match ->
162+
match.groupValues.getOrNull(1)?.toLongOrNull()?.let { seconds ->
163+
return seconds.seconds
164+
}
165+
}
166+
}
167+
return null
168+
}
169+
}

0 commit comments

Comments
 (0)