-
Notifications
You must be signed in to change notification settings - Fork 608
Closed as not planned
Labels
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Describe the bug
lmdeploy 0.9.0 部署 Qwen3 reasoning_content
错误。
Reproduction
启动服务: lmdeploy serve api_server /models/Qwen3-8B --reasoning-parser deepseek-r1
以下两种情况存在bug:
Case 1 流式输出:第一段 chunk content 包含 <think>
$ curl ... -d '{..., "stream": true}'
data: {"id":"1","object":"chat.completion.chunk","created":1750741881,"model":"/models/Qwen3-8B","choices":[{"index":0,"delta":{"role":"assistant","content":"<think>","reasoning_content":null,"tool_calls":[]},"logprobs":null,"finish_reason":null}],"usage":null}
data: {"id":"1","object":"chat.completion.chunk","created":1750741881,"model":"/models/Qwen3-8B","choices":[{"index":0,"delta":{"role":"assistant","content":null,"reasoning_content":"\n","tool_calls":[]},"logprobs":null,"finish_reason":null}],"usage":null}
data: {"id":"1","object":"chat.completion.chunk","created":1750741881,"model":"/models/Qwen3-8B","choices":[{"index":0,"delta":{"role":"assistant","content":null,"reasoning_content":"好的","tool_calls":[]},"logprobs":null,"finish_reason":null}],"usage":null}
...
Case 2 非流式输出,限制 max_tokens
:输出包含 <think>
$ curl ... -d '{..., "stream": false, "max_tokens": 8}'
{"id":"2","object":"chat.completion","created":1750742086,"model":"/models/Qwen3-8B","choices":[{"index":0,"message":{"role":"assistant","content":null,"reasoning_content":"<think>\n好的,用户发来的是“你好!”,我","tool_calls":null},"logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":16,"total_tokens":29,"completion_tokens":13}}
Environment
TorchVision: 0.21.0+cu124
LMDeploy: 0.9.0+unknown
transformers: 4.52.4
gradio: Not Found
fastapi: 0.115.13
pydantic: 2.11.7
triton: 3.2.0