Document async endpoint functionality #786

dagardner-nv · 2025-09-10T22:30:47Z

Description

Document the /generate/async endpoint
Document configuring the Dask and SQLAlchemy database parameters
Fix handling of max_running_async_jobs config parameter

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

• New Features
• Added an asynchronous generate API endpoint (/generate/async) with job IDs, optional timeouts, and output expiry; supports external schedulers and SQL-backed job history.
• Added an OpenAI v1-compatible API endpoint option.
• Made async worker count configurable (one thread per worker).

• Documentation
• Expanded startup guide, API key setup, install/run examples, migration guidance, and detailed async/streaming request/response examples (including intermediate streaming payloads).
• Noted installation requirement and install commands for the evaluation endpoint.

Signed-off-by: David Gardner <[email protected]>

…uate endpoint Signed-off-by: David Gardner <[email protected]>

Signed-off-by: David Gardner <[email protected]>

coderabbitai · 2025-09-10T22:30:55Z

Warning

Rate limit exceeded

@dagardner-nv has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 22 minutes and 32 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 0c34d1d and 4d37546.

📒 Files selected for processing (2)

docs/source/reference/api-server-endpoints.md (2 hunks)
docs/source/reference/evaluate-api.md (1 hunks)

Walkthrough

Adds documentation for an async /generate endpoint, OpenAI v1-compatible endpoint, server startup and installation notes (async_endpoints extra), and streaming examples. Adjusts FastAPI front-end to size LocalCluster workers from config, clarifies max_running_async_jobs description, updates JobStore db_url docstring, and adds "SQLAlchemy" to Vale vocabulary.

Changes

Cohort / File(s)	Summary
Docs: API server endpoints `docs/source/reference/api-server-endpoints.md`	Adds `/generate/async` docs, job lifecycle fields (job_id, sync_timeout, expiry_seconds), submission/examples (sync and async flows), startup/dev commands, OpenAI v1-compatible endpoint section, streaming intermediate payload example, and migration guidance.
Docs: Evaluation endpoint install note `docs/source/reference/evaluate-api.md`	Notes the evaluation endpoint is available only with the `async_endpoints` extra and adds install commands for source and PyPI installs.
FastAPI config & runtime `src/nat/front_ends/fastapi/fastapi_front_end_config.py` `src/nat/front_ends/fastapi/fastapi_front_end_plugin.py`	Expands description of `max_running_async_jobs` to state it controls Dask worker count when `scheduler_address` is None; `FastApiFrontEndPlugin.run()` now constructs `LocalCluster(n_workers=self.front_end_config.max_running_async_jobs, threads_per_worker=1)`.
Job store docstring `src/nat/front_ends/fastapi/job_store.py`	Updates `db_url` docstring to reference the specific SQLAlchemy database-URL docs page.
CI/vale vocabulary `ci/vale/styles/config/vocabularies/nat/accept.txt`	Adds the literal `SQLAlchemy` to accepted vocabulary.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant API as API Server (FastAPI)
  participant Dask as Dask (LocalCluster or External)
  participant DB as JobStore (SQLAlchemy)

  note over API: /generate/async (requires async_endpoints extra)
  Client->>API: POST /generate/async {input, job_id?, sync_timeout?, expiry_seconds?}
  alt scheduler_address provided
    API->>Dask: Submit job to external scheduler
  else local cluster
    API->>Dask: Submit job to LocalCluster(n_workers=max_running_async_jobs, threads_per_worker=1)
  end
  API->>DB: Persist job metadata/status
  alt completes within sync_timeout
    Dask-->>API: Result
    API->>DB: Update job status/result
    API-->>Client: 200 {status: completed, output}
  else async/pending
    API-->>Client: 202 {job_id, status: pending}
    Client->>API: GET /jobs/{job_id}
    API->>DB: Fetch status/result
    DB-->>API: state/result
    API-->>Client: {status, output?}
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Document async endpoint functionality" is concise (37 characters), uses imperative mood, and accurately summarizes the PR's primary purpose of documenting the async /generate endpoint and related configuration. It fits the ~72-character guideline and is clearly related to the changes described in the PR objectives.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull Request Overview

This PR documents the async endpoint functionality and fixes configuration handling for the max_running_async_jobs parameter. The changes improve documentation for asynchronous job processing capabilities and ensure proper configuration of the Dask cluster.

Documents the /generate/async endpoint with examples and configuration details
Fixes the handling of max_running_async_jobs to properly configure Dask worker count
Updates documentation links and adds SQLAlchemy to the accepted vocabulary

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/nat/front_ends/fastapi/job_store.py	Updates SQLAlchemy documentation URL reference
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py	Applies max_running_async_jobs config to Dask LocalCluster
src/nat/front_ends/fastapi/fastapi_front_end_config.py	Improves documentation for max_running_async_jobs parameter
docs/source/reference/evaluate-api.md	Documents async_endpoints dependency requirement
docs/source/reference/api-server-endpoints.md	Adds comprehensive documentation for async endpoint
ci/vale/styles/config/vocabularies/nat/accept.txt	Adds SQLAlchemy to accepted vocabulary

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

…avid-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

ci/vale/styles/config/vocabularies/nat/accept.txt (1)

1-4: Missing SPDX header — will fail CI header checks.

Add the standard SPDX Apache‑2.0 header at the top of this .txt file.

Apply this diff:

+# SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
 # List of case-sensitive regular expressions matching words that should be accepted by Vale. For product names like
 # "cuDF" or "cuML", we want to ensure that they are capitalized the same way they're written by the product owners.
 # Regular expressions are parsed according to the Go syntax: https://golang.org/pkg/regexp/syntax/

src/nat/front_ends/fastapi/job_store.py (4)

515-516: Fix invalid SQLAlchemy filter: .not_in doesn’t exist.

Use .notin_(...) (or ~col.in_(...)) to generate a NOT IN clause.

Apply this diff:

-        stmt = select(JobInfo).where(
-            and_(JobInfo.is_expired == sa_expr.false(),
-                 JobInfo.status.not_in(self.ACTIVE_STATUS))).order_by(JobInfo.updated_at.desc())
+        stmt = select(JobInfo).where(
+            and_(JobInfo.is_expired == sa_expr.false(),
+                 JobInfo.status.notin_(self.ACTIVE_STATUS))
+        ).order_by(JobInfo.updated_at.desc())

159-160: STATUS type mismatch causes logic errors; store and compare consistently as strings.

ACTIVE_STATUS currently holds Enum members but the DB stores strings; membership checks and SQL filters will misbehave.

Apply this diff:

-    ACTIVE_STATUS = {JobStatus.RUNNING, JobStatus.SUBMITTED}
+    ACTIVE_STATUS = {JobStatus.RUNNING.value, JobStatus.SUBMITTED.value}

470-475: Filter by the string value, not the Enum object.

JobInfo.status is a string column; comparing it to a JobStatus Enum will not match.

Apply this diff:

-        if not isinstance(status, JobStatus):
-            status = JobStatus(status)
-
-        stmt = select(JobInfo).where(JobInfo.status == status)
+        if not isinstance(status, JobStatus):
+            status = JobStatus(status)
+
+        stmt = select(JobInfo).where(JobInfo.status == status.value)

255-263: Insert uses Enum object into a String column — may break on some DB drivers.

Persist the Enum’s .value for consistency with queries and updates.

Apply this diff:

-        job = JobInfo(job_id=job_id,
-                      status=JobStatus.SUBMITTED,
+        job = JobInfo(job_id=job_id,
+                      status=JobStatus.SUBMITTED.value,
                       config_file=config_file,
                       created_at=datetime.now(UTC),
                       updated_at=datetime.now(UTC),
                       error=None,
                       output_path=None,
                       expiry_seconds=clamped_expiry)

src/nat/front_ends/fastapi/fastapi_front_end_config.py (1)

60-75: Strengthen job_id validation to enforce allowed characters.

Current check blocks path traversal but doesn’t enforce “alphanumeric or underscore” as documented.

Apply this diff:

@@
-import typing
+import typing
+import re
@@
     def validate_job_id(cls, job_id: str):
         job_id = job_id.strip()
-        job_id_path = Path(job_id)
-        if len(job_id_path.parts) > 1 or job_id_path.resolve().name != job_id:
+        # Enforce allowed characters
+        if not re.fullmatch(r"[A-Za-z0-9_]+", job_id):
             raise ValueError(
-                f"Job ID '{job_id}' contains invalid characters. Only alphanumeric characters and underscores are"
-                " allowed.")
+                f"Job ID '{job_id}' contains invalid characters. Only alphanumeric characters and underscores are allowed."
+            )
+        # Block any path-like input defensively
+        job_id_path = Path(job_id)
+        if len(job_id_path.parts) > 1:
+            raise ValueError(f"Job ID '{job_id}' must not contain path separators.")

🧹 Nitpick comments (9)

src/nat/front_ends/fastapi/job_store.py (3)

121-123: Optional: Align Python type with stored DB type or switch to native Enum.

Either annotate as str to match the String column, or use SAEnum(JobStatus) for safer typing.

Two options:

-    status: Mapped[JobStatus] = mapped_column(String(11))
+    status: Mapped[str] = mapped_column(String(11), index=True)

Or:

+from sqlalchemy import Enum as SAEnum
 ...
-    status: Mapped[JobStatus] = mapped_column(String(11))
+    status: Mapped[JobStatus] = mapped_column(SAEnum(JobStatus, values_callable=lambda e: [m.value for m in e]), index=True)

441-447: Optional: Add DB index to speed “last job” queries.

Ordering by created_at/updated_at is frequent; an index helps on large tables.

Apply one of:

-    updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True),
-                                                 default=datetime.now(UTC),
-                                                 onupdate=datetime.now(UTC))
+    updated_at: Mapped[datetime] = mapped_column(
+        DateTime(timezone=True),
+        default=datetime.now(UTC),
+        onupdate=datetime.now(UTC),
+        index=True,
+    )

306-325: Optional: Make sync_timeout robust to long-running jobs.

Consider catching/canceling stale local waits and logging at DEBUG when timing out.

-            if sync_timeout > 0:
+            if sync_timeout > 0:
                 try:
                     _ = await future.result(timeout=sync_timeout)
                     job = await self.get_job(job_id)
                     assert job is not None, "Job should exist after future result"
                     return (job_id, job)
                 except TimeoutError:
-                    pass
+                    logger.debug("submit_job timed out after %s s for job_id=%s", sync_timeout, job_id)

src/nat/front_ends/fastapi/fastapi_front_end_plugin.py (1)

109-110: Good fix: wire max_running_async_jobs into Dask LocalCluster.

This aligns runtime concurrency with configuration. Consider capping workers to CPU count to avoid excessive process fan-out on small hosts.

Example:
- self._cluster = LocalCluster(n_workers=self.front_end_config.max_running_async_jobs,
-                              threads_per_worker=1)
+ self._cluster = LocalCluster(
+     n_workers=min(self.front_end_config.max_running_async_jobs, max(1, os.cpu_count() or 1)),
+     threads_per_worker=1,
+)

docs/source/reference/evaluate-api.md (1)

23-23: Looks good; consider switching inline commands to fenced blocks for copy/paste.

Turning the two install commands into a fenced bash block improves readability and reduces copy errors.

- ... installed. For users installing from source, this can be done by running `uv pip install -e .[async_endpoints]` from the root directory of the NeMo Agent toolkit library. Similarly, for users installing from PyPI, this can be done by running `pip install nvidia-nat[async_endpoints]`.
+ ... installed. For users installing from source:
+
+ ```bash
+ uv pip install -e .[async_endpoints]
+ ```
+
+ For users installing from PyPI:
+
+ ```bash
+ pip install nvidia-nat[async_endpoints]
+ ```

docs/source/reference/api-server-endpoints.md (4)

21-25: Fix markdown list indentation (MD007).

Unindent the list to satisfy markdownlint and render consistently.

-  - **Generate Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
-    using Swagger while the server is running [`http://localhost:8000/docs`](http://localhost:8000/docs).
-  - **Chat Interface:** [OpenAI API Documentation](https://platform.openai.com/docs/guides/text?api-mode=chat) provides
-    details on chat formats compatible with the NeMo Agent toolkit server.
+- **Generate Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
+  using Swagger while the server is running [`http://localhost:8000/docs`](http://localhost:8000/docs).
+- **Chat Interface:** [OpenAI API Documentation](https://platform.openai.com/docs/guides/text?api-mode=chat) provides
+  details on chat formats compatible with the NeMo Agent toolkit server.

34-39: Avoid bare URL and improve sentence flow (MD034).

Add punctuation and wrap the URL.

-The following examples assume that the simple calculator workflow has been installed and is running on http://localhost:8000 to do so run the following commands:
+The following examples assume that the simple calculator workflow has been installed and is running on <http://localhost:8000>. To do so, run:

uv pip install -e examples/getting_started/simple_calculator
nat serve --config_file examples/getting_started/simple_calculator/configs/config.yml

71-74: Grammar/consistency in expiry_seconds.

Fix pluralization and wording.

-  - `expiry_seconds`: The amount of time in seconds after the job completes (either successfully or unsuccessfully) which any output files will be preserved before being deleted. Default is `3600` (1 hours), minimum is `600` (10 minutes) and maximum value for this field is `86400` (24 hours). The text output in the response is not affected by this field.
+  - `expiry_seconds`: Time in seconds after job completion (success or failure) to preserve any output files before deletion. Default is `3600` (1 hour); minimum `600` (10 minutes); maximum `86400` (24 hours). The text output in the response is not affected by this field.

109-118: Use consistent RFC 3339 timestamps.

Add Z suffix to UTC timestamps for created_at and updated_at to match expires_at.

-    "created_at": "2025-09-10T20:52:24.768066",
+    "created_at": "2025-09-10T20:52:24.768066Z",
@@
-    "updated_at": "2025-09-10T20:52:30.734659"
+    "updated_at": "2025-09-10T20:52:30.734659Z"

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c506b31 and 461aefe.

📒 Files selected for processing (6)

ci/vale/styles/config/vocabularies/nat/accept.txt (1 hunks)
docs/source/reference/api-server-endpoints.md (2 hunks)
docs/source/reference/evaluate-api.md (1 hunks)
src/nat/front_ends/fastapi/fastapi_front_end_config.py (1 hunks)
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py (1 hunks)
src/nat/front_ends/fastapi/job_store.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (9)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks

Files:

ci/vale/styles/config/vocabularies/nat/accept.txt
src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py
docs/source/reference/evaluate-api.md
docs/source/reference/api-server-endpoints.md

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

ci/vale/styles/config/vocabularies/nat/accept.txt
src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py
docs/source/reference/evaluate-api.md
docs/source/reference/api-server-endpoints.md

src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

src/**/*.py: All importable Python code must live under src/
All public APIs in src/ require Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise with a period; surround code entities with backticks

Files:

src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py

src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Core functionality under src/nat should prioritize backward compatibility when changed

Files:

src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bare raise and log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

Files:

src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py

**/*.{py,md}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm

Files:

src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py
docs/source/reference/evaluate-api.md
docs/source/reference/api-server-endpoints.md

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

src/nat/front_ends/fastapi/job_store.py
src/nat/front_ends/fastapi/fastapi_front_end_config.py
src/nat/front_ends/fastapi/fastapi_front_end_plugin.py

docs/source/**/*.md

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

docs/source/**/*.md: Use the official naming: first use “NVIDIA NeMo Agent toolkit”; subsequent uses “NeMo Agent toolkit”; never use deprecated names in documentation
Documentation sources must be Markdown under docs/source; keep docs in sync and fix Sphinx errors/broken links
Documentation must be clear, comprehensive, free of TODO/FIXME/placeholders/offensive/outdated terms; fix spelling; adhere to Vale vocab allow/reject lists

Files:

docs/source/reference/evaluate-api.md
docs/source/reference/api-server-endpoints.md

docs/source/**/*

⚙️ CodeRabbit configuration file

This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Files:

docs/source/reference/evaluate-api.md
docs/source/reference/api-server-endpoints.md

🧬 Code graph analysis (1)

src/nat/front_ends/fastapi/fastapi_front_end_plugin.py (1)

src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (1)

front_end_config (123-124)

🪛 markdownlint-cli2 (0.17.2)

docs/source/reference/api-server-endpoints.md

21-21: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

23-23: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

34-34: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

🔇 Additional comments (7)

ci/vale/styles/config/vocabularies/nat/accept.txt (1)

130-131: LGTM — adding “SQLAlchemy” to accepted vocabulary avoids false positives in docs.
src/nat/front_ends/fastapi/job_store.py (4)
150-151: Doc link improvement is clear and specific.

495-503: Active-job expiry check compares strings to Enums; will incorrectly expire running jobs.

With the fix to ACTIVE_STATUS (strings), this check works. If you don’t change ACTIVE_STATUS, wrap with JobStatus(job.status).

Apply either:

Keep the ACTIVE_STATUS string set (preferred); no change needed here.

Or:
-        if job.status in self.ACTIVE_STATUS:
+        if JobStatus(job.status) in self.ACTIVE_STATUS:
             return None
211-218: Is async_scoped_session.remove() awaitable?

If remove() is synchronous (as in scoped_session), await will fail at runtime.

Please confirm against the SQLAlchemy version in use. If not awaitable:
-        await self._session.remove()
+        self._session.remove()
552-553: No await required — keep var.delete() as-is.
Variable.delete on the client is a synchronous method (it sends "variable_delete" to the scheduler) and is not awaitable, so the current var.delete() call is correct. (docs.dask.org)
src/nat/front_ends/fastapi/fastapi_front_end_config.py (2)

208-213: Docstring clarification for max_running_async_jobs looks good.

Clearer behavior note (only used with local cluster) and impact on worker count are helpful.

200-207: LocalCluster already uses max_running_async_jobs; confirm it's guarded by scheduler_address

LocalCluster in src/nat/front_ends/fastapi/fastapi_front_end_plugin.py (≈lines 106–112) is instantiated with n_workers=self.front_end_config.max_running_async_jobs and threads_per_worker=1 — confirm this code path only runs when front_end_config.scheduler_address is None.

docs/source/reference/api-server-endpoints.md

Signed-off-by: David Gardner <[email protected]>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

docs/source/reference/api-server-endpoints.md (2)

20-20: Count mismatch fixed — looks good.

The numeral now matches the five listed transactions.

65-65: Replace incorrect config key, fix punctuation/spacing, and split into readable sentences.

Use db_url (not database_url), add missing space in “SQLAlchemy’s Dialects,” and avoid run-ons.

-Asynchronous jobs are managed using [Dask](https://docs.dask.org/en/stable/) by default a local Dask cluster is created at start time, however you can also configure the server to connect to an existing Dask scheduler by setting the `scheduler_address` configuration parameter. The Dask scheduler is used to manage the execution of asynchronous jobs, and can be configured to run on a single machine or across a cluster of machines. Job history and metadata is stored in a SQL database using [SQLAlchemy](https://www.sqlalchemy.org/) by default a temporary SQLite database is created at start time, however you can also configure the server to use a persistent database by setting the `database_url` configuration parameter. Any database supported by [SQLAlchemy's Asynchronous I/O extension](https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html) can be used, refer to [SQLAlchemy'sDialects](https://docs.sqlalchemy.org/en/20/dialects/index.html) for a complete list (many but not all of these support Asynchronous I/O).
+Asynchronous jobs are managed using [Dask](https://docs.dask.org/en/stable/). By default, a local Dask cluster is created at start time. You can also connect to an existing Dask scheduler by setting the `scheduler_address` configuration parameter. The Dask scheduler manages the execution of asynchronous jobs and can run on a single machine or across a cluster.
+
+Job history and metadata are stored in a SQL database using [SQLAlchemy](https://www.sqlalchemy.org/). By default, a temporary SQLite database is created at start time. To use a persistent database, set the `db_url` configuration parameter. Any database supported by [SQLAlchemy’s Asynchronous I/O extension](https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html) can be used—refer to [SQLAlchemy’s Dialects](https://docs.sqlalchemy.org/en/20/dialects/index.html) for a complete list (many but not all support asynchronous I/O).

🧹 Nitpick comments (7)

docs/source/reference/api-server-endpoints.md (7)

21-24: Fix top-level list indentation (markdownlint MD007).

Unindent the bullets to column 0.

-  - **Generate Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
+ - **Generate Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
     using Swagger while the server is running [`http://localhost:8000/docs`](http://localhost:8000/docs).
-  - **Chat Interface:** [OpenAI API Documentation](https://platform.openai.com/docs/guides/text?api-mode=chat) provides
+ - **Chat Interface:** [OpenAI API Documentation](https://platform.openai.com/docs/guides/text?api-mode=chat) provides
     details on chat formats compatible with the NeMo Agent toolkit server.

27-27: Branding/style: use “NeMo Agent toolkit” (lowercase “toolkit”) and sentence case.

Aligns with the docs naming rule.

-## Start the NeMo Agent Toolkit Server
+## Start the NeMo Agent toolkit server

34-38: Fix run-on sentence and bare URL (markdownlint MD034).

Split the sentence and format the URL.

-The following examples assume that the simple calculator workflow has been installed and is running on http://localhost:8000 to do so run the following commands:
+The following examples assume the simple calculator workflow is installed and running at `http://localhost:8000`. To do so, run:
 ```bash
 uv pip install -e examples/getting_started/simple_calculator
 nat serve --config_file examples/getting_started/simple_calculator/configs/config.yml


---

`63-64`: **Quote extras in pip commands to avoid shell globbing.**

Prevents bracket expansion in some shells.


```diff
-... by running `uv pip install -e .[async_endpoints]` from the root directory of the NeMo Agent toolkit library. Similarly, for users installing from PyPI, this can be done by running `pip install nvidia-nat[async_endpoints]`.
+... by running `uv pip install -e '.[async_endpoints]'` from the root directory of the NeMo Agent toolkit library. Similarly, for users installing from PyPI, this can be done by running `pip install 'nvidia-nat[async_endpoints]'`.

71-74: Tighten wording and fix minor grammar.

Plural agreement and “1 hour”.

-  - `sync_timeout`: The maximum time in seconds to wait for the job to complete before returning a response. If the job completes in less than `sync_timeout` seconds then the response will include the job result, otherwise the `job_id` and `status` is returned. Default is `0` which causes the request to return immediately, and maximum value for this field is `300`.
+  - `sync_timeout`: The maximum time in seconds to wait for the job to complete before returning a response. If the job completes within `sync_timeout`, the response includes the job result; otherwise, the `job_id` and `status` are returned. The default is `0` (return immediately). The maximum value is `300`.
-  - `expiry_seconds`: The amount of time in seconds after the job completes (either successfully or unsuccessfully) which any output files will be preserved before being deleted. Default is `3600` (1 hours), minimum is `600` (10 minutes) and maximum value for this field is `86400` (24 hours). The text output in the response is not affected by this field.
+  - `expiry_seconds`: The number of seconds after job completion (success or failure) that any output files are preserved before deletion. Default is `3600` (1 hour); minimum is `600` (10 minutes); maximum is `86400` (24 hours). The text output in the response is not affected by this field.

109-118: Standardize timestamps (add “Z” or state timezone).

created_at and updated_at omit Z while expires_at includes it. Prefer RFC 3339 UTC everywhere.

-    "created_at": "2025-09-10T20:52:24.768066",
+    "created_at": "2025-09-10T20:52:24.768066Z",
...
-    "updated_at": "2025-09-10T20:52:30.734659"
+    "updated_at": "2025-09-10T20:52:30.734659Z"

60-66: Add retrieval endpoints and failure states for async jobs.

Document how to poll job status and fetch results/errors (route, method, response schema, typical HTTP status codes).

I can draft a short “Job Status and Retrieval” subsection once you confirm the route(s), e.g., GET /jobs/{job_id} and/or GET /generate/async/{job_id} with optional deletion/cleanup semantics.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 461aefe and d65f342.

📒 Files selected for processing (1)

docs/source/reference/api-server-endpoints.md (2 hunks)

🧰 Additional context used

📓 Path-based instructions (5)

docs/source/**/*.md

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

docs/source/**/*.md: Use the official naming: first use “NVIDIA NeMo Agent toolkit”; subsequent uses “NeMo Agent toolkit”; never use deprecated names in documentation
Documentation sources must be Markdown under docs/source; keep docs in sync and fix Sphinx errors/broken links
Documentation must be clear, comprehensive, free of TODO/FIXME/placeholders/offensive/outdated terms; fix spelling; adhere to Vale vocab allow/reject lists

Files:

docs/source/reference/api-server-endpoints.md

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks

Files:

docs/source/reference/api-server-endpoints.md

**/*.{py,md}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm

Files:

docs/source/reference/api-server-endpoints.md

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

docs/source/reference/api-server-endpoints.md

docs/source/**/*

⚙️ CodeRabbit configuration file

This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Files:

docs/source/reference/api-server-endpoints.md

🪛 markdownlint-cli2 (0.17.2)

docs/source/reference/api-server-endpoints.md

21-21: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

23-23: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

34-34: Bare URL used

(MD034, no-bare-urls)

🔇 Additional comments (1)

docs/source/reference/api-server-endpoints.md (1)

27-39: Validation complete — internal links and config key "db_url" verified.

Quick checks: "Obtaining API Keys" anchor exists; evaluate-api.md and websockets.md are present. db_url is used consistently: defined in src/nat/front_ends/fastapi/fastapi_front_end_config.py, consumed by src/nat/front_ends/fastapi/job_store.py (falls back to NAT_JOB_STORE_DB_URL), passed through the FastAPI front-end/plugin and worker code, and covered by tests.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (9)

docs/source/reference/api-server-endpoints.md (9)

18-18: Use the official product name casing in the first mention.

Change “Toolkit” → “toolkit” to match the guideline “NVIDIA NeMo Agent toolkit” on first use.

-# NVIDIA NeMo Agent Toolkit API Server Endpoints
+# NVIDIA NeMo Agent toolkit API Server Endpoints

21-24: Fix list indentation to satisfy markdownlint MD007.

Top-level list items shouldn’t be indented.

-  - **Generate Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
+- **Generate Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
     using Swagger while the server is running [`http://localhost:8000/docs`](http://localhost:8000/docs).
-  - **Chat Interface:** [OpenAI API Documentation](https://platform.openai.com/docs/guides/text?api-mode=chat) provides
+- **Chat Interface:** [OpenAI API Documentation](https://platform.openai.com/docs/guides/text?api-mode=chat) provides
     details on chat formats compatible with the NeMo Agent toolkit server.

34-38: Split run-on sentence and avoid bare URL (MD034).

Improve readability and lint compliance.

-The following examples assume that the simple calculator workflow has been installed and is running on http://localhost:8000 to do so run the following commands:
+The following examples assume the simple calculator workflow is installed and running at <http://localhost:8000>. To do so, run:
 ```bash
 uv pip install -e examples/getting_started/simple_calculator
 nat serve --config_file examples/getting_started/simple_calculator/configs/config.yml


---

`60-66`: **Clarify Dask/SQLAlchemy configuration, fix grammar, and mention max_running_async_jobs.**

- Break up run-ons.
- “metadata are stored” (plural).
- Document how max_running_async_jobs controls local parallelism when no external scheduler is used.


Please confirm max_running_async_jobs maps to LocalCluster workers in code.

```diff
-This endpoint is only available when the `async_endpoints` optional dependency extra is installed. For users installing from source, this can be done by running `uv pip install -e .[async_endpoints]` from the root directory of the NeMo Agent toolkit library. Similarly, for users installing from PyPI, this can be done by running `pip install nvidia-nat[async_endpoints]`.
+This endpoint is only available when the `async_endpoints` optional dependency extra is installed. For source installs, run `uv pip install -e .[async_endpoints]` from the repository root. For PyPI installs, run `pip install nvidia-nat[async_endpoints]`.

-Asynchronous jobs are managed using [Dask](https://docs.dask.org/en/stable/) by default a local Dask cluster is created at start time, however you can also configure the server to connect to an existing Dask scheduler by setting the `scheduler_address` configuration parameter. The Dask scheduler is used to manage the execution of asynchronous jobs, and can be configured to run on a single machine or across a cluster of machines. Job history and metadata is stored in a SQL database using [SQLAlchemy](https://www.sqlalchemy.org/) by default a temporary SQLite database is created at start time, however you can also configure the server to use a persistent database by setting the `db_url` configuration parameter. Refer to the [SQLAlchemy documentation](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls) for the format of the `db_url` parameter. Any database supported by [SQLAlchemy's Asynchronous I/O extension](https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html) can be used, refer to [SQLAlchemy's Dialects](https://docs.sqlalchemy.org/en/20/dialects/index.html) for a complete list (many but not all of these support Asynchronous I/O).
+Asynchronous jobs are managed using [Dask](https://docs.dask.org/en/stable/). By default, if `scheduler_address` is not set, the server creates a local Dask cluster at startup. The number of concurrent jobs is controlled by the `max_running_async_jobs` configuration option. To use an external scheduler, set `scheduler_address`.
+
+Job history and metadata are stored in a SQL database using [SQLAlchemy](https://www.sqlalchemy.org/). By default, a temporary SQLite database is created at start time. To use a persistent database, set the `db_url` configuration parameter. Refer to the [SQLAlchemy documentation](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls) for the `db_url` format. Any database supported by [SQLAlchemy’s asynchronous I/O extension](https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html) can be used—see [SQLAlchemy’s Dialects](https://docs.sqlalchemy.org/en/20/dialects/index.html) for a complete list (many but not all support asynchronous I/O).

71-74: Tighten wording and fix grammar in optional fields.

Subject–verb agreement, clearer phrasing, and correct “1 hour”.

-  - `sync_timeout`: The maximum time in seconds to wait for the job to complete before returning a response. If the job completes in less than `sync_timeout` seconds then the response will include the job result, otherwise the `job_id` and `status` is returned. Default is `0` which causes the request to return immediately, and maximum value for this field is `300`.
+  - `sync_timeout`: The maximum time in seconds to wait for the job to complete before returning a response. If the job completes within `sync_timeout`, the response includes the job result; otherwise, the `job_id` and `status` are returned. Default is `0` (return immediately). The maximum value is `300`.
-  - `expiry_seconds`: The amount of time in seconds after the job completes (either successfully or unsuccessfully) which any output files will be preserved before being deleted. Default is `3600` (1 hours), minimum is `600` (10 minutes) and maximum value for this field is `86400` (24 hours). The text output in the response is not affected by this field.
+  - `expiry_seconds`: The number of seconds after the job completes (success or failure) during which any output files are preserved before deletion. Default is `3600` (1 hour), minimum is `600` (10 minutes), and the maximum is `86400` (24 hours). The text output in the response is not affected by this field.

109-118: Use consistent ISO 8601 timestamps with timezone.

Add “Z” (UTC) or an explicit offset for consistency with expires_at.

-    "created_at": "2025-09-10T20:52:24.768066",
+    "created_at": "2025-09-10T20:52:24.768066Z",
@@
-    "updated_at": "2025-09-10T20:52:30.734659"
+    "updated_at": "2025-09-10T20:52:30.734659Z"

345-357: Standardize product name casing.

Use “NeMo Agent toolkit” (lowercase “toolkit”) in narrative text.

-The NeMo Agent Toolkit provides full OpenAI Chat Completions API compatibility through a dedicated endpoint that enables seamless integration with existing OpenAI-compatible client libraries and workflows.
+The NeMo Agent toolkit provides full OpenAI Chat Completions API compatibility through a dedicated endpoint that enables seamless integration with existing OpenAI-compatible client libraries and workflows.

467-471: Standardize product name casing in code comment.

Align with naming guideline.

-# Initialize client pointing to your NeMo Agent Toolkit server
+# Initialize client pointing to your NeMo Agent toolkit server

534-536: Standardize section heading casing.

Use “NeMo Agent toolkit”.

-## NeMo Agent Toolkit API Server Interaction Guide
+## NeMo Agent toolkit API Server Interaction Guide

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d65f342 and 0c34d1d.

📒 Files selected for processing (1)

docs/source/reference/api-server-endpoints.md (2 hunks)

🧰 Additional context used

📓 Path-based instructions (5)

docs/source/**/*.md

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

docs/source/**/*.md: Use the official naming: first use “NVIDIA NeMo Agent toolkit”; subsequent uses “NeMo Agent toolkit”; never use deprecated names in documentation
Documentation sources must be Markdown under docs/source; keep docs in sync and fix Sphinx errors/broken links
Documentation must be clear, comprehensive, free of TODO/FIXME/placeholders/offensive/outdated terms; fix spelling; adhere to Vale vocab allow/reject lists

Files:

docs/source/reference/api-server-endpoints.md

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks

Files:

docs/source/reference/api-server-endpoints.md

**/*.{py,md}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm

Files:

docs/source/reference/api-server-endpoints.md

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

docs/source/reference/api-server-endpoints.md

docs/source/**/*

⚙️ CodeRabbit configuration file

This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Files:

docs/source/reference/api-server-endpoints.md

🪛 markdownlint-cli2 (0.17.2)

docs/source/reference/api-server-endpoints.md

21-21: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

23-23: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

34-34: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

willkill07

Need quotes around .[async_endpoints] because some shells are picky.

docs/source/reference/api-server-endpoints.md

docs/source/reference/evaluate-api.md

docs/source/reference/api-server-endpoints.md

…avid-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Co-authored-by: Will Killian <[email protected]> Signed-off-by: David Gardner <[email protected]>

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

…-nv/AIQtoolkit into david-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Co-authored-by: Will Killian <[email protected]> Signed-off-by: David Gardner <[email protected]>

Signed-off-by: David Gardner <[email protected]>

…-nv/AIQtoolkit into david-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

Signed-off-by: David Gardner <[email protected]>

dagardner-nv · 2025-09-15T16:30:44Z

/merge

dagardner-nv added 6 commits September 10, 2025 13:57

First pass at documenting the async endpoint

6de44eb

Signed-off-by: David Gardner <[email protected]>

Spelling fix

6c0e97c

Signed-off-by: David Gardner <[email protected]>

Document the need to install the async_endpoints deps to use the eval…

a1a1338

…uate endpoint Signed-off-by: David Gardner <[email protected]>

Set n_workers to max_running_async_jobs

7c1a702

Signed-off-by: David Gardner <[email protected]>

Fix the url for SQLAlchemy database urls

f1cbefa

Signed-off-by: David Gardner <[email protected]>

Document dask and sqlalchemy parameters

8b22f69

Signed-off-by: David Gardner <[email protected]>

dagardner-nv self-assigned this Sep 10, 2025

dagardner-nv requested a review from a team as a code owner September 10, 2025 22:30

dagardner-nv added doc Improvements or additions to documentation non-breaking Non-breaking change labels Sep 10, 2025

dagardner-nv requested review from Copilot and lvojtku and removed request for a team September 10, 2025 22:31

Copilot AI reviewed Sep 10, 2025

View reviewed changes

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into d…

461aefe

…avid-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

coderabbitai bot added the bug Something isn't working label Sep 10, 2025

dagardner-nv removed the bug Something isn't working label Sep 10, 2025

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

docs/source/reference/api-server-endpoints.md Outdated Show resolved Hide resolved

docs/source/reference/api-server-endpoints.md Outdated Show resolved Hide resolved

dagardner-nv added 2 commits September 10, 2025 16:09

five not four

d65f342

Signed-off-by: David Gardner <[email protected]>

db_url not database_url

0c34d1d

Signed-off-by: David Gardner <[email protected]>

coderabbitai bot reviewed Sep 10, 2025

View reviewed changes

willkill07 reviewed Sep 11, 2025

View reviewed changes

docs/source/reference/api-server-endpoints.md Outdated Show resolved Hide resolved

docs/source/reference/evaluate-api.md Outdated Show resolved Hide resolved

lvojtku approved these changes Sep 11, 2025

View reviewed changes

dagardner-nv and others added 5 commits September 15, 2025 08:05

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into d…

3d2d64a

…avid-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @lvojtku

7b3d057

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @lvojtku

3397be2

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @lvojtku

7278463

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @willkill07

f234253

Co-authored-by: Will Killian <[email protected]> Signed-off-by: David Gardner <[email protected]>

dagardner-nv and others added 9 commits September 15, 2025 08:15

Apply suggestion from @lvojtku

4ad7569

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Merge branch 'david-document-async-endpoints' of github.com:dagardner…

4d98abc

…-nv/AIQtoolkit into david-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @lvojtku

e3913ce

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @lvojtku

4a4cc94

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @lvojtku

627f894

Co-authored-by: lvojtku <[email protected]> Signed-off-by: David Gardner <[email protected]>

Apply suggestion from @willkill07

4d37546

Co-authored-by: Will Killian <[email protected]> Signed-off-by: David Gardner <[email protected]>

Surround the pip install in quotes as well

6982798

Signed-off-by: David Gardner <[email protected]>

Merge branch 'david-document-async-endpoints' of github.com:dagardner…

58d98ff

…-nv/AIQtoolkit into david-document-async-endpoints Signed-off-by: David Gardner <[email protected]>

Surround the pip install in quotes as well

b9fa6f4

Signed-off-by: David Gardner <[email protected]>

dagardner-nv requested a review from willkill07 September 15, 2025 15:26

willkill07 approved these changes Sep 15, 2025

View reviewed changes

rapids-bot bot merged commit b076638 into NVIDIA:develop Sep 15, 2025
17 checks passed

dagardner-nv deleted the david-document-async-endpoints branch September 15, 2025 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Document async endpoint functionality #786

Document async endpoint functionality #786

Uh oh!

dagardner-nv commented Sep 10, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 10, 2025 •

edited

Loading

Rate limit exceeded

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

willkill07 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dagardner-nv commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Document async endpoint functionality #786

Document async endpoint functionality #786

Uh oh!

Conversation

dagardner-nv commented Sep 10, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

Pre-merge checks (3 passed)

dagardner-nv commented Sep 10, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 10, 2025 •

edited

Loading