Add a blueprint for Haystack Deep Research Agent #461

oryx1729 · 2025-07-21T09:20:18Z

Description

Adds a blueprint demonstrating how to build a deep research agent using Haystack Framework that combines web search and Retrieval-Augmented Generation (RAG) using the NeMo-Agent-Toolkit.

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

copy-pr-bot · 2025-07-21T09:20:24Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

mdemoret-nv · 2025-07-21T21:23:32Z

@oryx1729 Can you update your commits to contain the DCO? More info here: https://github.com/NVIDIA/NeMo-Agent-Toolkit/pull/461/checks?check_run_id=46372766560

oryx1729 · 2025-07-23T15:26:36Z

Hi @mdemoret-nv, @mpangrazzi is on vacation and will be back next week. Is it be possible to start the review and have the DCO done when he's back?

mdemoret-nv · 2025-07-25T22:04:38Z

@oryx1729 No problem.

mpangrazzi · 2025-07-30T13:09:33Z

@mdemoret-nv Hi! Commits should be compliant with DCO now.

mdemoret-nv

Would be better to utilize the LLM top level object.

mdemoret-nv · 2025-08-21T19:32:29Z

...meworks/haystack_deep_research_agent/src/aiq_haystack_deep_research_agent/configs/config.yml

+  agent_model: nvidia/llama-3.3-nemotron-super-49b-v1
+  rag_model: nvidia/llama-3.3-nemotron-super-49b-v1
+  nvidia_api_url: https://integrate.api.nvidia.com/v1


We would want this to use the llm interface for specifying these models. Instead of referencing them directly, can we use the llms: top level object to specify this?

willkill07

Overall this looks good.

Please ensure that all "public" functions have full documentation and that all functions have type annotations. It really helps with clarity.

I can imagine in the future a complete haystack plugin with various features :)

willkill07 · 2025-08-22T01:42:09Z

examples/basic/frameworks/haystack_deep_research_agent/README.md

+The workflow demonstrates several key NeMo-Agent-Toolkit patterns:
+
+- **Function Registration**: Each tool is registered as a function with its own configuration
+- **Builder Pattern**: The NeMo-Agent-Toolkit Builder is used to create and manage tools and LLMs


This currently does not use the builder at all

willkill07 · 2025-08-22T01:42:52Z

examples/basic/frameworks/haystack_deep_research_agent/README.md

+
+The workflow demonstrates several key NeMo-Agent-Toolkit patterns:
+
+- **Function Registration**: Each tool is registered as a function with its own configuration


No functions are registered with the toolkit directly. You are creating internal tools and building them within your specific workflow.

willkill07 · 2025-08-22T01:47:09Z

...meworks/haystack_deep_research_agent/src/aiq_haystack_deep_research_agent/configs/config.yml

+
+general:
+  use_uvloop: true
+


I can envision a more robust separation of components, but the scope of that would be far too large. Instead, let's consider some enhancements to your existing configuration:

general: use_uvloop: true llms: rag_llm: _type: nim model: nvidia/llama-3.3-nemotron-super-49b-v1 api_key: ${NVIDIA_API_KEY} agent_llm: _type: nim model: nvidia/llama-3.3-nemotron-super-49b-v1 api_key: ${NVIDIA_API_KEY} workflow: _type: haystack_deep_research_agent max_agent_steps: 20 search_top_k: 10 rag_top_k: 15 opensearch_url: http://localhost:9200 data_dir: /data index_on_startup: true

By specifying the LLMs separately, they can automatically take advantage of other parameters like temperature. You can also still easily get the configuration by doing:

You can then get the configuration from rag_llm and instantiate the LLM by:

config = builder.get_llm_config("rag_llm") generator = NVIDIAChatGenerator(**config.model_dump_json(exclude={"type"}))

You can do the same for agent_llm

Signed-off-by: Michele Pangrazzi <[email protected]>

Signed-off-by: oryx1729 <[email protected]> Signed-off-by: Michele Pangrazzi <[email protected]>

…iaChatGenerator ; Rewrote tests Signed-off-by: Michele Pangrazzi <[email protected]>

Signed-off-by: Michele Pangrazzi <[email protected]>

mpangrazzi · 2025-08-22T13:36:25Z

@willkill07 @mdemoret-nv Thank you! I've just did another iteration following your suggestions.

mdemoret-nv added enhancement non-breaking Non-breaking change labels Jul 21, 2025

mdemoret-nv added feature request New feature or request and removed enhancement labels Jul 25, 2025

mpangrazzi force-pushed the develop branch from 5750e5a to 964eccc Compare July 30, 2025 12:59

mpangrazzi force-pushed the develop branch 2 times, most recently from 70b2319 to 6ccb74a Compare August 21, 2025 08:55

mdemoret-nv reviewed Aug 21, 2025

View reviewed changes

willkill07 reviewed Aug 22, 2025

View reviewed changes

mpangrazzi and others added 5 commits August 22, 2025 15:34

Add haystack deep research agent example

01340ed

Signed-off-by: Michele Pangrazzi <[email protected]>

Update README.md

2d8238e

Signed-off-by: oryx1729 <[email protected]> Signed-off-by: Michele Pangrazzi <[email protected]>

Refactoring / reorganizing pipelines ; Update README ; Switch to Nvid…

129378e

…iaChatGenerator ; Rewrote tests Signed-off-by: Michele Pangrazzi <[email protected]>

Reformat ; Using nvidia/llama-3.3-nemotron-super-49b-v1

c1dd447

Signed-off-by: Michele Pangrazzi <[email protected]>

Update README ; applying reviewers suggestions

a94c9d4

Signed-off-by: Michele Pangrazzi <[email protected]>

mpangrazzi force-pushed the develop branch from 27df197 to a94c9d4 Compare August 22, 2025 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a blueprint for Haystack Deep Research Agent #461

Add a blueprint for Haystack Deep Research Agent #461

Uh oh!

oryx1729 commented Jul 21, 2025

Uh oh!

copy-pr-bot bot commented Jul 21, 2025

Uh oh!

mdemoret-nv commented Jul 21, 2025

Uh oh!

oryx1729 commented Jul 23, 2025

Uh oh!

mdemoret-nv commented Jul 25, 2025

Uh oh!

mpangrazzi commented Jul 30, 2025

Uh oh!

mdemoret-nv left a comment

Uh oh!

mdemoret-nv Aug 21, 2025

Uh oh!

willkill07 left a comment

Uh oh!

willkill07 Aug 22, 2025

Uh oh!

willkill07 Aug 22, 2025

Uh oh!

willkill07 Aug 22, 2025

Uh oh!

mpangrazzi commented Aug 22, 2025

Uh oh!

Uh oh!


		The workflow demonstrates several key NeMo-Agent-Toolkit patterns:

		- Function Registration: Each tool is registered as a function with its own configuration

Add a blueprint for Haystack Deep Research Agent #461

Are you sure you want to change the base?

Add a blueprint for Haystack Deep Research Agent #461

Uh oh!

Conversation

oryx1729 commented Jul 21, 2025

Description

By Submitting this PR I confirm:

Uh oh!

copy-pr-bot bot commented Jul 21, 2025

Uh oh!

mdemoret-nv commented Jul 21, 2025

Uh oh!

oryx1729 commented Jul 23, 2025

Uh oh!

mdemoret-nv commented Jul 25, 2025

Uh oh!

mpangrazzi commented Jul 30, 2025

Uh oh!

mdemoret-nv left a comment

Choose a reason for hiding this comment

Uh oh!

mdemoret-nv Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

willkill07 left a comment

Choose a reason for hiding this comment

Uh oh!

willkill07 Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

willkill07 Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

willkill07 Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

mpangrazzi commented Aug 22, 2025

Uh oh!

Uh oh!