Skip to content

Conversation

usulkies
Copy link

@usulkies usulkies commented Jun 16, 2025

feat: Add S3 artifact repository configuration options for parallel transfers and resource limits

Fixes #12442, #9022, #4014

Motivation

The current S3 artifact repository implementation has several limitations that can cause performance and stability issues:

  1. No control over parallel transfers, leading to OOM errors with large files or many files
  2. Memory issues due to lack of resource limits and multipart upload configuration
  3. Sequential file transfers causing slow performance with many files

Modifications

  1. Added configuration options for S3 artifact repository:

    • parallelTransfers: Control number of concurrent transfers (default: runtime.NumCPU()*2, capped at 32)
    • multipartPartSize: Size of each part in multipart uploads (default: 5MB)
    • multipartConcurrency: Number of concurrent multipart uploads (default: 4)
  2. Implemented worker pool for parallel transfers:

    • Created workflow/artifacts/common/pool/pool.go for managing concurrent operations
    • Added parallel transfer support for both upload and download operations
    • Configurable through environment variables (ARGO_S3_PARALLEL_TRANSFERS, etc.)
  3. Added resource limits configuration:

    • Memory limits for S3 operations
    • Configurable through podSpecPatch for init containers

Verification

  1. Unit tests:

    • Added tests for parallel transfer functionality
    • Verified worker pool implementation
    • Tested with large files and many files
  2. Performance testing:

    • Verified parallel transfer performance improvements
    • Tested memory usage with different configurations
    • Confirmed OOM prevention with appropriate limits

Documentation

  1. Updated S3 artifact repository configuration documentation:

    • Added new configuration options
    • Added environment variable overrides
    • Added examples for different use cases
  2. Added code comments explaining:

    • Worker pool implementation
    • Resource limits configuration
    • Performance considerations

@usulkies usulkies marked this pull request as draft June 16, 2025 15:14
…ransfers and resource limits

Signed-off-by: usulkies <[email protected]>
Signed-off-by: Uziel Sulkies <[email protected]>
@usulkies usulkies force-pushed the uziel/s3_parallel branch from 4c09dff to c52f206 Compare June 16, 2025 15:49
usulkies and others added 4 commits June 16, 2025 17:02
- Implement parallel directory uploads/downloads using worker pools

- Add ParallelTransfers, MultipartPartSize, MultipartConcurrency config options

- Support environment variable overrides (ARGO_S3_PARALLEL_TRANSFERS, etc.)

- Auto-detect optimal parallelism based on CPU count when not configured

- Add comprehensive test coverage for parallelism and configuration options

- Improve memory efficiency for large directory operations using streaming approach

This addresses performance issues with large S3 directory operations by enabling concurrent file transfers while maintaining backward compatibility.

Signed-off-by: Uziel Sulkies <[email protected]>
@usulkies usulkies marked this pull request as ready for review June 17, 2025 12:56
@usulkies usulkies marked this pull request as draft June 17, 2025 14:30
@usulkies usulkies marked this pull request as draft June 17, 2025 14:30
Copy link
Member

@Joibel Joibel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some initial thoughts on this PR. Thanks for working on it.

@usulkies usulkies marked this pull request as ready for review June 19, 2025 07:32
@usulkies usulkies marked this pull request as draft June 19, 2025 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement parallelization to speed up S3 artifacts upload and download
2 participants