-
Notifications
You must be signed in to change notification settings - Fork 482
Description
Problem
Katib SDK currently supports only ONE pip index URL via pip_index_url
parameter. Users cannot install packages from multiple sources simultaneously (e.g., public PyPI + private repos).
Use Case
ML teams need both:
- Public packages:
scikit-learn
,pandas
from PyPI - Private packages: Company ML libraries from private repos
Currently forced to choose one OR the other, not both.
Proposed Solution
Replace pip_index_url
(string) with pip_index_urls
(list).
# Current (limitation)
client.tune(
name="experiment",
pip_index_url="https://my-repo.com/simple/" # Only one repo
)
# Proposed (multiple repos)
client.tune(
name="experiment",
pip_packages=["scikit-learn", "my-private-lib"],
pip_index_urls=[ # List of repos
"https://pypi.org/simple/",
"https://my-repo.com/simple/"
]
)
# Default behavior (if None provided)
client.tune(name="experiment") # Uses PyPI only
Implementation
- Change
pip_index_url
parameter topip_index_urls
(list) intune()
method - Create utility function in
utils.py
to format list as--index-url
(first) and--extra-index-url
(rest) - Modify
get_script_for_python_packages()
to handle the list - Default to PyPI when
pip_index_urls=None
Files to Modify
sdk/python/v1beta1/kubeflow/katib/api/katib_client.py
sdk/python/v1beta1/kubeflow/katib/utils/utils.py
I am willing to implement and submit PR.
/area sdk
/good-first-issue
Why is this needed?
ML teams often need packages from multiple sources in the same experiment:
Public packages (scikit-learn, pandas) from PyPI
Private packages (company ML libs) from private repos
Currently Katib forces you to choose only ONE repo, blocking common enterprise ML workflows where you need both public and private dependencies.
Love this feature?
Give it a 👍 We prioritize the features with most 👍