This is the repository for the accelerated discovery orchestrator
(ado
).
ado
is a unified platform for executing computational experiments at
scale and analysing their results. It can be extended with new experiments
or new analysis tools. It allows distributed teams of researchers and engineers
to collaborate on projects, execute experiments, and share data.
You can run the experiments and analysis tools already available in ado
in
a distributed, shared, environment with your team. You can also use ado
to
get features like data-tracking, data-sharing, tool integration and a CLI, for
your analysis method or experiment for free.
🧑💻 Using ado
assumes familiarity with command line tools.
🛠️ Developing ado
requires knowledge of python.
- 💻 CLI: Our human-centric CLI follows best practices
- 🤝 Projects: Allow distributed groups of users to collaborate and share data
- 🔌 Extendable: Easily add new experiments, optimizers or other tools.
- ⚙️ Scalable: We use ray as our execution engine allowing experiments and tools to easily scale
- ♻️ Automatic data-reuse: Avoid repeating work with
transparent reuse of experiment results.
ado
internal protocols ensure this happens only when it makes sense - 🔗 Provenance: As you work, the relationship between the data you create and operations you perform are automatically tracked
- 🔎 Optimization and sampling: Out-of-the-box, leverage powerful optimization
methods
via
raytune
or use our flexible in built sampler
We have developed ado
plugins providing advanced experiments for testing
foundation-models:
- ⏱️ fine-tuning performance benchmarking
- ⏱️ inference performance benchmarking (using the vLLM performance benchmark)
- COMING SOON 🔮 inference and fine-tuning prediction
A basic installation of ado
only requires a recent Python version (3.10+).
This will allow you to run
many of our examples and explore
ado features.
Some advanced features have additional requirements:
- Distributed Projects (Optional): To support projects with multiple users you will need a remote, accessible, MySQL database. See here for more
- Multi-Node Execution (Optional): To support multi-node or scaling execution you may need a multi-node RayCluster. See here for more details
In addition ado
plugins may have additional requirements for executing
realistic experiments. For example,
- Fine-Tuning Benchmarking: Requires a RayCluster with GPUs
- vLLM Performance Benchmarking: Requires an OpenShift cluster with GPUs
To install you can execute the following (we recommend you set up a virtual environment)
git clone https://github.com/IBM/ado.git
cd ado
pip install .
Alternate instructions to install ado
can be found here:
https://ibm.github.io/ado/getting-started/install/
Instructions for developing ado are available in DEVELOPING.
To run unit-tests read tests/README.md.
This video shows listing actuators and getting the details of an experiment.
Check demo for more videos.
step1_trimmed.mp4
For more details on the Discovery Spaces concept underlying ado, please refer to this technical report.
This project is partially funded by the European Union through the Smart Networks and Services Joint Undertaking (SNS JU) under grant agreement No. 101192750 (Project 6G-DALI).