-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Labels
feature requestRequest for a new feature to be added to AccelerateRequest for a new feature to be added to Accelerate
Description
System Info
accelerate: 0.12.0
OS: Linux 5.4.188+ (Colab)
Python: 3.7.13
numpy: 1.21.6
torch: 1.12.1+cu113
config: 1 CPU
Information
- The official example scripts
- My own modified scripts
Tasks
- One of the scripts in the examples/ folder of Accelerate or an officially supported
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
) - My own task or dataset (give details below)
Reproduction
MRE : https://colab.research.google.com/drive/17krCJCF_nWtNFSiMBo3oz12l7eX1bBZ6
First of all, thanks for this library and the great docs and examples that comes with it 😄!
I am using a custom torch Dataset that contains a Hugging Face Dataset (pyarrow) instance. Therefore, as indicated in the Datasets docs (https://huggingface.co/docs/datasets/v2.4.0/en/use_with_pytorch#use-a-batchsampler), I tried to use a BatchSampler to reduce the number of queries. However, I have not been able yet to make it work yet with accelerate.
I tried many different possibilities, one of which works one CPU or one GPU, but gets stuck when using distributed training.
Thanks for your help!
Metadata
Metadata
Assignees
Labels
feature requestRequest for a new feature to be added to AccelerateRequest for a new feature to be added to Accelerate