huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.2k
Star 15.4k

Code
Issues 482
Pull requests 76
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 33 Milestones 0

New pull request New

76 Open 1,906 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

🧨 DFT

#4042 opened Sep 9, 2025 by qgallouedec

Loading…

5 tasks

[GRPO] Fix potential hang in get_high_entropy_mask

#4041 opened Sep 9, 2025 by akakakakakaa

Loading…

Remove average_tokens_across_devices default replacement

#4039 opened Sep 8, 2025 by qgallouedec

Loading…

Fix label shifting logic in SFTTrainer for compatibility with CP

#4038 opened Sep 8, 2025 by qgallouedec

Loading…

5 tasks

Remove setting chat template in sft script

#4037 opened Sep 8, 2025 by qgallouedec

Loading…

Add autodoc for BestOfNSampler and improve docstrings

#4034 opened Sep 8, 2025 by albertvillanova

Loading…

[GRPO VLM] Update split sizes to generalize

#4032 opened Sep 8, 2025 by zucchini-nlp

Loading…

Enable XPU for vllm client

#4031 opened Sep 8, 2025 by jiqing-feng • Draft

vllm sleep mode support

#4028 opened Sep 8, 2025 by ved1beta

Loading…

2 of 5 tasks

Made ref_model as None in PPO trainer for refined args

#4024 opened Sep 7, 2025 by complete-dope

Loading…

Fix #3982: Fix DPO Trainer support for Gemma 3 vision models

#4022 opened Sep 6, 2025 by akshay-babbar

Loading…

Fix passing model kwargs

#4019 opened Sep 5, 2025 by qgallouedec

Loading…

Fix: undefined current_gradient_accumulation_steps

#4014 opened Sep 5, 2025 by ysjprojects

Loading…

2 of 5 tasks

Fix: ignore precompute_ref_log_probs when use_liger_loss=True

#4008 opened Sep 4, 2025 by ginkyenglee

Loading…

5 tasks

Improve typing of SFT trainer

#4007 opened Sep 4, 2025 by cyyever

Loading…

⚖️ Align SFT and DPO for model creation and deprecate DPOConfig.padding_value in favour or pad_token_id

#4006 opened Sep 4, 2025 by qgallouedec

Loading…

5 tasks

✨ Improve SFT doc

#4005 opened Sep 4, 2025 by qgallouedec

Loading…

5 tasks

Remove attention mask when position ids is returned

#3997 opened Sep 2, 2025 by qgallouedec • Draft

Fix: Make sft script work when chat template is None

#3995 opened Sep 2, 2025 by rabinadk1

Loading…

1 of 5 tasks

[GFPO]: implement GFPO in GRPOTrainer

#3989 opened Sep 1, 2025 by Peter-Chou

Loading…

3 of 5 tasks

Enable saving and loading precomputed reference log probabilities in …

#3986 opened Sep 1, 2025 by ginkyenglee

Loading…

3 tasks

Dft

#3960 opened Aug 27, 2025 by 1485840691

Loading…

5 tasks

fix bug when using dataset streaming by accelerate

#3950 opened Aug 25, 2025 by kaixuanliu

Loading…

Docker update

#3931 opened Aug 20, 2025 by qgallouedec

Loading…

5 tasks

[DRAFT] Refactor DPO

#3906 opened Aug 15, 2025 by qgallouedec • Draft

5 tasks

Previous 1 2 3 4 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!