Dft #3960

1485840691 · 2025-08-27T07:21:47Z

What does this PR do?

Support dynamic fine tuning

Fixes #3877 (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Tgt ent control

…the current step

kashif · 2025-08-27T09:17:38Z

thanks @1485840691 can we just have the sft script example only in this PR?

1485840691 · 2025-08-27T11:33:06Z

thanks @1485840691 can we just have the sft script example only in this PR?

@kashif . Thanks, aligned with main branch. The code is from a fork from the trl:main branch. But I created another PR and submit to the main branch of the fork. Seems my account could only create a single fork from trl. So this results in the un-related commits in the change history.

qgallouedec

Thanks a lot for this contribution!
I'd prefer having this directly in SFTTrainer.compute_loss. Let's say with a new arg loss_type="dft" (default to "cross_entropy"). What do you think?

qgallouedec · 2025-09-09T14:24:08Z

Closing via #4042

1485840691 and others added 30 commits June 21, 2025 08:25

add entropy loss

41f67a2

add entropy loss to metrics

619b598

Merge branch 'main' of https://github.com/1485840691/trl

bfad506

re-format

07a59c9

use F

c4b2eee

Output alignment

93e049d

merge commits

af13140

ent coef not equal 0

42d030a

fix format

d45f0fa

Merge branch 'main' into main

2302f48

fix ent loss log

c729673

fix mode

03ad8da

Merge branch 'main' of https://github.com/1485840691/trl

7b3c95c

Merge branch 'main' into main

91da19e

update based on review

1b30552

Merge branch 'main' of https://github.com/1485840691/trl

f7b4f3c

Merge branch 'main' into main

e73f820

adaptive entropy control

b022c79

adaptive entropy control update

97de806

adaptive entropy control update fmt

a99244c

Merge pull request #1 from 1485840691/tgt_ent

7641827

Tgt ent control

refactor loss in grpo

32d5c7c

merge master

872e2a4

Merge branch 'main' into main

0ef57fc

Update comment on dynamic ent control

70bf9d1

Merge branch 'main' of https://github.com/1485840691/trl

6a531ba

Merge branch 'main' into main

2fbf4da

Merge branch 'main' into main

44c342b

Merge branch 'main' into main

5b70fee

Merge branch 'main' into main

7c05fb6

1485840691 and others added 16 commits August 1, 2025 23:45

Merge branch 'main' into main

dde326f

change coef collective

ce8bf67

nits

22f2fa9

separete entropy coefficient from the coefficient that is applied in …

df32688

…the current step

update tests

4d28df5

Merge branch 'main' into main

8fe1d94

Merge branch 'main' into main

adc5bca

Merge branch 'main' into main

55b2e83

Merge branch 'main' into main

3d94cd7

Merge branch 'main' into main

736bb60

Merge branch 'main' into main

0278e69

Merge branch 'main' into main

94afea9

Merge branch 'huggingface:main' into main

cb1cb85

add dynamic ft example

6f8b6c4

align with main

96c89bd

dft update code

fb0cbc3

1485840691 marked this pull request as draft August 27, 2025 11:11

1485840691 and others added 4 commits August 27, 2025 12:27

align with main

aba3d94

Merge branch 'main' into dft

94c3c78

new line

dc1fd82

Merge branch 'dft' of https://github.com/1485840691/trl into dft

949d095

1485840691 marked this pull request as ready for review August 27, 2025 11:33

1485840691 and others added 2 commits August 27, 2025 12:37

fix format

d3cf4fc

Merge branch 'main' into dft

38dfafe

qgallouedec reviewed Aug 29, 2025

View reviewed changes

pramodith mentioned this pull request Sep 4, 2025

Add missing doc strings in SFTrainer #4003

Merged

5 tasks

qgallouedec closed this Sep 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dft #3960

Dft #3960

Uh oh!

1485840691 commented Aug 27, 2025 •

edited by kashif

Loading

Uh oh!

kashif commented Aug 27, 2025

Uh oh!

1485840691 commented Aug 27, 2025 •

edited

Loading

Uh oh!

qgallouedec left a comment

Uh oh!

qgallouedec commented Sep 9, 2025

Uh oh!

Uh oh!

Dft #3960

Dft #3960

Uh oh!

Conversation

1485840691 commented Aug 27, 2025 • edited by kashif Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

kashif commented Aug 27, 2025

Uh oh!

1485840691 commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Sep 9, 2025

Uh oh!

Uh oh!

1485840691 commented Aug 27, 2025 •

edited by kashif

Loading

1485840691 commented Aug 27, 2025 •

edited

Loading