Translator Step #2

LoggeL · 2024-06-22T17:07:48Z

LoggeL
Jun 22, 2024
Collaborator

It would be quite cool to have a final model translate the story to another language. This could be a smaller model that gets fed a chapter and some general info and translate each chapter.

datacrystals · 2024-06-22T17:22:06Z

datacrystals
Jun 22, 2024
Maintainer

Absolutely - Totally happy to put that in, mind opening an issue?

4 replies

LoggeL Jun 22, 2024
Collaborator Author

#4
Thanks for the interest.

datacrystals Jun 22, 2024
Maintainer

Of course - happy to help! I'll get right on that now.

datacrystals Jun 22, 2024
Maintainer

Do the new features do what you had in mind?

LoggeL Jun 22, 2024
Collaborator Author

Do the new features do what you had in mind?

Looks exactly like I imagined. Will have to test with a proper model tho. Only got 12GB of VRAM and have to offload bigger models which makes it incredibly slow. So I'm mainly using llama3:8b which is pretty bad at translating non English content. I'll also do a test with facebook/nllb-200-distilled-600M.

datacrystals · 2024-06-22T20:02:15Z

datacrystals
Jun 22, 2024
Maintainer

Gotcha no worries - Mistral should fit, maybe that's worth a try? Also if you want to make a section in the readme with what models work well for you that would be awesome. (Similar to the 72 and 20 GiB sections). I'd like to give folks with less vram some recommended models.

…

On Sat, Jun 22, 2024, 12:52 Logge ***@***.***> wrote: Do the new features do what you had in mind? Looks perfect. Will have to test with a proper model tho. Only got 12GB of VRAM and have to offload most things. So I'm mainly using llama3:8b. — Reply to this email directly, view it on GitHub <#2 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALXHV5LLUKRCFRSDBXPFDSDZIXIWRAVCNFSM6AAAAABJXSTKEKVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TQNBYGY4TM> . You are receiving this because you commented.Message ID: ***@***.*** com>

2 replies

LoggeL Jun 23, 2024
Collaborator Author

Yeah I'm mainly orienting myself at the lmsys leaderboard. Llama3 is insanely good for it's size. There are some roleplay finetunes which might work out well. I'll have to test. In my tests it outperforms the small mixtral. Ollama also quantizes the models by default. Your 20GB section should easily fit in cards up to 10GB if you don't load all models at once.

LoggeL Jun 23, 2024
Collaborator Author

For now I hooked the generation to Anthropic for the new Claude (which is insane but has hard daily rate limits) and Google Gemini Flash. They both produce way longer stories (3x longer compared to llama 8b) and way better quality and consistency. Will see if I can test some finetunes tomorrow.

datacrystals · 2024-06-23T01:42:07Z

datacrystals
Jun 23, 2024
Maintainer

Gotcha, I didn't pick the 20GB models very well... I truthfully just picked some of the top models on ollama. It would be great to see what fine tunes work best! Although, if needed, I'm happy to fine-tune some of those 8-14b models as I can fit those in vram. Curious to see how the proprietary models compare. If needed, I'm happy to test one or two of your prompts so you can compare with the proprietary models.

…

On Sat, Jun 22, 2024, 18:25 Logge ***@***.***> wrote: Yeah I'm mainly orienting myself at the lmsys leaderboard. Llama3 is insanely good for it's size. There are some roleplay finetunes which might work out well. I'll have to test. In my tests it outperforms the small mixtral. Ollama also quantizes the models by default. Your 20GB section should easily fit in cards up to 10GB if you don't load all models at once. — Reply to this email directly, view it on GitHub <#2 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALXHV5IBSFUWKVXDSRTYOW3ZIYPYRAVCNFSM6AAAAABJXSTKEKVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TQNBZGU2DM> . You are receiving this because you commented.Message ID: ***@***.*** com>

5 replies

LoggeL Jun 23, 2024
Collaborator Author

https://loggel.github.io/StoryBoard/
I made a small page where one can read the stories. Tried to translate with a 8b german-english llama finetune but was medicore.

LoggeL Jun 23, 2024
Collaborator Author

Gemini-1.5-flash is a insanely good performance / price. One 7500 word story (with lots of debugging) was 0.20$ in the end. It has a context window of 1M and is roughly 1/2 the API price of llama3 70b. I want to test with 70b as well for translations but im pretty sure they mainly trained on English data.
For translations it would be nice to retain the original (english) md. I would in general be in favor of storing the different chapters when done (did this locally). It helps when something crashes or I want to compare the pruning. In the future this would be nice to allow it to only run certain steps or pause / resume a task at a certain point. But that's only a nice to have. It's already really good and super fun 🥳

datacrystals Jun 23, 2024
Maintainer

Wow, that is a nifty page - You've been making a lot of awesome changes, I'd be happy to give you permission to push to other branches on this repo directly if you'd like.

Hmm, I wonder if a bigger model would be better for the translation - perhaps some ~20b param model? I think that should fit in 12GB VRAM at Q4.

datacrystals Jun 23, 2024
Maintainer

Regarding keeping the Eng version - happy to add that too, it shouldn't be very difficult to do at all - I'll get to that either in a bit tonight or tomorrow.

Re. Model Performances - really excited for LLaMA 400b, I cant wait to try a low bit quant of that and see what happens - I'm hopeful that we'll get some good performance from that w/ open source models.

LoggeL Jun 23, 2024
Collaborator Author

I'm unsure if my code quality is good enough to push directly, but I can try.

Yeah LLaMA 400B will be insane but quite costly.
I'm mainly looking at the quality / cost chart: https://artificialanalysis.ai/models
As you can see the pareto front spans between llama3 70b, gemini flash and llama3 8b so you already did a pretty good choice of models.

So I'd say:
small hardware: llama3 8b
large hardware: llama3 70b
no hardware: gemini-flash / claude 3.5 sonnet

Might be an idea to implement cloud providers in a proper way in the future. Right now my implementation is really hacky.
I'll keep my eyes open for good fine-tunes.

datacrystals · 2024-06-23T13:33:51Z

datacrystals
Jun 23, 2024
Maintainer

Awesome! I just made you a collaborator on the repo. No worries about code quality - let's do the open issue and new branch approach and we should be good to go. Regarding cloud models, that sounds good, maybe one of the first things we do is open an issue on this repo and implement cloud models fully.

…

On Sun, Jun 23, 2024, 05:55 Logge ***@***.***> wrote: I'm unsure if my code quality is good enough to push directly, but I can try. Yeah LLaMA 400B will be insane but quite costly. I'm mainly looking at the quality / cost chart: https://artificialanalysis.ai/models As you can see the pareto front spans between llama3 70b, gemini flash and llama3 8b so you already did a pretty good choice of models. So I'd say: small hardware: llama3 8b large hardware: llama3 70b no hardware: gemini-flash / claude 3.5 sonnet Might be an idea to implement cloud providers in a proper way in the future. Right now my implementation is really hacky. I'll keep my eyes open for good fine-tunes. — Reply to this email directly, view it on GitHub <#2 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALXHV5N7K2QF7BE3SYYMNI3ZI3AVFAVCNFSM6AAAAABJXSTKEKVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TQNJRHA2TQ> . You are receiving this because you commented.Message ID: ***@***.*** com>

2 replies

datacrystals Jun 23, 2024
Maintainer

I'm especially interested in getting the improved summary check that you implemented with a reason. That seems like it'd be a very useful addition - I've been having that same issue you were describing!

LoggeL Jun 23, 2024
Collaborator Author

I did a bit of testing and it seems like gemini-flash barely needs corrections (and will succeed in one correction pass). It's rather a problem with the small models (e.g. llama3:8b).
Will do some more testing tho.

datacrystals · 2024-06-23T22:15:48Z

datacrystals
Jun 23, 2024
Maintainer

Huh, I'd like to test it with llama70b. What does your cloud model system look like?

…

On Sun, Jun 23, 2024, 15:04 Logge ***@***.***> wrote: I did a bit of testing and it seems like gemini-flash barely needs corrections (and will succeed in one correction pass). It's rather a problem with the small models (e.g. llama3:8b). Will do some more testing tho. — Reply to this email directly, view it on GitHub <#2 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALXHV5LE2VYRLQLDGTGEH3LZI5BATAVCNFSM6AAAAABJXSTKEKVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TQNJUGE3DO> . You are receiving this because you commented.Message ID: ***@***.*** com>

7 replies

LoggeL Jun 24, 2024
Collaborator Author

Haha same for me. Too many stories and too little time.
I wrote implementation for Anthropic and Google. If we can modularize the generation more I can try to write the implementations. This would open up the project to a bigger variety of users 😃

Feedback on the translator:
It tires to translate too many things "directly". Those sentences sound perfectly normal in English but not in other languages. It might need a bit more flexibility to rewrite more of the chapter. I also thought about replacing all prompts into the target language which could improve performance. Maybe it would be nice to split all promps into a separate file which could easily be modified by the user.

datacrystals Jun 24, 2024
Maintainer

Sure thing, I have in my todo to create a class wrapping the client. Would that be enough for you to write the implementations for proprietary models?

Re: Translator - ah, I'm afraid that my experience would be limited here. I unfortunately only know English well - I've been trying to learn French but I sadly do not know enough to tell what's a normal way to phrase it versus what's a direct translation. Feel free to edit the prompts and see if you can get something that works (I just won't be of much help unfortunately).

LoggeL Jun 25, 2024
Collaborator Author

Would be glad to write the implementations after you're done with the Client Wrapper. Just have to find a clean way of doing it.
It would require 2 parameters for each provider (MODEL_NAME, API_KEY) and another parameter for the provider name.

Translator: I'll get back to you when I'm done testing the translated prompts.

datacrystals Jun 25, 2024
Maintainer

Sure thing, sounds like a plan. I'm getting to the end of what I can do today, (1AM here), so I'll see if I can do it tonight before going to sleep. I might have to do it tomorrow, but I just want to test something prior to doing this. I would love to get more iterations of stuff done faster, but I just spend hours waiting for stuff to generate, so it limits how fast I can work on this project.

datacrystals Jun 25, 2024
Maintainer

I think I'm just going to test it with mistral:7b to see if it works at all, and if so, then I'll go ahead and merge it and start on the client wrapper. If you could test the new version at some point, I'd appreciate the feedback if the new outline generation preprocessing is worth it or not. I tried to give a bit more structure and context to the chapter writers by adding a step at the start of the outline generator.

datacrystals · 2024-06-25T08:31:47Z

datacrystals
Jun 25, 2024
Maintainer

In branch 18, I've started creating a proper interface wrapper, but I'm getting too tired tonight to finish integrating it properly. If you are able to see what I've done and finish it, that would be fantastic - no worries if not. I can try and get to it tomorrow afternoonish, but tomorrow is going to be a busy day for me.

I've made an interface class that holds the client, and then allows for the generation history and such to be kept within that class. Not sure if that's a good idea or not, but it might simplify langchains. The client itself is still an OLLAMA object, but maybe you can have it be smarter than that.

The rest of the code needs to be updated so it uses the interface or something similar - as right now It'll still try and use the old client (Which doesn't exist anymore).

2 replies

LoggeL Jun 25, 2024
Collaborator Author

Will try to get it working after work this evening.
Don't push yourself too hard. We don't have deadlines and want to keep it sustainable :)

datacrystals Jun 26, 2024
Maintainer

That's a good point, thanks for reminding me :)

Let's talk more via Discord. (maybe we can make a server for this project? IDK, we can discuss more).

Translator Step #2

Uh oh!

LoggeL Jun 22, 2024 Collaborator

Replies: 6 comments · 22 replies

Uh oh!

datacrystals Jun 22, 2024 Maintainer

Uh oh!

LoggeL Jun 22, 2024 Collaborator Author

Uh oh!

datacrystals Jun 22, 2024 Maintainer

Uh oh!

datacrystals Jun 22, 2024 Maintainer

Uh oh!

Uh oh!

LoggeL Jun 22, 2024 Collaborator Author

Uh oh!

datacrystals Jun 22, 2024 Maintainer

Uh oh!

LoggeL Jun 23, 2024 Collaborator Author

Uh oh!

LoggeL Jun 23, 2024 Collaborator Author

Uh oh!

datacrystals Jun 23, 2024 Maintainer

Uh oh!

LoggeL Jun 23, 2024 Collaborator Author

Uh oh!

LoggeL Jun 23, 2024 Collaborator Author

Uh oh!

datacrystals Jun 23, 2024 Maintainer

Uh oh!

datacrystals Jun 23, 2024 Maintainer

Uh oh!

LoggeL Jun 23, 2024 Collaborator Author

Uh oh!

datacrystals Jun 23, 2024 Maintainer

Uh oh!

datacrystals Jun 23, 2024 Maintainer

Uh oh!

LoggeL Jun 23, 2024 Collaborator Author

Uh oh!

datacrystals Jun 23, 2024 Maintainer

Uh oh!

Uh oh!

LoggeL Jun 24, 2024 Collaborator Author

Uh oh!

datacrystals Jun 24, 2024 Maintainer

Uh oh!

LoggeL Jun 25, 2024 Collaborator Author

Uh oh!

datacrystals Jun 25, 2024 Maintainer

Uh oh!

datacrystals Jun 25, 2024 Maintainer

Uh oh!

datacrystals Jun 25, 2024 Maintainer

Uh oh!

LoggeL Jun 25, 2024 Collaborator Author

Uh oh!

datacrystals Jun 26, 2024 Maintainer

LoggeL
Jun 22, 2024
Collaborator

Replies: 6 comments 22 replies

datacrystals
Jun 22, 2024
Maintainer

LoggeL Jun 22, 2024
Collaborator Author

datacrystals Jun 22, 2024
Maintainer

datacrystals Jun 22, 2024
Maintainer

LoggeL Jun 22, 2024
Collaborator Author

datacrystals
Jun 22, 2024
Maintainer

LoggeL Jun 23, 2024
Collaborator Author

LoggeL Jun 23, 2024
Collaborator Author

datacrystals
Jun 23, 2024
Maintainer

LoggeL Jun 23, 2024
Collaborator Author

LoggeL Jun 23, 2024
Collaborator Author

datacrystals Jun 23, 2024
Maintainer

datacrystals Jun 23, 2024
Maintainer

LoggeL Jun 23, 2024
Collaborator Author

datacrystals
Jun 23, 2024
Maintainer

datacrystals Jun 23, 2024
Maintainer

LoggeL Jun 23, 2024
Collaborator Author

datacrystals
Jun 23, 2024
Maintainer

LoggeL Jun 24, 2024
Collaborator Author

datacrystals Jun 24, 2024
Maintainer

LoggeL Jun 25, 2024
Collaborator Author

datacrystals Jun 25, 2024
Maintainer

datacrystals Jun 25, 2024
Maintainer

datacrystals
Jun 25, 2024
Maintainer

LoggeL Jun 25, 2024
Collaborator Author

datacrystals Jun 26, 2024
Maintainer