Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support for chain-of-thought models #485

Open
acolombier opened this issue Jan 21, 2025 · 10 comments
Open

[Feature Request] Support for chain-of-thought models #485

acolombier opened this issue Jan 21, 2025 · 10 comments
Labels
enhancement New feature or request

Comments

@acolombier
Copy link

Is your feature request related to a problem? Please describe.

With the recent release of deepseek-r1, it would be great to add proper support to chain-of-thoughts model

Describe the solution you'd like

A valid solution would include

  • A correct title, potentially generated by another model (optional): Alpaca uses the same model to generate title for the prompt, but chain of thought model aren't a great usecase for it. (See context)
  • A better UI rendering allowing collapsing of the thoughts: the chain of thoughts can be very verbose, and while it is great to be able to see it, most of the time it is artifacts you don't need to as part of your prompt answer. This makes the "chat" hard to interact with (See context)
  • A better responsiveness: currently, Alpaca is not responding well during the generation, beside using a dedicated server and thus not having the current machine under load

Describe alternatives you've considered

N/A

Additional context

Image

Image

@acolombier acolombier added the enhancement New feature or request label Jan 21, 2025
@Jeffser
Copy link
Owner

Jeffser commented Jan 21, 2025

Hi, thanks for the suggestion. I was actually looking into this to get Deepseek R1 working better for the next Alpaca release.

I didn't noticed the title issue. For now I'll make it so that Alpaca uses the default model for that, in the future I'll give it it's own option, something like title generation model.

I'm also working on a new widget for handling the <think> blocks.

@Jeffser
Copy link
Owner

Jeffser commented Jan 21, 2025

First attempt:

Image

@Jeffser
Copy link
Owner

Jeffser commented Jan 21, 2025

I'll set this as finished

2a60dff
a360058

@Jeffser
Copy link
Owner

Jeffser commented Jan 22, 2025

Reopening so people know this is done, I also changed the appearance to an attachment button, it opens a file previewer with the text in it.

Image

Image

@Jeffser Jeffser marked this as a duplicate of #492 Jan 22, 2025
@olumolu
Copy link
Contributor

olumolu commented Jan 26, 2025

Make the thought part integrated in the chat it is not good looking and unintiative how deepseek do it in there app... Which is actually better.

@acolombier
Copy link
Author

Looking very promising @Jeffser , thanks for that quick reaction. I'll give it a spin during the next few days.

@Aleksanaa
Copy link
Contributor

Aleksanaa commented Jan 30, 2025

Two problems:

  1. Thoughts is also in markdown format, but isn't rendered as such in popup view.
  2. <Think> isn't hidden in chat title.

@Jeffser
Copy link
Owner

Jeffser commented Jan 31, 2025

The whole markdown processing thing is attached to the message widget (not attachments), I will make it more generic later so that it works in attachments too.

The default model set in preferences is now in charge of generating the titles, you should probably select one that isn't chain of thought, that's the solution for now.

I plan on rewriting all the instance manager code, and when I do, I'll separate the option to default model and title generating model.

@CodingKoalaGeneral
Copy link

CodingKoalaGeneral commented Feb 4, 2025

please add an option to hide the thoughts like chatgpt does
since they often do not provide actual value to the user, who is just interested in the final result

(just show the thoughs bubble icon in ui then and user can click to expand it and see the generation )

deepseek has sometimes annoying emulated vocal expressions like hmmmmm let me see ...

off topic:
also the models (in general) seem to lose recent and origin context of conversations rather easily which can be quite annoying (as soon as previous and recent request are large in token input amount)

for example i provided the model with a large script and asked to fix one part and when i said combine the updated code regarding the fix with the original script, ai be like: what script? generating random wanna be placeholders around the fixed previous requested code

@acolombier
Copy link
Author

Finally got the update over flatpak - sorry I didn't have time to build the test flatpak before! It's looking great, thanks for that! :)

One "bug" I found is, if you ask the model (R1 in my case) "forbidden" question (e.g historical facts), it will return an empty <thought> block, which will display a thought button but no action attached to it.

It would be great if the rendering of the thought section could be dynamic, rather than waiting for the whole response to generate, but this is probably a separated feature request. Is that something you are already working on?

Anyway, happy to close this issue if you would prefer to track follow up work in other ones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants