Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JetBrains AI Assistant integration? #694

Open
nikAizuddin opened this issue Feb 1, 2025 · 7 comments
Open

JetBrains AI Assistant integration? #694

nikAizuddin opened this issue Feb 1, 2025 · 7 comments

Comments

@nikAizuddin
Copy link

Is RamaLama a drop-in replacement to Ollama? I'm trying to use RamaLama with JetBrains AI Assistant but failed to connect

Image

Logs:

request: GET / 192.168.***.*** 200

It seems ramalama returned gzip is not supported by this browser to JetBrains AI Assistant:
Image

Steps to reproduce

  1. ramalama --image localhost/ramalama/rocm-gfx9:latest serve qwen2.5-coder:7b
  2. Launch JetBrains IDE (such as Pycharm Professional). Enable Ollama as shown in the screenshot.
@dougsland
Copy link
Collaborator

@nikAizuddin https://www.jetbrains.com/pycharm/ is a valid IDE? Never used JetBrains, I would like to give it a try.

@nikAizuddin
Copy link
Author

@nikAizuddin https://www.jetbrains.com/pycharm/ is a valid IDE? Never used JetBrains, I would like to give it a try.

Yup, PyCharm Community Edition should be okay.

  1. Under Settings > Plugins, install JetBrains AI Assistant plugin
  2. After plugin installation, Settings > Tools > AI Assistant, Enable Ollama

@eye942
Copy link

eye942 commented Feb 6, 2025

looks like ramalama directly uses llama.cpp as opposed to ollama which wraps the responses in its own api

exec_args = [

Error message seems to be coming from:
https://github.com/ggerganov/llama.cpp/blob/aa6fb1321333fae8853d0cdc26bcb5d438e650a1/examples/server/server.cpp#L4345
This seems to be llama.cpp's handler for "/" when webui hasn't been disabled.

ollama shims llama.cpp
ollama routes -> ollama handlers -> ollama's llama.cpp client
Seems like ollama's api is almost exactly identical to llama.cpp

However, the key difference is that ollama serves a status page @ /
https://github.com/ollama/ollama/blob/928911bc683a9234343e2542e1a13564dd0f2684/server/routes.go#L1167

currently seeing 3 possible fixes:

  1. disable llama.cpp's native webui by default in ramalama and serve a simple status page @ <llamma.cpp api>/
  2. copy the shimming concept to ramalama - might be implemented Provide model info in chat ui & allow multiple models #598 (comment)
  3. ask jetbrains for llama.cpp api support (which they should have 90% of the work done since their ollama integration is working)

@rhatdan
Copy link
Member

rhatdan commented Feb 6, 2025

I would prefer 3, llama.cpp is a much more open project and the basis of most of the work being done by Ollama. JetBrains should support llama.cpp

@ericcurtin
Copy link
Collaborator

ericcurtin commented Feb 6, 2025

You raised my interest though @eye942 we have the llama.cpp webui on by default in RamaLama. Does it cause issue having it on? I think it's quite useful, but have always assumed it had no impact on the REST API (maybe a false assumption)

@eye942
Copy link

eye942 commented Feb 7, 2025

@ericcurtin

Yeah, the issue raised here is a llama.cpp server-side error raised only when

  1. webui is on
  2. Accept-Encoding header in the client's request to llama.cpp doesn't include gzip

That is the only side-effect of having webui on, from what I can tell.

If webui is off, that I believe is still incompatible with ollama/openai's api because llama.cpp then serves a 404 error with a corresponding json body:

{"error":{"code":404,"message":"File Not Found","type":"not_found_error"}}

OpenAI's api spec doesn't specify what should be served at "/" (however, they do serve a 200 response, linked):

{
    "message": "Welcome to the OpenAI API! Documentation is available at https://platform.openai.com/docs/api-reference"
}

Haven't tested it with JetBrain's product, but it's highly likely that they are just determining whether / serves a 200 response when testing connection

@rhatdan
Copy link
Member

rhatdan commented Feb 7, 2025

We should probably turn off the WEB UI by default for ramalama serve and then add an option to enable it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants