Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probably simple - I cannot rectify A0 from an OpenAI\_streaming.py Error #217

Open
SavageHobbies opened this issue Oct 25, 2024 · 0 comments

Comments

@SavageHobbies
Copy link

I do not see anyone else with this error, but I have been experiencing it for the past week or so. I do not know if it has something to do with embeddings, because I do not really understand what embedding is, other than like a basic language model.

I do not have anything connected to Open AI. Like I said unless I am doing the embedding incorrectly. When I use Open AI as my LLM then there is no issues, but if I try to use any other model I get this issue. Could someone help me understand this and what I am doing incorrectly?

Agent 0: Generating

Traceback (most recent call last):
File "openai_streaming.py", line 147, in aiter
File "openai_streaming.py", line 174, in stream
openai.APIError: Requested generation length 1024 is not possible! The provided prompt is 4142 tokens long, so generating 1024 tokens requires a sequence length of 5166, but the maximum supported sequence length is just 4096!

Here is the relevant parts of my initialize.py.

main chat model used by agents (smarter, more accurate)

# chat_llm = models.get_openai_chat(model_name="gpt-4o-mini", temperature=0)
# chat_llm = models.get_ollama_chat(model_name="llama3.2:3b-instruct-fp16", temperature=0)
# chat_llm = models.get_lmstudio_chat(model_name="lmstudio-community/Llama-3.2-3B-Instruct-Q8_0-GGUF", temperature=0)
# chat_llm = models.get_openrouter_chat(model_name="google/gemma-2-9b-it", temperature=0)
# chat_llm = models.get_azure_openai_chat(deployment_name="gpt-4o-mini", temperature=0)
# chat_llm = models.get_anthropic_chat(model_name="claude-3-5-sonnet-20240620", temperature=0)
# chat_llm = models.get_google_chat(model_name="gemini-1.5-flash", temperature=0)
# chat_llm = models.get_mistral_chat(model_name="mistral-small-latest", temperature=0)
# chat_llm = models.get_codestral_chat(model_name="codestral-mistral", temperature=0)
# chat_llm = models.get_groq_chat(model_name="llama3-groq-70b-8192-tool-use-preview", temperature=0)
chat_llm = models.get_sambanova_chat(model_name="Meta-Llama-3.2-3B-Instruct", temperature=0)

# utility model used for helper functions (cheaper, faster)
utility_llm = chat_llm

# embedding model used for memory
# embedding_llm = models.get_openai_embedding(model_name="text-embedding-3-small")
# embedding_llm = models.get_ollama_embedding(model_name="nomic-embed-text:v1.5")
embedding_llm = models.get_huggingface_embedding(model_name="sentence-transformers/all-MiniLM-L6-v2")
# embedding_llm = models.get_lmstudio_embedding(model_name="nomic-ai/nomic-embed-text-v1.5-GGUF")

# agent configuration
config = AgentConfig(
    chat_model = chat_llm,
    utility_model = utility_llm,
    embeddings_model = embedding_llm,
    # prompts_subdir = "default",
    # memory_subdir = "",
    knowledge_subdirs = ["default","custom"],
    auto_memory_count = 0,
    # auto_memory_skip = 2,
    # rate_limit_seconds = 60,
    # rate_limit_requests = 3,
    # rate_limit_input_tokens = 0,
    # rate_limit_output_tokens = 6900,
    # msgs_keep_max = 25,
    # msgs_keep_start = 5,
    # msgs_keep_end = 10,
    max_tool_response_length = 3000,
    # response_timeout_seconds = 60,
    code_exec_docker_enabled = True,
    # code_exec_docker_name = "agent-zero-exe",
    # code_exec_docker_image = "frdel/agent-zero-exe:latest",
    # code_exec_docker_ports = { "22/tcp": 50022 }
    # code_exec_docker_volumes = { 
        # files.get_abs_path("work_dir"): {"bind": "/root", "mode": "rw"},
        # files.get_abs_path("instruments"): {"bind": "/instruments", "mode": "rw"},
        #                         },
    code_exec_ssh_enabled = True,
    # code_exec_ssh_addr = "localhost",
    # code_exec_ssh_port = 50022,
    # code_exec_ssh_user = "root",
    # code_exec_ssh_pass = "toor",
    # additional = {},
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant