Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a custom system role #130

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

igorlima
Copy link

@igorlima igorlima commented Dec 30, 2024

This PR focuses on addressing two open issues by introducing a custom system role:

Initially, there are no apparent issues with either litellm or Groq. However, this flexible approach will address both concerns.

  • here's a code snippet for you to try out:
    • snippet.py
      # Groq PDF EXTRACT via zerox
      from pyzerox import zerox
      import pathlib, os, asyncio
      
      kwargs = {}
      custom_system_prompt = None
      custom_role = 'user' # HERE IS the NEW custom_role parameter
      
      model = "groq/llama-3.2-11b-vision-preview"
      
      # Define main async entrypoint
      async def main():
        file_path = "data/input.pdf" ## local filepath and file URL supported
        ## process only some pages or all
        select_pages = None ## None for all, but could be int or list(int) page numbers (1 indexed)
        output_dir = "./data" ## directory to save the consolidated markdown file
        output_dir = None
        result = await zerox(file_path=file_path, model=model, output_dir=output_dir,
                            custom_system_prompt=custom_system_prompt, select_pages=select_pages,
                            custom_role=custom_role, **kwargs)
        return result
      
      # run the main function:
      result = asyncio.run(main())
      md_text = "\n".join([page.content for page in result.pages])
      
      # print markdown md_text
      print(md_text)
      # save markdown to file
      pathlib.Path("data/output-zerox-pdf.md").write_text(md_text)
      print("Markdown saved to output-zerox-pdf.md")
      
      
      """
      # HOW TO RUN THIS SCRIPT:    
      ({
      export GROQ_API_KEY="XXXXXXXXXXXXXXXXXXXX"
      python3 snippet.py
      })
      """
      # TO INSTALL dependencies
      # installing from git repo branch
      pip install --no-cache --upgrade-strategy eager -I \
        git+https://github.com/igorlima/zerox.git@default-role
      
      # installing from local file system folder
      # pip install ~/workstation/github/zerox
      
      # TO RUN this script:
      ({
      export GROQ_API_KEY="XXXXXXXXXXXXXXXXXXXX"
      python3 snippet.py
      })

The code was initially intended to function with a hardcoded system role. However, some LLMs require customization. This PR introduces a new custom_role parameter to address the above issues. Then, using custom_role=user should solve these issues.

Below are some notes and snippets to help refresh my memory whenever I revisit this code. They also provide newcomers with an overview of the underlying processes:

  • additional notes and snippets

    The py-zerox library is designed to interact with LLMs via API, using the litellm library to ensure everything runs smoothly. Before jumping into action, py-zerox performs a couple of important checks using litellm methods:

    • model validation

      The validate_model(self) method uses litellm.supports_vision(model=self.model) to confirm that the model is indeed a vision model. Essentially, litellm checks a comprehensive JSON map with detailed information on various LLM options to ensure compatibility.

      • snippets
        ({
        # https://docs.python.org/3/library/logging.html#logging-levels
        # https://github.com/BerriAI/litellm/blob/ea8f0913c2aac4a7b4ec53585cfe9ea04a3282de/litellm/_logging.py#L11
        export LITELLM_LOG=CRITICAL
        
        output=$(python3 <<EOF
        
        import litellm
        # # https://github.com/BerriAI/litellm/blob/11932d0576a073d83f38a418cbdf6b2d8d4ff46f/litellm/litellm_core_utils/get_llm_provider_logic.py#L322
        litellm.suppress_debug_info = True
        # https://docs.litellm.ai/docs/debugging/local_debugging#set-verbose
        litellm.set_verbose=False
        
        model = "bedrock/amazon.nova-pro-v1:0"
        model = "bedrock/anthropic.claude-3-haiku-20240307-v1:0"
        model = "bedrock/amazon.nova-lite-v1:0"
        model = "groq/llama-3.2-11b-vision-preview"
        isVisionModel = litellm.supports_vision(model)
        print("Does %s supports visual? %s" % (model, isVisionModel))
        
        EOF
        )
        
        echo $output
        })
    • access validation

      The validate_access(self) method uses litellm.check_valid_key(model=self.model, api_key=None) to verify access to the model. This check ensures that environment variables are correctly set with proper values.

      • In short, litellm performs a simple API request to the given LLM. If any issues arise, it simply returns False. Otherwise, it returns a True for a successful outcome.

        • a quick note: exceptions during this process will not be displayed - to view them, you must start debugging or set the appropriate Debug Environment Variable.
      • snippets
        ({
        # https://docs.python.org/3/library/logging.html#logging-levels
        # https://github.com/BerriAI/litellm/blob/ea8f0913c2aac4a7b4ec53585cfe9ea04a3282de/litellm/_logging.py#L11
        export LITELLM_LOG=DEBUG
        export GROQ_API_KEY="xxxxxxxxxxxxxxxx"
        
        output=$(python3 <<EOF
        
        import litellm
        # # https://github.com/BerriAI/litellm/blob/11932d0576a073d83f38a418cbdf6b2d8d4ff46f/litellm/litellm_core_utils/get_llm_provider_logic.py#L322
        litellm.suppress_debug_info = False
        # https://docs.litellm.ai/docs/debugging/local_debugging#set-verbose
        litellm.set_verbose=True
        
        model = "bedrock/amazon.nova-pro-v1:0"
        model = "bedrock/anthropic.claude-3-haiku-20240307-v1:0"
        model = "bedrock/amazon.nova-lite-v1:0"
        model = "groq/llama-3.2-11b-vision-preview"
        isAllSet = litellm.check_valid_key(model,api_key=None)
        print("Does %s have everything set? %s" % (model, isAllSet))
        
        EOF
        )
        
        echo $output
        })

    Once these checks are completed, py-zerox begins using the litellm to convert PDFs into markdown format.

Thanks!

@igorlima igorlima marked this pull request as ready for review December 30, 2024 18:05
@igorlima igorlima changed the title Introduce a customizable and flexible system role Introduce a custom system role Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant