Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple Python API from the io.datasette.llm user directory #754

Open
simonw opened this issue Feb 16, 2025 · 4 comments
Open

Decouple Python API from the io.datasette.llm user directory #754

simonw opened this issue Feb 16, 2025 · 4 comments

Comments

@simonw
Copy link
Owner

simonw commented Feb 16, 2025

It's a little weird that using the Python API still has bits of code that depend on the ~/.../io.datasette.llm/` directory - it's used for registering models and registering keys too.

It would be nicer if this was a mechanism that was used for the CLI tool but users of the Python library could avoid it entirely if they wanted to.

@simonw simonw added the design label Feb 16, 2025
@simonw
Copy link
Owner Author

simonw commented Feb 16, 2025

The main thing here is the way plugins register their models in files in that directory like llm-mlx.json - and extra-openai-models.yaml.

There's also the way keys are looked up - this function is called by the self.get_key() method on models that use keys:

llm/llm/__init__.py

Lines 222 to 246 in 8611d92

def get_key(
explicit_key: Optional[str], key_alias: str, env_var: Optional[str] = None
) -> Optional[str]:
"""
Return an API key based on a hierarchy of potential sources.
:param provided_key: A key provided by the user. This may be the key, or an alias of a key in keys.json.
:param key_alias: The alias used to retrieve the key from the keys.json file.
:param env_var: Name of the environment variable to check for the key.
"""
stored_keys = load_keys()
# If user specified an alias, use the key stored for that alias
if explicit_key in stored_keys:
return stored_keys[explicit_key]
if explicit_key:
# User specified a key that's not an alias, use that
return explicit_key
# Stored key over-rides environment variables over-ride the default key
if key_alias in stored_keys:
return stored_keys[key_alias]
# Finally try environment variable
if env_var and os.environ.get(env_var):
return os.environ[env_var]
# Couldn't find it
return None

Which calls:

llm/llm/__init__.py

Lines 249 to 254 in 8611d92

def load_keys():
path = user_dir() / "keys.json"
if path.exists():
return json.loads(path.read_text())
else:
return {}

@simonw
Copy link
Owner Author

simonw commented Feb 16, 2025

One way to do this could be to have a LLM() class that Python users can instantiate:

llm = LLM("/path/to/config-dir/")

The existing llm.get_models() methods etc would continue to work but would use a default LLM() instance that uses llm.user_dir() - this kind of thing:

default_llm = LLM(llm.user_dir())

def get_models():
    return default_llm.get_models()

@simonw
Copy link
Owner Author

simonw commented Feb 16, 2025

It would be nicer if the config dir wasn't actually needed by most plugins (except the ones that actually need to store binary data in disk) - so any JSON configs etc could optionally be provided as a Python dictionary instead of being looked up on disk.

@simonw simonw added the pre-1.0 label Feb 16, 2025
@simonw
Copy link
Owner Author

simonw commented Feb 16, 2025

Started a pre-1.0 label for things like this that need to be sorted out before a 1.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant