Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArbitraryFileIdManager should have a configurable option to determine its "content root" #43

Open
kevin-bates opened this issue Oct 29, 2022 · 3 comments
Labels
enhancement New feature or request
Milestone

Comments

@kevin-bates
Copy link
Member

In looking at various ContentsManager implementations (e.g., S3, PostgreSQL) the root_dir is not used. Instead, a different (custom) attribute would be tantamount to determining the content root. For example, the referenced S3 ContentsManager has a bucket attribute indicating the bucket in which the content resides. Similarly, the referenced PostgreSQL ContentsManager uses db_url to identify the database, which would be sufficient as the content root. (One might argue that the user_id attribute may be a better choice, but that's an unrelated discussion.)

It seems like ArbitraryFileIdManager should introduce a configurable trait that either indicates the name of the attribute from which to pull the content root identifier from or a trait the user must set on both its ContentsManager and ArbitraryFileIdManager.

I'd lean toward the former.

Proposal

  • Introduce a trait content_root_attribute: str that refers to the attribute on the configured ContentsManager from which the content_root value (known as root_dir today, and that name can continue) is pulled. The default value for content_root_attribute would be root_dir.

The other option would be to simply expose root_dir on ArbitraryFileIdManager which is configured to be the same value as the appropriate attribute on ContentsManager (e.g., S3ContentsManager.bucket).

The advantage of the first option is that operators/admins would only need to configure the ArbitraryFileIdManager once for the duration that that particular ContentsManager is in use. While the second option would require a different configuration for each user. As a result, the first option is helpful for IT staffs that need to push configurations (like in Berkeley's Data 8 class or in PaaS envs).

@kevin-bates kevin-bates added the bug Something isn't working label Oct 29, 2022
@welcome
Copy link

welcome bot commented Oct 29, 2022

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@dlqqq dlqqq added enhancement New feature or request and removed bug Something isn't working labels Oct 31, 2022
@dlqqq
Copy link
Collaborator

dlqqq commented Oct 31, 2022

The advantage of the first option is that operators/admins would only need to configure the ArbitraryFileIdManager once for the duration that that particular ContentsManager is in use. While the second option would require a different configuration for each user. As a result, the first option is helpful for IT staffs that need to push configurations (like in Berkeley's Data 8 class or in PaaS envs).

Ah, I think I know what you mean. So you're saying that with this setup, the user wouldn't need to pass the same contents root twice to two separate configuration keys? For example,

jupyter lab --S3ContentsManager.bucket="s3://some_very_long_bucket_name" --ArbitraryFileIdManager.content_root_attribute="bucket"

v.s.

jupyter lab --S3ContentsManager.bucket="s3://some_very_long_bucket_name" --ArbitraryFileIdManager.root_dir="s3://some_very_long_bucket_name"

Hm, I'm inclined to agree. Changing the contents root should be as smooth as possible. It requires us to assume the contents root will always be available as an attribute, which I think is OK.

@kevin-bates
Copy link
Member Author

Correct - although my use case would use a file to contain the config that operators push out. So in your example above, operators would push a config containing:

c.ArbitraryFileIdManager.content_root_attribute="bucket"

Then, users alice and bob run (respectively)...

jupyter lab --S3ContentsManager.bucket="s3://alice"
jupyter lab --S3ContentsManager.bucket="s3://bob"

vs. users essentially having to list/update both values:

jupyter lab --S3ContentsManager.bucket="s3://alice" --S3ContentsManager.content_root="s3://alice"
jupyter lab --S3ContentsManager.bucket="s3://bob" --S3ContentsManager.content_root="s3://bob"

(I guess I'm also advocating that BaseFileIdManager have its root_dir renamed to content_root so that it's more natural across ContentsManager implementations.)

@dlqqq dlqqq added this to the 0.7.0 milestone Nov 7, 2022
@dlqqq dlqqq modified the milestones: 0.7.0, 0.8.0, 0.9.0 Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants