Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve install size #32

Closed
transitive-bullshit opened this issue Apr 23, 2024 · 3 comments
Closed

Improve install size #32

transitive-bullshit opened this issue Apr 23, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@transitive-bullshit
Copy link
Collaborator

Currently sitting at ~17MB with 14MB coming from tiktoken: https://pkg-size.dev/@dexaai%2Fdexter

See also dqbd/tiktoken#68

For comparison, here's langchain at ~36MB: https://pkg-size.dev/langchain but we should be a lot slimmer than this. Langchain's not even loading the full tiktoken WASM lib; they're using the 6.6MB js-tiktoken.

This issue may end up just being resolved by improving tiktoken's WASM bundle size upstream, but I wanted to track it while it's top of mind for gptlint.

@transitive-bullshit transitive-bullshit added the enhancement New feature or request label Apr 23, 2024
@rileytomasek
Copy link
Contributor

bundle size is obv important, but do we care about bundle size? bundlephobia is showing 238k minified bundle but that seems like it has to be wrong with tiktoken.

also, have you looked in the js tokenizer libs lately? is there a better one yet?

@transitive-bullshit
Copy link
Collaborator Author

transitive-bullshit commented Apr 23, 2024

agreed that it's not a priority; just bringing it up because gptlint came in at 25MB and 80% of that was dexter which was surprising to me.

also, have you looked in the js tokenizer libs lately? is there a better one yet?

Not that I'm aware of; langchain is still using js-tiktoken and I haven't seen any others gain wide adoption.

@rileytomasek
Copy link
Contributor

ok lets close this then considering it's mostly tiktoken and we don't have a good alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants