-
Notifications
You must be signed in to change notification settings - Fork 952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(embedding): always request embedding creation as base64 #1312
base: master
Are you sure you want to change the base?
perf(embedding): always request embedding creation as base64 #1312
Conversation
Requesting base64 encoded embeddings returns smaller body sizes, on average ~60% smaller than float32 encoded. In other words, the size of the response body containing embeddings in float32 is ~2.3x bigger than base64 encoded embedding. We always request embedding creating encoded as base64, and then decoded them to float32 based on the user's provided encoding_format parameter. Closes openai#1310
7702d54
to
270861b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
// Force base64 encoding for vector embeddings creation | ||
// See https://github.com/openai/openai-node/issues/1310 | ||
encoding_format: 'base64', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to always use base64
, if the user explicitly asked for a different format we should have the exact same behaviour as we do prior to this PR which is to just let them.
console.log(embeddingBase64Obj); | ||
const embeddingBase64Str = embeddingBase64Obj.embedding as unknown as string; | ||
embeddingBase64Obj.embedding = Array.from( | ||
new Float32Array(Buffer.from(embeddingBase64Str, 'base64').buffer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer
is a Node.js specific API, we need to use something that is available everywhere, would be great to add a generic helper function in core.ts
.
return base64Response._thenUnwrap((response) => { | ||
if (response && response.data) { | ||
response.data.forEach((embeddingBase64Obj) => { | ||
console.log(embeddingBase64Obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
console.log(embeddingBase64Obj); |
Requesting base64 encoded embeddings returns smaller body sizes, on average ~60% smaller than float32 encoded. In other words, the size of the response body containing embeddings in float32 is ~2.3x bigger than base64 encoded embedding.
Closes #1310
Changes being requested
We always request embedding creating encoded as base64, and then decoded them to float32 based on the user's provided encoding_format parameter.
Additional context & links
After running a few benchmarks, requesting base64 encoded embeddings returns smaller body sizes, on average ~60% smaller than float32 encoded. In other words, the size of the response body containing embeddings in float32 is ~2.3x bigger than base64 encoded embedding.
This performance improvement could translate to:
This is the result of a request that creates embedding from a 10kb chunk, run 10 times (the number are the size of response body in kb):
Read more #1310