You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Enable response caching for the two generator classes so that if an exception is raised (e.g. RateLimitError) partway through generation, re-generation of those already-generated responses is not required.
Describe the solution you'd like
Ideally, this would involve a batch_size or similar parameter for the generate_responses methods. The prompts would be partitioned and generation would occur in batches (e.g. in a loop). If an exception is raised in batch k, responses 1 through (k-1) would still be available to the user. We are thinking of using the following approach: cache the successfully generated responses from batches 1 through (k-1) and start at batch k in subsequent run of generate_responses method if failure occurs. Ideally, this would be using something temporary on the filesystem rather than something in memory, like an instance attribute.
Describe alternatives you've considered
Status quo
Additional context
It may be useful to add a time dimension to help avoid RateLimitError. Specifically, this could involve pausing before starting batch k if batch (k-1) completed in fewer than n seconds. This could be accomplished with a min_time_per_batch parameter.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Enable response caching for the two generator classes so that if an exception is raised (e.g.
RateLimitError
) partway through generation, re-generation of those already-generated responses is not required.Describe the solution you'd like
Ideally, this would involve a
batch_size
or similar parameter for thegenerate_responses
methods. The prompts would be partitioned and generation would occur in batches (e.g. in a loop). If an exception is raised in batch k, responses 1 through (k-1) would still be available to the user. We are thinking of using the following approach: cache the successfully generated responses from batches 1 through (k-1) and start at batch k in subsequent run ofgenerate_responses
method if failure occurs. Ideally, this would be using something temporary on the filesystem rather than something in memory, like an instance attribute.Describe alternatives you've considered
Status quo
Additional context
It may be useful to add a time dimension to help avoid
RateLimitError
. Specifically, this could involve pausing before starting batch k if batch (k-1) completed in fewer than n seconds. This could be accomplished with amin_time_per_batch
parameter.The text was updated successfully, but these errors were encountered: