Is your feature request related to a problem? Please describe.
Enable response caching for the two generator classes so that if an exception is raised (e.g. RateLimitError) partway through generation, re-generation of those already-generated responses is not required.
Describe the solution you'd like
Ideally, this would involve a batch_size or similar parameter for the generate_responses methods. The prompts would be partitioned and generation would occur in batches (e.g. in a loop). If an exception is raised in batch k, responses 1 through (k-1) would still be available to the user. We are thinking of using the following approach: cache the successfully generated responses from batches 1 through (k-1) and start at batch k in subsequent run of generate_responses method if failure occurs. Ideally, this would be using something temporary on the filesystem rather than something in memory, like an instance attribute.
Describe alternatives you've considered
Status quo
Additional context
It may be useful to add a time dimension to help avoid RateLimitError. Specifically, this could involve pausing before starting batch k if batch (k-1) completed in fewer than n seconds. This could be accomplished with a min_time_per_batch parameter.
Is your feature request related to a problem? Please describe.
Enable response caching for the two generator classes so that if an exception is raised (e.g.
RateLimitError) partway through generation, re-generation of those already-generated responses is not required.Describe the solution you'd like
Ideally, this would involve a
batch_sizeor similar parameter for thegenerate_responsesmethods. The prompts would be partitioned and generation would occur in batches (e.g. in a loop). If an exception is raised in batch k, responses 1 through (k-1) would still be available to the user. We are thinking of using the following approach: cache the successfully generated responses from batches 1 through (k-1) and start at batch k in subsequent run ofgenerate_responsesmethod if failure occurs. Ideally, this would be using something temporary on the filesystem rather than something in memory, like an instance attribute.Describe alternatives you've considered
Status quo
Additional context
It may be useful to add a time dimension to help avoid
RateLimitError. Specifically, this could involve pausing before starting batch k if batch (k-1) completed in fewer than n seconds. This could be accomplished with amin_time_per_batchparameter.