accurate pause#88
Conversation
richardr1126
commented
Apr 16, 2026
- feat: add support for Replicate TTS provider and models
- feat(replicate): add support for custom Replicate model selection
- feat(TTS): implement pause functionality with state preservation
- refactor(tts): update Replicate model defaults and voice resolution
- feat(TTS): implement pause functionality with state preservation
- refactor(tts): centralize upstream response helpers and improve settings validation
- feat(TTS): implement pause functionality with state preservation
- Updated environment variables documentation to include Replicate as a TTS provider option. - Added Replicate to the sidebar for TTS provider guides. - Included Replicate as a dependency in package.json and pnpm-lock.yaml. - Enhanced audiobook chapter generation to normalize native speed settings based on the TTS provider. - Improved error handling in TTS API routes to provide retry information for rate-limited responses. - Updated AudiobookExportModal to reflect native speed support for Replicate models. - Modified SettingsModal to set default model for Replicate. - Enhanced SpeedControl component to conditionally render native speed controls based on provider support. - Updated TTSContext to utilize effective native speed for TTS requests. - Implemented Replicate request handling in the TTS generation logic. - Added new documentation for configuring Replicate as a TTS provider.
Align default Replicate model to the versioned Kokoro model across UI, docs, and server logic. Refactor TTS settings merging for audiobooks to ensure consistent normalization. Improve Retry-After header handling for upstream rate limits. Expand Replicate voice resolution to use model schemas when available, with test coverage for custom and built-in models. Update documentation to reflect new Replicate defaults, model selection, and configuration guidance.
…Reader-WebUI into accurate-pause
…ngs validation Move getUpstreamStatus and getUpstreamRetryAfterSeconds to a shared utility module for consistent upstream error handling across TTS endpoints. Strengthen audiobook chapter API by introducing runtime validation for incoming settings payloads, ensuring type safety and error reporting for malformed requests. Replace in-memory Map caches with LRUMap for Replicate voice and schema lookups, improving memory management and eviction logic.
…Reader-WebUI into accurate-pause
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 42 minutes and 11 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (21)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 304d3d51e0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Output is a URI string pointing to the generated audio file | ||
| const audioUrl = typeof output === 'string' ? output : String(output); | ||
| const audioResponse = await fetch(audioUrl, { signal }); |
There was a problem hiding this comment.
Handle non-string outputs from replicate.run
This assumes replicate.run() always returns a URL string, but Replicate models that output files can return FileOutput objects or arrays; in those cases String(output) becomes something like "[object Object]" or a comma-joined list, and fetch(audioUrl) fails even though the upstream generation succeeded. This will break audio generation for any Replicate model whose output is not a plain string URL, so the code should explicitly handle FileOutput/array outputs (or configure the client to return raw URLs).
Useful? React with 👍 / 👎.
| async function runWithReplicateGate<T>(signal: AbortSignal, operation: () => Promise<T>): Promise<T> { | ||
| const waitMs = Math.max(0, replicateBlockedUntilMs - Date.now()); | ||
| if (waitMs > 0) { |
There was a problem hiding this comment.
Scope Replicate cooldown by credential/model
The cooldown gate is process-global, so a single 429 from one request sets replicateBlockedUntilMs for every subsequent Replicate request in this worker, including different users and API keys. In a multi-user deployment where clients can send their own keys, one throttled key can unnecessarily stall unrelated traffic for up to the full Retry-After window; cooldown state should be partitioned (at least by API key, ideally by key+model).
Useful? React with 👍 / 👎.
| async function getReplicateOpenApiSchemaCached(apiKey: string, model: string): Promise<unknown | null> { | ||
| const cachedPromise = replicateOpenApiSchemaPromiseCache.get(model); | ||
| if (cachedPromise) { |
There was a problem hiding this comment.
Key Replicate schema caches with API key context
Both schema and voice-input-key caches are indexed only by model, so schema data fetched with one token is reused for later requests with different tokens. For private/custom Replicate models, this can leak metadata (voices/input fields) across users and also produce incorrect voice-key resolution when different credentials see different schema versions; cache keys should include credential context (or avoid cross-request sharing for authenticated model metadata).
Useful? React with 👍 / 👎.