-
-
Notifications
You must be signed in to change notification settings - Fork 320
perf(ccusage): add timestamp cache for faster file processing #766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add a persistent timestamp cache to dramatically improve performance when loading usage data from JSONL files. Previously, every run needed to read all files to extract timestamps for sorting. Now timestamps are cached and only updated when files change. Key optimizations: - Cache file timestamps to ~/.config/claude/.ccusage/timestamp-cache.json - Use file mtime to detect when cache entries are stale - Early filter files by date range before sorting (using --since/--until) - Only read first 4KB of files to extract timestamps - Batch process files with controlled concurrency (50 at a time) Performance improvement on 8600+ files: - Full data query: 28.2s → 8.4s (3.4x faster) - With --since filter: 11.4s → 8.1s (1.4x faster) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
📝 WalkthroughWalkthroughA new timestamp-cache module provides a persistent in-memory and disk-backed cache for per-file timestamp metadata; data-loader functions now use this cache for early date-range filtering and cached timestamp-based sorting to reduce file processing. Changes
Sequence Diagram(s)sequenceDiagram
participant DataLoader as Data Loader
participant TimestampCache as Timestamp Cache
participant FileSystem as File System
DataLoader->>TimestampCache: filterFilesByDateRange(files, since, until)
activate TimestampCache
TimestampCache->>TimestampCache: Load/initialize cache (disk)
loop batched files (concurrency limit)
TimestampCache->>FileSystem: stat(file) for mtime
alt cache entry valid (mtime match)
TimestampCache->>TimestampCache: use cached timestamps
else
TimestampCache->>FileSystem: read head/tail of JSONL
TimestampCache->>TimestampCache: extract first/last timestamps
TimestampCache->>TimestampCache: update cache entry
end
end
TimestampCache->>FileSystem: debounced save cache (async)
TimestampCache-->>DataLoader: return filtered file list
deactivate TimestampCache
DataLoader->>TimestampCache: sortFilesByTimestampCached(files)
activate TimestampCache
TimestampCache->>TimestampCache: retrieve timestamps and sort
TimestampCache-->>DataLoader: return sorted files
deactivate TimestampCache
DataLoader->>DataLoader: proceed with loading/processing
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
apps/ccusage/src/_timestamp-cache.ts (2)
42-43: Cache location doesn't respectCLAUDE_CONFIG_DIRenvironment variable.The cache path is hardcoded to
~/.config/claude/.ccusage/usingDEFAULT_CLAUDE_CONFIG_PATH. If users setCLAUDE_CONFIG_DIRto a custom location, the cache will still be stored in the default location rather than alongside their data.Consider deriving the cache location dynamically based on the first valid Claude path, or document this as a known limitation.
245-247: Consider the implications ofDate.now()fallback for mtime.If
statfails, usingDate.now()as mtime means this entry will never be cache-hit on subsequent calls (sinceDate.now()will differ each time). This is likely acceptable as stat failures are rare, but worth noting that such files won't benefit from caching.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
🧰 Additional context used
📓 Path-based instructions (7)
apps/ccusage/src/**/*.ts
📄 CodeRabbit inference engine (apps/ccusage/CLAUDE.md)
apps/ccusage/src/**/*.ts: Write tests in-source usingif (import.meta.vitest != null)blocks instead of separate test files
Use Vitest globals (describe,it,expect) without imports in test blocks
In tests, use current Claude 4 models (sonnet-4, opus-4)
Usefs-fixturewithcreateFixture()to simulate Claude data in tests
Only export symbols that are actually used by other modules
Do not use console.log; use the logger utilities fromsrc/logger.tsinstead
Files:
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
apps/ccusage/**/*.ts
📄 CodeRabbit inference engine (apps/ccusage/CLAUDE.md)
apps/ccusage/**/*.ts: NEVER useawait import()dynamic imports anywhere (especially in tests)
Prefer@praha/byethrowResult type for error handling instead of try-catch
Use.tsextensions for local imports (e.g.,import { foo } from './utils.ts')
Files:
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js,jsx}: Use ESLint for linting and formatting with tab indentation and double quotes
No console.log allowed except where explicitly disabled with eslint-disable; use logger.ts instead
Use file paths with Node.js path utilities for cross-platform compatibility
Use variables starting with lowercase (camelCase) for variable names
Can use UPPER_SNAKE_CASE for constants
Files:
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
**/*.ts{,x}
📄 CodeRabbit inference engine (CLAUDE.md)
Use TypeScript with strict mode and bundler module resolution
Files:
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Use.tsextensions for local file imports (e.g.,import { foo } from './utils.ts')
Prefer @praha/byethrow Result type over traditional try-catch for functional error handling
UseResult.try()for wrapping operations that may throw (JSON parsing, etc.)
UseResult.isFailure()for checking errors (more readable than!Result.isSuccess())
Use early return pattern (if (Result.isFailure(result)) continue;) instead of ternary operators when checking Results
Keep traditional try-catch only for file I/O with complex error handling or legacy code that's hard to refactor
Always useResult.isFailure()andResult.isSuccess()type guards for better code clarity
Use uppercase (PascalCase) for type names
Only export constants, functions, and types that are actually used by other modules - internal constants used only within the same file should NOT be exported
In-source testing pattern: write tests directly in source files usingif (import.meta.vitest != null)blocks
CRITICAL: DO NOT useawait import()dynamic imports anywhere in the codebase - this causes tree-shaking issues
CRITICAL: Never use dynamic imports withawait import()in vitest test blocks - this is particularly problematic for test execution
Vitest globals (describe,it,expect) are enabled and available without imports since globals are configured
Create mock data usingfs-fixturewithcreateFixture()for Claude data directory simulation in tests
All test files must use current Claude 4 models (claude-sonnet-4-20250514, claude-opus-4-20250514), not outdated Claude 3 models
Model names in tests must exactly match LiteLLM's pricing database entries
Files:
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
**/*.{ts,tsx,json}
📄 CodeRabbit inference engine (CLAUDE.md)
Claude model naming convention:
claude-{model-type}-{generation}-{date}(e.g.,claude-sonnet-4-20250514, NOTclaude-4-sonnet-20250514)
Files:
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
**/data-loader.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Silently skip malformed JSONL lines during parsing in data loading operations
Files:
apps/ccusage/src/data-loader.ts
🧠 Learnings (10)
📓 Common learnings
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-25T14:42:34.734Z
Learning: Applies to **/data-loader.ts : Silently skip malformed JSONL lines during parsing in data loading operations
📚 Learning: 2025-09-18T16:06:37.474Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: apps/ccusage/CLAUDE.md:0-0
Timestamp: 2025-09-18T16:06:37.474Z
Learning: Applies to apps/ccusage/src/**/*.ts : Use `fs-fixture` with `createFixture()` to simulate Claude data in tests
Applied to files:
apps/ccusage/src/_timestamp-cache.tsapps/ccusage/src/data-loader.ts
📚 Learning: 2025-11-25T14:42:34.734Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-25T14:42:34.734Z
Learning: Applies to **/*.{ts,tsx} : Create mock data using `fs-fixture` with `createFixture()` for Claude data directory simulation in tests
Applied to files:
apps/ccusage/src/_timestamp-cache.ts
📚 Learning: 2025-09-18T16:06:37.474Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: apps/ccusage/CLAUDE.md:0-0
Timestamp: 2025-09-18T16:06:37.474Z
Learning: Applies to apps/ccusage/src/**/*.ts : Write tests in-source using `if (import.meta.vitest != null)` blocks instead of separate test files
Applied to files:
apps/ccusage/src/_timestamp-cache.ts
📚 Learning: 2025-09-18T16:07:16.293Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: apps/codex/CLAUDE.md:0-0
Timestamp: 2025-09-18T16:07:16.293Z
Learning: Tests should use fs-fixture with using to ensure cleanup
Applied to files:
apps/ccusage/src/_timestamp-cache.ts
📚 Learning: 2025-11-25T14:42:34.734Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-25T14:42:34.734Z
Learning: Applies to **/*.{ts,tsx} : In-source testing pattern: write tests directly in source files using `if (import.meta.vitest != null)` blocks
Applied to files:
apps/ccusage/src/_timestamp-cache.ts
📚 Learning: 2025-09-17T18:29:15.764Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: apps/mcp/CLAUDE.md:0-0
Timestamp: 2025-09-17T18:29:15.764Z
Learning: Applies to apps/mcp/**/*.{test,spec}.ts : Use `fs-fixture` for mock data in tests of MCP server functionality
Applied to files:
apps/ccusage/src/_timestamp-cache.ts
📚 Learning: 2025-09-18T16:06:37.474Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: apps/ccusage/CLAUDE.md:0-0
Timestamp: 2025-09-18T16:06:37.474Z
Learning: Applies to apps/ccusage/src/**/*.ts : Use Vitest globals (`describe`, `it`, `expect`) without imports in test blocks
Applied to files:
apps/ccusage/src/_timestamp-cache.ts
📚 Learning: 2025-09-18T16:06:37.474Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: apps/ccusage/CLAUDE.md:0-0
Timestamp: 2025-09-18T16:06:37.474Z
Learning: Applies to apps/ccusage/**/*.ts : Use `.ts` extensions for local imports (e.g., `import { foo } from './utils.ts'`)
Applied to files:
apps/ccusage/src/data-loader.ts
📚 Learning: 2025-09-18T16:06:37.474Z
Learnt from: CR
Repo: ryoppippi/ccusage PR: 0
File: apps/ccusage/CLAUDE.md:0-0
Timestamp: 2025-09-18T16:06:37.474Z
Learning: Applies to apps/ccusage/src/**/*.ts : Do not use console.log; use the logger utilities from `src/logger.ts` instead
Applied to files:
apps/ccusage/src/data-loader.ts
🧬 Code graph analysis (2)
apps/ccusage/src/_timestamp-cache.ts (1)
apps/ccusage/src/_consts.ts (1)
DEFAULT_CLAUDE_CONFIG_PATH(56-56)
apps/ccusage/src/data-loader.ts (1)
apps/ccusage/src/_timestamp-cache.ts (2)
filterFilesByDateRange(291-337)sortFilesByTimestampCached(342-364)
🔇 Additional comments (8)
apps/ccusage/src/data-loader.ts (4)
47-50: LGTM!The import follows coding guidelines with
.tsextension and imports only the required functions from the new timestamp-cache module.
771-780: LGTM!Good optimization pattern: filtering by date range first reduces the number of files that need to be sorted. The cached timestamp sorting then benefits from not needing to re-read files on subsequent runs.
920-939: LGTM!Good use of Set and Map for O(1) lookups when filtering and rejoining the files with their base directories. The pattern efficiently preserves the file-to-baseDir association through the filtering and sorting operations.
1368-1376: LGTM!Consistent application of the early date-range filtering and cached sorting pattern across all data loading paths.
apps/ccusage/src/_timestamp-cache.ts (4)
12-17: LGTM!Imports follow the established patterns in the codebase. Uses
.tsextension for local imports as per coding guidelines.
54-82: LGTM!Good use of
Result.tryfor error handling and version checking for cache compatibility. The lazy loading pattern with null check is appropriate.
291-337: LGTM!The date range filtering logic correctly handles:
- Early return when no filters specified
- File date range overlaps with filter range
- Files without timestamps (included for safety)
The YYYYMMDD parsing relies on caller validation, which is consistent with the existing codebase pattern.
374-444: LGTM!Tests follow coding guidelines:
- In-source testing with
if (import.meta.vitest != null)block- Uses Vitest globals without imports
- Uses
await usingwithfs-fixturefor proper cleanup- Clears memory cache in
beforeEachfor test isolation
|
looks good! i'd love to use it in other clis!! |
commit: |
|
Thanks for the feedback! I've updated the implementation to use file seeking ( |
Summary
This PR adds a persistent timestamp cache to dramatically improve performance when loading usage data from JSONL files.
Problem: With large numbers of JSONL files (8600+ files, 863MB), ccusage was very slow because it needed to read every file to extract timestamps for sorting on each run.
Solution:
~/.config/claude/.ccusage/timestamp-cache.json--since/--until)Performance Results
Tested on 8642 JSONL files (863MB):
--sincefilterChanges
apps/ccusage/src/_timestamp-cache.ts- New cache module with tests (445 lines)apps/ccusage/src/data-loader.ts- Integrate cache into data loading functionsTest plan
🤖 Generated with Claude Code
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.