[ENHANCEMENT] Support on-the-fly 1M context switching

### Problem (one or two sentences)

API providers have different rate limit quotas for 1M context window models versus standard 200K models, making it more cost-effective and efficient to use the standard model until context size actually requires the 1M window.

### Context (who is affected and when)

This affects all users who enable the "1M Context Window (Beta)" setting for Claude Sonnet 4/4.5 models. Currently, when this setting is enabled, ALL API requests use the 1M context window regardless of actual context size, even for small conversations that fit well within the 200K limit. This results in unnecessary quota consumption and potentially higher API costs.

### Desired behavior (conceptual, not technical)

Change the "Enable 1M Context Window" checkbox to enable dynamic context window switching. When enabled, the system should automatically:

- Use the standard 200K context window for requests that fit within 200K tokens
- Switch to the 1M context window only when context size approaches or exceeds 200K tokens
- Switch back to 200K if context is condensed and fits within the smaller window again

The user shouldn't need to manually toggle the setting - the system intelligently chooses the right model based on actual context usage.

### Constraints / preferences (optional)

- Must maintain backward compatibility with the existing anthropicBeta1MContext boolean setting
- Should work transparently without requiring user intervention once enabled
- Context size calculation already exists via calculateTokenDistribution() - reuse this
- Similar pattern should work for both Anthropic direct API and AWS Bedrock providers

### Request checklist

- [x] I've searched existing Issues and Discussions for duplicates
- [x] This describes a specific problem with clear context and impact

### Roo Code Task Links (optional)

_No response_

### Acceptance criteria (optional)

**Given** a user has enabled "Dynamic 1M Context Window" for Claude Sonnet 4/4.5
**When** they start a new conversation with less than 200K tokens of context
**Then** the system uses the standard 200K context window model
**And** API requests include NO `context-1m-2025-08-07` beta flag
**And** pricing reflects 200K tier ($3 input / $15 output per million tokens)

**Given** the same user continues the conversation
**When** context size grows to exceed 190K tokens (threshold before 200K limit)
**Then** subsequent API requests automatically switch to 1M context window
**And** API requests include the `context-1m-2025-08-07` beta flag
**And** pricing reflects 1M tier ($6 input / $22.50 output per million tokens)

**Given** context was using 1M window
**When** context is condensed and fits within 180K tokens (with buffer below 200K)
**Then** subsequent requests switch back to 200K context window
**And** pricing returns to 200K tier rates

**But** users with the setting DISABLED should always use 200K context window regardless of size
**And** the model selection should happen transparently per-request without user intervention
**And** existing users' `anthropicBeta1MContext` settings should continue working as before

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENHANCEMENT] Support on-the-fly 1M context switching #9250

Problem (one or two sentences)

Context (who is affected and when)

Desired behavior (conceptual, not technical)

Constraints / preferences (optional)

Request checklist

Roo Code Task Links (optional)

Acceptance criteria (optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[ENHANCEMENT] Support on-the-fly 1M context switching #9250

Description

Problem (one or two sentences)

Context (who is affected and when)

Desired behavior (conceptual, not technical)

Constraints / preferences (optional)

Request checklist

Roo Code Task Links (optional)

Acceptance criteria (optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions