You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| 240 |`complex_architecture`|`google/gemini-3.1-pro`| Complex systems and architecture design | CS or engineering + architecture embedding / markers + `projection:balance_complex` or `projection:balance_reasoning`|
170
174
| 235 |`complex_stem`|`google/gemini-3.1-pro`| Complex STEM synthesis outside dedicated math | STEM domain + STEM or research embedding, or high routing band |
| 220 |`medium_code_general`|`qwen/qwen3.5-rocm`| Low-medium cost coding, debugging, and technical Q&A | code domain / markers / embedding / coding preference + `projection:balance_medium` or `projection:balance_complex`, excluding agentic, architecture, and creative cues|
176
+
| 220 |`medium_code_general`|`qwen/qwen3.5-rocm`| Low-medium cost coding, debugging, and technical Q&A | code domain / markers / embedding + `projection:balance_medium` or `projection:balance_complex`, or short urgent code prompts with `projection:balance_simple` + `projection:urgency_elevated`|
173
177
| 216 |`verified_business`|`google/gemini-2.5-flash-lite`| Evidence-sensitive business or economics requests | business/economics + `projection:verification_required` or hard evidence synthesis + business embedding or medium/complex routing band |
174
178
| 215 |`medium_business`|`qwen/qwen3.5-rocm`| Mid-tier business and economics analysis | business/economics + `embedding:business_analysis` + `projection:balance_medium` or `projection:balance_complex`, excluding verification overlay |
175
179
| 214 |`verified_health`|`google/gemini-3.1-pro`| Evidence-sensitive health and medical guidance |`domain:health` + `projection:verification_required` + health embedding or medium/complex/reasoning band |
176
180
| 211 |`verified_history`|`google/gemini-2.5-flash-lite`| Source-sensitive history explanation |`domain:history` + `projection:verification_required` or hard evidence synthesis + history embedding or medium/complex routing band |
177
181
| 210 |`medium_history`|`qwen/qwen3.5-rocm`| Mid-tier history explanation and comparison |`domain:history` + `embedding:history_explainer` + `projection:balance_medium` or `projection:balance_complex`, excluding verification overlay |
178
182
| 205 |`medium_psychology`|`qwen/qwen3.5-rocm`| Psychology and behavior queries with nuanced explanation |`domain:psychology` + `embedding:psychology_support` + `projection:balance_medium` or `projection:balance_complex`|
183
+
| 202 |`engaged_general`|`google/gemini-2.5-flash-lite`| General or psychology-adjacent prompts with visible emotion or urgency |`projection:emotion_positive` or `projection:emotion_negative` or `projection:urgency_elevated` + general/psychology cues, excluding specialist and verification-heavy lanes |
179
184
| 200 |`medium_creative`|`google/gemini-2.5-flash-lite`| Creative writing, copywriting, and ideation | creative markers / embedding / collaboration preference + `projection:balance_simple` or `projection:balance_medium`|
180
185
| 190 |`reasoning_general`|`openai/gpt5.4`| Non-specialist deep analysis and multi-step reasoning | reasoning / research / multi-step cues + `projection:balance_complex` or `projection:balance_reasoning`, excluding specialist embeddings and broad technical markers |
|`structure`| cheap structural overlays for workflow formatting and punctuation emphasis |`ordered_workflow`, `numbered_steps`, `exclamation_emphasis`|
205
211
|`fact_check`| evidence-sensitive detection that feeds verification pressure |`needs_fact_check`|
206
212
|`user_feedbacks`| explicit correction or clarification overlays |`wrong_answer`, `need_clarification`|
207
213
|`preferences`| collaboration style and request framing |`coding_partner`, `creative_collaboration`, `agentic_execution`|
@@ -214,8 +220,9 @@ Notable profile-specific signal details:
214
220
215
221
-`context` bands are non-overlapping: `short_context` is `0-999`, `medium_context` is `1K-7999`, and `long_context` is `8K-256K`.
216
222
-`complexity` signals are reusable across both route predicates and projection scores through sublevels such as `code_task:hard` or `evidence_synthesis:medium`.
223
+
- the emotion and urgency overlays stay heuristic on purpose: lexical markers and repeated `!` / `!` are used as secondary coordination signals instead of replacing the learned primary-intent lanes.
217
224
- short lexical verification and correction cues are intentionally literal in this profile, so examples that say `verify this`, `answer with citations`, or Chinese `给出处` are more reliable than looser paraphrases.
218
-
-`jailbreak` and `pii` signals are still defined in the profile for safety surfaces, but they are not the primary routing predicates for the 22 active decisions.
225
+
-`jailbreak` and `pii` signals are still defined in the profile for safety surfaces, but they are not the primary routing predicates for the 23 active decisions.
219
226
220
227
## Projection Overview
221
228
@@ -227,6 +234,10 @@ The profile uses `routing.projections` as the coordination layer between raw sig
227
234
|`balance_intent_partition`| partition | resolves one learned-intent winner across the maintained embedding lanes |`agentic_workflows`, `architecture_design`, `code_general`, `creative_tasks`, `fast_qa_en`, `fast_qa_zh`, `general_chat_fallback`, and related specialist embeddings |
228
235
|`difficulty_score`| score | blends context, keywords, embeddings, and complexity sublevels into one difficulty signal | source for the difficulty band mapping |
|`emotion_valence`| score | blends positive and negative affect markers into one lightweight emotional-overlay score | source for the emotion band mapping |
@@ -242,7 +253,7 @@ That lets the profile reuse one difficulty story and one verification story acro
242
253
243
254
Test these in the dashboard playground at `http://<your-server-ip>:8700`:
244
255
245
-
The same stable examples are also maintained as machine-readable probes in [`balance.probes.yaml`](./balance.probes.yaml) for live `POST /api/v1/eval` calibration loops. The maintained suite currently covers all 22 decisions with 54 probe variants, so routing changes are checked against a small robustness set instead of one crafted prompt per route.
256
+
The same stable examples are also maintained as machine-readable probes in [`balance.probes.yaml`](./balance.probes.yaml) for live `POST /api/v1/eval` calibration loops. The maintained suite currently covers all 23 decisions with 58 probe variants, so routing changes are checked against a small robustness set instead of one crafted prompt per route.
246
257
247
258
Each decision below includes every maintained probe variant from the manifest, so the README stays copy-pasteable for playground checks and aligned with the executable eval suite.
248
259
@@ -408,6 +419,12 @@ A Java unit test is failing after a refactor; explain the most likely cause and
408
419
After a refactor, an integration test started failing in a Java codebase. Explain the most likely cause and the first code change to inspect.
409
420
```
410
421
422
+
#### `urgent_bug_zh`
423
+
424
+
```text
425
+
这太离谱了!!!马上告诉我该怎么处理这个 bug。
426
+
```
427
+
411
428
### `verified_business`
412
429
413
430
Expected alias: `google/gemini-2.5-flash-lite`
@@ -540,6 +557,30 @@ Why do people fall into confirmation bias, and what strategies usually help redu
540
557
Why do people procrastinate on important work, and what interventions usually help?
541
558
```
542
559
560
+
### `engaged_general`
561
+
562
+
Expected alias: `google/gemini-2.5-flash-lite`
563
+
564
+
Emotion-aware and urgency-aware general lane for prompts that should avoid brittle specialist or fast-QA misroutes.
565
+
566
+
#### `celebratory_reply_zh`
567
+
568
+
```text
569
+
太好了!!!我终于拿到 offer 了,帮我写一段兴奋但得体的回复。
570
+
```
571
+
572
+
#### `roommate_text`
573
+
574
+
```text
575
+
I am overwhelmed right now!! Help me write a calm text to my roommate and keep it supportive.
576
+
```
577
+
578
+
#### `dinner_reschedule`
579
+
580
+
```text
581
+
This is ridiculous!! Help me write a calm message to reschedule tonight's dinner.
0 commit comments