Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,9 @@ The building blocks of a priompt prompt are:
3. `<empty>`: for specifying empty space, useful for reserving tokens for generation.
4. `<capture>`: capture the output and parse it right within the prompt.
5. `<isolate>`: isolate a section of the prompt with its own token limit. This is useful for guaranteeing that the start of the prompt will be the same for caching purposes. it would be nice to extend this to allow token limits like `100% - 100`.
6. `<br/>`: force a token break at a particular location, which is useful for ensuring exact tokenization matches between two parts of a prompt (e.g. when implementing something like speculative edits).
7. `<config>`: specify a few common configuration properties, such as `stop` token and `maxResponseTokens`, which can make the priompt dump more self-contained and help with evals.
6. `<max>`: specify a limit on the number of tokens within a scope, but unlike `<isolate>`, include the inner scopes in the global priority calculation. This allows for token limiting while maintaining global priority-based optimization.
7. `<br/>`: force a token break at a particular location, which is useful for ensuring exact tokenization matches between two parts of a prompt (e.g. when implementing something like speculative edits).
8. `<config>`: specify a few common configuration properties, such as `stop` token and `maxResponseTokens`, which can make the priompt dump more self-contained and help with evals.

You can create components all you want, just like in React. The builtin components are:

Expand All @@ -97,8 +98,8 @@ You can create components all you want, just like in React. The builtin componen

A few things that would be cool to add:

1. A `<max>` block: specify a `limit` on the number of tokens within a scope, but unlike `<isolate>`, include the inner scopes in the global priority calculation.
2. Performance-optimized rendering of big trees: minimizing time spent tokenizing is part of it, but part of it is also working around JavaScript object allocation, and it is possible that writing the entire rendering engine in Rust, for example, would make it a lot faster.
1. Performance-optimized rendering of big trees: minimizing time spent tokenizing is part of it, but part of it is also working around JavaScript object allocation, and it is possible that writing the entire rendering engine in Rust, for example, would make it a lot faster.
2. Enhanced `<max>` block integration: The current `<max>` implementation provides basic token limiting while participating in global priority calculation. A more sophisticated version could integrate more deeply with the global optimizer for even better priority handling.

## Caveats

Expand Down
172 changes: 172 additions & 0 deletions priompt/src/components.test.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -938,3 +938,175 @@ describe("Complex PromptElement Regression Tests", () => {
expect(rendered).toMatchSnapshot();
});
});

describe("Max Block Tests", () => {
const tokenizer = getTokenizerByName_ONLY_FOR_OPENAI_TOKENIZERS("cl100k_base");

it("should render a PromptElement with max block that enforces token limits", async () => {
function MaxBlockPrompt(props: PromptProps): PromptElement {
return (
<>
<SystemMessage>
System message outside max block
</SystemMessage>
<max tokenLimit={50}>
<UserMessage>
This is a user message inside max block that is quite long and should be truncated
<scope p={900}>High priority content</scope>
<scope p={800}>Medium priority content</scope>
<scope p={700}>Low priority content that should be excluded when token limit is hit</scope>
</UserMessage>
</max>
<AssistantMessage>
<scope p={950}>Very high priority content outside max block</scope>
<scope p={850}>High priority content outside max block</scope>
</AssistantMessage>
</>
);
}

const rendered = await Priompt.render(MaxBlockPrompt({}), {
tokenLimit: 1000,
tokenizer,
shouldBuildSourceMap: true,
});

delete rendered.durationMs;
expect(rendered).toMatchSnapshot();
});

it("should include max block children in global priority calculation unlike isolate", async () => {
function MaxVsIsolatePrompt(props: PromptProps): PromptElement {
return (
<>
<SystemMessage>
System message
</SystemMessage>
<max tokenLimit={200}>
<UserMessage>
<scope p={950}>Max block high priority content</scope>
<scope p={850}>Max block medium priority content</scope>
<scope p={750}>Max block low priority content</scope>
</UserMessage>
</max>
<isolate tokenLimit={200}>
<UserMessage>
<scope p={940}>Isolate block high priority content</scope>
<scope p={840}>Isolate block medium priority content</scope>
<scope p={740}>Isolate block low priority content</scope>
</UserMessage>
</isolate>
<AssistantMessage>
<scope p={900}>Global high priority content</scope>
<scope p={800}>Global medium priority content</scope>
<scope p={700}>Global low priority content</scope>
</AssistantMessage>
</>
);
}

const rendered = await Priompt.render(MaxVsIsolatePrompt({}), {
tokenLimit: 300, // Tight limit to test priority behavior
tokenizer,
shouldBuildSourceMap: true,
});

delete rendered.durationMs;
expect(rendered).toMatchSnapshot();
});

it("should handle nested max blocks correctly", async () => {
function NestedMaxBlockPrompt(props: PromptProps): PromptElement {
return (
<>
<SystemMessage>
Outer system message
</SystemMessage>
<max tokenLimit={100}>
<UserMessage>
Outer max block
<max tokenLimit={50}>
<scope p={900}>Inner max block content</scope>
<scope p={800}>More inner content</scope>
</max>
</UserMessage>
</max>
<AssistantMessage>
<scope p={950}>Outside max blocks</scope>
</AssistantMessage>
</>
);
}

const rendered = await Priompt.render(NestedMaxBlockPrompt({}), {
tokenLimit: 500,
tokenizer,
shouldBuildSourceMap: true,
});

delete rendered.durationMs;
expect(rendered).toMatchSnapshot();
});

it("should handle max block with different priority levels correctly", async () => {
function MaxPriorityTestPrompt(props: PromptProps): PromptElement {
return (
<>
<SystemMessage>
System message
</SystemMessage>
<max tokenLimit={30}>
<UserMessage>
{/* These should be prioritized based on their p values even within max block */}
<scope p={1000}>Highest priority in max block</scope>
<scope p={900}>High priority in max block</scope>
<scope p={800}>Medium priority in max block</scope>
<scope p={700}>Low priority in max block</scope>
</UserMessage>
</max>
<AssistantMessage>
<scope p={950}>High priority outside max</scope>
</AssistantMessage>
</>
);
}

const rendered = await Priompt.render(MaxPriorityTestPrompt({}), {
tokenLimit: 200,
tokenizer,
shouldBuildSourceMap: true,
});

delete rendered.durationMs;
expect(rendered).toMatchSnapshot();
});

it("should return empty when max block cannot fit any content", async () => {
function EmptyMaxBlockPrompt(props: PromptProps): PromptElement {
return (
<>
<SystemMessage>
System message
</SystemMessage>
<max tokenLimit={5}> {/* Very small limit */}
<UserMessage>
This is a very long message that definitely cannot fit within 5 tokens and should result in an empty max block
</UserMessage>
</max>
<AssistantMessage>
This should still be included
</AssistantMessage>
</>
);
}

const rendered = await Priompt.render(EmptyMaxBlockPrompt({}), {
tokenLimit: 1000,
tokenizer,
shouldBuildSourceMap: true,
});

delete rendered.durationMs;
expect(rendered).toMatchSnapshot();
});
});
92 changes: 89 additions & 3 deletions priompt/src/lib.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import { ChatCompletionRequestMessage, ChatCompletionFunctions, ChatCompletionResponseMessage, CreateChatCompletionResponse, Content, StreamChatCompletionResponse, } from './openai';
import { CHATML_PROMPT_EXTRA_TOKEN_COUNT_CONSTANT, CHATML_PROMPT_EXTRA_TOKEN_COUNT_LINEAR_FACTOR, } from './openai';
import { OpenAIMessageRole, PriomptTokenizer, numTokensForImage } from './tokenizer';
import { BaseProps, Node, ChatPrompt, Empty, First, RenderedPrompt, PromptElement, Scope, FunctionDefinition, FunctionPrompt, TextPrompt, ChatAndFunctionPromptFunction, ChatPromptMessage, ChatUserSystemMessage, ChatAssistantMessage, ChatFunctionResultMessage, Capture, OutputHandler, PromptProps, CaptureProps, BasePromptProps, ReturnProps, Isolate, RenderOutput, RenderOptions, PromptString, Prompt, BreakToken, PromptContentWrapper, PromptContent, ChatImage, ImagePromptContent, Config, ConfigProps, ChatToolResultMessage, SourceMap, ToolPrompt, ToolDefinition, ChatAndToolPromptToolFunction, AbsoluteSourceMap, RenderunCountTokensFast_UNSAFE } from './types';
import { BaseProps, Node, ChatPrompt, Empty, First, RenderedPrompt, PromptElement, Scope, FunctionDefinition, FunctionPrompt, TextPrompt, ChatAndFunctionPromptFunction, ChatPromptMessage, ChatUserSystemMessage, ChatAssistantMessage, ChatFunctionResultMessage, Capture, OutputHandler, PromptProps, CaptureProps, BasePromptProps, ReturnProps, Isolate, Max, RenderOutput, RenderOptions, PromptString, Prompt, BreakToken, PromptContentWrapper, PromptContent, ChatImage, ImagePromptContent, Config, ConfigProps, ChatToolResultMessage, SourceMap, ToolPrompt, ToolDefinition, ChatAndToolPromptToolFunction, AbsoluteSourceMap, RenderunCountTokensFast_UNSAFE } from './types';
import { NewOutputCatcher } from './outputCatcher.ai';
import { PreviewManager } from './preview';
import { statsd } from './statsd';
Expand Down Expand Up @@ -393,6 +393,25 @@ export function createElement(tag: ((props: BaseProps & Record<string, unknown>)
name: (props !== null && typeof props.name === 'string') ? props.name : undefined,
};
}
case 'max':
{
// must have tokenLimit
if (!props || typeof props.tokenLimit !== 'number') {
throw new Error(`max tag must have a tokenLimit prop, got ${props}`);
}

return {
type: 'scope',
children: [{
type: 'max',
tokenLimit: props.tokenLimit,
children: children.flat(),
}],
absolutePriority: (typeof props.p === 'number') ? props.p : undefined,
relativePriority: (typeof props.prel === 'number') ? props.prel : undefined,
name: (props !== null && typeof props.name === 'string') ? props.name : undefined,
};
}
case 'capture':
{
if (children.length > 0) {
Expand Down Expand Up @@ -1125,7 +1144,7 @@ type NormalizedFunctionDefinition = FunctionDefinition & {
type NormalizedToolDefinition = ToolDefinition & {
cachedCount: number | undefined;
}
type NormalizedNode = NormalizedFirst | NormalizedScope | BreakToken | Config | Empty | Isolate | Capture | NormalizedChatMessage | NormalizedString | ChatImage | NormalizedFunctionDefinition | NormalizedToolDefinition;
type NormalizedNode = NormalizedFirst | NormalizedScope | BreakToken | Config | Empty | Isolate | Max | Capture | NormalizedChatMessage | NormalizedString | ChatImage | NormalizedFunctionDefinition | NormalizedToolDefinition;
type NormalizedPromptElement = NormalizedNode[];
function normalizePrompt(elem: PromptElement): NormalizedPromptElement {
// we want to merge all the strings together
Expand Down Expand Up @@ -1157,6 +1176,7 @@ function normalizePrompt(elem: PromptElement): NormalizedPromptElement {
case 'config':
case 'capture':
case 'isolate':
case 'max':
case 'breaktoken':
case 'image':
case 'empty': {
Expand Down Expand Up @@ -1408,6 +1428,43 @@ async function renderWithLevelAndCountTokens(elem: NormalizedNode[] | Normalized
config: emptyConfig(),
}
}
case 'max': {
// For max blocks, we render the children with priority level consideration
// and then truncate if needed
const childrenPrompt = await renderWithLevelAndCountTokens(elem.children, level, tokenizer);

// If within token limit, return as-is
if (childrenPrompt.tokenCount <= elem.tokenLimit) {
return childrenPrompt;
}

// If exceeding limit, we do a simple truncation by raising the priority cutoff
// This is a simplified implementation - the full version would integrate with global optimization
const priorities = new Set<number>();
computePriorityLevels(elem.children, BASE_PRIORITY, priorities);
const sortedPriorityLevels = Array.from(priorities).sort((a, b) => b - a);

// Try higher priority levels until we fit within the limit
for (const testLevel of sortedPriorityLevels) {
if (testLevel >= level) {
const testPrompt = await renderWithLevelAndCountTokens(elem.children, testLevel, tokenizer);
if (testPrompt.tokenCount <= elem.tokenLimit) {
return testPrompt;
}
}
}

// If nothing fits, return empty
return {
prompt: undefined,
tokenCount: 0,
emptyTokenCount: 0,
outputHandlers: [],
streamHandlers: [],
streamResponseObjectHandlers: [],
config: emptyConfig(),
};
}
case 'chat': {
const p = await renderWithLevelAndCountTokens(elem.children, level, tokenizer);
if (isChatPrompt(p.prompt)) {
Expand Down Expand Up @@ -1667,6 +1724,12 @@ function renderWithLevelAndEarlyExitWithTokenEstimation(elem: PromptElement, lev
emptyTokenCount += elem.cachedRenderOutput.tokensReserved;
return;
}
case 'max': {
// For max blocks, we render children and enforce limits
// This is a simplified implementation
await renderWithLevelAndEarlyExitWithTokenEstimation(elem.children, level, prompt, emptyTokenCount, tokenizer);
return;
}
case 'scope': {
if (elem.absolutePriority === undefined) {
throw new Error(`BUG!! computePriorityLevels should have set absolutePriority for all scopes`);
Expand Down Expand Up @@ -1825,6 +1888,7 @@ function hydrateEmptyTokenCount(elem: PromptElement, tokenizer: PriomptTokenizer
case 'capture':
case 'image':
case 'isolate':
case 'max':
case 'breaktoken':
case 'config':
case 'toolDefinition':
Expand Down Expand Up @@ -1892,6 +1956,11 @@ function hydrateIsolates(elem: PromptElement, tokenizer: PriomptTokenizer, shoul
}
return;
}
case 'max': {
// Max blocks don't need pre-rendering like isolate blocks
// They participate in global priority calculation
return hydrateIsolates(elem.children, tokenizer, shouldBuildSourceMap);
}
case 'chat': {
return hydrateIsolates(elem.children, tokenizer, shouldBuildSourceMap);
}
Expand Down Expand Up @@ -2047,6 +2116,10 @@ function renderWithLevel(
result.streamResponseObjectHandlers.push(...elem.cachedRenderOutput.streamResponseObjectHandlers);
return elem.cachedRenderOutput.sourceMap;
}
case 'max': {
// For max blocks, we build source maps recursively for children
return await renderWithLevelAndEarlyExitWithTokenEstimation2(elem.children, level, result, tokenizer, shouldBuildSourceMap);
}
case 'scope': {
if (elem.absolutePriority === undefined) {
throw new Error(`BUG!! computePriorityLevels should have set absolutePriority for all scopes`);
Expand Down Expand Up @@ -2358,6 +2431,7 @@ function validateNoUnhandledTypes(elem: PromptElement): void {
return;
}
case 'isolate':
case 'max':
case 'breaktoken':
case 'config':
case 'capture':
Expand Down Expand Up @@ -2391,6 +2465,7 @@ function validateNotBothAbsoluteAndRelativePriority(elem: PromptElement): void {
switch (elem.type) {
case 'chat':
case 'isolate':
case 'max':
case 'first': {
for (const child of elem.children) {
validateNotBothAbsoluteAndRelativePriority(child);
Expand Down Expand Up @@ -2439,7 +2514,8 @@ function validateNoChildrenHigherPriorityThanParent(elem: PromptElement, parentP

switch (elem.type) {
case 'chat':
case 'first': {
case 'first':
case 'max': {
for (const child of elem.children) {
validateNoChildrenHigherPriorityThanParent(child, parentPriority);
}
Expand Down Expand Up @@ -2521,6 +2597,14 @@ function computePriorityLevels(elem: AnyNode[] | AnyNode, parentPriority: number
// nothing happens because we fully re-render
return;
}
case 'max': {
// Max blocks participate in global priority calculation
// So we compute priority levels for their children
for (const child of elem.children) {
computePriorityLevels(child, parentPriority, levels);
}
return;
}
case 'scope': {
// compute the priority of this scope
// the absolutePriority takes precedence over the relativePriority
Expand Down Expand Up @@ -2610,6 +2694,7 @@ function computePriorityLevelsTokensMapping(elem: NormalizedNode[] | NormalizedN
return;
}
case 'isolate':
case 'max':
case 'breaktoken':
case 'capture':
case 'config':
Expand Down Expand Up @@ -3126,6 +3211,7 @@ export function getPromptElementNodeCount(elem: PromptElement): number {
return 1;
case 'first':
case 'isolate':
case 'max':
case 'scope':
case 'chat':
return 1 + getPromptElementNodeCount(elem.children);
Expand Down
Loading