Skip to content

Commit f8eb87d

Browse files
authored
Merge branch 'main' into main
2 parents a9b893d + 0c293f1 commit f8eb87d

File tree

8 files changed

+940
-8
lines changed

8 files changed

+940
-8
lines changed

README.md

Lines changed: 131 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -117,8 +117,18 @@ prompt-learning/
117117

118118
### Installation
119119

120+
Install the `prompt-learning` package via pip:
121+
120122
```bash
121-
pip install -r requirements.txt
123+
pip install prompt-learning
124+
```
125+
126+
Or install from source for development:
127+
128+
```bash
129+
git clone https://github.com/priyanjindal/prompt-learning.git
130+
cd prompt-learning
131+
pip install -e .
122132
```
123133

124134
### Environment Setup
@@ -131,29 +141,142 @@ export OPENAI_API_KEY="your-api-key-here"
131141

132142
```python
133143
import pandas as pd
134-
from optimizer_sdk.prompt_learning_optimizer import PromptLearningOptimizer
144+
from prompt_learning import PromptLearningOptimizer
135145

136146
# Create dataset with English feedback
137147
dataset = pd.DataFrame({
138-
'input': ["Generate a tech company's career page"],
139-
'output': ["{incorrect JSON output}"],
140-
'feedback': ["The generated JSON breaks several rules: missing 'updatedAt' field, top-level key should be 'page'"]
148+
'query': [
149+
"I can't log in to my account anymore",
150+
"My password reset email never arrived",
151+
"I was charged twice for the same order",
152+
],
153+
'output': [
154+
"Login Issues",
155+
"Password Reset",
156+
"Billing Inquiry",
157+
],
158+
'feedback': [
159+
"correct",
160+
"correct",
161+
"correct",
162+
]
141163
})
142164

165+
# Define your prompt with template variables
166+
prompt = """You are a customer support classifier.
167+
Classify the query into a category.
168+
169+
Query: {query}
170+
171+
Category:"""
172+
143173
# Initialize optimizer
144174
optimizer = PromptLearningOptimizer(
145-
prompt="You are an expert in JSON webpage creation. Generate: {input}",
146-
model_choice="gpt-4"
175+
prompt=prompt,
176+
model_choice="gpt-4o"
147177
)
148178

149-
# Optimize the prompt using English feedback
179+
# Optimize the prompt using feedback
150180
optimized_prompt = optimizer.optimize(
151181
dataset=dataset,
152182
output_column='output',
153183
feedback_columns=['feedback']
154184
)
185+
186+
print(optimized_prompt)
187+
```
188+
189+
### Advanced Usage
190+
191+
#### Using Custom Evaluators
192+
193+
You can run evaluators on your dataset before optimization:
194+
195+
```python
196+
from prompt_learning import PromptLearningOptimizer
197+
198+
optimizer = PromptLearningOptimizer(
199+
prompt="Your prompt with {variables}",
200+
model_choice="gpt-4o"
201+
)
202+
203+
# Run evaluators first
204+
dataset, feedback_columns = optimizer.run_evaluators(
205+
dataset=dataset,
206+
evaluators=[your_custom_evaluator],
207+
feedback_columns=["existing_feedback"]
208+
)
209+
210+
# Then optimize
211+
optimized_prompt = optimizer.optimize(
212+
dataset=dataset,
213+
output_column='output',
214+
feedback_columns=feedback_columns
215+
)
155216
```
156217

218+
#### Using Annotations
219+
220+
Generate detailed annotations to guide optimization:
221+
222+
```python
223+
annotations = optimizer.create_annotation(
224+
prompt=prompt,
225+
template_variables=["query"],
226+
dataset=dataset,
227+
feedback_columns=["feedback"],
228+
annotator_prompts=["Analyze why the model made errors and suggest improvements."],
229+
output_column="output"
230+
)
231+
232+
optimized_prompt = optimizer.optimize(
233+
dataset=dataset,
234+
output_column='output',
235+
feedback_columns=['feedback'],
236+
annotations=annotations
237+
)
238+
```
239+
240+
#### Optimizing Rulesets
241+
242+
For coding agents or complex systems, optimize dynamic rulesets instead of the full prompt:
243+
244+
```python
245+
optimized_ruleset = optimizer.optimize(
246+
dataset=dataset,
247+
output_column='output',
248+
feedback_columns=['feedback'],
249+
ruleset="- Rule 1: Always check for edge cases\n- Rule 2: Validate inputs"
250+
)
251+
```
252+
253+
### API Reference
254+
255+
#### `PromptLearningOptimizer`
256+
257+
**Constructor:**
258+
```python
259+
PromptLearningOptimizer(
260+
prompt: Union[PromptVersion, str, List[Dict[str, str]]],
261+
model_choice: str = "gpt-4",
262+
openai_api_key: Optional[str] = None,
263+
meta_prompt: Optional[str] = None,
264+
rules_meta_prompt: Optional[str] = None,
265+
)
266+
```
267+
268+
- `prompt`: The prompt to optimize. Can be a string, list of messages, or Phoenix PromptVersion.
269+
- `model_choice`: OpenAI model to use (default: "gpt-4")
270+
- `openai_api_key`: API key (or set via `OPENAI_API_KEY` env var)
271+
- `meta_prompt`: Custom meta-prompt template (optional)
272+
- `rules_meta_prompt`: Custom meta-prompt for ruleset optimization (optional)
273+
274+
**Methods:**
275+
276+
- `optimize(dataset, output_column, feedback_columns, ...)`: Optimize the prompt using feedback data
277+
- `run_evaluators(dataset, evaluators, feedback_columns)`: Run evaluators on the dataset
278+
- `create_annotation(...)`: Generate annotations for optimization guidance
279+
157280
## Contributing
158281

159282
You can contribute to the optimizer sdk itself within the optimizer_sdk notebook. You can also add notebooks, datasets, or other additional material.

src/prompt_learning/__init__.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
"""
2+
prompt-learning: A prompt optimization SDK using meta-prompt approaches.
3+
4+
This package provides tools for optimizing LLM prompts using feedback
5+
and evaluation data.
6+
"""
7+
8+
from .prompt_learning_optimizer import PromptLearningOptimizer
9+
from .annotator import Annotator
10+
from .meta_prompt import MetaPrompt
11+
from .tiktoken_splitter import TiktokenSplitter
12+
13+
__version__ = "0.1.0"
14+
15+
__all__ = [
16+
"PromptLearningOptimizer",
17+
"Annotator",
18+
"MetaPrompt",
19+
"TiktokenSplitter",
20+
]

src/prompt_learning/annotator.py

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
from typing import List, Tuple
2+
import pandas as pd
3+
import openai
4+
import os
5+
6+
from .constants import END_DELIM, START_DELIM
7+
8+
9+
class Annotator:
10+
def __init__(self, annotations_prompt_template: str):
11+
self.annotations_prompt_template = annotations_prompt_template
12+
13+
def construct_content(
14+
self,
15+
batch_df: pd.DataFrame,
16+
baseline_prompt: str,
17+
template_variables: List[str],
18+
feedback_columns: List[str],
19+
output_column: str,
20+
ground_truth_column: str = None,
21+
) -> str:
22+
"""
23+
Generate annotations based on the evaluation results.
24+
25+
Args:
26+
batch_df: DataFrame containing the evaluation data
27+
baseline_prompt: The original prompt that was evaluated
28+
template_variables: List of template variable names
29+
feedback_columns: List of feedback column names
30+
output_column: Name of the output column
31+
32+
Returns:
33+
Formatted prompt string for annotation generation
34+
"""
35+
content = self.annotations_prompt_template
36+
content = content.replace("{baseline prompt}", baseline_prompt)
37+
38+
examples = ""
39+
# Iterate over the batch of data and populate the template with actual values
40+
for ind, row in batch_df.iterrows():
41+
row_dict = row.to_dict()
42+
output_value = row_dict[output_column]
43+
if output_value is not None and isinstance(output_value, str):
44+
output_value = output_value.replace(START_DELIM, " ").replace(END_DELIM, " ")
45+
else:
46+
output_value = "None"
47+
if ground_truth_column is not None:
48+
ground_truth_value = row_dict[ground_truth_column]
49+
else:
50+
ground_truth_value = "N/A"
51+
current_example = f"""\n
52+
Example {str(ind)}
53+
54+
Input: {[row_dict[temp_var] for temp_var in template_variables]}
55+
56+
Output: {output_value}
57+
58+
Ground Truth: {row_dict.get('ground_truth', 'N/A')}
59+
60+
Feedback:
61+
"""
62+
63+
for feedback_column in feedback_columns:
64+
feedback_value = row_dict[feedback_column]
65+
if feedback_value is not None:
66+
# Cast to string to handle integers and other types
67+
feedback_value = str(feedback_value)
68+
feedback_value = feedback_value.replace(START_DELIM, " ").replace(END_DELIM, " ")
69+
else:
70+
feedback_value = "None"
71+
current_example += f"\n{feedback_column}: {feedback_value}"
72+
examples += current_example
73+
74+
content = content.replace("{examples}", examples)
75+
return content
76+
77+
def generate_annotation(
78+
self,
79+
prompt: str,
80+
) -> str:
81+
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
82+
response = client.chat.completions.create(
83+
model="gpt-4o",
84+
messages=[
85+
{"role": "user", "content": prompt},
86+
]
87+
)
88+
return response.choices[0].message.content
89+
90+

src/prompt_learning/constants.py

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Constants for the prompt-learning-sdk module.
2+
3+
4+
# Delimiters for template variables
5+
START_DELIM = "{"
6+
END_DELIM = "}"
7+
8+
SUPPORTED_MODELS = [
9+
"o1",
10+
"o3",
11+
"gpt-4o",
12+
"gpt-4",
13+
"gpt-3.5-turbo",
14+
"gpt-3.5",
15+
]
16+
17+
# Meta prompt template sections
18+
META_PROMPT_TEMPLATE = """
19+
You are an expert in prompt optimization. Given the original baseline prompt and the following associated metadata (such as model inputs, outputs, evaluation labels and explanations),
20+
generate a revised version of the original prompt that would likely improve results with respect to the evaluation labels.
21+
Your goal is to align the prompt with the feedback and evaluation criteria.
22+
23+
BELOW IS THE ORIGINAL BASELINE PROMPT
24+
************* start prompt *************
25+
26+
27+
{baseline_prompt}
28+
************* end prompt *************
29+
30+
BELOW ARE THE EXAMPLES USING THE ABOVE PROMPT
31+
************* start example data *************
32+
33+
34+
{examples}
35+
************* end example data *************
36+
37+
HERE ARE SOME ANNOTATIONS THAT MAY BE HELPFUL:
38+
{annotations}
39+
40+
FINAL INSTRUCTIONS
41+
Iterate on the original prompt (above) with a new prompt that will improve the results, based on the examples and feedback above.
42+
43+
A common best practice in prompt optimization is to add guidelines and the most helpful few shot examples.
44+
45+
Note: Make sure to include the variables from the original prompt, which are wrapped in either single brackets or double brackets (e.g.
46+
{var}). If you fail to include these variables, the LLM will not be able to access the required data.
47+
Do not add any single or double brackets around anything other than the variables from the original prompt. The only curly brackets that should be used are the ones that wrap the variables from the original prompt.
48+
Make sure to copy paste the exact return instructions from the original prompt. Do not add any brackets here.
49+
50+
YOUR NEW PROMPT:
51+
"""
52+
53+
CODING_AGENT_META_PROMPT_TEMPLATE = """
54+
You are an expert in coding agent prompt optimization.
55+
Your goal is to improve the dynamic ruleset that guides the coding agent.
56+
57+
Process:
58+
1. Carefully review the baseline prompt, the current dynamic ruleset, examples, and annotations.
59+
2. Identify high-level issues in the baseline prompt and dynamic ruleset — focus on missing guidance, vague constraints, or areas where rules could be made more robust.
60+
3. Revise the dynamic ruleset so it is stronger, more reliable, and generalizes well beyond the provided examples.
61+
62+
BELOW IS THE ORIGINAL BASELINE PROMPT WITH STATIC RULESET
63+
************* start prompt *************
64+
65+
{baseline_prompt}
66+
************* end prompt *************
67+
68+
BELOW IS THE CURRENT DYNAMIC RULESET (CHANGE THESE OR ADD NEW RULES)
69+
************* start ruleset *************
70+
71+
{ruleset}
72+
************* end ruleset *************
73+
74+
Now you will be given data examples that use the above prompt and ruleset. Each example consists of:
75+
- problem_statement: the problem statement
76+
- coding agent patch: a patch generated by the coding agent, which is supposed to fix the problem.
77+
- ground truth patch: a ground truth solution/patch to the problem
78+
- test patch: a test patch that the coding agent's output should pass, which directly addresses the issue in the problem statement
79+
- pass_or_fail: either "pass" or "fail" indicating whether the coding agent's code changes passed the unit tests (indicates whether the coding agent's output is correct or incorrect)
80+
- explanation: explanation of your reasoning: why/why not the coding agent's output is correct, why the coding agent may have taken that approach, and general improvement suggestions for the coding agent to improve its output.
81+
82+
BELOW ARE THE EXAMPLES USING THE ABOVE PROMPT AND RULESET
83+
************* start example data *************
84+
85+
{examples}
86+
************* end example data *************
87+
88+
FINAL INSTRUCTIONS
89+
Iterate on the **dynamic ruleset only**. You may:
90+
- Add new rules
91+
- Edit or strengthen existing rules
92+
93+
Important constraints:
94+
- Do **not** modify the static rules in the baseline prompt.
95+
- Do **not** add rules that request user input, confirmations, or follow-up questions (e.g., `ask_followup_question`). The coding agent should always act autonomously.
96+
- Keep the ruleset concise and relevant — avoid unnecessary rules that don't match the general types of problems that the coding agent is likely to encounter or overly specific rules that only patch the given examples.
97+
- Remember that you are writing GENERAL rules. They should not be specific to the repositories or problems that you are given. They should be general rules that would improve the overall ability of the coding agent.
98+
Output format:
99+
- Return only the final, revised dynamic ruleset as a bullet-point list.
100+
- Do not include any extra commentary, explanations, or text outside the ruleset.
101+
102+
New ruleset:
103+
"""
104+
105+
106+
# Template placeholders
107+
EXAMPLES_PLACEHOLDER = "{examples}"
108+
109+
# Example formatting constants
110+
EXAMPLE_HEADER = "Example {index}"
111+
ORIGINAL_TEMPLATE_LABEL = "Original Template With Variables from the Baseline Prompt Populated:"
112+
OUTPUT_LABEL = "Output from the LLM using the template above:"
113+
FEEDBACK_LABEL = "Feedback from the evaluator using the template above and the output above:"

0 commit comments

Comments
 (0)