Skip to content

Commit 8ee8ade

Browse files
Merge branch 'main' into evaluate-cleanup
2 parents b630c62 + 33da561 commit 8ee8ade

File tree

9 files changed

+327
-9
lines changed

9 files changed

+327
-9
lines changed

docs/docs/learn/evaluation/data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ print("Example object with Non-Input fields only:", non_input_key_only)
7272

7373
**Output**
7474
```
75-
Example object with Input fields only: Example({'article': 'This is an article.'}) (input_keys=None)
75+
Example object with Input fields only: Example({'article': 'This is an article.'}) (input_keys={'article'})
7676
Example object with Non-Input fields only: Example({'summary': 'This is a summary.'}) (input_keys=None)
7777
```
7878

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
# Managing Conversation History
2+
3+
Maintaining conversation history is a fundamental feature when building AI applications such as chatbots. While DSPy does not provide automatic conversation history management within `dspy.Module`, it offers the `dspy.History` utility to help you manage conversation history effectively.
4+
5+
## Using `dspy.History` to Manage Conversation History
6+
7+
The `dspy.History` class can be used as an input field type, containing a `messages: list[dict[str, Any]]` attribute that stores the conversation history. Each entry in this list is a dictionary with keys corresponding to the fields defined in your signature. See the example below:
8+
9+
```python
10+
import dspy
11+
import os
12+
13+
os.environ["OPENAI_API_KEY"] = "{your_openai_api_key}"
14+
15+
dspy.settings.configure(lm=dspy.LM("openai/gpt-4o-mini"))
16+
17+
class QA(dspy.Signature):
18+
question: str = dspy.InputField()
19+
history: dspy.History = dspy.InputField()
20+
answer: str = dspy.OutputField()
21+
22+
predict = dspy.Predict(QA)
23+
history = dspy.History(messages=[])
24+
25+
while True:
26+
question = input("Type your question, end conversation by typing 'finish': ")
27+
if question == "finish":
28+
break
29+
outputs = predict(question=question, history=history)
30+
print(f"\n{outputs.answer}\n")
31+
history.messages.append({"question": question, **outputs})
32+
33+
dspy.inspect_history()
34+
```
35+
36+
There are two key steps when using the conversation history:
37+
38+
- **Include a field of type `dspy.History` in your Signature.**
39+
- **Maintain a history instance at runtime, appending new conversation turns to it.** Each entry should include all relevant input and output field information.
40+
41+
A sample run might look like this:
42+
43+
```
44+
Type your question, end conversation by typing 'finish': do you know the competition between pytorch and tensorflow?
45+
46+
Yes, there is a notable competition between PyTorch and TensorFlow, which are two of the most popular deep learning frameworks. PyTorch, developed by Facebook, is known for its dynamic computation graph, which allows for more flexibility and ease of use, especially in research settings. TensorFlow, developed by Google, initially used a static computation graph but has since introduced eager execution to improve usability. TensorFlow is often favored in production environments due to its scalability and deployment capabilities. Both frameworks have strong communities and extensive libraries, and the choice between them often depends on specific project requirements and personal preference.
47+
48+
Type your question, end conversation by typing 'finish': which one won the battle? just tell me the result, don't include any reasoning, thanks!
49+
50+
There is no definitive winner; both PyTorch and TensorFlow are widely used and have their own strengths.
51+
Type your question, end conversation by typing 'finish': finish
52+
53+
54+
55+
56+
[2025-07-11T16:35:57.592762]
57+
58+
System message:
59+
60+
Your input fields are:
61+
1. `question` (str):
62+
2. `history` (History):
63+
Your output fields are:
64+
1. `answer` (str):
65+
All interactions will be structured in the following way, with the appropriate values filled in.
66+
67+
[[ ## question ## ]]
68+
{question}
69+
70+
[[ ## history ## ]]
71+
{history}
72+
73+
[[ ## answer ## ]]
74+
{answer}
75+
76+
[[ ## completed ## ]]
77+
In adhering to this structure, your objective is:
78+
Given the fields `question`, `history`, produce the fields `answer`.
79+
80+
81+
User message:
82+
83+
[[ ## question ## ]]
84+
do you know the competition between pytorch and tensorflow?
85+
86+
87+
Assistant message:
88+
89+
[[ ## answer ## ]]
90+
Yes, there is a notable competition between PyTorch and TensorFlow, which are two of the most popular deep learning frameworks. PyTorch, developed by Facebook, is known for its dynamic computation graph, which allows for more flexibility and ease of use, especially in research settings. TensorFlow, developed by Google, initially used a static computation graph but has since introduced eager execution to improve usability. TensorFlow is often favored in production environments due to its scalability and deployment capabilities. Both frameworks have strong communities and extensive libraries, and the choice between them often depends on specific project requirements and personal preference.
91+
92+
[[ ## completed ## ]]
93+
94+
95+
User message:
96+
97+
[[ ## question ## ]]
98+
which one won the battle? just tell me the result, don't include any reasoning, thanks!
99+
100+
Respond with the corresponding output fields, starting with the field `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.
101+
102+
103+
Response:
104+
105+
[[ ## answer ## ]]
106+
There is no definitive winner; both PyTorch and TensorFlow are widely used and have their own strengths.
107+
108+
[[ ## completed ## ]]
109+
```
110+
111+
Notice how each user input and assistant response is appended to the history, allowing the model to maintain context across turns.
112+
113+
The actual prompt sent to the language model is a multi-turn message, as shown by the output of `dspy.inspect_history`. Each conversation turn is represented as a user message followed by an assistant message.
114+
115+
## History in Few-shot Examples
116+
117+
You may notice that `history` does not appear in the input fields section of the prompt, even though it is listed as an input field (e.g., "2. `history` (History):" in the system message). This is intentional: when formatting few-shot examples that include conversation history, DSPy does not expand the history into multiple turns. Instead, to remain compatible with the OpenAI standard format, each few-shot example is represented as a single turn.
118+
119+
For example:
120+
121+
```
122+
import dspy
123+
124+
dspy.settings.configure(lm=dspy.LM("openai/gpt-4o-mini"))
125+
126+
127+
class QA(dspy.Signature):
128+
question: str = dspy.InputField()
129+
history: dspy.History = dspy.InputField()
130+
answer: str = dspy.OutputField()
131+
132+
133+
predict = dspy.Predict(QA)
134+
history = dspy.History(messages=[])
135+
136+
predict.demos.append(
137+
dspy.Example(
138+
question="What is the capital of France?",
139+
history=dspy.History(
140+
messages=[{"question": "What is the capital of Germany?", "answer": "The capital of Germany is Berlin."}]
141+
),
142+
answer="The capital of France is Paris.",
143+
)
144+
)
145+
146+
predict(question="What is the capital of America?", history=dspy.History(messages=[]))
147+
dspy.inspect_history()
148+
```
149+
150+
The resulting history will look like this:
151+
152+
```
153+
[2025-07-11T16:53:10.994111]
154+
155+
System message:
156+
157+
Your input fields are:
158+
1. `question` (str):
159+
2. `history` (History):
160+
Your output fields are:
161+
1. `answer` (str):
162+
All interactions will be structured in the following way, with the appropriate values filled in.
163+
164+
[[ ## question ## ]]
165+
{question}
166+
167+
[[ ## history ## ]]
168+
{history}
169+
170+
[[ ## answer ## ]]
171+
{answer}
172+
173+
[[ ## completed ## ]]
174+
In adhering to this structure, your objective is:
175+
Given the fields `question`, `history`, produce the fields `answer`.
176+
177+
178+
User message:
179+
180+
[[ ## question ## ]]
181+
What is the capital of France?
182+
183+
[[ ## history ## ]]
184+
{"messages": [{"question": "What is the capital of Germany?", "answer": "The capital of Germany is Berlin."}]}
185+
186+
187+
Assistant message:
188+
189+
[[ ## answer ## ]]
190+
The capital of France is Paris.
191+
192+
[[ ## completed ## ]]
193+
194+
195+
User message:
196+
197+
[[ ## question ## ]]
198+
What is the capital of Germany?
199+
200+
Respond with the corresponding output fields, starting with the field `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.
201+
202+
203+
Response:
204+
205+
[[ ## answer ## ]]
206+
The capital of Germany is Berlin.
207+
208+
[[ ## completed ## ]]
209+
```
210+
211+
As you can see, the few-shot example does not expand the conversation history into multiple turns. Instead, it represents the history as JSON data within its section:
212+
213+
```
214+
[[ ## history ## ]]
215+
{"messages": [{"question": "What is the capital of Germany?", "answer": "The capital of Germany is Berlin."}]}
216+
```
217+
218+
This approach ensures compatibility with standard prompt formats while still providing the model with relevant conversational context.
219+

docs/docs/tutorials/core_development/index.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,9 @@ This section covers essential DSPy features and best practices for professional
44

55
## Integration and Tooling
66

7+
### [Managing Conversation History](../conversation_history/index.md)
8+
Learn how to manage conversation history in DSPy applications.
9+
710
### [Use MCP in DSPy](../mcp/index.md)
811
Learn to integrate Model Context Protocol (MCP) with DSPy applications. This tutorial shows how to leverage MCP for enhanced context management and more sophisticated AI interactions.
912

docs/docs/tutorials/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ Welcome to DSPy tutorials! We've organized our tutorials into three main categor
3333
- [Finetuning Agents](games/index.ipynb)
3434

3535
- Tools, Development, and Deployment
36+
- [Managing Conversation History](conversation_history/index.md)
3637
- [Use MCP in DSPy](mcp/index.md)
3738
- [Output Refinement](output_refinement/best-of-n-and-refine.md)
3839
- [Saving and Loading](saving/index.md)

docs/docs/tutorials/streaming/index.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,67 @@ Final output: Prediction(
188188
)
189189
```
190190

191+
### Streaming the Same Field Multiple Times (as in dspy.ReAct)
192+
193+
By default, a `StreamListener` automatically closes itself after completing a single streaming session.
194+
This design helps prevent performance issues, since every token is broadcast to all configured stream listeners,
195+
and having too many active listeners can introduce significant overhead.
196+
197+
However, in scenarios where a DSPy module is used repeatedly in a loop—such as with `dspy.ReAct` — you may want to stream
198+
the same field from each prediction, every time it is used. To enable this behavior, set allow_reuse=True when creating
199+
your `StreamListener`. See the example below:
200+
201+
```python
202+
import asyncio
203+
204+
import dspy
205+
206+
lm = dspy.LM("openai/gpt-4o-mini", cache=False)
207+
dspy.settings.configure(lm=lm)
208+
209+
210+
def fetch_user_info(user_name: str):
211+
"""Get user information like name, birthday, etc."""
212+
return {
213+
"name": user_name,
214+
"birthday": "2009-05-16",
215+
}
216+
217+
218+
def get_sports_news(year: int):
219+
"""Get sports news for a given year."""
220+
if year == 2009:
221+
return "Usane Bolt broke the world record in the 100m race."
222+
return None
223+
224+
225+
react = dspy.ReAct("question->answer", tools=[fetch_user_info, get_sports_news])
226+
227+
stream_listeners = [
228+
# dspy.ReAct has a built-in output field called "next_thought".
229+
dspy.streaming.StreamListener(signature_field_name="next_thought", allow_reuse=True),
230+
]
231+
stream_react = dspy.streamify(react, stream_listeners=stream_listeners)
232+
233+
234+
async def read_output_stream():
235+
output = stream_react(question="What sports news happened in the year Adam was born?")
236+
return_value = None
237+
async for chunk in output:
238+
if isinstance(chunk, dspy.streaming.StreamResponse):
239+
print(chunk)
240+
elif isinstance(chunk, dspy.Prediction):
241+
return_value = chunk
242+
return return_value
243+
244+
245+
print(asyncio.run(read_output_stream()))
246+
```
247+
248+
In this example, by setting `allow_reuse=True` in the StreamListener, you ensure that streaming for "next_thought" is
249+
available for every iteration, not just the first. When you run this code, you will see the streaming tokens for `next_thought`
250+
output each time the field is produced.
251+
191252
#### Handling Duplicate Field Names
192253

193254
When streaming fields with the same name from different modules, specify both the `predict` and `predict_name` in the `StreamListener`:

docs/mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ nav:
5151
- RL for Multi-Hop Research: tutorials/rl_multihop/index.ipynb
5252
- Tools, Development, and Deployment:
5353
- Overview: tutorials/core_development/index.md
54+
- Managing Conversation History: tutorials/conversation_history/index.md
5455
- Use MCP in DSPy: tutorials/mcp/index.md
5556
- Output Refinement: tutorials/output_refinement/best-of-n-and-refine.md
5657
- Saving and Loading: tutorials/saving/index.md

dspy/adapters/types/code.py

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
import re
2-
from typing import Any
2+
from typing import Any, ClassVar
33

44
import pydantic
5+
from pydantic import create_model
56

67
from dspy.adapters.types.base_type import Type
78

@@ -23,7 +24,7 @@ class CodeGeneration(dspy.Signature):
2324
'''Generate python code to answer the question.'''
2425
2526
question: str = dspy.InputField(description="The question to answer")
26-
code: dspy.Code = dspy.OutputField(description="The code to execute")
27+
code: dspy.Code["java"] = dspy.OutputField(description="The code to execute")
2728
2829
2930
predict = dspy.Predict(CodeGeneration)
@@ -43,7 +44,7 @@ class CodeGeneration(dspy.Signature):
4344
class CodeAnalysis(dspy.Signature):
4445
'''Analyze the time complexity of the function.'''
4546
46-
code: dspy.Code = dspy.InputField(description="The function to analyze")
47+
code: dspy.Code["python"] = dspy.InputField(description="The function to analyze")
4748
result: str = dspy.OutputField(description="The time complexity of the function")
4849
4950
@@ -64,6 +65,8 @@ def sleepsort(x):
6465

6566
code: str
6667

68+
language: ClassVar[str] = "python"
69+
6770
def format(self):
6871
return f"{self.code}"
6972

@@ -76,7 +79,8 @@ def serialize_model(self):
7679
def description(cls) -> str:
7780
return (
7881
"Code represented in a string, specified in the `code` field. If this is an output field, the code "
79-
"should follow the markdown code block format, e.g. \n```python\n{code}\n``` or \n```cpp\n{code}\n```."
82+
"field should follow the markdown code block format, e.g. \n```python\n{code}\n``` or \n```cpp\n{code}\n```"
83+
f"\nProgramming language: {cls.language}"
8084
)
8185

8286
@pydantic.model_validator(mode="before")
@@ -115,3 +119,13 @@ def _filter_code(code: str) -> str:
115119
return match.group(1).strip()
116120
# Fallback case
117121
return code
122+
123+
124+
# Patch __class_getitem__ directly on the class to support dspy.Code["python"] syntax
125+
def _code_class_getitem(cls, language):
126+
code_with_language_cls = create_model(f"{cls.__name__}_{language}", __base__=cls)
127+
code_with_language_cls.language = language
128+
return code_with_language_cls
129+
130+
131+
Code.__class_getitem__ = classmethod(_code_class_getitem)

dspy/adapters/utils.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,14 @@ def parse_value(value, annotation):
171171

172172
try:
173173
return TypeAdapter(annotation).validate_python(candidate)
174-
except pydantic.ValidationError:
174+
except pydantic.ValidationError as e:
175+
if issubclass(annotation, Type):
176+
try:
177+
# For dspy.Type, try parsing from the original value in case it has a custom parser
178+
return TypeAdapter(annotation).validate_python(value)
179+
except Exception:
180+
raise e
181+
175182
if origin is Union and type(None) in get_args(annotation) and str in get_args(annotation):
176183
return str(candidate)
177184
raise

0 commit comments

Comments
 (0)