Message compaction / summarisation for long-running conversation #1277
atensity
started this conversation in
Show and tell
Replies: 1 comment
-
|
This is gold!!! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
A lot of times when the conversation goes on for a very long time; or when I use gptel-agent and there are a ton of tool-calls (e.g. Read tool calls), it becomes both expensive, and difficult for the LLM to keep all (correct) information in it's context. A lot of other providers, such as Anthropic/Claude have their message compaction systems, to avoid context rot.
So, thought I could try implementing something similar here. More or less, we set a context percentage limit when the compaction should occur (e.g 75% of context limit), we estimate token count from character count (i.e. 1:4 ratio) and by hooking into
gptel-post-response-functions, determine whether compaction needs to be done.I have been using it so far, and it seems to do a decent job. One thing I've noticed is that since gptel-agent's sub-agents initially weren't affected by this, I've made some adjustments that also tries to execute the same functionality in long-running sub-agents called by the "Agent" tool.
Note: A lot of this was done with the help of LLMs (using gptel + introspector of course!), as my Emacs Lisp skills aren't up to par. I'm sure there can be a lot of improvements, but I thought I would share something like this anyway just in case people could get some use out of this, or maybe even improve upon it!
Beta Was this translation helpful? Give feedback.
All reactions