Skip to content

Conversation

@kahkeng
Copy link
Collaborator

@kahkeng kahkeng commented May 27, 2023

No description provided.


RE_COMMAND = re.compile(r"\<\|(?P<command>[^(]+)\((?P<params>[^)<{}]*)\)\|\>")

TEMPLATE = '''<user>{question}<hist>{chat_history}<task>{task_info}<bot>'''
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dogbox this is still very early WIP, but I was thinking inference could be as simple as doing something like this. This file is just a copy of RephraseWidgetSearch2 but with the TEMPLATE modified.

I wonder if training data generation could utilize this constant or maybe it should be the other way around (where inference references something used in training). We could probably also add some kind of "compression" to the chat_history/task_info pieces, but for simplicity, even just keeping them verbatim as-is should probably work.

I decided not to use the <|im_start|> format of ChatML (https://github.com/openai/openai-python/blob/main/chatml.md) because it might get confusing with the widget commands having the same format.

If you're working on a branch, I could try to follow along with what you have so far.

@kahkeng kahkeng force-pushed the kahkeng/finetuned branch from 3974b68 to 47dba52 Compare May 28, 2023 02:42
@kahkeng kahkeng force-pushed the kahkeng/finetuned branch from 8db93c5 to b767497 Compare May 28, 2023 22:37
@kahkeng kahkeng closed this Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants