Skip to content

Commit 6de1ca4

Browse files
committed
Imported changes from repo VowpalWabbit/rl_chain into rl_chain directory
1 parent dda5b1e commit 6de1ca4

19 files changed

+3261
-0
lines changed
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
name: Unit Tests
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
pull_request:
8+
branches:
9+
- '*'
10+
11+
jobs:
12+
python-unit-test:
13+
container:
14+
image: python:3.8
15+
runs-on: ubuntu-latest
16+
steps:
17+
- uses: actions/checkout@v1
18+
- name: Run Tests
19+
shell: bash
20+
run: |
21+
pip install -r requirements.txt
22+
pip install pytest
23+
python -m pytest tests/
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
**/__pycache__/**
2+
models/*
3+
logs/*
4+
**/*.vw
5+
.venv
6+
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2023 Vowpal Wabbit
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# VW in a langchain chain
2+
3+
Install `requirements.txt`
4+
5+
[VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit)
6+
7+
There is an example notebook (rl_chain.ipynb) with basic usage of the chain.
8+
9+
TLDR:
10+
11+
- Chain is initialized and creates a Vowpal Wabbit instance - only Contextual Bandits and Slates are supported for now
12+
- You can change the arguments at chain creation time
13+
- There is a default prompt but it can be changed
14+
- There is a default reward function that gets triggered and triggers learn automatically
15+
- This can be turned off and score can be spcified explicitly
16+
17+
Flow:
18+
19+
- Developer: creates chain
20+
- Developer: sets actions
21+
- Developer: calls chain with context and other prompt inputs
22+
- Chain: calls VW with the context and selects an action
23+
- Chain: action (and other vars) are passed to the LLM with the prompt
24+
- Chain: if default reward set, the LLM is called to judge and give a reward score of the response based on the context
25+
- Chain: VW learn is triggered with that score

0 commit comments

Comments
 (0)