Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
5dad714
feat: add kubernetes app role selection
kyteinsky Mar 5, 2026
089d27a
feat: add thread start and stop logic
kyteinsky Mar 5, 2026
64ffdaf
wip: migrate the indexing process
kyteinsky Mar 9, 2026
03a3f43
wip: parallelize file parsing and processing based on cpu count
kyteinsky Mar 9, 2026
0dc404b
ci: use the kubernetes branch of context_chat
kyteinsky Mar 10, 2026
c733982
fix typo
kyteinsky Mar 10, 2026
dda312f
migrate the update process to be thread based
kyteinsky Mar 11, 2026
b09a93c
fix pydantic types
kyteinsky Mar 11, 2026
11b436c
fix: use a dedicated event to allow app halt without app being disabled
kyteinsky Mar 11, 2026
c88e153
fix fetch url and pydantic types
kyteinsky Mar 11, 2026
cd5241e
fix: use the correct file id
kyteinsky Mar 11, 2026
4958d1d
fix: wip: improve embeddings exception handling
kyteinsky Mar 11, 2026
a049121
fix(ci): update to the latest changes
kyteinsky Mar 11, 2026
795380c
fix(ci): use file to store stderr
kyteinsky Mar 12, 2026
7bc0ed7
fix(ci): add cron jobs
kyteinsky Mar 12, 2026
d94c687
fix(ci): do a occ files scan before cron jobs
kyteinsky Mar 12, 2026
dadc8fa
feat: record indexing errors in content decode function
kyteinsky Mar 16, 2026
f9d86dc
chore: move file fetch inside injest
kyteinsky Mar 17, 2026
1ade191
fix: truly parallel file parsing and indexing
kyteinsky Mar 18, 2026
12fd1ca
initial pass at request processing
marcelklehr Mar 24, 2026
8aa2471
implement request processing
marcelklehr Mar 25, 2026
2093936
request processing fixes
kyteinsky Mar 26, 2026
36b5f02
chore: drop commented code
kyteinsky Mar 26, 2026
85d29f1
fix(ci): parse json output from the stats command
kyteinsky Mar 26, 2026
4c6d01b
fix: seek to 0 to read the full buffer
kyteinsky Mar 26, 2026
51774ff
fix(ci): 3% tolerance
kyteinsky Mar 26, 2026
c81b675
fix(ci): wait longer for EM server
kyteinsky Mar 26, 2026
6817f89
fix: don't process files or requests until the EM server is healthy
kyteinsky Mar 30, 2026
104a37a
tests: Increase testing time to allow backend to injest more sources
marcelklehr Apr 1, 2026
b3b461a
fix: More log statements
marcelklehr Apr 1, 2026
a4a88da
tests: Set wait time back to 90
marcelklehr Apr 1, 2026
0c52747
fix: Reduce worker count on github actions
marcelklehr Apr 1, 2026
e676c32
fix(exec_in_proc): Raise RuntimeError if exitcode is non-zero
marcelklehr Apr 1, 2026
b027ff3
fix(indexing): Reduce memory pressure on gh actions
marcelklehr Apr 1, 2026
19b773f
fix(indexing): Fallback to batch_size=1 if embed_sources is killed
marcelklehr Apr 1, 2026
bde0bc5
fix: log stdout and stderr from subprocesses
kyteinsky Apr 2, 2026
4de591f
fix: don't raise before std* is captured
kyteinsky Apr 2, 2026
4deda84
feat: log cpu count and memory info of the system
kyteinsky Apr 2, 2026
ad0eac7
fix: catch BaseException in subprocess
kyteinsky Apr 2, 2026
36bcfb7
fix(utils): Improve exec_in_proc to handle more failure modes
marcelklehr Apr 2, 2026
47eaf72
one more stab at a fix
kyteinsky Apr 3, 2026
309ab2b
do not throw away the valid result even with exitcode 1
kyteinsky Apr 3, 2026
e1763ac
fix: use forkserver as process start method
kyteinsky Apr 3, 2026
3301652
fix(ci): consider eligible files as the total files count
kyteinsky Apr 3, 2026
32aa374
fix: use logging config in forkserver and other fixes
kyteinsky Apr 3, 2026
33ee38a
fix: remove extra diagnostics
kyteinsky Apr 3, 2026
d9ebdac
fix: use zip on the subset of filtered sources
kyteinsky Apr 3, 2026
ea77480
fix(em): use tcp socket connection check
kyteinsky Apr 3, 2026
1ce237a
fix(ci): remove github CI restrictions
kyteinsky Apr 3, 2026
d82e01b
fix: remove unused code and some de-duplication
kyteinsky Apr 3, 2026
286db22
fix(mp): run repairs and config file check only in MainProcess
kyteinsky Apr 3, 2026
726eb64
fix: attach source_ids as keys in json logs
kyteinsky Apr 7, 2026
073f9d0
fix(ci): upload db dump artifacts
kyteinsky Apr 7, 2026
13ea740
fix: retry PGVector object creation if table already exists
kyteinsky Apr 7, 2026
dcb04e7
fix: unique db dump artifact id
kyteinsky Apr 7, 2026
dc1d57b
fix(ci): log stats before exit
kyteinsky Apr 7, 2026
eae1cd4
fix: mark unembeddable files as such
kyteinsky Apr 9, 2026
7b10b27
chore: migrate default values in the type definition
kyteinsky Apr 9, 2026
8b4d260
chore(config): add config entries for tunables
kyteinsky Apr 9, 2026
e4be682
fix: ignore SIGTERM and SIGINT for subprocesses
kyteinsky Apr 9, 2026
d7c9e4f
fix: cleanup request processing, get template from config
kyteinsky Apr 9, 2026
da680e3
fix: explicit check for non-None response
kyteinsky Apr 10, 2026
ecf07c4
fix: add default value of limit in search task type
kyteinsky Apr 15, 2026
531e581
fix(bg_threads): Poll app enabled state every 30s in all threads
marcelklehr Apr 15, 2026
7337d17
fix(app_enabled): centralize app_enabled check
marcelklehr Apr 15, 2026
7c1cf45
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 15, 2026
5e9eb76
pyright and ruff fixes
kyteinsky Apr 20, 2026
04d7fe1
fix(k8s): do not register task proc trigger endpoint
kyteinsky Apr 20, 2026
2fbf9fc
fix(k8s): do not start internal pgsql in k8s env
kyteinsky Apr 20, 2026
cf98ab2
fix(k8s): log exclusively to stderr for k8s env
kyteinsky Apr 21, 2026
6e7f20b
chore: separate out updates processing in an app role
kyteinsky Apr 21, 2026
309de32
fix(k8s): app role fixes
kyteinsky Apr 21, 2026
7b5020e
fix: scoped context search fixes
kyteinsky Apr 21, 2026
bdd842f
chore: better naming of app roles
kyteinsky Apr 22, 2026
f691743
fix(app_enabled): Add lock for app enabled check
marcelklehr Apr 22, 2026
17aa810
feat: build llama cpp python and add cpu/cuda/vulkan builds
kyteinsky Apr 23, 2026
10092cb
feat(ci): add kubernetes integration test
kyteinsky Apr 23, 2026
6819902
fix: checkout app_api
kyteinsky Apr 23, 2026
2376535
fix: cache docker build image
kyteinsky Apr 23, 2026
cf6ba4c
fix: correct info.xml path + register command fixes
kyteinsky Apr 23, 2026
7413e5e
fix: use gha as cache backend for docker images
kyteinsky Apr 23, 2026
4f86758
fix: replace role names in info.xml
kyteinsky Apr 23, 2026
9ca5dd7
fix: use local tag so image is not pulled from remote
kyteinsky Apr 23, 2026
41c85fe
chore: show HaRP container's logs
kyteinsky Apr 23, 2026
cfbc2a1
fix: add ghcr.io to the docker image name
kyteinsky Apr 23, 2026
25ba688
tests(k8s): make php listen on all interfaces
marcelklehr Apr 23, 2026
3df8fdc
fix(ci): use NODE_IP to reach the vector db
kyteinsky Apr 28, 2026
fd12d84
fix(ci): increase timeout for context chat stats and handle exit stat…
kyteinsky Apr 28, 2026
98d2765
fix(ci): checkout the correct branch of app_api
kyteinsky Apr 28, 2026
5054791
fix(ci): show all k8s pods' logs
kyteinsky Apr 28, 2026
f74908b
fix(ci): app_api branch translation + only run k8s for master
kyteinsky Apr 28, 2026
d3f9575
fix(ci): separate prompt responses in groups
kyteinsky Apr 28, 2026
c9edba6
fix(ci): always dump db
kyteinsky Apr 28, 2026
7da999a
fix(context): break the loop after the first chunk does not fit in th…
kyteinsky Apr 29, 2026
4bbb08c
chore(context): increase default context size to 16384
kyteinsky Apr 29, 2026
205dba7
chore: increase context chunks fetched to 30
kyteinsky Apr 29, 2026
1c6a929
feat(ci): add kubernetes integration test (#293)
marcelklehr Apr 30, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
450 changes: 450 additions & 0 deletions .github/workflows/integration-test-k8s.yml

Large diffs are not rendered by default.

144 changes: 115 additions & 29 deletions .github/workflows/integration-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ jobs:
POSTGRES_USER: root
POSTGRES_PASSWORD: rootpassword
POSTGRES_DB: nextcloud
options: --health-cmd pg_isready --health-interval 5s --health-timeout 2s --health-retries 5
options: --health-cmd pg_isready --health-interval 5s --health-timeout 2s --health-retries 5 --name postgres --hostname postgres

steps:
- name: Checkout server
Expand Down Expand Up @@ -120,6 +120,14 @@ jobs:
path: context_chat_backend/
persist-credentials: false

- name: Checkout app_api
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4
with:
repository: nextcloud/app_api
ref: ${{ matrix.server-versions == 'master' && 'main' || matrix.server-versions }}
path: apps/app_api
persist-credentials: false

- name: Get app version
id: appinfo
uses: skjnldsv/xpath-action@7e6a7c379d0e9abc8acaef43df403ab4fc4f770c # master
Expand Down Expand Up @@ -167,6 +175,10 @@ jobs:
cd ..
rm -rf documentation

- name: Run files scan
run: |
./occ files:scan --all

- name: Setup python 3.11
uses: actions/setup-python@42375524e23c412d93fb67b49958b491fce71c38 # v5
with:
Expand Down Expand Up @@ -195,42 +207,109 @@ jobs:
timeout 10 ./occ app_api:daemon:register --net host manual_install "Manual Install" manual-install http localhost http://localhost:8080
timeout 120 ./occ app_api:app:register context_chat_backend manual_install --json-info "{\"appid\":\"context_chat_backend\",\"name\":\"Context Chat Backend\",\"daemon_config_name\":\"manual_install\",\"version\":\"${{ fromJson(steps.appinfo.outputs.result).version }}\",\"secret\":\"12345\",\"port\":10034,\"scopes\":[],\"system_app\":0}" --force-scopes --wait-finish
ls -la context_chat_backend/persistent_storage/*
sleep 30 # Wait for the em server to get ready

- name: Scan files, baseline
run: |
./occ files:scan admin
./occ context_chat:scan admin -m text/plain

- name: Check python memory usage
- name: Initial memory usage check
run: |
ps -p $(cat pid.txt) -o pid,cmd,%mem,rss --sort=-%mem
ps -p $(cat pid.txt) -o %mem --no-headers > initial_mem.txt

- name: Scan files
- name: Run cron jobs
run: |
# every 10 seconds indefinitely
while true; do
php cron.php
sleep 10
done &
sleep 30
# list all the bg jobs
./occ background-job:list

- name: Initial dump of DB with context_chat_queue populated
if: always()
run: |
./occ files:scan admin
./occ context_chat:scan admin -m text/markdown &
./occ context_chat:scan admin -m text/x-rst
docker exec postgres pg_dump nextcloud > /tmp/0_pgdump_nextcloud

- name: Check python memory usage
- name: Periodically check context_chat stats for 15 minutes to allow the backend to index the files
run: |
ps -p $(cat pid.txt) -o pid,cmd,%mem,rss --sort=-%mem
ps -p $(cat pid.txt) -o %mem --no-headers > after_scan_mem.txt
success=0
echo "::group::Checking stats periodically for 15 minutes to allow the backend to index the files"
for i in {1..90}; do
echo "Checking stats, attempt $i..."

stats_err=$(mktemp)
stats_exit=0
stats=$(timeout 30 ./occ context_chat:stats --json 2>"$stats_err") || stats_exit=$?
echo "Stats output:"
echo "$stats"
if [ -s "$stats_err" ]; then
echo "Stderr:"
cat "$stats_err"
fi
echo "---"
rm -f "$stats_err"

# Check for critical errors in output
if [ $stats_exit -ne 0 ] || echo "$stats" | grep -q "Error during request"; then
echo "Backend connection error detected (exit=$stats_exit), retrying..."
sleep 10
continue
fi

# Extract total eligible files
total_eligible_files=$(echo "$stats" | jq '.eligible_files_count' || echo "")

# Extract indexed documents count (files__default)
indexed_count=$(echo "$stats" | jq '.vectordb_document_counts.files__default' || echo "")

echo "Total eligible files: $total_eligible_files"
echo "Indexed documents (files__default): $indexed_count"

diff=$((total_eligible_files - indexed_count))
threshold=$((total_eligible_files * 3 / 100))

# Check if difference is within tolerance
if [ $diff -le $threshold ]; then
echo "Indexing within 3% tolerance (diff=$diff, threshold=$threshold)"
success=1
break
else
progress=$((diff * 100 / total_eligible_files))
echo "Outside 3% tolerance: diff=$diff (${progress}%), threshold=$threshold"
fi

# Check if backend is still alive
ccb_alive=$(ps -p $(cat pid.txt) -o cmd= | grep -c "main.py" || echo "0")
if [ "$ccb_alive" -eq 0 ]; then
echo "Error: Context Chat Backend process is not running. Exiting."
exit 1
fi

sleep 10
done

echo "::endgroup::"

if [ $success -ne 1 ]; then
echo "Max attempts reached"
exit 1
fi

- name: Run the prompts
run: |
./occ background-job:worker 'OC\TaskProcessing\SynchronousBackgroundJob' > worker1_logs 2>&1 &
./occ background-job:worker 'OC\TaskProcessing\SynchronousBackgroundJob' > worker2_logs 2>&1 &

echo ::group::English prompt
OUT1=$(./occ context_chat:prompt admin "Which factors are taken into account for the Ethical AI Rating?")
echo "$OUT1"
echo '--------------------------------------------------'
echo "$OUT1" | grep -q "If all of these points are met, we give a Green label." || exit 1
echo ::endgroup::

echo ::group::German prompt
OUT2=$(./occ context_chat:prompt admin "Welche Faktoren beeinflussen das Ethical AI Rating?")
echo "$OUT2"

echo "$OUT1" | grep -q "If all of these points are met, we give a Green label." || exit 1
echo "$OUT2" | grep -q "If all of these points are met, we give a Green label." || exit 1
echo ::endgroup::

- name: Check python memory usage
run: |
Expand All @@ -250,18 +329,10 @@ jobs:
echo "Memory usage during scan is stable. No memory leak detected."
fi

- name: Compare memory usage and detect leak
- name: Final dump of DB with vectordb populated
if: always()
run: |
initial_mem=$(cat after_scan_mem.txt | tr -d ' ')
final_mem=$(cat after_prompt_mem.txt | tr -d ' ')
echo "Initial Memory Usage: $initial_mem%"
echo "Memory Usage after prompt: $final_mem%"

if (( $(echo "$final_mem > $initial_mem" | bc -l) )); then
echo "Memory usage has increased during prompt. Possible memory leak detected!"
else
echo "Memory usage during prompt is stable. No memory leak detected."
fi
docker exec postgres pg_dump nextcloud > /tmp/1_pgdump_nextcloud

- name: Show server logs
if: always()
Expand Down Expand Up @@ -298,6 +369,21 @@ jobs:
run: |
tail -v -n +1 context_chat_backend/persistent_storage/logs/em_server.log* || echo "No logs in logs directory"

- name: Upload database dumps
uses: actions/upload-artifact@v4
if: always()
with:
name: database-dumps-${{ matrix.server-versions }}-php@${{ matrix.php-versions }}
path: |
/tmp/0_pgdump_nextcloud
/tmp/1_pgdump_nextcloud

- name: Final stats log
if: always()
run: |
./occ context_chat:stats
./occ context_chat:stats --json

summary:
permissions:
contents: none
Expand Down
Loading
Loading