Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .github/scripts/clean_up_stream_table.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

aws kinesis delete-stream --stream-name $STREAM_NAME || true

# Delete all tables
for i in {1..10}; do
echo "Deleting table $APP_NAME"
aws dynamodb delete-table --table-name $APP_NAME && break ||
echo "Table deletion failed, attempt $i/10. Retrying DynamoDB Table deletion in $((i * 3)) seconds" && sleep $((i * 3))
done
Comment on lines +6 to +10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend making a bash function like

delete_table()
  table_name=$1
  if aws dynamodb describe-table --table-name $table_name &>/dev/null; then
    echo "Deleting table $table_name"
    # Do table deletion retries here
  else
    echo "Table $table_name does not exist and does not need to be cleaned up"
  fi

So the primary script can just make an array with the three KCL tables and call the function to reduce the code duplication

for SUFFIX in "-CoordinatorState" "-WorkerMetricStats"; do
if aws dynamodb describe-table --table-name $APP_NAME$SUFFIX &>/dev/null; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to describe on these tables but not the main lease table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The describe is used to check if the tables exist, and the main lease table basically always exists. In some cases either the CoordinatorState table or the WorkerMetricStats table aren't created by the end of the allotted runtime, and would cause problems when trying to delete a non-existent table. I now realize I could just have a blank suffix for the main lease table and have them all be in the same clean-up loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its possible for the main lease table to not exist if there is a runtime error right at the beginning of KCL app and nothing is able to start up. There is a LeaseCoordinator thread which runs which creates the lease table if it doesn't exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll account for the edge case of the main lease table not existing by also checking for its existence

echo "Deleting table $APP_NAME$SUFFIX"
for i in {1..10}; do
aws dynamodb delete-table --table-name $APP_NAME$SUFFIX && break ||
echo "Table deletion failed, attempt $i/10. Retrying DynamoDB Table deletion in $((i * 3))s" && sleep $((i * 3))
done
fi
done
12 changes: 12 additions & 0 deletions .github/scripts/create_stream.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash
set -e

for i in {1..10}; do
if aws kinesis create-stream --stream-name $STREAM_NAME --shard-count 1; then
break
else
echo "Stream creation failed, attempt $i/10. Waiting $((i * 3)) seconds..."
sleep $((i * 3))
fi
done
aws kinesis wait stream-exists --stream-name $STREAM_NAME
29 changes: 29 additions & 0 deletions .github/scripts/manipulate_properties.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#!/bin/bash
set -e

# Manipulate sample.properties file that the KCL application pulls properties from (ex: streamName, applicationName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we ever change the sample properties, some of this may not work anymore right? I see there are some assumptions of what the sample properties should be and some explicit find and replace. Feels finicky and can be easily broken in the future if for example we add a new property. Would it be easier to just create a folder which contains the properties files we are going to use for the github actions and have the workflow use that file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. I had an aversion to adding any new files since I wanted the tests to use the same files that the customers would from the readme to avoid any additional overhead/work for future changes, but this is fragile implementation and would (imo) justify "maintaining" two different samples.properties files.

# Depending on the OS, different properties need to be changed
if [[ "$RUNNER_OS" == "macOS" ]]; then
sed -i "" "s/kclpysample/$STREAM_NAME/g" samples/sample.properties
sed -i "" "s/PythonKCLSample/$APP_NAME/g" samples/sample.properties
sed -i "" 's/us-east-5/us-east-1/g' samples/sample.properties
grep -v "idleTimeBetweenReadsInMillis" samples/sample.properties > samples/temp.properties
echo "idleTimeBetweenReadsInMillis = 250" >> samples/temp.properties
mv samples/temp.properties samples/sample.properties
elif [[ "$RUNNER_OS" == "Linux" || "$RUNNER_OS" == "Windows" ]]; then
sed -i "s/kclpysample/$STREAM_NAME/g" samples/sample.properties
sed -i "s/PythonKCLSample/$APP_NAME/g" samples/sample.properties
sed -i 's/us-east-5/us-east-1/g' samples/sample.properties
sed -i "/idleTimeBetweenReadsInMillis/c\idleTimeBetweenReadsInMillis = 250" samples/sample.properties

if [[ "$RUNNER_OS" == "Windows" ]]; then
echo '@echo off' > samples/run_script.bat
echo 'python %~dp0\sample_kclpy_app.py %*' >> samples/run_script.bat
sed -i 's/executableName = sample_kclpy_app.py/executableName = samples\/run_script.bat/' samples/sample.properties
fi
else
echo "Unknown OS: $RUNNER_OS"
exit 1
fi

cat samples/sample.properties
15 changes: 15 additions & 0 deletions .github/scripts/put_words_to_stream.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash
set -e

sample_kinesis_wordputter.py --stream $STREAM_NAME -w cat -w dog -w bird -w lobster -w octopus

# Get records from stream to verify they exist before continuing
SHARD_ITERATOR=$(aws kinesis get-shard-iterator --stream-name $STREAM_NAME --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON --query 'ShardIterator' --output text)
INITIAL_RECORDS=$(aws kinesis get-records --shard-iterator $SHARD_ITERATOR)
RECORD_COUNT_BEFORE=$(echo $INITIAL_RECORDS | jq '.Records | length')

if [ "$RECORD_COUNT_BEFORE" -eq 0 ]; then
echo "No records found in stream. Test cannot proceed."
exit 1
fi
echo "Found $RECORD_COUNT_BEFORE records in stream before KCL start"
28 changes: 28 additions & 0 deletions .github/scripts/start_kcl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/bin/bash
set -e
set -o pipefail

chmod +x samples/sample.properties
chmod +x samples/sample_kclpy_app.py

# Get records from stream to verify they exist before continuing
SHARD_ITERATOR=$(aws kinesis get-shard-iterator --stream-name $STREAM_NAME --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON --query 'ShardIterator' --output text)
INITIAL_RECORDS=$(aws kinesis get-records --shard-iterator $SHARD_ITERATOR)
RECORD_COUNT_BEFORE=$(echo $INITIAL_RECORDS | jq '.Records | length')

echo "Found $RECORD_COUNT_BEFORE records in stream before KCL start"
Comment on lines +8 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do this a second time? I see this is also done in put_words_stream.sh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a redundant check to make sure the $STREAM_NAME variable has records in the stream. Its not necessary and could be removed.


if [[ "$RUNNER_OS" == "macOS" ]]; then
brew install coreutils
KCL_COMMAND=$(amazon_kclpy_helper.py --print_command --java $(which java) --properties samples/sample.properties)
gtimeout $RUN_TIME_SECONDS $KCL_COMMAND 2>&1 | tee kcl_output.log || [ $? -eq 124 ]
elif [[ "$RUNNER_OS" == "Linux" || "$RUNNER_OS" == "Windows" ]]; then
KCL_COMMAND=$(amazon_kclpy_helper.py --print_command --java $(which java) --properties samples/sample.properties)
timeout $RUN_TIME_SECONDS $KCL_COMMAND 2>&1 | tee kcl_output.log || [ $? -eq 124 ]
else
echo "Unknown OS: $RUNNER_OS"
exit 1
fi

echo "==========ERROR LOGS=========="
grep -i error kcl_output.log || echo "No errors found in logs"
Copy link
Contributor

@ethkatnic ethkatnic Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if there's anything else we would care about besides "error" here, like "fatal", "failed" or something even more general like "exception". Don't think its needed for now and we can always look at failed runs in the future and add stuff to this grep if we find ones that time out but don't show error logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't found many other interesting key terms to search for other than "error", usually when something fails or is of note we don't need to check this section anyway. Like you said in the future we could expand the search but for now I think it should be good.

20 changes: 20 additions & 0 deletions .github/scripts/verify_kcl.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#!/bin/bash
set -e

NUM_LEASES_FOUND=$(aws dynamodb scan --table-name $APP_NAME --select "COUNT" --query "Count" --output text || echo "0")
NUM_CHECKPOINTS_FOUND=$(aws dynamodb scan --table-name $APP_NAME --select "COUNT" --filter-expression "attribute_exists(checkpoint) AND checkpoint <> :trim_horizon" --expression-attribute-values '{":trim_horizon": {"S": "TRIM_HORIZON"}}' --query "Count" --output text || echo "0")

echo "Found $NUM_LEASES_FOUND leases and $NUM_CHECKPOINTS_FOUND non-TRIM-HORIZON checkpoint in DynamoDB"

echo "Printing checkpoint values"
aws dynamodb scan --table-name $APP_NAME --projection-expression "leaseKey,checkpoint" --output json

if [ "$NUM_LEASES_FOUND" -gt 0 ] && [ "$NUM_CHECKPOINTS_FOUND" -gt 0 ]; then
echo "Test passed: Found both leases and non-TRIM_HORIZON checkpoints in DDB (KCL is fully functional)"
exit 0
else
echo "Test failed: KCL not fully functional"
echo "Lease(s) found: $NUM_LEASES_FOUND"
echo "non-TRIM_HORIZON checkpoint(s) found: $NUM_CHECKPOINTS_FOUND"
exit 1
fi
112 changes: 0 additions & 112 deletions .github/workflows/privileged-run.yml

This file was deleted.

Loading