-
Notifications
You must be signed in to change notification settings - Fork 16
Add CI workflows to provision azure VM and run mshv unit tests #213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,229 @@ | ||
name: MSHV Infra Setup | ||
on: | ||
workflow_call: | ||
inputs: | ||
ARCH: | ||
description: 'Architecture for the VM' | ||
required: true | ||
type: string | ||
KEY: | ||
description: 'SSH Key Name' | ||
required: true | ||
type: string | ||
OS_DISK_SIZE: | ||
description: 'OS Disk Size in GB' | ||
required: true | ||
type: string | ||
RG: | ||
description: 'Resource Group Name' | ||
required: true | ||
type: string | ||
VM_SKU: | ||
description: 'VM SKU' | ||
required: true | ||
type: string | ||
secrets: | ||
MI_CLIENT_ID: | ||
required: true | ||
RUNNER_RG: | ||
required: true | ||
STORAGE_ACCOUNT_PATHS: | ||
required: true | ||
ARCH_SOURCE_PATH: | ||
required: true | ||
USERNAME: | ||
required: true | ||
outputs: | ||
PRIVATE_IP: | ||
description: 'Private IP of the VM' | ||
value: ${{ jobs.infra-setup.outputs.PRIVATE_IP }} | ||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.ref }} | ||
cancel-in-progress: true | ||
jobs: | ||
infra-setup: | ||
name: ${{ inputs.ARCH }} VM Provision | ||
runs-on: | ||
- self-hosted | ||
- Linux | ||
outputs: | ||
PRIVATE_IP: ${{ steps.get-vm-ip.outputs.PRIVATE_IP }} | ||
steps: | ||
- name: Install & login to AZ CLI | ||
env: | ||
MI_CLIENT_ID: ${{ secrets.MI_CLIENT_ID }} | ||
run: | | ||
set -e | ||
echo "Installing Azure CLI if not already installed" | ||
if ! command -v az &>/dev/null; then | ||
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash | ||
else | ||
echo "Azure CLI already installed" | ||
fi | ||
az --version | ||
echo "Logging into Azure CLI using Managed Identity" | ||
az login --identity --client-id ${MI_CLIENT_ID} | ||
|
||
- name: Get Location | ||
id: get-location | ||
env: | ||
SKU: ${{ inputs.VM_SKU }} | ||
STORAGE_ACCOUNT_PATHS: ${{ secrets.STORAGE_ACCOUNT_PATHS }} | ||
run: | | ||
set -e | ||
# Extract vCPU count from SKU (e.g., "Standard_D2s_v3" => 2) | ||
vcpu=$(echo "$SKU" | sed -n 's/^Standard_[A-Za-z]\+\([0-9]\+\).*/\1/p') | ||
if [[ -z "$vcpu" ]]; then | ||
echo "Cannot extract vCPU count from SKU: $SKU" | ||
exit 1 | ||
fi | ||
|
||
SUPPORTED_LOCATIONS=$(echo "$STORAGE_ACCOUNT_PATHS" | jq -r 'to_entries[] | .key') | ||
|
||
for location in $SUPPORTED_LOCATIONS; do | ||
family=$(az vm list-skus --size "$SKU" --location "$location" --resource-type "virtualMachines" --query '[0].family' -o tsv) | ||
if [[ -z "$family" ]]; then | ||
echo "Cannot determine VM family for SKU: $SKU in $location" | ||
continue | ||
fi | ||
|
||
usage=$(az vm list-usage --location "$location" --query "[?name.value=='$family'] | [0]" -o json) | ||
current=$(echo "$usage" | jq -r '.currentValue') | ||
limit=$(echo "$usage" | jq -r '.limit') | ||
|
||
if [[ $((limit - current)) -ge $vcpu ]]; then | ||
echo "Sufficient quota found in $location" | ||
echo "location=$location" >> "$GITHUB_OUTPUT" | ||
exit 0 | ||
fi | ||
done | ||
|
||
echo "No location found with sufficient vCPU quota for SKU: $SKU" | ||
exit 1 | ||
|
||
- name: Create Resource Group | ||
id: rg-setup | ||
env: | ||
LOCATION: ${{ steps.get-location.outputs.location }} | ||
RG: ${{ inputs.RG }} | ||
STORAGE_ACCOUNT_PATHS: ${{ secrets.STORAGE_ACCOUNT_PATHS }} | ||
run: | | ||
set -e | ||
echo "Creating Resource Group: $RG" | ||
# Create the resource group | ||
echo "Creating resource group in location: ${LOCATION}" | ||
az group create --name ${RG} --location ${LOCATION} | ||
echo "Resource group created successfully." | ||
|
||
- name: Generate SSH Key | ||
id: generate-ssh-key | ||
env: | ||
KEY: ${{ inputs.KEY }} | ||
run: | | ||
set -e | ||
echo "Generating SSH key: $KEY" | ||
mkdir -p ~/.ssh | ||
ssh-keygen -t rsa -b 4096 -f ~/.ssh/${KEY} -N "" | ||
|
||
- name: Create VM | ||
id: vm-setup | ||
env: | ||
KEY: ${{ inputs.KEY }} | ||
LOCATION: ${{ steps.get-location.outputs.location }} | ||
OS_DISK_SIZE: ${{ inputs.OS_DISK_SIZE }} | ||
RG: ${{ inputs.RG }} | ||
RUNNER_RG: ${{ secrets.RUNNER_RG }} | ||
USERNAME: ${{ secrets.USERNAME }} | ||
VM_SKU: ${{ inputs.VM_SKU }} | ||
VM_IMAGE_NAME: ${{ inputs.ARCH }}_${{ steps.get-location.outputs.location }}_image | ||
VM_NAME: ${{ inputs.ARCH }}_${{ steps.get-location.outputs.location }}_${{ github.run_id }} | ||
run: | | ||
set -e | ||
echo "Creating $VM_SKU VM: $VM_NAME" | ||
|
||
# Extract subnet ID from the runner VM | ||
echo "Retrieving subnet ID..." | ||
SUBNET_ID=$(az network vnet list --resource-group ${RUNNER_RG} --query "[?contains(location, '${LOCATION}')].{SUBNETS:subnets}" | jq -r ".[0].SUBNETS[0].id") | ||
if [[ -z "${SUBNET_ID}" ]]; then | ||
echo "ERROR: Failed to retrieve Subnet ID." | ||
exit 1 | ||
fi | ||
|
||
# Extract image ID from the runner VM | ||
echo "Retrieving image ID..." | ||
IMAGE_ID=$(az image show --resource-group ${RUNNER_RG} --name ${VM_IMAGE_NAME} --query "id" -o tsv) | ||
if [[ -z "${IMAGE_ID}" ]]; then | ||
echo "ERROR: Failed to retrieve Image ID." | ||
exit 1 | ||
fi | ||
|
||
# Create VM | ||
az vm create \ | ||
--resource-group ${RG} \ | ||
--name ${VM_NAME} \ | ||
--subnet ${SUBNET_ID} \ | ||
--size ${VM_SKU} \ | ||
--location ${LOCATION} \ | ||
--image ${IMAGE_ID} \ | ||
--os-disk-size-gb ${OS_DISK_SIZE} \ | ||
--public-ip-sku Standard \ | ||
--storage-sku Premium_LRS \ | ||
--public-ip-address "" \ | ||
--admin-username ${USERNAME} \ | ||
--ssh-key-value ~/.ssh/${KEY}.pub \ | ||
--security-type Standard \ | ||
--output json | ||
|
||
echo "VM creation process completed successfully." | ||
|
||
- name: Get VM Private IP | ||
id: get-vm-ip | ||
env: | ||
RG: ${{ inputs.RG }} | ||
VM_NAME: ${{ inputs.ARCH }}_${{ steps.get-location.outputs.location }}_${{ github.run_id }} | ||
run: | | ||
set -e | ||
echo "Retrieving VM Private IP address..." | ||
# Retrieve VM Private IP address | ||
PRIVATE_IP=$(az vm show -g ${RG} -n ${VM_NAME} -d --query privateIps -o tsv) | ||
if [[ -z "$PRIVATE_IP" ]]; then | ||
echo "ERROR: Failed to retrieve private IP address." | ||
exit 1 | ||
fi | ||
echo "PRIVATE_IP=$PRIVATE_IP" >> $GITHUB_OUTPUT | ||
|
||
- name: Wait for SSH availability | ||
env: | ||
KEY: ${{ inputs.KEY }} | ||
PRIVATE_IP: ${{ steps.get-vm-ip.outputs.PRIVATE_IP }} | ||
USERNAME: ${{ secrets.USERNAME }} | ||
run: | | ||
echo "Waiting for SSH to be accessible..." | ||
timeout 120 bash -c 'until ssh -o StrictHostKeyChecking=no -i ~/.ssh/${KEY} ${USERNAME}@${PRIVATE_IP} "exit" 2>/dev/null; do sleep 5; done' | ||
echo "VM is accessible!" | ||
|
||
- name: Remove Old Host Key | ||
env: | ||
PRIVATE_IP: ${{ steps.get-vm-ip.outputs.PRIVATE_IP }} | ||
run: | | ||
set -e | ||
echo "Removing the old host key" | ||
ssh-keygen -R $PRIVATE_IP | ||
|
||
- name: SSH into VM and Install Dependencies | ||
env: | ||
KEY: ${{ inputs.KEY }} | ||
PRIVATE_IP: ${{ steps.get-vm-ip.outputs.PRIVATE_IP }} | ||
USERNAME: ${{ secrets.USERNAME }} | ||
run: | | ||
set -e | ||
ssh -i ~/.ssh/${KEY} -o StrictHostKeyChecking=no ${USERNAME}@${PRIVATE_IP} << EOF | ||
set -e | ||
echo "Logged in successfully." | ||
echo "Installing dependencies..." | ||
sudo tdnf install -y git moby-engine moby-cli clang llvm pkg-config make gcc glibc-devel | ||
echo "Installing Rust..." | ||
curl -sSf https://sh.rustup.rs | sh -s -- --default-toolchain stable --profile default -y | ||
export PATH="\$HOME/.cargo/bin:\$PATH" | ||
cargo --version | ||
EOF |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
name: Build & Test MSHV Crate | ||
on: | ||
pull_request: | ||
workflow_dispatch: | ||
inputs: | ||
branch: | ||
description: 'Branch to build and test' | ||
required: true | ||
default: 'main' | ||
jobs: | ||
infra-setup: | ||
name: MSHV Infra Setup (x86_64) | ||
uses: ./.github/workflows/mshv-infra.yaml | ||
with: | ||
ARCH: x86_64 | ||
KEY: azure_key_${{ github.run_id }} | ||
OS_DISK_SIZE: 512 | ||
RG: RUST-VMM-MSHV-${{ github.run_id }} | ||
VM_SKU: Standard_D16s_v5 | ||
secrets: | ||
MI_CLIENT_ID: ${{ secrets.MSHV_MI_CLIENT_ID }} | ||
RUNNER_RG: ${{ secrets.MSHV_RUNNER_RG }} | ||
STORAGE_ACCOUNT_PATHS: ${{ secrets.MSHV_STORAGE_ACCOUNT_PATHS }} | ||
ARCH_SOURCE_PATH: ${{ secrets.MSHV_X86_SOURCE_PATH }} | ||
USERNAME: ${{ secrets.MSHV_USERNAME }} | ||
|
||
build-test: | ||
name: Build & test | ||
needs: infra-setup | ||
if: ${{ always() && needs.infra-setup.result == 'success' }} | ||
runs-on: | ||
- self-hosted | ||
- Linux | ||
steps: | ||
- name: Determine branch to build | ||
run: | | ||
echo "Determining branch to build and test..." | ||
if [[ "${{ github.event_name }}" == "pull_request" ]]; then | ||
echo "BRANCH=${{ github.event.pull_request.head.ref }}" >> $GITHUB_ENV | ||
else | ||
echo "BRANCH=${{ inputs.branch }}" >> $GITHUB_ENV | ||
gamora12 marked this conversation as resolved.
Show resolved
Hide resolved
gamora12 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
fi | ||
|
||
- name: Build & Run tests on remote VM | ||
env: | ||
BRANCH_NAME: ${{ env.BRANCH }} | ||
KEY: azure_key_${{ github.run_id }} | ||
PR_NUMBER: ${{ github.event.pull_request.number }} | ||
PRIVATE_IP: ${{ needs.infra-setup.outputs.PRIVATE_IP }} | ||
RG: MSHV-${{ github.run_id }} | ||
USERNAME: ${{ secrets.MSHV_USERNAME }} | ||
run: | | ||
set -e | ||
echo "Connecting to the VM via SSH..." | ||
ssh -i ~/.ssh/${KEY} -o StrictHostKeyChecking=no ${USERNAME}@${PRIVATE_IP} << EOF | ||
set -e | ||
echo "Logged in successfully." | ||
export PATH="\$HOME/.cargo/bin:\$PATH" | ||
echo "${BRANCH_NAME}" | ||
git clone --depth 1 --single-branch --branch "$BRANCH_NAME" https://github.com/rust-vmm/mshv.git | ||
cd mshv | ||
cargo build --all-features --workspace | ||
cargo test --all-features --workspace | ||
EOF | ||
echo "Build and test completed successfully." | ||
|
||
cleanup: | ||
name: Cleanup | ||
needs: build-test | ||
if: always() | ||
runs-on: | ||
- self-hosted | ||
- Linux | ||
steps: | ||
- name: Delete RG | ||
env: | ||
RG: RUST-VMM-MSHV-${{ github.run_id }} | ||
run: | | ||
if az group exists --name ${RG}; then | ||
az group delete --name ${RG} --yes --no-wait | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can fail right? Do we have another service which periodically goes through all the stale resource groups and cleans them up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean cleanup executes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, we can add a separate cleanup workflow (cron) for this. We can have a separate PR for this. |
||
else | ||
echo "Resource Group ${RG} does not exist. Skipping deletion." | ||
fi | ||
echo "Cleanup process completed." | ||
|
||
- name: Delete SSH Key | ||
env: | ||
KEY: azure_key_${{ github.run_id }} | ||
run: | | ||
if [ -f ~/.ssh/${KEY} ]; then | ||
rm -f ~/.ssh/${KEY} ~/.ssh/${KEY}.pub | ||
echo "SSH key deleted successfully." | ||
else | ||
echo "SSH key does not exist. Skipping deletion." | ||
fi | ||
echo "Cleanup process completed." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one meta question, all these runs are cancellable right? So I update the PR previous run gets cancelled and resource group gets deleted and everything.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these runs can be cancelled. The cleanup job runs at the end by default even if the workflow is cancelled by the latest PR update. So, if a new commit is pushed, automatically Github will cancel the previous run.