diff --git a/ai-quick-actions/multimodel-deployment-tips.md b/ai-quick-actions/multimodel-deployment-tips.md
index 191c5c63..7cdc1005 100644
--- a/ai-quick-actions/multimodel-deployment-tips.md
+++ b/ai-quick-actions/multimodel-deployment-tips.md
@@ -1,40 +1,167 @@
-# **AI Quick Actions MultiModel Deployment (Available through CLI only)**
+# **Multi-Model Deployment with AI Quick Actions (AQUA)**
+Multi-Model inference and serving refers to efficiently hosting and managing multiple large language models (LLMs) on a single compute shape on Oracle Cloud Infrastructure. This allows for inference requests to multiple models while being hosted on one compute instance, with one model deployment endpoint.
-# Table of Contents
-- # Introduction to MultiModel Deployment and Serving
-- [Models](#models)
+We use the 'name' parameter within model payloads to reach specific models. The Data Science service has a prebuilt **vLLM service container** that makes deploying and serving multiple large language model on **single GPU Compute shape** very easy, simplifying the deployment process and reducing operational complexity. This container comes with preinstalled [**LiteLLM proxy server**](https://docs.litellm.ai/docs/simple_proxy) which routes requests to the appropriate model, ensuring seamless prediction.
+
+
+
+**Note**
+
+This diagram shows one base model LLM and one fine-tuned model (only available through AQUA CLI).
+
+For fine-tuned models, requests specifying the base model name (ex. model: meta-llama/Llama-3.2-1B-Instruct) are routed to the base LLM, while requests specifying the fine-tuned model (ex. model: tunedModel_meta-llama/Llama-3.2-1B) are routed to the fine-tuned model (the base model w/ applied LoRA weights).
+
+
+# Models supported by Multi-Model Deployment
+
+**Multi-Model Deployment is currently in beta. At this time, the following is supported:**
+- **[AQUA User Interface on Notebooks](#using-aqua-ui-interface-for-multi-model-deployment)**
+ - only Multi-Model Deployments with **base service LLM models (text-generation)** can be deployed with AQUA UI.
+ - fine-tuned/registered/custom models in Multi-Model Deployment must be deployed through [AQUA CLI](cli-tips.md)
+- **[AQUA CLI](#using-aqua-cli-for-multi-model-deployment)**
+ - along w/ base service LLM models, **fine-tuned models, [custom models](#custom-models), multi-modal models** (image-text-to-text models) can be used in Multi-Model Deployments
+
+## Contents
+- [**Multi-Model Deployment with AI Quick Actions (AQUA)**](#multi-model-deployment-with-ai-quick-actions-aqua)
+- [Models supported by Multi-Model Deployment](#models-supported-by-multi-model-deployment)
+ - [Contents](#contents)
+ - [Setup](#setup)
+ - [For AQUA CLI](#for-aqua-cli)
+- [Using AQUA UI Interface for Multi-Model Deployment](#using-aqua-ui-interface-for-multi-model-deployment)
+ - [Select the 'Create deployment' Button](#select-the-create-deployment-button)
+ - [Select 'Deploy Multi Model'](#select-deploy-multi-model)
+ - [Inferencing with Multi-Model Deployment](#inferencing-with-multi-model-deployment)
+- [Using AQUA CLI for Multi-Model Deployment](#using-aqua-cli-for-multi-model-deployment)
+ - [1. Obtain Model OCIDs](#1-obtain-model-ocids)
+ - [Service Managed Models](#service-managed-models)
+ - [Fine-Tuned Models](#fine-tuned-models)
- [Custom Models](#custom-models)
-- [MultiModel Deployment](#multimodel-deployment)
- - [List Available Shapes](#list-available-shapes)
- - [Get MultiModel Configuration](#get-multimodel-configuration)
- - [Create MultiModel Deployment](#create-multimodel-deployment)
- - [Manage MultiModel Deployments](#manage-multimodel-deployments)
-- [MultiModel Inferencing](#multimodel-inferencing)
-- [MultiModel Evaluation](#multimodel-evaluation)
- - [Create Model Evaluation](#create-model-evaluations)
-- [Limitation](#limitations)
+ - [2. Before Deployment, Check Resource Limits](#2-before-deployment-check-resource-limits)
+ - [List Available Shapes](#list-available-shapes)
+ - [Usage](#usage)
+ - [Optional Parameters](#optional-parameters)
+ - [Example](#example)
+ - [CLI Output](#cli-output)
+ - [Obtain Model Configurations for Multi-Model Deployment](#obtain-model-configurations-for-multi-model-deployment)
+ - [Usage](#usage-1)
+ - [Required Parameters](#required-parameters)
+ - [Optional Parameters](#optional-parameters-1)
+ - [Example](#example-1)
+ - [CLI Output](#cli-output-1)
+ - [3. Create Multi-Model Deployment](#3-create-multi-model-deployment)
+ - [Description](#description)
+ - [Usage](#usage-2)
+ - [Required Parameters](#required-parameters-1)
+ - [Example with Base Models:](#example-with-base-models)
+ - [Example with Fine-Tuned-Model:](#example-with-fine-tuned-model)
+ - [Example with Multi-Modal \& Embedding Models:](#example-with-multi-modal--embedding-models)
+ - [Optional Parameters](#optional-parameters-2)
+ - [Example](#example-2)
+ - [Create Multi-Model deployment with `/v1/completions`](#create-multi-model-deployment-with-v1completions)
+ - [CLI Output](#cli-output-2)
+ - [Create Multi-Model deployment with `/v1/chat/completions`](#create-multi-model-deployment-with-v1chatcompletions)
+ - [CLI Output](#cli-output-3)
+ - [Create Multi-Model (1 Embedding Model, 1 LLM) deployment with `/v1/completions`](#create-multi-model-1-embedding-model-1-llm-deployment-with-v1completions)
+ - [Manage Multi-Model Deployments](#manage-multi-model-deployments)
+- [Multi-Model Inferencing](#multi-model-inferencing)
+ - [Using oci-cli](#using-oci-cli)
+ - [Using Python SDK (without streaming)](#using-python-sdk-without-streaming)
+ - [Using Python SDK (with streaming)](#using-python-sdk-with-streaming)
+ - [Using Python SDK for /v1/chat/completions endpoint](#using-python-sdk-for-v1chatcompletions-endpoint)
+ - [Using Java (with streaming)](#using-java-with-streaming)
+ - [Multiple Inference endpoints](#multiple-inference-endpoints)
+- [Multi-Model Evaluations](#multi-model-evaluations)
+ - [Create Model Evaluations](#create-model-evaluations)
+ - [Description](#description-1)
+ - [Usage](#usage-3)
+ - [Required Parameters](#required-parameters-2)
+ - [Optional Parameters](#optional-parameters-3)
+ - [Example](#example-3)
+ - [CLI Output](#cli-output-4)
+- [Limitations](#limitations)
- [Supported Service Models](#supported-service-models)
-- [Tensor Parallelism VS Multi-Instance GPU (MIG)](#tensor-parallelism-vs-multi-instance-gpu)
+- [Tensor Parallelism VS Multi-Instance GPU](#tensor-parallelism-vs-multi-instance-gpu)
+ - [Key Differences](#key-differences)
+ - [Summary](#summary)
+
+
+## Setup
+
+- Ensure requirements (AQUA policies & OCI Bucket Versioning) are met [here](register-tips.md#Prerequisites).
+
+#### For AQUA CLI
+- latest version of Accelerated Data Science (ADS) to run Multi-Model Deployments, installation instructions are available [here](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/cli/quickstart.html).
+
+# Using AQUA UI Interface for Multi-Model Deployment
+Only Multi-Model Deployments with **base service LLM models (text-generation)** can be deployed with AQUA UI.
+- For fine-tuned,registered, or custom models in Multi-Model Deployment, use the AQUA CLI as detailed [here](#using-aqua-cli-for-multi-model-deployment)
+
+### Select the 'Create deployment' Button
+
+
+### Select 'Deploy Multi Model'
+- Based on the 'models' field, a Compute Shape will be recommended to accomidate both models.
+- Select logging and endpoints (/v1/completions | /v1/chat/completions).
+- Submit form via 'Deploy Button' at bottom.
+
+
+### Inferencing with Multi-Model Deployment
+
+There are two ways to send inference requests to models within a Multi-Model Deployment
+1. Python SDK (recommended)- see [here](#Multi-Model-Inferencing)
+2. Using AQUA UI (see below, ok for testing)
-# Introduction to MultiModel Deployment and Serving
+Once the Deployment is Active, view the model deployment details and inferencing form by clicking on the 'Deployments' Tab and selecting the model within the Model Deployment list.
-MultiModel inference and serving refers to efficiently hosting and managing multiple large language models simultaneously to serve inference requests using shared resources. The Data Science server has prebuilt **vLLM service container** that make deploying and serving multiple large language model on **single GPU Compute shape** very easy, simplifying the deployment process and reducing operational complexity. This container comes with preinstalled [**LiteLLM proxy server**](https://docs.litellm.ai/docs/simple_proxy) which routes requests to the appropriate model, ensuring seamless prediction.
+Use the dropdown labeled 'Model parameters' to select a specific model for inference.
-**MultiModel Deployment is currently in beta and is only available through the CLI. At this time, only base service LLM models are supported, and fine-tuned/registered models cannot be deployed.**
+
-This document provides documentation on how to use ADS CLI to create MultiModel deployment using AI Quick Actions (AQUA) model deployments, and evaluate the models. you'll need the latest version of ADS to run these, installation instructions are available [here](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/cli/quickstart.html).
+# Using AQUA CLI for Multi-Model Deployment
+This document provides documentation on how to use ADS CLI to create Multi-Model deployment using AI Quick Actions (AQUA) model deployments, and evaluate the models. There are three steps:
+## 1. Obtain Model OCIDs
-# Models
+Again, only **base service managed LLM models (text-generation)** obtained on the AQUA User Interface on OCI Notebooks.
+- use the **[AQUA CLI](cli-tips.md)** for **[fine-tuned models](#fine-tuned-models), [custom models](#custom-models), multi-modal models (image-text-to-text models)**
-First step in process is to get the OCIDs of the desired base service LLM AQUA models, which are required to initiate the MultiModel deployment process. Refer to [AQUA CLI tips](cli-tips.md) for detailed instructions on how to obtain the OCIDs of base service LLM AQUA models.
+### Service Managed Models
-You can also obtain the OCID from the AQUA user interface by clicking on the model card and selecting the `Copy OCID` button from the `More Options` dropdown in the top-right corner of the screen.
+- obtain the OCID from the AQUA user interface by clicking on the model card and selecting the `Copy OCID` button from the `More Options` dropdown in the top-right corner of the screen.
+- Refer to [AQUA CLI tips](cli-tips.md#get-service-model-ocid) for detailed instructions on how to obtain the OCIDs of service-managed models via CLI.
-## Custom Models
+### Fine-Tuned Models
+To use a fine-tuned model in a Multi-Model Deployment, we must create the fine-tuned model first as specified below:
+- **Register the Base Model**
-Out of the box, MultiModel Deployment currently supports only AI Quick Actions service LLM models (see [requirements](#introduction-to-multimodel-deployment-and-serving) above). However, it is also possible to enable support for *custom-registered* models by manually adding a deployment configuration to the model artifact folder in Object Storage.
+ Register the **base model** using either the **AI Quick Actions UI** or **ADS CLI**. This will upload the model to the Object Storage and associate it with a Model Catalog record.
+
+- **Fine-Tune the Model**
+
+ Fine-tune your model using **[AI Quick Actions UI](fine-tuning-tips.md#Fine-Tune-a-Model)** or **[ADS CLI](cli-tips.md#Model-Fine-Tuning)** . This will create a Data Science Job using your dataset (.jsonl file) and create a LoRA module uploaded to the specified bucket.
+
+- **Obtain the Base Model OCID**
+ - On the model tab, obtain the OCID from the AQUA user interface by clicking on the **base** model card and selecting the `Copy OCID` button from the `More Options` dropdown in the top-right corner of the screen.
+ - To obtain the OCIDs of Service/ User-Registered Models **via CLI**: [docs](cli-tips.md#get-service-model-ocid)
+
+- **Obtain Fine-Tuned Model OCID, Fine-Tuned Model Name**
+ - On the model tab, select the Fine-Tuned tab, and obtain the OCID & Model Name from the AQUA user interface by clicking on the **Fine-Tuned** model card and selecting the `Copy OCID` button from the `More Options` dropdown in the top-right corner of the screen.
+ - To Obtain Fine-Tuned Model OCID via CLI: [docs](cli-tips.md#get-fine-tuned-model-ocid)
+ - To Obtain Fine-Tuned Model Name via CLI: [docs](cli-tips.md#get-model-details)
+ - The "name" field in the CLI output will be the Fine-Tuned Model Name
+ - Use the Fine-Tuned Model OCID for CLI command
+
+- **Use Fine-Tuned model within a Multi-Model Deployment via AQUA CLI**
+
+ - Complete the following two steps:
+ - [Ensure](#2-before-deployment-check-resource-limits) that models within Multi-Model Deployment is compatible with selected compute shape and compute shape is available in your tenancy.
+ - **Note** that the GPU requirement is determined by the base model that the Fine-Tuned model uses.
+ - [Using AQUA CLI](#3-create-multi-model-deployment), create a Multi-Model Deployment w/ Fine-Tuned Model as shown [here](#example-w-fine-tuned-model)
+
+### Custom Models
+
+Using the AQUA CLI (not supported in AQUA UI), it is also possible to enable support for *custom-registered* models by manually adding a deployment configuration to the model artifact folder in Object Storage.
Follow the steps below to enable MultiModel Deployment support for your custom models:
@@ -131,27 +258,28 @@ Follow the steps below to enable MultiModel Deployment support for your custom m
Use the [MultiModel Configuration command](#get-multimodel-configuration) to check whether the selected model is compatible with multi-model deployment now.
-# MultiModel Deployment
+## 2. Before Deployment, Check Resource Limits
+Before starting a Multi-Model Deployment, we must check the following resource limits:
+- Determine the compute shapes available for model deployment in our OCI Environment, seen [here](#list-available-shapes).
+- After selecting an available compute shape, we check if the models selected are compatible with the selected shape, seen [here](#Obtain-Model-Configurations-for-Multi-Model-Deployment).
-## List Available Shapes
-
-### Description
+### List Available Shapes
Lists the available **Compute Shapes** with basic information such as name, configuration, CPU/GPU specifications, and memory capacity for the shapes supported by the Model Deployment service in your compartment.
-### Usage
+#### Usage
```bash
ads aqua deployment list_shapes
```
-### Optional Parameters
+#### Optional Parameters
`--compartment_id [str]`
The compartment OCID where model deployment is to be created. If not provided, then it defaults to user's compartment.
-### Example
+#### Example
```bash
ads aqua deployment list_shapes
```
@@ -184,25 +312,24 @@ ads aqua deployment list_shapes
```
-## Get MultiModel Configuration
+### Obtain Model Configurations for Multi-Model Deployment
-### Description
+Retrieves the deployment configuration for multiple models and calculates the GPU allocations for all compatible shapes.
+- **For Fine-Tuned Models**, use the OCID on the fine-tuned model card/ fine-tuned model catalog entry (not base model OCID)
-Retrieves the deployment configuration for multiple base Aqua service models and calculates the GPU allocations for all compatible shapes.
-
-### Usage
+#### Usage
```bash
ads aqua deployment get_multimodel_deployment_config [OPTIONS]
```
-### Required Parameters
+#### Required Parameters
`--model_ids [list]`
A list of OCIDs for the Aqua models
-### Optional Parameters
+#### Optional Parameters
`--primary_model_id [str]`
@@ -221,7 +348,7 @@ If B is the primary model, the gpu allocation is [2, 4, 2] as B always gets the
The compartment OCID to retrieve the models and available model deployment shapes.
-### Example
+#### Example
```bash
ads aqua deployment get_multimodel_deployment_config --model_ids '["ocid1.datasciencemodel.oc1.iad.","ocid1.datasciencemodel.oc1.iad."]'
@@ -343,7 +470,7 @@ ads aqua deployment get_multimodel_deployment_config --model_ids '["ocid1.datasc
}
```
-## Create MultiModel Deployment
+## 3. Create Multi-Model Deployment
Only **base service LLM models** are supported for MultiModel Deployment. All selected models will run on the same **GPU shape**, sharing the available compute resources. Make sure to choose a shape that meets the needs of all models in your deployment using [MultiModel Configuration command](#get-multimodel-configuration)
@@ -362,8 +489,53 @@ ads aqua deployment create [OPTIONS]
`--models [str]`
-The String representation of a JSON array, where each object defines a model’s OCID and the number of GPUs assigned to it. The gpu count should always be a **power of two (e.g., 1, 2, 4, 8)**.
-Example: `'[{"model_id":"", "gpu_count":1},{"model_id":"", "gpu_count":1}]'` for `VM.GPU.A10.2` shape.
+The String representation of a JSON array, where each object defines:
+- model’s OCID
+- number of GPUs assigned to it
+- Fine Tuned Weights (if Fine-Tuned Model only).
+
+
+The gpu count should always be a **power of two (e.g., 1, 2, 4, 8)**.
+
+#### Example with Base Models:
+
+`'[{"model_id":"", "gpu_count":1},{"model_id":"", "gpu_count":1}]'` for `VM.GPU.A10.2` shape.
+
+#### Example with Fine-Tuned-Model:
+
+For **fine-tuned models**, specify the OCID of the **base model** in the top-level "model_id" field.
+List your fine-tuned models in the **"fine_tune_weights" array**, where each entry includes the **OCID and name of a fine-tuned model** derived from the base model.
+```
+'[ # Fine-Tuned-Model
+ {
+ "model_id": "ocid1.datasciencemodel.oc1.iad.",
+ "gpu_count": 1,
+ "model_name": "meta-llama/Meta-Llama-3.1-8B",
+ "model_task": "text_generation",
+ "fine_tune_weights": [
+ {
+ "model_id": "ocid1.datasciencemodel.oc1.iad.<>",
+ "model_name": "meta-llama/Meta-Llama-3.1-8B-FT1"
+ },
+ {
+ "model_id": "ocid1.datasciencemodel.oc1.iad.<>",
+ "model_name": "meta-llama/Meta-Llama-3.1-8B-FT2"
+ }
+ ]
+ }, # 2nd Model
+ {
+ "model_task": "text_generation",
+ "model_id": "ocid1.datasciencemodel.oc1.iad.",
+ "gpu_count": 1
+ }
+ ]'
+```
+#### Example with Multi-Modal & Embedding Models:
+
+`'[{"model_id":"", "gpu_count":1, "model_task": "text_generation"},{"model_id":"", "gpu_count":1, "model_task": "image_text_to_text"}]'` for `VM.GPU.A10.2` shape.
+
+For deploying multi-modal and embedding models, model_task must be specified. For best practice, model_task should be supplied. (Supported tasks: text_generation, image_text_to_text, code_synthesis, text_embedding)
+
`--instance_shape [str]`
@@ -434,14 +606,15 @@ The private endpoint id of model deployment.
### Example
-#### Create MultiModel deployment with `/v1/completions`
+#### Create Multi-Model deployment with `/v1/completions`
```bash
ads aqua deployment create \
--container_image_uri "dsmc://odsc-vllm-serving:0.6.4.post1.2" \
--models '[{"model_id":"ocid1.log.oc1.iad.", "gpu_count":1}, {"model_id":"ocid1.log.oc1.iad.", "gpu_count":1}]' \
--instance_shape "VM.GPU.A10.2" \
- --display_name "modelDeployment_multmodel_model1_model2"
+ --display_name "modelDeployment_multmodel_model1_model2" \
+ --env_var '{"MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/completions"}'
```
@@ -493,7 +666,7 @@ ads aqua deployment create \
"MODEL_DEPLOY_ENABLE_STREAMING": "true",
```
-#### Create MultiModel deployment with `/v1/chat/completions`
+#### Create Multi-Model deployment with `/v1/chat/completions`
```bash
ads aqua deployment create \
@@ -501,7 +674,8 @@ ads aqua deployment create \
--models '[{"model_id":"ocid1.log.oc1.iad.", "gpu_count":1}, {"model_id":"ocid1.log.oc1.iad.", "gpu_count":1}]' \
--env-var '{"MODEL_DEPLOY_PREDICT_ENDPOINT":"/v1/chat/completions"}' \
--instance_shape "VM.GPU.A10.2" \
- --display_name "modelDeployment_multmodel_model1_model2"
+ --display_name "modelDeployment_multmodel_model1_model2" \
+ --env_var '{"MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/chat/completions"}'
```
@@ -552,19 +726,33 @@ ads aqua deployment create \
"MULTI_MODEL_CONFIG": "{\"models\": [{\"params\": \"--served-model-name mistralai/Mistral-7B-v0.1 --seed 42 --tensor-parallel-size 1 --max-model-len 4096\", \"model_path\": \"service_models/Mistral-7B-v0.1/78814a9/artifact\"}, {\"params\": \"--served-model-name tiiuae/falcon-7b --seed 42 --tensor-parallel-size 1 --trust-remote-code\", \"model_path\": \"service_models/falcon-7b/f779652/artifact\"}]}",
"MODEL_DEPLOY_ENABLE_STREAMING": "true",
```
+#### Create Multi-Model (1 Embedding Model, 1 LLM) deployment with `/v1/completions`
+Note: will need to pass {"route": "v1/embeddings"} as a header for all inference requests to embedding model
-## Manage MultiModel Deployments
+```
+headers={'route':'/v1/embeddings','Content-Type':'application/json'}
+```
+- for /v1/chat/completions, modify "MODEL_DEPLOY_PREDICT_ENDPOINT"
+```bash
+ads aqua deployment create \
+ --container_image_uri "dsmc://odsc-vllm-serving:0.6.4.post1.2" \
+ --models '[{"model_id":"ocid1.log.oc1.iad.", "gpu_count":1, "model_task": "embedding"}, {"model_id":"ocid1.log.oc1.iad.", "gpu_count":1, "model_task": "text_generation"}]' \
+ --instance_shape "VM.GPU.A10.2" \
+ --display_name "modelDeployment_multmodel_model1_model2" \
+ --env_var '{"MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/completions"}'
-### Description
+```
+
+## Manage Multi-Model Deployments
-To list all AQUA deployments (both MultiModel and single-model) within a specified compartment or project, or to get detailed information on a specific MultiModel deployment, kindly refer to the [AQUA CLI tips](cli-tips.md) documentation.
+To list all AQUA deployments (both Multi-Model and single-model) within a specified compartment or project, or to get detailed information on a specific Multi-Model deployment, kindly refer to the [AQUA CLI tips](cli-tips.md) documentation.
-Note: MultiModel deployments are identified by the tag `"aqua_multimodel": "true",` associated with them.
+Note: Multi-Model deployments are identified by the tag `"aqua_multimodel": "true",` associated with them.
-# MultiModel Inferencing
+# Multi-Model Inferencing
-The only change required to infer a specific model from a MultiModel deployment is to update the value of `"model"` parameter in the request payload. The values for this parameter can be found in the Model Deployment details, under the field name `"model_name"`. This parameter segregates the request flow, ensuring that the inference request is directed to the correct model within the MultiModel deployment.
+The only change required to infer a specific model from a Multi-Model deployment is to update the value of `"model"` parameter in the request payload. The values for this parameter can be found in the Model Deployment details, under the field name `"model_name"`. This parameter segregates the request flow, ensuring that the inference request is directed to the correct model within the MultiModel deployment.
## Using oci-cli
@@ -872,7 +1060,7 @@ oci raw-request \
```
-# MultiModel Evaluations
+# Multi-Model Evaluations
## Create Model Evaluations
diff --git a/ai-quick-actions/web_assets/create-deployment.png b/ai-quick-actions/web_assets/create-deployment.png
new file mode 100644
index 00000000..159b2673
Binary files /dev/null and b/ai-quick-actions/web_assets/create-deployment.png differ
diff --git a/ai-quick-actions/web_assets/deploy-mmd.png b/ai-quick-actions/web_assets/deploy-mmd.png
new file mode 100644
index 00000000..ad8be421
Binary files /dev/null and b/ai-quick-actions/web_assets/deploy-mmd.png differ
diff --git a/ai-quick-actions/web_assets/mmd-details.png b/ai-quick-actions/web_assets/mmd-details.png
new file mode 100644
index 00000000..8a1fa253
Binary files /dev/null and b/ai-quick-actions/web_assets/mmd-details.png differ
diff --git a/ai-quick-actions/web_assets/mmd-ex.png b/ai-quick-actions/web_assets/mmd-ex.png
new file mode 100644
index 00000000..d7e8890e
Binary files /dev/null and b/ai-quick-actions/web_assets/mmd-ex.png differ