Skip to content

garystafford/nova-mm-embedding-model-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Vector Semantic Search with Amazon Nova Multimodal Embeddings Model and OpenSearch

Demonstrating the use of Amazon Nova Multimodal Embeddings and TwelveLabs Pegasus 1.2 models on Amazon Bedrock along with Amazon OpenSearch Serverless to perform semantic search.

Architecture

Usage Instructions

Prerequisites

  • Python 3.12+
  • AWS credentials
  • Amazon S3 bucket
  • Amazon OpenSearch Serverless collection (optional)
  • FFmpeg (optional for keyframe generation)

Installation

Clone the repository:

git clone https://github.com/garystafford/nova-mm-embedding-model-demo.git
cd nova-mm-embedding-model-demo

Rename python-dotenv file:

Mac:

mv env.txt .env

Windows:

rename env.txt .env

Enter the following environment variables in the .env file:

AWS_ACCESS_KEY_ID=<Your AWS Access Key ID>
AWS_SECRET_ACCESS_KEY=<Your AWS Secret Access Key>
AWS_SESSION_TOKEN=<Your AWS Session Token>

S3_VIDEO_STORAGE_BUCKET=<Your S3 Bucket Name>
OPENSEARCH_ENDPOINT=<Your OpenSearch Endpoint>
CLOUDFRONT_URL=<Your Amazon CloudFront Distribution>

Create a Python virtual environment for the Jupyter Notebook:

Mac:

python -m pip install virtualenv -Uq
python -m venv .venv
source .venv/bin/activate

python -m pip install -r requirements.txt -Uq

Windows:

python -m venv .venv
.venv\Scripts\activate

python -m pip install pip -Uq
python -m pip install -r requirements.txt -Uq

Check for FFmpeg:

ffmpeg -version

Upload the Videos and Keyframes to S3

Videos and keyframes should be uploaded to the Amazon S3 buckets in us-east-1.

Run the Code

Run the following Python scripts.

# Extract keyframes from videos
python ./extract_keyframe.py

# Generate embeddings using Amazon Nova Multimodal Embeddings
python ./generate_embeddings.py

# Generate video analyses using TwelveLabs Pegasus 1.2
python ./generate_analyses.py

# Prepare combined OpenSearch documents
python ./prepare_documents.py

Access the Jupyter Notebook for all OpenSearch-related code: nova-mm-emd-opensearch-demo.ipynb

Alternative: Running OpenSearch in Docker

As an alternative to AWS, you can run OpenSearch locally using Docker. This is intended for development environments only and is not secure.

Mac:

docker swarm init

SWARM_ID=$(docker node ls --format "{{.ID}}")
docker stack deploy -c docker-compose.yml $SWARM_ID

docker service ls

Windows:

docker swarm init

for /f "delims=" %x in ('docker node ls --format "{{.ID}}"') do set SWARM_ID=%x
docker stack deploy -c docker-compose.yml %SWARM_ID%

docker service ls

Basic OpenSearch Command

You can interact with your OpenSearch index in the Dev Tools tab of the OpenSearch Dashboards UI.

GET tv-commercials-index-nova-mm/_settings

GET tv-commercials-index-nova-mm/_count

GET tv-commercials-index-nova-mm/_search
{
  "query": {
    "match_all": {}
  }
}

GET tv-commercials-index-nova-mm/_search
{
  "query": {
    "terms": {
      "keywords": [
        "car",
        "city"
      ]
    }
  },
    "_source": false,
    "fields": ["title", "durationSec"]
}

GET tv-commercials-index-nova-mm/_search
{
  "query": {
    "nested": {
      "path": "embeddings",
      "query": {
        "knn": {
          "embeddings.embedding": {
            "vector": [
              0.059814453125,
              -0.017333984375,
              0.01153564453125,
              ...
            ],
            "k": 6
          }
        }
      }
    }
  },
  "size": 6,
  "_source": {
    "excludes": [
      "embeddings.embedding"
    ]
  }
}

Previews from Notebook

Television commercials used in video

Previews

Preview of search results with keyframe previews

Result Grid

“Elbow” method to help select the optimal number of clusters

Search Results

All video segments plotted using t-SNE and K-Means clustering

t-SNE 2D Plot


The contents of this repository represent my viewpoints and not those of my past or current employers, including Amazon Web Services (AWS). All third-party libraries, modules, plugins, and SDKs are the property of their respective owners.

About

Demonstrating the use of Amazon Nova Multimodal Embeddings and TwelveLabs Pegasus 1.2 models on Amazon Bedrock along with Amazon OpenSearch Serverless to perform semantic search.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors