Skip to content

Bedrock embeddings #392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

way2hemanthkumar
Copy link

Description

Note

Please provide a description of the work completed in this PR below

Added support for Amazon Bedrock as a new embedding backend.

  • Introduced BedrockEmbedding class.
  • Added unit tests for new backend.
  • Performed manual tests to confirm expected embedding outputs.
  • Signed the CLA.
  • Updated CHANGELOG.md.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Note

Please provide an estimated complexity of this PR of either Low, Medium or High

Complexity: Medium

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@way2hemanthkumar way2hemanthkumar requested a review from a team as a code owner July 26, 2025 08:07
Copy link
Contributor

@NathalieCharbel NathalieCharbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!
I’ve left a few suggestions below to further refine the embedder class, but overall this looks great, we really appreciate your work on this :)

- Added support for Amazon Bedrock embeddings via `BedrockEmbedding` class.
- Users can now leverage Bedrock-hosted Embedding models for vector generation.
- Added unit test and conducted Unit test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually add one entry for each added feature, so it is better to group the first two in one item. The last one can be skipped.

Embedder implementation using Amazon Bedrock's Titan Text Embedding model.

This class integrates with AWS Bedrock via `boto3` and uses the Titan Embedding
model (`amazon.titan-embed-text-v2:0`) to generate 1536-dimensional vector
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please mention here that amazon.titan-embed-text-v2:0 is the default model used as other models could be accepted as well.

body=json.dumps({"inputText": text})
)
body = json.loads(response['body'].read())
time.sleep(0.05) # To prevent throttling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather remove this from here as we have a RateLimitHandler interface that could be handling these kind of issues (for now it is only applied on LLM calls for answer generation, but we have a plan to apply it to the embedders as well. The user may also want to customise the behaviour so fixing the behaviour here won't help.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. I will remove the
time.sleep(0.05)

Thanks

self,
model_id: str = 'amazon.titan-embed-text-v2:0',
region: str = 'us-east-1'
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be also good to allow other optional parameters in form of **kwargs (same as in the other embedder classes). same applies to the embed_query function below

Initialize the BedrockEmbeddings instance.

Args:
model_id (str): Identifier for the Bedrock Titan embedding model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe for the wrapper we can name it model instead of model_id (for consistency with the other embedding wrappers?)


from abc import ABC, abstractmethod
from typing import List
import boto3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this further involves 1) adding this package in the list of optional dependencies in the .toml file, 2) wrapping the import in a try-except block to avoid crashing the whole application if the module is not installed (please check how similar issue is handled in the other embedders e.g., here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants