Skip to content

Add support for Amazon Bedrock Citations Option within DocumentBlock object #8511

@lewwolfe

Description

@lewwolfe

Description

Description

The AI SDK should support the new AWS Bedrock Citations API feature that was recently added to the platform. This feature is crucial for improved PDF parsing capabilities, particularly for image-based PDFs, and provides native citation generation for models that support it (such as Claude/Anthropic).

Background

AWS Bedrock now supports a citations configuration option in the DocumentBlock as documented in the [AWS API Reference](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_DocumentBlock.html). This feature enables:

  1. Enhanced PDF Processing: Bedrock can natively scan image PDFs for supported models
  2. Automatic Citation Generation: The model can generate citations that reference source documents
  3. Better Document Understanding: Improved contextual understanding of document content

Proposed Implementation

The citations configuration should be added as an optional parameter with the following structure:

export interface BedrockDocumentBlock {
  document: {
    format: BedrockDocumentFormat;
    name: string;
    source: {
      bytes: string;
    };
    citations?: {
      enabled: boolean;
    };
  };
}

The implementation can follow the same pattern as the existing getCachePoint function:

function getCachePoint(
  providerMetadata: SharedV2ProviderMetadata | undefined,
): BedrockCachePoint | undefined {
  return providerMetadata?.bedrock?.cachePoint as BedrockCachePoint | undefined;
}

function getCitationsConfig(
  providerMetadata: SharedV2ProviderMetadata | undefined,
): BedrockCitationsConfig | undefined {
  return providerMetadata?.bedrock?.citations as BedrockCitationsConfig | undefined;
}

This can then be utilized when adding valid BedrockDocumentBlock to bedrockContent or in a similar way to getCachePoint

bedrockContent.push({
  document: {
    format: getBedrockDocumentFormat(part.mediaType),
    name: generateDocumentName(),
    source: { bytes: convertToBase64(part.data) },
    citations: getCitationsConfig(providerOptions),
  },
});

Benefits

  • Improved PDF Support: Native handling of image-based PDFs without external OCR
  • Better Attribution: Automatic citation generation for document-based responses
  • Enhanced Accuracy: Models can better reference specific parts of source documents
  • Future-Proofing: Aligns with AWS Bedrock's roadmap and capabilities

References

This feature is particularly valuable for applications that work with complex documents, research materials, and scenarios where source attribution is critical.

AI SDK Version

  • AI: Latest
  • AI/Bedrock: Latest

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions