Skip to content

Conversation

ai-solution-dev
Copy link

Description

This PR integrates the Polaris AI DataInsight Document Loader into LangChain.js.
The integration enables the use of Polaris AI's advanced document extraction capabilities to process various document formats and extract text, images, tables, charts, and mathematical equations.


Key Features

  • Added Polaris AI DataInsight Document Loader support in LangChain.js

    • Load documents either from the file system or buffer data
    • Extract text, images, tables, charts, and equations from various document formats
    • Process documents through Polaris AI's DataInsight API service
  • Flexible document loading modes

    • element: Load each element in the pages as a separate Document object
    • page: Load each page in the document as a separate Document object
    • single: Load the entire document as a single Document object
  • Comprehensive document element support

    • Text content extraction
    • Table processing with HTML output
    • Chart data with CSV format and metadata
    • Image handling with file path references
  • Source: Located under libs/langchain-community

  • Includes comprehensive testing and examples for the integration


Example Usage

import { PolarisAIDataInsightLoader } from "@langchain/community/document_loaders/web/polaris_ai_datainsight";
import fs from "fs";



const loaderFromPath = new PolarisAIDataInsightLoader({
  filePath: "path/to/file.docx",
  apiKey: process.env.POLARIS_AI_DATA_INSIGHT_API_KEY,
  resourcesDir: "path/to/save/resources/",
  mode: "single",
});
const docsFromPath = await loaderFromPath.load();



const docsFromBuffer = await loaderFromBuffer.load();

…s and examples

    - Add PolarisAIDataInsightLoader for extracting documents from Polaris AI DataInsight API
    - Implement load(), file handling, unzip, and resource mapping
    - Match error messages and comments with Python implementation
    - Provide success and failure unit tests covering element, page, and single modes
    - Add example script demonstrating usage of PolarisAIDataInsightLoader
    - Ensure compliance with LangChain.js coding and linting guidelines
Copy link

changeset-bot bot commented Oct 13, 2025

⚠️ No Changeset found

Latest commit: d3029d7

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions bot added community Issues related to `@langchain/community` examples labels Oct 13, 2025
Copy link

vercel bot commented Oct 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Updated (UTC)
langchainjs-docs Ignored Ignored Oct 13, 2025 5:56am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Issues related to `@langchain/community` examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant