Skip to content

feat(bundles): consolidate long-tail providers into lfx-bundles (45 providers)#13568

Draft
erichare wants to merge 4 commits into
bundles/create-lfx-bundlesfrom
bundles/bulk-move
Draft

feat(bundles): consolidate long-tail providers into lfx-bundles (45 providers)#13568
erichare wants to merge 4 commits into
bundles/create-lfx-bundlesfrom
bundles/bulk-move

Conversation

@erichare

@erichare erichare commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Bundle Separation Phase A — PR 6: bulk-move long-tail into lfx-bundles (engine + 45 providers)

Part of Bundle Separation Phase A — metapackage split (1.11). Stacked on the skeleton PR (bundles/create-lfx-bundles). The high-risk PR — collapsing providers into the metapackage must preserve pip install langflow exactly.

What this does

Adds scripts/migrate/consolidate_bundles.py (the inverse of port_bundle.py: moves in-tree providers into the manifest-less lfx-bundles metapackage) and runs it on 45 providers across three verified tranches:

  • Tranche 1 (5): tavily, exa, wikipedia, yahoosearch, wolframalpha
  • Tranche 3 (10, openai-SDK family, post-PR-8): aiml, azure, cometapi, deepseek, litellm, lmstudio, novita, openrouter, vllm, xai — langchain-openai wrapper; openai SDK declared only where directly imported; lmstudio's lazy NVIDIAEmbeddings path gets langchain-nvidia-ai-endpoints
  • Tranche 2 (30): vector stores (chroma, clickhouse, couchbase, milvus, mongodb, pgvector, pinecone, qdrant, supabase, upstash, weaviate), model providers (groq, mistral, ollama, perplexity, sambanova), tools/memory/data (apify, assemblyai, confluence, firecrawl, git, glean, icosacomputing, mem0, needle, scrapegraph, serpapi, unstructured, youtube, zep)

Per provider:

  • moves src/lfx/src/lfx/components/<p>/lfx_bundles/<p>/ (lowercase, per BUNDLE_NAME_RE);
  • leaves a fail-soft import shim (first line # lfx-bundles-shim) so from lfx.components.<p> import X keeps working when lfx-bundles is installed, and raises an actionable ImportError otherwise;
  • merges the provider's third-party deps into a PEP 685-normalized extra + regenerates the all aggregate;
  • appends the 4-entry migration block per Component class (276 entries total = 69 classes × 4, zero bare-name ambiguities) → ext:<p>:<Class>@official.

To avoid double registration, the in-tree walk (_load_components_dynamically) skips shimmed provider dirs, and component_index.json is regenerated (355→286 components, 95→50 modules — exactly the moved set; no stale standalone entries).

Dep parity (the headline risk)

Every spec is sourced from langflow-base's per-provider extras or its direct dependencies; langchain-community providers carry the wrapper plus the SDK the wrapper lazy-imports (e.g. pgvector, atlassian-python-api); requests is declared explicitly where imported (only transitive in today's env); pinecone keeps its python_version<'3.14' marker verbatim.

Verified at the resolution level: the uv.lock diff is +220 lines of lfx-bundles extras metadata with zero packages added or removed (name = diff empty) — pip install langflow resolves the identical set.

Test plan

  • Discovery finds all 45 bundles at @official, 69 components, no errors/warnings
  • Shims resolve across categories (qdrant, ollama, youtube, zep, tavily, wikipedia)
  • import lfx.components still works (engine safety)
  • Index drops exactly the moved entries; migration-table hooks (bare-name uniqueness + append-only) pass
  • 449 lfx extension unit tests pass; ruff + format clean
  • Lint/secrets exceptions travel with moved files (mongodb_atlas.py SLF001 per-file-ignore; mem0/mongodb detect-secrets baseline re-keyed)

Deliberately deferred (next tranches)

  • Provider-specific lfx.base.* (composio, huggingface, langwatch) — needs base-move support in the script
  • Case-sensitive names (FAISS, Notion) — BUNDLE_NAME_RE is lowercase-only; palette-name implications need a decision
  • from langflow importers (vlmrun), marker-heavy (cuga, altk), and community-wrapper providers with unverified runtime SDKs (baidu, bing, cloudflare, maritalk, vectara, searchapi, redis, cassandra, elastic, nvidia, …)
  • Partner set (openai/anthropic/amazon/datastax/cohere) excluded → PR 8
  • langflow-base per-provider extras left in place (harmless dep duplication; de-dup is a follow-up)

Stack

Base bundles/create-lfx-bundles (PR 2); the stack lands on release-1.11.0 bottom-up via PR 1 (#13563). Draft until QA signs off.

Adds scripts/migrate/consolidate_bundles.py (the inverse of port_bundle.py --
moves in-tree providers into the manifest-less lfx-bundles metapackage) and
runs it on a verified 5-provider tranche: tavily, exa, wikipedia, yahoosearch,
wolframalpha.

Per provider the script:
- moves src/lfx/src/lfx/components/<p>/ -> lfx_bundles/<p>/ (lowercase names);
- leaves a fail-soft import shim (first line `# lfx-bundles-shim`) so
  `from lfx.components.<p> import X` keeps working when lfx-bundles is
  installed, and raises an actionable ImportError otherwise;
- merges the provider's third-party deps into a PEP 685-normalized lfx-bundles
  extra and regenerates the `all` aggregate. Dep parity holds: `uv sync` is a
  no-op because those deps were already pulled via langflow-base[complete];
- appends the 4-entry migration block per Component class (28 entries) so saved
  flows referencing lfx.components.<p>.<Class> migrate to
  ext:<p>:<Class>@official.

To avoid double registration, the in-tree component walk
(_load_components_dynamically) now skips shimmed provider dirs, and
component_index.json is regenerated (355->348 components, 95->90 modules); the
moved providers load only at @official via lfx.bundles discovery.

Verified: discovery finds all 5 at @official with no errors; shims resolve;
`import lfx.components` still works; the index drops the 7 moved component
entries (residual `tools`-category name refs resolve via the shim); ruff clean.

First tranche proves the engine; the remaining long-tail scales by extending
PROVIDER_DEPS (each provider's deps verified individually -- the careful part).
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e48bbb11-3f58-468e-adf0-5cdb28a8cac7

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bundles/bulk-move

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the enhancement New feature or request label Jun 9, 2026
Extends PROVIDER_DEPS with 30 individually-verified providers and runs the
consolidation: vector stores (chroma, clickhouse, couchbase, milvus, mongodb,
pgvector, pinecone, qdrant, supabase, upstash, weaviate), model providers
(groq, mistral, ollama, perplexity, sambanova), and tools/memory/data (apify,
assemblyai, confluence, firecrawl, git, glean, icosacomputing, mem0, needle,
scrapegraph, serpapi, unstructured, youtube, zep).

Dep verification (the careful part): every spec comes from langflow-base's
per-provider extras or its direct dependencies; langchain-community providers
carry the wrapper plus the SDK the wrapper lazy-imports (e.g. pgvector,
atlassian-python-api); requests is declared explicitly where imported (it is
only transitive in today's env); pinecone keeps its python_version<'3.14'
marker verbatim. Tranche excludes: providers with langflow imports (vlmrun),
provider-specific lfx.base dirs (composio/huggingface/langwatch), case-
sensitive names (FAISS/Notion), the openai-SDK family (azure/aiml/deepseek/
litellm/lmstudio/novita/openrouter/vllm/xai/cometapi -- cleaner after PR-8),
and the partner set.

Dep parity verified at the resolution level: the uv.lock diff is +220 lines of
lfx-bundles extras metadata with ZERO packages added or removed (`name =` diff
empty), so pip install langflow resolves the identical set.

Also: 192 append-only migration entries (48 classes x 4, zero bare-name
ambiguities); component_index.json regenerated 348->300 components / 90->60
modules (exactly the moved set, no stale standalone entries); mongodb_atlas.py
SLF001 per-file-ignore and the mem0/mongodb detect-secrets baseline entries
migrated to the new paths (lint/secrets exceptions travel with moved files);
35 bundles now discover at @official with 55 components; shims verified across
categories; 449 extension tests pass.
@github-actions github-actions Bot removed the enhancement New feature or request label Jun 9, 2026
@erichare erichare changed the title feat(bundles): consolidate first long-tail tranche into lfx-bundles feat(bundles): consolidate long-tail providers into lfx-bundles (35 providers) Jun 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Jun 9, 2026
10 providers that ride the langchain-openai wrapper, deferred from tranche 2
until the partner graduation settled the openai-SDK dep story: aiml, azure,
cometapi, deepseek, litellm, lmstudio, novita, openrouter, vllm, xai.

Dep verification: every provider declares langchain-openai>=1.1.6; the openai
SDK is declared only where a component imports it directly (aiml, deepseek,
litellm, lmstudio, vllm, xai -- wrapper-transitive elsewhere); requests
declared where imported (cometapi, deepseek, novita, xai); lmstudio's lazy
NVIDIAEmbeddings path gets langchain-nvidia-ai-endpoints~=1.0.0. The litellm
component drives LiteLLM-served endpoints through the OpenAI client and does
NOT import the litellm package -- langflow-base's litellm extra stays put.
These providers use the lazy _dynamic_imports __init__ shape; it survives the
move unchanged (import_mod resolves via __spec__.parent and
lfx.components._importing remains a core helper).

Dep parity: uv.lock diff has zero package additions/removals (all specs
already resolved via langflow-base[complete]).

Also: 56 append-only migration entries (14 classes x 4, zero ambiguities);
component_index.json regenerated 300->286 components / 60->50 modules (exactly
the moved set); detect-secrets baseline re-keyed; 45 bundles now discover at
@official with 69 components; shims verified (azure/xai/litellm/deepseek);
ruff clean.
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Jun 9, 2026
@erichare erichare changed the title feat(bundles): consolidate long-tail providers into lfx-bundles (35 providers) feat(bundles): consolidate long-tail providers into lfx-bundles (45 providers) Jun 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Jun 9, 2026
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant