From 253b193c0d23444320734acca8921b848a067306 Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Wed, 21 May 2025 12:11:32 +0100 Subject: [PATCH 01/12] stamp: starting with `ai` docs - to rebase later --- ext/ai/README.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 ext/ai/README.md diff --git a/ext/ai/README.md b/ext/ai/README.md new file mode 100644 index 000000000..d4627b347 --- /dev/null +++ b/ext/ai/README.md @@ -0,0 +1,25 @@ +# Supabase AI module + +This crate is part of the Supabase Edge Runtime stack and implements AI related +features for the `Supabase.ai` namespace. + +## Model Execution Engine + +`Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model +execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings. + +TODO: add photo + +Following there's specific documentation for both "lands": + +
+ Javascript/Frontend +
+ +
+ Rust/Backend +
+ +onnxruntime: + +the Session class: From 9b5cb8a4e408d7f4b3b151a3f379dc750d88ae9a Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Wed, 21 May 2025 15:00:03 +0100 Subject: [PATCH 02/12] stamp: trying with assets rebase later --- assets/docs/ai/onnx-backend-dark.svg | 4 ++++ assets/docs/ai/onnx-backend.svg | 4 ++++ ext/ai/README.md | 8 +++++++- 3 files changed, 15 insertions(+), 1 deletion(-) create mode 100644 assets/docs/ai/onnx-backend-dark.svg create mode 100644 assets/docs/ai/onnx-backend.svg diff --git a/assets/docs/ai/onnx-backend-dark.svg b/assets/docs/ai/onnx-backend-dark.svg new file mode 100644 index 000000000..146c97388 --- /dev/null +++ b/assets/docs/ai/onnx-backend-dark.svg @@ -0,0 +1,4 @@ + + +Onnx Backend(nlp tasks + ort rust)Shipped AssetsModel loadingDownload and cacheorShared instanceWorker 1Warm- Model InferenceInit- Model Lazy LoadingWorker 2Warm- Model InferenceInit- Model Lazy LoadingExecution time:- 1º request: 1s / 5s- 2º request: 400ms Execution time:- 1º request: 400ms- 2º request: 400ms /api/api \ No newline at end of file diff --git a/assets/docs/ai/onnx-backend.svg b/assets/docs/ai/onnx-backend.svg new file mode 100644 index 000000000..9a2397d4d --- /dev/null +++ b/assets/docs/ai/onnx-backend.svg @@ -0,0 +1,4 @@ + + +Onnx Backend(nlp tasks + ort rust)Shipped AssetsModel loadingDownload and cacheorShared instanceWorker 1Warm- Model InferenceInit- Model Lazy LoadingWorker 2Warm- Model InferenceInit- Model Lazy LoadingExecution time:- 1º request: 1s / 5s- 2º request: 400ms Execution time:- 1º request: 400ms- 2º request: 400ms /api/api \ No newline at end of file diff --git a/ext/ai/README.md b/ext/ai/README.md index d4627b347..4e7bc5bdf 100644 --- a/ext/ai/README.md +++ b/ext/ai/README.md @@ -8,7 +8,13 @@ features for the `Supabase.ai` namespace. `Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings. -TODO: add photo +

+ + + + ONNX Backend illustration + +

Following there's specific documentation for both "lands": From 16551d38fc875fa5cddc3684913c475b4ac2c62c Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Wed, 21 May 2025 16:08:08 +0100 Subject: [PATCH 03/12] add: onnxruntime instructions --- ext/ai/README.md | 59 +++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 56 insertions(+), 3 deletions(-) diff --git a/ext/ai/README.md b/ext/ai/README.md index 4e7bc5bdf..dac405e5b 100644 --- a/ext/ai/README.md +++ b/ext/ai/README.md @@ -5,9 +5,6 @@ features for the `Supabase.ai` namespace. ## Model Execution Engine -`Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model -execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings. -

@@ -16,10 +13,66 @@ execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings.

+`Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model +execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings. + Following there's specific documentation for both "lands":
Javascript/Frontend + +The **onnxruntime** API is available from `globalThis` and shares similar specs of [onnxruntime-common](https://github.com/microsoft/onnxruntime/tree/main/js/common). + +The available items are: + +- `Tensor`: represent a basic tensor with specified dimensions and data type. -- "The AI input/output" +- `InferenceSession`: represent the inner model session. -- "The AI model itself" + +### Usage + +It can be used from the exported `globalThis[Symbol.for("onnxruntime")]` -- +but manipulating it directly is not trivial, so in the future you may use the [Inference API #501](https://github.com/supabase/edge-runtime/pull/501) for a more user friendly API. + +```typescript +const { InferenceSession, Tensor } = globalThis[Symbol.for("onnxruntime")]; + +// 'create()' supports an url string buffer or the binary data +const modelUrlBuffer = new TextEncoder().encode("https://huggingface.co/Supabase/gte-small/resolve/main/onnx/model_quantized.onnx"); +const session = await InferenceSession.create(modelUrlBuffer); + +// Example only, in real 'feature-extraction' tensors must be created from the tokenizer step. +const inputs = { + input_ids: new Tensor('float32', [1, 2, 3...], [1, 384]), + attention_mask: new Tensor('float32', [...], [1, 384]), + token_types_ids: new Tensor('float32', [...], [1, 384]) +}; + +const { last_hidden_state } = await session.run(inputs); +console.log(last_hidden_state); +``` + +### Third party libs + +Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching. For further information pleas check the [PR #436](https://github.com/supabase/edge-runtime/pull/436) + +> [!WARNING] +> At this moment users need to explicit target `device: 'auto'` to enable the platform compatibility. + +```typescript +import { env, pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.1'; + +// Broswer cache is now supported for `onnx` models +env.useBrowserCache = true; +env.allowLocalModels = false; + +const pipe = await pipeline('feature-extraction', 'supabase/gte-small', { device: 'auto' }); + +const output = await pipe("This embed will be generated from rust land", { + pooling: 'mean', + normalize: true +}); +``` +
From dd2ddad190bfab4bdc66930cebd12784ca4c5eef Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Wed, 21 May 2025 16:19:12 +0100 Subject: [PATCH 04/12] stamp: self-hosting onnxruntime instructions --- ext/ai/README.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/ext/ai/README.md b/ext/ai/README.md index dac405e5b..f93915f29 100644 --- a/ext/ai/README.md +++ b/ext/ai/README.md @@ -73,6 +73,29 @@ const output = await pipe("This embed will be generated from rust land", { }); ``` +### Self-Hosting + +**Caching filepath**: +The `EXT_AI_CACHE_DIR` environment variable can be use to set a custom cache path + +**Memory clean up**: +For Self-Hosting users an extra method is available for `main/index.ts` scope and should be used to clean up unused sessions, consider adding it into your main entrypoint file: + +```typescript +// cleanup unused sessions every 30s +setInterval(async () => { + try { + const cleanupCount = await EdgeRuntime.ai.tryCleanupUnusedSession(); + if (cleanupCount == 0) { + return; + } + console.log('EdgeRuntime.ai.tryCleanupUnusedSession', cleanupCount); + } catch (e) { + console.error(e.toString()); + } +}, 30 * 1000); +``` +
From 7c6c0b565e118403f29daa4577de3e5f5aee570a Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Fri, 23 May 2025 20:46:48 +0100 Subject: [PATCH 05/12] stamp: adding `Session` information --- ext/ai/README.md | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/ext/ai/README.md b/ext/ai/README.md index f93915f29..d3e6acf85 100644 --- a/ext/ai/README.md +++ b/ext/ai/README.md @@ -16,17 +16,15 @@ features for the `Supabase.ai` namespace. `Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings. -Following there's specific documentation for both "lands": -
- Javascript/Frontend + Javascript docs The **onnxruntime** API is available from `globalThis` and shares similar specs of [onnxruntime-common](https://github.com/microsoft/onnxruntime/tree/main/js/common). The available items are: -- `Tensor`: represent a basic tensor with specified dimensions and data type. -- "The AI input/output" -- `InferenceSession`: represent the inner model session. -- "The AI model itself" +- `Tensor`: Represent a basic tensor with specified dimensions and data type. -- "The AI input/output" +- `InferenceSession`: Represent the inner model session. -- "The AI model itself" ### Usage @@ -53,7 +51,7 @@ console.log(last_hidden_state); ### Third party libs -Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching. For further information pleas check the [PR #436](https://github.com/supabase/edge-runtime/pull/436) +Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching. For further information please check the [PR #436](https://github.com/supabase/edge-runtime/pull/436) > [!WARNING] > At this moment users need to explicit target `device: 'auto'` to enable the platform compatibility. @@ -98,10 +96,11 @@ setInterval(async () => {
-
- Rust/Backend -
+## The `Session` class -onnxruntime: +Prior versions has [introduced](https://supabase.com/blog/ai-inference-now-available-in-supabase-edge-functions) the `Session` class as alternative to `transformers.js` for *gte-small* model and then was used to provide a [LLM interface](https://supabase.com/docs/guides/functions/ai-models?queryGroups=platform&platform=ollama#using-large-language-models-llm) for Ollama and some other providers. -the Session class: +Since the **Model Execution Engine** was created the `Session` class now can focus on LLM interface while the `Session('gte-small')` is for compatibility purposes only. + +> [!WARNING] +> Docs for Session class will end here - There's a open [PR #539](https://github.com/supabase/edge-runtime/pull/539) that may change a lot of things for it. From d5e0a4ed4e72576b37408ec23f62b8b6aa2c28b0 Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Fri, 23 May 2025 21:11:52 +0100 Subject: [PATCH 06/12] stamp: adding useful links to main Readme --- README.md | 11 +++++++++++ crates/base/README.md | 0 ext/node/README.md | 10 +++++++++- ext/node/polyfills/README.md | 7 ++++--- 4 files changed, 24 insertions(+), 4 deletions(-) create mode 100644 crates/base/README.md diff --git a/README.md b/README.md index 412c0ac73..3c0b9e14c 100644 --- a/README.md +++ b/README.md @@ -32,6 +32,17 @@ The edge runtime can be divided into two runtimes with different purposes. - Limits are required to be set such as: Memory and Timeouts. - Has access to environment variables explictly allowed by the main runtime. +### Edge Runtime in Deep + +#### Conceptual + +- [EdgeRuntime Base](/crates/base/README.md): Overalls about how EdgeRuntime is based on Deno. + +#### Extension Modules + +- [AI](/ext/ai/README.md): Implements AI related features. +- [NodeJs](/ext/node/README.md) & [NodeJs Polyfills](/ext/node/polyfills/README.md): Implements the NodeJs compatibility layer. + ## Developers To learn how to build / test Edge Runtime, visit [DEVELOPERS.md](DEVELOPERS.md) diff --git a/crates/base/README.md b/crates/base/README.md new file mode 100644 index 000000000..e69de29bb diff --git a/ext/node/README.md b/ext/node/README.md index d154d8cb6..87d08a664 100644 --- a/ext/node/README.md +++ b/ext/node/README.md @@ -1,3 +1,11 @@ -# deno_node +# Supabase Node module + +This crate is part of the Supabase Edge Runtime stack and implements NodeJs +related features. + +To see all compatible features, please check the +[NodeJs Polyfills](/ext/node/polyfills/README.md) section. + +## deno_node `require` and other node related functionality for Deno. diff --git a/ext/node/polyfills/README.md b/ext/node/polyfills/README.md index 26527278e..1e6e82bdc 100644 --- a/ext/node/polyfills/README.md +++ b/ext/node/polyfills/README.md @@ -1,6 +1,7 @@ -# Deno Node.js compatibility +# Supabase Node.js compatibility module -This module is meant to have a compatibility layer for the +This crate is part of the Supabase Edge Runtime stack and implements a +compatibility layer for the [Node.js standard library](https://nodejs.org/docs/latest/api/). **Warning**: Any function of this module should not be referred anywhere in the @@ -59,7 +60,7 @@ Deno standard library as it's a compatibility module. - [x] worker_threads - [ ] zlib -* [x] node globals _partly_ +- [x] node globals _partly_ ### Deprecated From 4af69852a606e226703b958e4cc0c70f6f4d0b4b Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Fri, 23 May 2025 21:20:06 +0100 Subject: [PATCH 07/12] stamp: improving --- ext/ai/README.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/ext/ai/README.md b/ext/ai/README.md index d3e6acf85..aece5d0a4 100644 --- a/ext/ai/README.md +++ b/ext/ai/README.md @@ -16,19 +16,17 @@ features for the `Supabase.ai` namespace. `Supabase.ai` uses [onnxruntime](https://onnxruntime.ai/) as internal model execution engine, backend by [ort pyke](https://ort.pyke.io/) rust bindings. -
- Javascript docs - The **onnxruntime** API is available from `globalThis` and shares similar specs of [onnxruntime-common](https://github.com/microsoft/onnxruntime/tree/main/js/common). The available items are: -- `Tensor`: Represent a basic tensor with specified dimensions and data type. -- "The AI input/output" -- `InferenceSession`: Represent the inner model session. -- "The AI model itself" +- `Tensor`: Represent a basic tensor with specified dimensions and data type. - "The AI input/output" +- `InferenceSession`: Represent the inner model session. - "The AI model itself" -### Usage +
+Usage -It can be used from the exported `globalThis[Symbol.for("onnxruntime")]` -- +It can be used from the exported `globalThis[Symbol.for("onnxruntime")]` - but manipulating it directly is not trivial, so in the future you may use the [Inference API #501](https://github.com/supabase/edge-runtime/pull/501) for a more user friendly API. ```typescript @@ -49,6 +47,8 @@ const { last_hidden_state } = await session.run(inputs); console.log(last_hidden_state); ``` +
+ ### Third party libs Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching. For further information please check the [PR #436](https://github.com/supabase/edge-runtime/pull/436) @@ -94,8 +94,6 @@ setInterval(async () => { }, 30 * 1000); ``` -
- ## The `Session` class Prior versions has [introduced](https://supabase.com/blog/ai-inference-now-available-in-supabase-edge-functions) the `Session` class as alternative to `transformers.js` for *gte-small* model and then was used to provide a [LLM interface](https://supabase.com/docs/guides/functions/ai-models?queryGroups=platform&platform=ollama#using-large-language-models-llm) for Ollama and some other providers. From e08be8f84f5e28ba949f3693c11d5f034eb13341 Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Fri, 23 May 2025 21:27:09 +0100 Subject: [PATCH 08/12] stamp(ai): adding reference to tests folder --- ext/ai/README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ext/ai/README.md b/ext/ai/README.md index aece5d0a4..7633f327e 100644 --- a/ext/ai/README.md +++ b/ext/ai/README.md @@ -51,7 +51,8 @@ console.log(last_hidden_state); ### Third party libs -Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching. For further information please check the [PR #436](https://github.com/supabase/edge-runtime/pull/436) +Originaly this backend was created to implicit integrate with [transformers.js](https://github.com/huggingface/transformers.js/). This way users can still consuming a high-level lib at same time they benefits of all Supabase's Model Execution Engine features, like model optimization and caching. +For further information please check the [PR #436](https://github.com/supabase/edge-runtime/pull/436) as well the [tests folder](/crates/base/test_cases/ai-ort-rust-backend/transformers-js) > [!WARNING] > At this moment users need to explicit target `device: 'auto'` to enable the platform compatibility. From 6dd5f1c0cee9b1e40ac87223f9fb919a8cb6c5f3 Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Mon, 26 May 2025 11:27:16 +0100 Subject: [PATCH 09/12] stamp: improving edge-runtime diagram --- README.md | 8 +++++++- assets/edge-runtime-diagram-dark.svg | 4 ++++ assets/edge-runtime-diagram.svg | 25 ++++--------------------- 3 files changed, 15 insertions(+), 22 deletions(-) create mode 100644 assets/edge-runtime-diagram-dark.svg diff --git a/README.md b/README.md index 3c0b9e14c..41a5bec8f 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,13 @@ Options** ## Architecture -![Sequence diagram of Edge Runtime request flow](assets/edge-runtime-diagram.svg?raw=true) +

+ + + + Sequence diagram of Edge Runtime request flow + +

The edge runtime can be divided into two runtimes with different purposes. diff --git a/assets/edge-runtime-diagram-dark.svg b/assets/edge-runtime-diagram-dark.svg new file mode 100644 index 000000000..e440ef85c --- /dev/null +++ b/assets/edge-runtime-diagram-dark.svg @@ -0,0 +1,4 @@ + + +Incomming http reqSb. Worker IsolateMain WorkerSb extensionsUser Worker/endpoint-AUser Worker/endpoint-B Deno CoreSb. InterceptorsHyperOutcome http resEdge Runtime \ No newline at end of file diff --git a/assets/edge-runtime-diagram.svg b/assets/edge-runtime-diagram.svg index 63cfa5ffe..73a032d70 100644 --- a/assets/edge-runtime-diagram.svg +++ b/assets/edge-runtime-diagram.svg @@ -1,21 +1,4 @@ - - - - - - - - - - - - - - - - - - - - - + + +Incomming http reqSb. Worker IsolateMain WorkerSb extensionsUser Worker/endpoint-AUser Worker/endpoint-B Deno CoreSb. InterceptorsHyperOutcome http resEdge Runtime \ No newline at end of file From 0e66325487bdd98e88af0528423b5b28a5ea066b Mon Sep 17 00:00:00 2001 From: Kalleby Santos <105971119+kallebysantos@users.noreply.github.com> Date: Mon, 26 May 2025 11:28:01 +0100 Subject: [PATCH 10/12] fix: diagram width --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 41a5bec8f..86ea30b97 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ Options** - Sequence diagram of Edge Runtime request flow + Sequence diagram of Edge Runtime request flow

From e9271a3d45641455af1bc66a5c984a7fc9a703d9 Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Mon, 26 May 2025 11:31:47 +0100 Subject: [PATCH 11/12] stamp: add diagram caption --- assets/edge-runtime-diagram-dark.svg | 4 ++-- assets/edge-runtime-diagram.svg | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/assets/edge-runtime-diagram-dark.svg b/assets/edge-runtime-diagram-dark.svg index e440ef85c..f19e8b5d2 100644 --- a/assets/edge-runtime-diagram-dark.svg +++ b/assets/edge-runtime-diagram-dark.svg @@ -1,4 +1,4 @@ -Incomming http reqSb. Worker IsolateMain WorkerSb extensionsUser Worker/endpoint-AUser Worker/endpoint-B Deno CoreSb. InterceptorsHyperOutcome http resEdge Runtime \ No newline at end of file +Incomming http reqSb. Worker IsolateMain WorkerSb extensionsUser Worker/endpoint-AUser Worker/endpoint-B Deno CoreSb. InterceptorsHyperOutcome http resEdge Runtimerusttypescript / javascript \ No newline at end of file diff --git a/assets/edge-runtime-diagram.svg b/assets/edge-runtime-diagram.svg index 73a032d70..bc4036c70 100644 --- a/assets/edge-runtime-diagram.svg +++ b/assets/edge-runtime-diagram.svg @@ -1,4 +1,4 @@ -Incomming http reqSb. Worker IsolateMain WorkerSb extensionsUser Worker/endpoint-AUser Worker/endpoint-B Deno CoreSb. InterceptorsHyperOutcome http resEdge Runtime \ No newline at end of file +Incomming http reqSb. Worker IsolateMain WorkerSb extensionsUser Worker/endpoint-AUser Worker/endpoint-B Deno CoreSb. InterceptorsHyperOutcome http resEdge Runtimerusttypescript / javascript \ No newline at end of file From 6b0f3b5954ca5b0fb6f42eb3ba64befddf2e0f6b Mon Sep 17 00:00:00 2001 From: kallebysantos Date: Mon, 18 Aug 2025 12:59:47 +0100 Subject: [PATCH 12/12] stamp: add base runtime diagram --- README.md | 2 +- crates/base/README.md | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 86ea30b97..0304d9f56 100644 --- a/README.md +++ b/README.md @@ -36,7 +36,7 @@ The edge runtime can be divided into two runtimes with different purposes. - User runtime: - An instance for the _user runtime_ is responsible for executing users' code. - Limits are required to be set such as: Memory and Timeouts. - - Has access to environment variables explictly allowed by the main runtime. + - Has access to environment variables explicitly allowed by the main runtime. ### Edge Runtime in Deep diff --git a/crates/base/README.md b/crates/base/README.md index e69de29bb..a1374b042 100644 --- a/crates/base/README.md +++ b/crates/base/README.md @@ -0,0 +1,14 @@ +# Supabase EdgeRuntime base + +This crate is part of the Supabase Edge Runtime stack and implements the runtime +core features. + +## Architecture + +

+ + + + Sequence diagram of Edge Runtime request flow + +