From df3381e364d2ce688114bc682aae7e19f4e71d91 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Fri, 22 Aug 2025 10:21:40 +0200 Subject: [PATCH 1/9] tutorial using qwen imge edit and inference providers --- .../guides/image-editor.md | 324 ++++++++++++++++++ 1 file changed, 324 insertions(+) create mode 100644 docs/inference-providers/guides/image-editor.md diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md new file mode 100644 index 000000000..4c1642bbc --- /dev/null +++ b/docs/inference-providers/guides/image-editor.md @@ -0,0 +1,324 @@ +# Building an AI Image Editor with Gradio and Inference Providers + +In this guide, we'll build an AI-powered image editor that lets users upload images and edit them using natural language prompts. This project demonstrates how to combine Inference Providers with image-to-image models like [Qwen's Image Edit](https://huggingface.co/Qwen/Qwen-Image-Edit). + +Our app will: + +1. **Accept image uploads** through a web interface +2. **Process natural language prompts** editing instructions like "Turn the cat into a tiger" +3. **Transform images** using Qwen Image Edit +4. **Display results** in a Gradio interface + + + +This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co). + + + +## Step 1: Set Up Authentication + +Before we start coding, authenticate with Hugging Face using your token: + +```bash +# Get your token from https://huggingface.co/settings/tokens +export HF_TOKEN="your_token_here" +``` + +When you set this environment variable, it handles authentication automatically for all your inference calls. You can generate a token from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). + +## Step 2: Project Setup + +Create a new project directory and initialize it with uv: + +```bash +mkdir image-editor-app +cd image-editor-app +uv init +``` + +This creates a basic project structure with a `pyproject.toml` file. Now add the required dependencies: + +```bash +uv add huggingface-hub>=0.34.4 gradio>=5.0.0 pillow>=11.3.0 +``` + +The dependencies are now installed and ready to use! Also, `uv` will create a handy `pyproject.toml` file for you to manage your dependencies as a project. + + + +We're using `uv` because it's a fast Python package manager that handles dependency resolution and virtual environment management automatically. It's much faster than pip and provides better dependency resolution. If you're not familiar with `uv`, check it out [here](https://docs.astral.sh/uv/). + + + +## Step 3: Build the Core Image Editing Function + +Now let's create the main logic for our application - the image editing function that transforms images using AI. + +Create `main.py` then import the necessary libraries and instantiate the InferenceClient. We're using the `fal-ai` provider for fast image processing, but other providers are available. + +```python +import os +import gradio as gr +from huggingface_hub import InferenceClient +import io + +# Initialize the client with fal-ai provider for fast image processing +client = InferenceClient( + provider="fal-ai", + api_key=os.environ["HF_TOKEN"], +) +``` + +Now let's create the image editing function. This function takes an input image and a prompt, and returns an edited image. We also want to handle errors gracefully and return the original image if there's an error, so our UI always shows something. + +```python +def edit_image(input_image, prompt): + """ + Edit an image using the given prompt. + + Args: + input_image: PIL Image object from Gradio + prompt: String prompt for image editing + + Returns: + PIL Image object (edited image) + """ + if input_image is None: + return None + + if not prompt or prompt.strip() == "": + return input_image + + try: + # Convert PIL Image to bytes + img_bytes = io.BytesIO() + input_image.save(img_bytes, format="PNG") + img_bytes = img_bytes.getvalue() + + # Use the image_to_image method with Qwen's image editing model + edited_image = client.image_to_image( + img_bytes, + prompt=prompt.strip(), + model="Qwen/Qwen-Image-Edit", + ) + + return edited_image + + except Exception as e: + print(f"Error editing image: {e}") + return input_image +``` + + + +We're using the `fal-ai` provider with the `Qwen/Qwen-Image-Edit` model. The fal-ai provider offers fast inference times, perfect for interactive applications. In some use cases, you might want to switch between providers for maximum performance. Whilst in others you might want to go for the consistency of a single provider. + +You can experiment with different providers for various performance characteristics: + +```python +client = InferenceClient(provider="replicate", api_key=os.environ["HF_TOKEN"]) +client = InferenceClient(provider="auto", api_key=os.environ["HF_TOKEN"]) # Automatic selection +``` + + + +## Step 4: Create the Gradio Interface + +Now let's build a simple user-friendly interface using Gradio. + +```python +# Create the Gradio interface +with gr.Blocks(title="Image Editor", theme=gr.themes.Soft()) as interface: + gr.Markdown( + """ + # 🎨 AI Image Editor + Upload an image and describe how you want to edit it using natural language! + """ + ) + + with gr.Row(): + with gr.Column(): + input_image = gr.Image(label="Upload Image", type="pil", height=400) + prompt = gr.Textbox( + label="Edit Prompt", + placeholder="Describe how you want to edit the image...", + lines=2, + ) + edit_btn = gr.Button("✨ Edit Image", variant="primary", size="lg") + + with gr.Column(): + output_image = gr.Image(label="Edited Image", type="pil", height=400) + + # Example images and prompts + with gr.Row(): + gr.Examples( + examples=[ + ["cat.png", "Turn the cat into a tiger"], + ["cat.png", "Make it look like a watercolor painting"], + ["cat.png", "Change the background to a forest"], + ], + inputs=[input_image, prompt], + outputs=output_image, + fn=edit_image, + cache_examples=False, + ) + + # Event handlers + edit_btn.click(fn=edit_image, inputs=[input_image, prompt], outputs=output_image) + + # Allow Enter key to trigger editing + prompt.submit(fn=edit_image, inputs=[input_image, prompt], outputs=output_image) +``` + +In this app we'll use some practical Gradio features to make a user-friendly app + +- We'll use blocks to create a two column layout with the image upload and the edited image. +- We'll drop some markdown into to explain what the app does. +- And, we'll use `gr.Examples` to show some example inputs to give the user some inspiration. + +Finally, add the launch configuration at the end of `main.py`: + +```python +if __name__ == "__main__": + interface.launch( + share=True, # Creates a public link + server_name="0.0.0.0", # Allow external access + server_port=7860, # Default Gradio port + show_error=True, # Show errors in the interface + ) +``` + +Now run your application: + +```bash +python main.py +``` + +Your app will launch locally at `http://localhost:7860` and Gradio will also provide a public shareable link! + + +## Complete Working Code + +
+📋 Click to view the complete main.py file + +```python +import os +import gradio as gr +from huggingface_hub import InferenceClient +from PIL import Image +import io + +# Initialize the client +client = InferenceClient( + provider="fal-ai", + api_key=os.environ["HF_TOKEN"], +) + +def edit_image(input_image, prompt): + """ + Edit an image using the given prompt. + + Args: + input_image: PIL Image object from Gradio + prompt: String prompt for image editing + + Returns: + PIL Image object (edited image) + """ + if input_image is None: + return None + + if not prompt or prompt.strip() == "": + return input_image + + try: + # Convert PIL Image to bytes + img_bytes = io.BytesIO() + input_image.save(img_bytes, format="PNG") + img_bytes = img_bytes.getvalue() + + # Use the image_to_image method + edited_image = client.image_to_image( + img_bytes, + prompt=prompt.strip(), + model="Qwen/Qwen-Image-Edit", + ) + + return edited_image + + except Exception as e: + print(f"Error editing image: {e}") + return input_image + +# Create Gradio interface +with gr.Blocks(title="Image Editor", theme=gr.themes.Soft()) as interface: + gr.Markdown( + """ + # 🎨 AI Image Editor + Upload an image and describe how you want to edit it using natural language! + """ + ) + + with gr.Row(): + with gr.Column(): + input_image = gr.Image(label="Upload Image", type="pil", height=400) + prompt = gr.Textbox( + label="Edit Prompt", + placeholder="Describe how you want to edit the image...", + lines=2, + ) + edit_btn = gr.Button("✨ Edit Image", variant="primary", size="lg") + + with gr.Column(): + output_image = gr.Image(label="Edited Image", type="pil", height=400) + + # Example images and prompts + with gr.Row(): + gr.Examples( + examples=[ + ["cat.png", "Turn the cat into a tiger"], + ["cat.png", "Make it look like a watercolor painting"], + ["cat.png", "Change the background to a forest"], + ], + inputs=[input_image, prompt], + outputs=output_image, + fn=edit_image, + cache_examples=False, + ) + + # Event handlers + edit_btn.click(fn=edit_image, inputs=[input_image, prompt], outputs=output_image) + + # Allow Enter key to trigger editing + prompt.submit(fn=edit_image, inputs=[input_image, prompt], outputs=output_image) + +if __name__ == "__main__": + interface.launch( + share=True, # Creates a public link + server_name="0.0.0.0", # Allow external access + server_port=7860, # Default Gradio port + show_error=True, # Show errors in the interface + ) +``` + +
+ +## Deploy on Hugging Face Spaces + +1. **Create a new Space**: Go to [huggingface.co/new-space](https://huggingface.co/new-space) +2. **Choose Gradio SDK** and make it public +3. **Upload your files**: Upload `main.py` and any example images +4. **Add your token**: In Space settings, add `HF_TOKEN` as a secret +5. **Launch**: Your app will be live at `https://huggingface.co/spaces/your-username/your-space-name` + + +# Next Steps + +Congratulations! You've created a production-ready AI image editor. Now that you have a working image editor, here are some ideas to extend it: + +- **Batch processing**: Edit multiple images at once +- **Object removal**: Remove unwanted objects from images +- **Provider comparison**: Benchmark different providers for your use case + +Happy building! And remember to share your app with the community on the hub. From 96230bbe636ddda59b4cdc4b3bc7f446a7614bf5 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Fri, 22 Aug 2025 10:25:28 +0200 Subject: [PATCH 2/9] add to toc --- docs/inference-providers/_toctree.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/inference-providers/_toctree.yml b/docs/inference-providers/_toctree.yml index 23a7c71f7..3feaac75f 100644 --- a/docs/inference-providers/_toctree.yml +++ b/docs/inference-providers/_toctree.yml @@ -54,6 +54,8 @@ title: Function Calling - local: guides/gpt-oss title: How to use OpenAI gpt-oss + - local: guides/image-editor + title: Build an Image Editor - title: API Reference From 93962d90082cea5b9396af14b0983a1c929d1257 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Fri, 22 Aug 2025 11:59:39 +0200 Subject: [PATCH 3/9] add flux kontext as an example --- docs/inference-providers/guides/image-editor.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md index 4c1642bbc..646ff204e 100644 --- a/docs/inference-providers/guides/image-editor.md +++ b/docs/inference-providers/guides/image-editor.md @@ -1,6 +1,6 @@ # Building an AI Image Editor with Gradio and Inference Providers -In this guide, we'll build an AI-powered image editor that lets users upload images and edit them using natural language prompts. This project demonstrates how to combine Inference Providers with image-to-image models like [Qwen's Image Edit](https://huggingface.co/Qwen/Qwen-Image-Edit). +In this guide, we'll build an AI-powered image editor that lets users upload images and edit them using natural language prompts. This project demonstrates how to combine Inference Providers with image-to-image models like [Qwen's Image Edit](https://huggingface.co/Qwen/Qwen-Image-Edit) and [Black Forest Labs' Flux Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev). Our app will: From 7f78e08c93d9d94efc1c6ca313511c97185aeb09 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Mon, 25 Aug 2025 13:49:31 +0200 Subject: [PATCH 4/9] Update docs/inference-providers/guides/image-editor.md Co-authored-by: Pedro Cuenca --- docs/inference-providers/guides/image-editor.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md index 646ff204e..0ea26121a 100644 --- a/docs/inference-providers/guides/image-editor.md +++ b/docs/inference-providers/guides/image-editor.md @@ -6,7 +6,7 @@ Our app will: 1. **Accept image uploads** through a web interface 2. **Process natural language prompts** editing instructions like "Turn the cat into a tiger" -3. **Transform images** using Qwen Image Edit +3. **Transform images** using Qwen Image Edit or FLUX.1 Kontext 4. **Display results** in a Gradio interface From 5e3d610097f8f2a16ba298b95394b1852ff0028c Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Mon, 25 Aug 2025 13:54:02 +0200 Subject: [PATCH 5/9] add requirements txt --- .../guides/image-editor.md | 20 ++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md index 0ea26121a..80f968f03 100644 --- a/docs/inference-providers/guides/image-editor.md +++ b/docs/inference-providers/guides/image-editor.md @@ -306,9 +306,27 @@ if __name__ == "__main__": ## Deploy on Hugging Face Spaces +Let's deploy our app to Hugging Face Spaces. + +First, we will export our dependencies to a requirements file. + +```bash +uv export --format requirements-txt --output-file requirements.txt +``` + +This creates a `requirements.txt` file with all your project dependencies and their exact versions from the lockfile. + + + +The `uv export` command ensures that your Space will use the exact same dependency versions that you tested locally, preventing deployment issues caused by version mismatches. + + + +Now you can deploy to Spaces: + 1. **Create a new Space**: Go to [huggingface.co/new-space](https://huggingface.co/new-space) 2. **Choose Gradio SDK** and make it public -3. **Upload your files**: Upload `main.py` and any example images +3. **Upload your files**: Upload `main.py`, `requirements.txt`, and any example images 4. **Add your token**: In Space settings, add `HF_TOKEN` as a secret 5. **Launch**: Your app will be live at `https://huggingface.co/spaces/your-username/your-space-name` From dd4a0b64ee1c4369ceca11d9973345e679119948 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Mon, 25 Aug 2025 13:54:48 +0200 Subject: [PATCH 6/9] Apply suggestions from code review Co-authored-by: Pedro Cuenca --- docs/inference-providers/guides/image-editor.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md index 80f968f03..b6c398ded 100644 --- a/docs/inference-providers/guides/image-editor.md +++ b/docs/inference-providers/guides/image-editor.md @@ -42,7 +42,7 @@ This creates a basic project structure with a `pyproject.toml` file. Now add the uv add huggingface-hub>=0.34.4 gradio>=5.0.0 pillow>=11.3.0 ``` -The dependencies are now installed and ready to use! Also, `uv` will create a handy `pyproject.toml` file for you to manage your dependencies as a project. +The dependencies are now installed and ready to use! Also, `uv` will maintain the `pyproject.toml` file for you as you add dependencies. @@ -339,4 +339,4 @@ Congratulations! You've created a production-ready AI image editor. Now that you - **Object removal**: Remove unwanted objects from images - **Provider comparison**: Benchmark different providers for your use case -Happy building! And remember to share your app with the community on the hub. +Happy building! And remember to share your app with the community on the Hub. From 0d2a5d04466d25f34aca8b0cb48b6659dd90a605 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Tue, 26 Aug 2025 14:26:17 +0200 Subject: [PATCH 7/9] Apply suggestions from code review Co-authored-by: vb --- docs/inference-providers/guides/image-editor.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md index b6c398ded..c885764df 100644 --- a/docs/inference-providers/guides/image-editor.md +++ b/docs/inference-providers/guides/image-editor.md @@ -54,7 +54,7 @@ We're using `uv` because it's a fast Python package manager that handles depende Now let's create the main logic for our application - the image editing function that transforms images using AI. -Create `main.py` then import the necessary libraries and instantiate the InferenceClient. We're using the `fal-ai` provider for fast image processing, but other providers are available. +Create `main.py` then import the necessary libraries and instantiate the InferenceClient. We're using the `fal-ai` provider for fast image processing, but other providers like `replicate` are also available. ```python import os @@ -111,9 +111,9 @@ def edit_image(input_image, prompt): -We're using the `fal-ai` provider with the `Qwen/Qwen-Image-Edit` model. The fal-ai provider offers fast inference times, perfect for interactive applications. In some use cases, you might want to switch between providers for maximum performance. Whilst in others you might want to go for the consistency of a single provider. +We're using the `fal-ai` provider with the `Qwen/Qwen-Image-Edit` model. The fal-ai provider offers fast inference times, perfect for interactive applications. -You can experiment with different providers for various performance characteristics: +However, you can experiment with different providers for various performance characteristics: ```python client = InferenceClient(provider="replicate", api_key=os.environ["HF_TOKEN"]) From fe522044cfa15c1895582889a42fc7ed8a007263 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Tue, 26 Aug 2025 19:54:47 +0200 Subject: [PATCH 8/9] add tldr --- docs/inference-providers/guides/image-editor.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md index b6c398ded..bf797dbf9 100644 --- a/docs/inference-providers/guides/image-editor.md +++ b/docs/inference-providers/guides/image-editor.md @@ -11,7 +11,7 @@ Our app will: -This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co). +TLDR; this guide will show you how to build an AI image editor with Gradio and Inference Providers, just like [this one](https://huggingface.co/spaces/Qwen/Qwen-Image-Edit). @@ -24,6 +24,12 @@ Before we start coding, authenticate with Hugging Face using your token: export HF_TOKEN="your_token_here" ``` + + +This guide assumes you have a Hugging Face account. If you don't have one, you can create one for free at [huggingface.co](https://huggingface.co). + + + When you set this environment variable, it handles authentication automatically for all your inference calls. You can generate a token from [your settings page](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained). ## Step 2: Project Setup From e02582583e08f0ce46b6e6844dffb9f4561c10f4 Mon Sep 17 00:00:00 2001 From: burtenshaw Date: Tue, 26 Aug 2025 19:55:11 +0200 Subject: [PATCH 9/9] format tldr --- docs/inference-providers/guides/image-editor.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/inference-providers/guides/image-editor.md b/docs/inference-providers/guides/image-editor.md index 16e6d8357..e9a919a97 100644 --- a/docs/inference-providers/guides/image-editor.md +++ b/docs/inference-providers/guides/image-editor.md @@ -11,7 +11,7 @@ Our app will: -TLDR; this guide will show you how to build an AI image editor with Gradio and Inference Providers, just like [this one](https://huggingface.co/spaces/Qwen/Qwen-Image-Edit). +TL;DR - this guide will show you how to build an AI image editor with Gradio and Inference Providers, just like [this one](https://huggingface.co/spaces/Qwen/Qwen-Image-Edit).