openvinotoolkit · openvino-dev-samples · Feb 2, 2026 · Feb 2, 2026 · Feb 3, 2026 · Feb 3, 2026
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -42,6 +42,7 @@ arxiv
 ASPP
 ASR
 asr
+ASRModel
 ASYM
 async
 AsyncInferQueue
@@ -158,6 +159,7 @@ coors
 coreference
 CoSENT
 CosyVoice
+Conv
 cpm
 cpp
 cpu
@@ -316,9 +318,11 @@ finetuned
 finetuning
 FireRedTTS
 FLAC
+flac
 FLD
 floyd
 fn
+ForcedAligner
 foley
 Formatter
 formatter
@@ -1162,6 +1166,7 @@ Vladlen
 vlm
 VLM
 VLModel
+vLLM
 VLMPipeline
 VLMs
 VL’s

diff --git a/notebooks/qwen3-asr/README.md b/notebooks/qwen3-asr/README.md
@@ -0,0 +1,40 @@
+# Qwen3-ASR Speech Recognition with OpenVINO™
+
+The Qwen3-ASR family includes Qwen3-ASR-1.7B and Qwen3-ASR-0.6B, which support language identification and ASR for 52 languages and dialects. Both leverage large-scale speech training data and the strong audio understanding capability of their foundation model, Qwen3-Omni. Experiments show that the 1.7B version achieves state-of-the-art performance among open-source ASR models and is competitive with the strongest proprietary commercial APIs. Here are the main features:
+
+* **All-in-one**: Qwen3-ASR-1.7B and Qwen3-ASR-0.6B support language identification and speech recognition for 30 languages and 22 Chinese dialects, so as to English accents from multiple countries and regions.
+
+* **Excellent and Fast**: The Qwen3-ASR family ASR models maintains high-quality and robust recognition under complex acoustic environments and challenging text patterns. Qwen3-ASR-1.7B achieves strong performance on both open-sourced and internal benchmarks. While the 0.6B version achieves accuracy-efficient trade-off, it reaches 2000 times throughput at a concurrency of 128. They both achieve streaming / offline unified inference with single model and support transcribe long audio.
+
+* **Novel and strong forced alignment Solution**: We introduce Qwen3-ForcedAligner-0.6B, which supports timestamp prediction for arbitrary units within up to 5 minutes of speech in 11 languages. Evaluations show its timestamp accuracy surpasses E2E based forced-alignment models.
+
+* **Comprehensive inference toolkit**: In addition to open-sourcing the architectures and weights of the Qwen3-ASR series, we also release a powerful, full-featured inference framework that supports vLLM-based batch inference, asynchronous serving, streaming inference, timestamp prediction, and more.
+
+<p align="center">
+    <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3-ASR-Repo/overview.jpg" width="100%"/>
+<p>
+
+More details can be found in the original [repository](https://github.com/QwenLM/Qwen3-ASR) and [model card](https://huggingface.co/Qwen/Qwen3-ASR-0.6B)
+
+### Notebook Contents
+
+In this tutorial we consider how to run and optimize Qwen3-ASR using OpenVINO.
+
+The tutorial consists of the following steps:
+
+- Install prerequisites
+- Convert model to OpenVINO intermediate representation (IR) format 
+- Prepare OpenVINO Inference pipeline
+- Run Speech Recognition
+- Launch interactive demo
+
+## Installation Instructions
+
+This is a self-contained example that relies solely on its own code.</br>
+We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
+For further details, please refer to [Installation Guide](../../README.md).
+
+⚠️ **EXPERIMENTAL NOTEBOOK**
+
+This notebook demonstrates a model that has not been fully validated with OpenVINO. It may be fully supported and validated in the future.
+<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/qwen3-asr/README.md" />