Installation guide for installing release branches (#637)

quic-rishinr · abukhoy · web-flow · commit 2ef06c2ae13d · 2025-11-26T14:25:16.000+05:30
Added installation guide for installing release branches

---------

Signed-off-by: Rishin Raj &lt;rishinr@qti.qualcomm.com&gt;
Signed-off-by: Abukhoyer Shaik &lt;abukhoye@qti.qualcomm.com&gt;
Co-authored-by: Abukhoyer Shaik &lt;abukhoye@qti.qualcomm.com&gt;
diff --git a/README.md b/README.md
@@ -93,9 +93,13 @@ python3.10 -m venv qeff_env
 source qeff_env/bin/activate
 pip install -U pip
 
-# Clone and Install the QEfficient Repo.
+# Clone and Install the QEfficient repository from the mainline branch
 pip install git+https://github.com/quic/efficient-transformers
 
+# Clone and Install the QEfficient repository from a specific branch, tag or commit by appending @ref
+# Release branch (e.g., release/v1.20.0):
+pip install "git+https://github.com/quic/efficient-transformers@release/v1.20.0"
+
 # Or build wheel package using the below command.
 pip install build wheel
 python -m build --wheel --outdir dist
diff --git a/docs/source/quick_start.md b/docs/source/quick_start.md
@@ -221,4 +221,26 @@ Benchmark the model on Cloud AI 100, run the infer API to print tokens and tok/s
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 qeff_model.generate(prompts=["My name is"],tokenizer=tokenizer)
 ```
+
+### Local Model Execution
+If the model and tokenizer are already downloaded, we can directly load them from local path.
+
+```python
+from QEfficient import QEFFAutoModelForCausalLM
+from transformers import AutoTokenizer
+
+# Local path to the downloaded model. You can find downloaded HF models in:
+# - Default location: ~/.cache/huggingface/hub/models--{model_name}/snapshots/{snapshot_id}/
+local_model_repo = "~/.cache/huggingface/hub/models--gpt2/snapshots/607a30d783dfa663caf39e06633721c8d4cfcd7e"
+
+# Load model from local path
+model = QEFFAutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path=local_model_repo)
+
+model.compile(num_cores=16)
+
+# Load tokenizer from the same local path
+tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=local_model_repo)
+
+model.generate(prompts=["Hi there!!"], tokenizer=tokenizer)
+```
 End to End demo examples for various models are available in [**notebooks**](https://github.com/quic/efficient-transformers/tree/main/notebooks) directory. Please check them out.