Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit 8a0b464

Browse files
Edit root readme (#304)
* altered emoji and title font sizes to match other readmes * fix yaml code block indentation * aligned indentation 2nd time * fix yaml identation * edited tables to sync with docs, added urls for new readmes, and edited grammar * removed border * fixed resources section * altered urls to tasks in the nlp inference section * edited grammar and URL issues * edited grammar * updating squad model stubs
1 parent f6d27d5 commit 8a0b464

File tree

1 file changed

+34
-41
lines changed

1 file changed

+34
-41
lines changed

README.md

Lines changed: 34 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -76,25 +76,25 @@ pip install deepsparse
7676

7777
## 🔌 DeepSparse Server
7878

79-
The DeepSparse Server allows you to serve models and pipelines in deployment in CLI. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command:
79+
The DeepSparse Server allows you to serve models and pipelines from the terminal. The server runs on top of the popular FastAPI web framework and Uvicorn web server. Install the server using the following command:
8080

8181
```bash
8282
pip install deepsparse[server]
8383
```
8484

85-
** Single Model**
85+
### Single Model
8686

8787
Once installed, the following example CLI command is available for running inference with a single BERT model:
8888

8989
```bash
9090
deepsparse.server \
9191
--task question_answering \
92-
--model_path "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none"
92+
--model_path "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni"
9393
```
9494

9595
To look up arguments run: `deepsparse.server --help`.
9696

97-
** Multiple Models**
97+
### Multiple Models
9898
To serve multiple models in your deployment you can easily build a `config.yaml`. In the example below, we define two BERT models in our configuration for the question answering task:
9999

100100
```yaml
@@ -104,7 +104,7 @@ models:
104104
batch_size: 1
105105
alias: question_answering/base
106106
- task: question_answering
107-
model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant-aggressive_95
107+
model_path: zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni
108108
batch_size: 1
109109
alias: question_answering/pruned_quant
110110
```
@@ -113,6 +113,9 @@ Finally, after your `config.yaml` file is built, run the server with the config
113113
```bash
114114
deepsparse.server --config_file config.yaml
115115
```
116+
117+
[Getting Started with the DeepSparse Server](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server) for more info.
118+
116119
## 📜 DeepSparse Benchmark
117120

118121
The benchmark tool is available on your CLI to run expressive model benchmarks on the DeepSparse Engine with minimal parameters.
@@ -124,27 +127,26 @@ deepsparse.benchmark [-h] [-b BATCH_SIZE] [-shapes INPUT_SHAPES]
124127
[-ncores NUM_CORES] [-s {async,sync}] [-t TIME]
125128
[-nstreams NUM_STREAMS] [-pin {none,core,numa}]
126129
[-q] [-x EXPORT_PATH]
127-
model_path
130+
model_path
128131
129132
```
130133

131134
[Getting Started with CLI Benchmarking](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark_model) includes examples of select inference scenarios:
132135
- Synchronous (Single-stream) Scenario
133136
- Asynchronous (Multi-stream) Scenario
134-
__ __
135137

136-
## 👩‍💻 NLP Inference | Question Answering
138+
139+
## 👩‍💻 NLP Inference Example
137140

138141
```python
139142
from deepsparse.transformers import pipeline
140143
141144
# SparseZoo model stub or path to ONNX file
142-
onnx_filepath="zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned-aggressive_98"
145+
model_path = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/12layer_pruned80_quant-none-vnni"
143146
144147
qa_pipeline = pipeline(
145148
task="question-answering",
146-
model_path=onnx_filepath,
147-
num_cores=None, # uses all available CPU cores by default
149+
model_path=model_path,
148150
)
149151
150152
my_name = qa_pipeline(question="What's my name?", context="My name is Snorlax")
@@ -154,20 +156,19 @@ NLP Tutorials:
154156
- [Getting Started with Hugging Face Transformers 🤗](https://github.com/neuralmagic/deepsparse/tree/main/examples/huggingface-transformers)
155157

156158
Tasks Supported:
157-
- Text Classification (Sentiment Analysis)
158-
- Question Answering
159-
- Masked Language Modeling (MLM)
160-
161-
__ __
159+
- [Token Classification: Named Entity Recognition](https://neuralmagic.com/use-cases/sparse-named-entity-recognition/)
160+
- [Text Classification: Multi-Class](https://neuralmagic.com/use-cases/sparse-multi-class-text-classification/)
161+
- [Text Classification: Binary](https://neuralmagic.com/use-cases/sparse-binary-text-classification/)
162+
- [Text Classification: Sentiment Analysis](https://neuralmagic.com/use-cases/sparse-sentiment-analysis/)
163+
- [Question Answering](https://neuralmagic.com/use-cases/sparse-question-answering/)
162164

163165
## 🦉 SparseZoo ONNX vs. Custom ONNX Models
164166

165167
DeepSparse can accept ONNX models from two sources:
166168

167-
1. `SparseZoo ONNX`: our open-source collection of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) hosts inference-optimized models, trained on repeatable sparsification recipes using state-of-the-art techniques from [SparseML.](https://github.com/neuralmagic/sparseml)
168-
169-
2. `Custom ONNX`: Your own ONNX model, can be dense or sparse. Plug in your model to compare performance with other solutions.
169+
- **SparseZoo ONNX**: our open-source collection of sparse models available for download. [SparseZoo](https://github.com/neuralmagic/sparsezoo) hosts inference-optimized models, trained on repeatable sparsification recipes using state-of-the-art techniques from [SparseML](https://github.com/neuralmagic/sparseml).
170170

171+
- **Custom ONNX**: your own ONNX model, can be dense or sparse. Plug in your model to compare performance with other solutions.
171172

172173
```bash
173174
> wget https://github.com/onnx/models/raw/main/vision/classification/mobilenet/model/mobilenetv2-7.onnx
@@ -188,15 +189,13 @@ inputs = generate_random_inputs(onnx_filepath, batch_size)
188189
engine = compile_model(onnx_filepath, batch_size)
189190
outputs = engine.run(inputs)
190191
```
191-
Compatibility/Support Notes
192+
Compatibility/Support Notes:
192193
- ONNX version 1.5-1.7
193194
- ONNX opset version 11+
194195
- ONNX IR version has not been tested at this time
195196

196197
The [GitHub repository](https://github.com/neuralmagic/deepsparse) includes package APIs along with examples to quickly get started benchmarking and inferencing sparse models.
197198

198-
__ __
199-
200199
## Scheduling Single-Stream, Multi-Stream, and Elastic Inference
201200

202201
The DeepSparse Engine offers up to three types of inferences based on your use case. Read more details here: [Inference Types](https://github.com/neuralmagic/deepsparse/blob/main/docs/source/scheduler.md).
@@ -216,7 +215,6 @@ PRO TIP: The most common use cases for the multi-stream scheduler are where para
216215
3 ⚡ Elastic scheduling: requests execute in parallel, but not multiplexed on individual NUMA nodes.
217216

218217
Use Case: A workload that might benefit from the elastic scheduler is one in which multiple requests need to be handled simultaneously, but where performance is hindered when those requests have to share an L3 cache.
219-
__ __
220218

221219
## 🧰 CPU Hardware Support
222220

@@ -233,34 +231,29 @@ Here is a table detailing specific support for some algorithms over different mi
233231

234232
## Resources
235233

236-
<table>
237-
<tr><th> Documentation </th><th> &emsp;&emsp;&emsp;Versions </th><th> Info </th></tr>
238-
<tr><td>
239-
240-
[DeepSparse](https://docs.neuralmagic.com/deepsparse/)
241-
242-
[SparseML](https://docs.neuralmagic.com/sparseml/)
243234

244-
[SparseZoo](https://docs.neuralmagic.com/sparsezoo/)
235+
#### Libraries
236+
- [DeepSparse](https://docs.neuralmagic.com/deepsparse/)
245237

246-
[Sparsify](https://docs.neuralmagic.com/sparsify/)
238+
- [SparseML](https://docs.neuralmagic.com/sparseml/)
247239

248-
</td><td>
240+
- [SparseZoo](https://docs.neuralmagic.com/sparsezoo/)
249241

250-
&emsp;stable : : [DeepSparse](https://pypi.org/project/deepsparse)
242+
- [Sparsify](https://docs.neuralmagic.com/sparsify/)
251243

252-
&emsp;nightly (dev) : : [DeepSparse-Nightly](https://pypi.org/project/deepsparse-nightly/)
253244

254-
&emsp;releases : : [GitHub](https://github.com/neuralmagic/deepsparse/releases)
245+
#### Versions
246+
- [DeepSparse](https://pypi.org/project/deepsparse) | stable
255247

256-
</td><td>
248+
- [DeepSparse-Nightly](https://pypi.org/project/deepsparse-nightly/) | nightly (dev)
257249

258-
[Blog](https://www.neuralmagic.com/blog/)
250+
- [GitHub](https://github.com/neuralmagic/deepsparse/releases) | releases
259251

260-
[Resources](https://www.neuralmagic.com/resources/)
252+
#### Info
261253

262-
</td></tr> </table>
254+
- [Blog](https://www.neuralmagic.com/blog/)
263255

256+
- [Resources](https://www.neuralmagic.com/resources/)
264257

265258

266259
## Community
@@ -270,7 +263,7 @@ Here is a table detailing specific support for some algorithms over different mi
270263

271264
Contribute with code, examples, integrations, and documentation as well as bug reports and feature requests! [Learn how here.](https://github.com/neuralmagic/deepsparse/blob/main/CONTRIBUTING.md)
272265

273-
For user help or questions about DeepSparse, sign up or log in to our [**Deep Sparse Community Slack**](https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ). We are growing the community member by member and happy to see you there. Bugs, feature requests, or additional questions can also be posted to our [GitHub Issue Queue.](https://github.com/neuralmagic/deepsparse/issues) You can get the latest news, webinar and event invites, research papers, and other ML Performance tidbits by [subscribing](https://neuralmagic.com/subscribe/) to the Neural Magic community.
266+
For user help or questions about DeepSparse, sign up or log in to our **[Deep Sparse Community Slack](https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ)**. We are growing the community member by member and happy to see you there. Bugs, feature requests, or additional questions can also be posted to our [GitHub Issue Queue.](https://github.com/neuralmagic/deepsparse/issues) You can get the latest news, webinar and event invites, research papers, and other ML Performance tidbits by [subscribing](https://neuralmagic.com/subscribe/) to the Neural Magic community.
274267

275268
For more general questions about Neural Magic, complete this [form.](http://neuralmagic.com/contact/)
276269

0 commit comments

Comments
 (0)