Skip to content

Commit f6d41b1

Browse files
authored
Merge pull request #345 from raosukrit67/main
Porting Legacy Notebooks to NeMo Microservices v25.09 Design Spec
2 parents 82b5008 + 3bbae42 commit f6d41b1

21 files changed

+9683
-2230
lines changed

nemo/NeMo-Data-Designer/README.md

Lines changed: 49 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,7 @@
22

33
This directory contains the tutorial notebooks for getting started with NeMo Data Designer.
44

5-
## 🐳 Deploy the NeMo Data Designer microservice locally
6-
7-
In order to run these notebooks, you must have the NeMo Data Designer microservice deployed locally via docker compose. See the [deployment guide](http://docs.nvidia.com/nemo/microservices/latest/set-up/deploy-as-microservices/data-designer/docker-compose.html) for more details.
8-
9-
## 📦 Set up the environment
5+
## 📦 Set Up the Environment
106

117
We will use the `uv` package manager to set up our environment and install the necessary dependencies. If you don't have `uv` installed, you can follow the installation instructions from the [uv documentation](https://docs.astral.sh/uv/getting-started/installation/).
128

@@ -23,3 +19,51 @@ source .venv/bin/activate
2319
```
2420

2521
Be sure to select this virtual environment as your kernel when running the notebooks.
22+
23+
## 🚀 Deploying the NeMo Data Designer Microservice
24+
25+
To run these notebooks, you'll need the NeMo Data Designer microservice. You have two deployment options:
26+
27+
### ⚙️ Using the NeMo Data Designer Managed Service
28+
We have a [managed service of NeMo Data Designer](https://build.nvidia.com/nemo/data-designer) to help you get started quickly.
29+
30+
Please refer to the [intro-tutorials](./intro-tutorials/) notebooks to learn how to connect to this service.
31+
32+
**Note**: This managed service of NeMo Data Designer is intended to only help you get started. As a result, it can only be used to launch `preview` jobs. It can **not** be used to launch long running jobs. If you need to launch long-running jobs please deploy an instance of [NeMo Data Designer locally](#-deploy-the-nemo-data-designer-microservice-locally)
33+
34+
35+
### 🐳 Deploy the NeMo Data Designer Microservice Locally
36+
37+
Alternatively, you can deploy the NeMo Data Designer microservice locally via Docker Compose.
38+
39+
To run the tutorial notebooks in the [advanced](./advanced/), you will need to have NeMo Data Designer deployed locally. Please see the [deployment guide](http://docs.nvidia.com/nemo/microservices/latest/set-up/deploy-as-microservices/data-designer/docker-compose.html) for more details.
40+
41+
42+
## 📚 Tutorial Directory
43+
44+
### 🚀 Intro Tutorials
45+
46+
| Notebook | Description |
47+
|---------------------------------------------------|----------------------------------------------------------------------------------|
48+
| [1-the-basics.ipynb](./intro-tutorials/1-the-basics.ipynb) | Learn the basics of Data Designer by generating a simple product review dataset |
49+
| [2-structured-outputs-and-jinja-expressions.ipynb](./intro-tutorials/2-structured-outputs-and-jinja-expressions.ipynb) | Explore advanced data generation using structured outputs and Jinja expressions |
50+
| [3-seeding-with-a-dataset.ipynb](./intro-tutorials/3-seeding-with-a-dataset.ipynb) | Discover how to seed synthetic data generation with an external dataset |
51+
| [4-custom-model-configs.ipynb](./intro-tutorials/4-custom-model-configs.ipynb) | Master creating and using custom model configurations |
52+
53+
### 🎯 Advanced Tutorials
54+
55+
| Notebook | Domain | Description |
56+
|---------------------------------------------------|---------------------|-----------------------------------------------------------------|
57+
| [person-sampler-tutorial.ipynb](./advanced/person-samplers/person-sampler-tutorial.ipynb) | Persona Samplers | Generate realistic personas using the person sampler |
58+
| [clinical-trials.ipynb](./advanced/healthcare-datasets/clinical-trials.ipynb) | Healthcare | Build synthetic clinical trial datasets with realistic PII for testing data protection |
59+
| [insurance-claims.ipynb](./advanced/healthcare-datasets/insurance-claims.ipynb) | Healthcare | Create synthetic insurance claims datasets with realistic claim data and processing information |
60+
| [physician-notes-with-realistic-personal-details.ipynb](./advanced/healthcare-datasets/physician-notes-with-realistic-personal-details.ipynb) | Healthcare | Generate realistic patient data and physician notes with embedded personal information |
61+
| [w2-dataset.ipynb](./advanced/forms/w2-dataset.ipynb) | Forms & Documents | Generate synthetic W-2 tax form datasets with realistic employee and employer information |
62+
| [multi-turn-conversation.ipynb](./advanced/multi-turn-chat/multi-turn-conversation.ipynb) | Conversational AI | Build synthetic conversational data with realistic person details and multi-turn dialogues |
63+
| [visual-question-answering-using-vlm.ipynb](./advanced/multimodal/visual-question-answering-using-vlm.ipynb) | Multimodal | Create visual question answering datasets using Vision Language Models |
64+
| [product-question-answer-generator.ipynb](./advanced/qa-generation/product-question-answer-generator.ipynb) | Q&A Generation | Build product information datasets with corresponding questions and answers |
65+
| [generate-rag-evaluation-dataset.ipynb](./advanced/rag-examples/generate-rag-evaluation-dataset.ipynb) | RAG & Retrieval | Generate diverse RAG evaluation datasets for testing retrieval-augmented generation systems |
66+
| [reasoning-traces.ipynb](./advanced/reasoning/reasoning-traces.ipynb) | Reasoning | Build synthetic reasoning traces to demonstrate step-by-step problem-solving processes |
67+
| [text-to-python.ipynb](./advanced/text-to-code/text-to-python.ipynb) | Text-to-Code | Generate Python code from natural language instructions with validation and evaluation |
68+
| [text-to-python-evol.ipynb](./advanced/text-to-code/text-to-python-evol.ipynb) | Text-to-Code | Build advanced Python code generation with evolutionary improvements and iterative refinement |
69+
| [text-to-sql.ipynb](./advanced/text-to-code/text-to-sql.ipynb) | Text-to-Code | Create SQL queries from natural language descriptions with validation and testing |

0 commit comments

Comments
 (0)