Skip to content

Commit a5739b4

Browse files
mksunintrogh
andauthored
Adding documentation for Microsoft Fabric VS Code experience for data science (#8946)
* Add Microsoft Fabric quickstart guide for VS Code * Fix typo in Microsoft Fabric section headings * Update Microsoft Fabric quickstart documentation * added Git and MCP details * updated fabric extension doc * updated fabric data science documentation * added image for UDF * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update microsoft-fabric-quickstart.md * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update microsoft-fabric-quickstart.md * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update docs/datascience/microsoft-fabric-quickstart.md Co-authored-by: Nick Trogh <[email protected]> * Update microsoft-fabric-quickstart.md * Update toc.json * Update microsoft-fabric-quickstart.md * Update image captions in quickstart guide --------- Co-authored-by: Nick Trogh <[email protected]>
1 parent 98b5043 commit a5739b4

File tree

7 files changed

+236
-1
lines changed

7 files changed

+236
-1
lines changed
Lines changed: 3 additions & 0 deletions
Loading
Lines changed: 3 additions & 0 deletions
Loading
Lines changed: 3 additions & 0 deletions
Loading
Lines changed: 3 additions & 0 deletions
Loading
Lines changed: 3 additions & 0 deletions
Loading
Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
---
2+
ContentId: 99a5d36e-ce14-4040-b1cf-7345b7fa2c7d
3+
DateApproved: 10/9/2025
4+
MetaDescription: Get started with Microsoft Fabric extensions for Visual Studio Code to develop data engineering and analytics solutions
5+
MetaSocialImage: images/datascience/fabric-social.png
6+
---
7+
8+
# Data science in Microsoft Fabric using Visual Studio Code
9+
10+
You can build and develop data science and data engineering solutions for [Microsoft Fabric](https://learn.microsoft.com/fabric/) within VS Code. [Microsoft Fabric](https://marketplace.visualstudio.com/items?itemName=fabric.vscode-fabric) extensions for VS Code provide an integrated development experience for working with Fabric artifacts, lakehouses, notebooks, and user data functions.
11+
12+
## What is Microsoft Fabric?
13+
14+
[Microsoft Fabric](http://app.fabric.microsoft.com/) is an enterprise-ready, end-to-end analytics platform. It unifies data movement, data processing, ingestion, transformation, real-time event routing, and report building. It supports these capabilities with integrated services like Data Engineering, Data Factory, Data Science, Real-Time Intelligence, Data Warehouse, and Databases. [Sign up for free](https://app.fabric.microsoft.com/?pbi_source=learn-vscodedocs-microsoft-fabric-quickstart) and explore Microsoft Fabric for 60 days — no credit card required.
15+
16+
![Diagram that shows what is Microsoft Fabric?](images/microsoft-fabric/microsoft-fabric.png)
17+
18+
## Prerequisites
19+
20+
Before you get started with Microsoft Fabric extensions for VS Code, you need:
21+
22+
* **Visual Studio Code**: Install latest [VS Code](https://code.visualstudio.com/) version.
23+
* **Microsoft Fabric account**: You need access to a Microsoft Fabric workspace. You can [sign up for a free trial](https://app.fabric.microsoft.com/?pbi_source=learn-vscodedocs-microsoft-fabric-quickstart) to get started.
24+
* **Python**: Install [Python 3.8 or later](https://python.org/downloads/) to work with [Notebooks](https://learn.microsoft.com/fabric/data-engineering/author-notebook-with-vs-code), [User data functions](https://learn.microsoft.com/fabric/data-engineering/user-data-functions/create-user-data-functions-vs-code) in VS Code.
25+
26+
## Installation and setup
27+
28+
You can find and install the extensions from the [Visual Studio Marketplace](https://marketplace.visualstudio.com/VSCode) or directly in VS Code. Select the **Extensions** view (`kb(workbench.view.extensions)`) and search for **Microsoft Fabric**.
29+
30+
### Which extensions to use
31+
32+
| Extension | Best For | Key Features | Recommended for you if… |Documentation|
33+
|-----------------------------|-----------------------------|-----------------------------|--------------------------| --------------------------|
34+
| **Microsoft Fabric extension** | General workspace management, item management and working with item definitions | - Manage Fabric items (Lakehouses, Notebooks, Pipelines)<br>- Microsoft account sign-in & tenant switching<br>- Unified or grouped item views<br>- Edit Fabric notebooks with IntelliSense<br>- Command Palette integration (`Fabric:` commands) | You want a single extension to manage workspaces, notebooks, and items in Fabric directly from VS Code. | [What is Fabric VS code extension](https://learn.microsoft.com/fabric/data-engineering/set-up-fabric-vs-code-extension)|
35+
| **Fabric User data functions** | Developers building custom transformations & workflows | - Author serverless functions in Fabric<br>- Local debugging with breakpoints<br>- Manage data source connections<br>- Install/manage Python libraries<br>- Deploy functions directly to Fabric workspace | You build automation or data transformation logic and need debugging + deployment from VS Code. | [Develop User data function in VS code](https://learn.microsoft.com/fabric/data-engineering/user-data-functions/create-user-data-functions-vs-code)|
36+
| **Fabric Data Engineering** | Data engineers working with large-scale data & Spark | - Explore Lakehouses (tables, raw files)<br>- Develop/debug Spark notebooks<br>- Build/test Spark job definitions<br>- Sync notebooks between local VS Code & Fabric<br>- Preview schemas & sample data | You work with Spark, Lakehouses, or large-scale data pipelines and want to explore, develop, and debug locally. | [Develop Fabric notebooks in VS Code](https://learn.microsoft.com/fabric/data-engineering/setup-vs-code-extension) |
37+
38+
## Getting started
39+
Once you have the extensions installed and signed in, you can start working with Fabric workspaces and items. In the Command Palette (`kb(workbench.action.showCommands)`), type **Fabric** to list the commands that are specific to Microsoft Fabric.
40+
![Diagram that shows all microsoft Fabric commands](images/microsoft-fabric/fabric-command-palette.png)
41+
42+
## Fabric Workspace and items explorer
43+
44+
The Fabric extensions provide a seamless way to work with both remote and local Fabric items.
45+
- In the Fabric extension, the **Fabric Workspaces** section lists all items from your remote workspace, organized by type (Lakehouses, Notebooks, Pipelines, and more).
46+
- In the Fabric extension, the **Local folder** section shows a Fabric item(s) folder opened in VS Code. It reflects the structure of your fabric item definition for each type that is opened in VS Code. This enables you to develop locally and publish your changes to current or new workspace.
47+
48+
![Screenshot that shows how to view your workspaces and items?](images/microsoft-fabric/view-workspaces-and-items.png)
49+
50+
## Use user data functions for data science
51+
52+
1. In the Command Palette (`kb(workbench.action.showCommands)`), type **Fabric: Create Item**.
53+
2. Select your workspace and select **User data function**. Provide a name and select **Python** language.
54+
3. You are notified to set up the Python virtual environment and continue to set this up locally.
55+
4. Install the libraries using `pip install` or select the user data function item in the Fabric extension to add libraries. Update the `requirements.txt` file to specify the dependencies:
56+
57+
```txt
58+
fabric-user-data-functions ~= 1.0
59+
pandas == 2.3.1
60+
numpy == 2.3.2
61+
requests == 2.32.5
62+
scikit-learn=1.2.0
63+
joblib=1.2.0
64+
```
65+
66+
4. Open `functions_app.py`. Here's an example of developing a User Data Function for data science using scikit-learn:
67+
68+
```python
69+
import datetime
70+
import fabric.functions as fn
71+
import logging
72+
73+
# Import additional libraries
74+
import pandas as pd
75+
from sklearn.ensemble import RandomForestClassifier
76+
from sklearn.preprocessing import StandardScaler
77+
from sklearn.model_selection import train_test_split
78+
from sklearn.metrics import accuracy_score
79+
import joblib
80+
81+
udf = fn.UserDataFunctions()
82+
@udf.function()
83+
def train_churn_model(data: list, targetColumn: str) -> dict:
84+
'''
85+
Description: Train a Random Forest model to predict customer churn using pandas and scikit-learn.
86+
87+
Args:
88+
- data (list): List of dictionaries containing customer features and churn target
89+
Example: [{"Age": 25, "Income": 50000, "Churn": 0}, {"Age": 45, "Income": 75000, "Churn": 1}]
90+
- targetColumn (str): Name of the target column for churn prediction
91+
Example: "Churn"
92+
93+
Returns: dict: Model training results including accuracy and feature information
94+
'''
95+
# Convert data to DataFrame
96+
df = pd.DataFrame(data)
97+
98+
# Prepare features and target
99+
numeric_features = df.select_dtypes(include=['number']).columns.tolist()
100+
numeric_features.remove(targetColumn)
101+
102+
X = df[numeric_features]
103+
y = df[targetColumn]
104+
105+
# Split and scale data
106+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
107+
scaler = StandardScaler()
108+
X_train_scaled = scaler.fit_transform(X_train)
109+
X_test_scaled = scaler.transform(X_test)
110+
111+
# Train model
112+
model = RandomForestClassifier(n_estimators=100, random_state=42)
113+
model.fit(X_train_scaled, y_train)
114+
115+
# Evaluate and save
116+
accuracy = accuracy_score(y_test, model.predict(X_test_scaled))
117+
joblib.dump(model, 'churn_model.pkl')
118+
joblib.dump(scaler, 'scaler.pkl')
119+
120+
return {
121+
'accuracy': float(accuracy),
122+
'features': numeric_features,
123+
'message': f'Model trained with {len(X_train)} samples and {accuracy:.2%} accuracy'
124+
}
125+
126+
@udf.function()
127+
def predict_churn(customer_data: list) -> list:
128+
'''
129+
Description: Predict customer churn using trained Random Forest model.
130+
131+
Args:
132+
- customer_data (list): List of dictionaries containing customer features for prediction
133+
Example: [{"Age": 30, "Income": 60000}, {"Age": 55, "Income": 80000}]
134+
135+
Returns: list: Customer data with churn predictions and probability scores
136+
'''
137+
# Load saved model and scaler
138+
model = joblib.load('churn_model.pkl')
139+
scaler = joblib.load('scaler.pkl')
140+
141+
# Convert to DataFrame and scale features
142+
df = pd.DataFrame(customer_data)
143+
X_scaled = scaler.transform(df)
144+
145+
# Make predictions
146+
predictions = model.predict(X_scaled)
147+
probabilities = model.predict_proba(X_scaled)[:, 1]
148+
149+
# Add predictions to original data
150+
results = customer_data.copy()
151+
for i, (pred, prob) in enumerate(zip(predictions, probabilities)):
152+
results[i]['churn_prediction'] = int(pred)
153+
results[i]['churn_probability'] = float(prob)
154+
155+
return results
156+
```
157+
158+
6. Test your functions locally, by pressing `kbstyle(F5)`.
159+
7. In the Fabric extension, in **Local folder** , select the function and publish to your workspace.
160+
![Screenshot that shows how to publish your user data funtions item](./images/microsoft-fabric/publish-user-data-function.png)
161+
162+
Learn more about invoking the function from:
163+
- [Fabric Data pipelines](https://learn.microsoft.com/fabric/data-engineering/user-data-functions/create-functions-activity-data-pipelines)
164+
- [Fabric Notebooks](https://learn.microsoft.com/fabric/data-engineering/notebook-utilities#user-data-function-udf-utilities)
165+
- [An external application](https://learn.microsoft.com/fabric/data-engineering/user-data-functions/tutorial-invoke-from-python-app)
166+
167+
## Use Fabric notebooks for data science
168+
A Fabric notebook is an interactive workbook in Microsoft Fabric for writing and running code, visualizations, and markdown side-by-side. Notebooks support multiple languages (Python, Spark, SQL, Scala, and more) and are ideal for data exploration, transformation, and model development in Fabric working with your existing data in OneLake.
169+
170+
### Example
171+
172+
The cell below reads a CSV with Spark, converts it to pandas, and trains a logistic regression model with scikit-learn. Replace column names and path with your dataset values.
173+
174+
```python
175+
def train_logistic_from_spark(spark, csv_path):
176+
# Read CSV with Spark, convert to pandas
177+
sdf = spark.read.option("header", "true").option("inferSchema", "true").csv(csv_path)
178+
df = sdf.toPandas().dropna()
179+
180+
# Adjust these to match your dataset
181+
X = df[['feature1', 'feature2']]
182+
y = df['label']
183+
184+
from sklearn.model_selection import train_test_split
185+
from sklearn.linear_model import LogisticRegression
186+
from sklearn.metrics import accuracy_score
187+
188+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
189+
model = LogisticRegression(max_iter=200)
190+
model.fit(X_train, y_train)
191+
192+
preds = model.predict(X_test)
193+
return {'accuracy': float(accuracy_score(y_test, preds))}
194+
195+
# Example usage in a Fabric notebook cell
196+
# train_logistic_from_spark(spark, '/path/to/data.csv')
197+
```
198+
199+
Refer to [Microsoft Fabric Notebooks](https://learn.microsoft.com/fabric/data-engineering/how-to-use-notebook) documentation to learn more.
200+
201+
## Git integration
202+
Microsoft Fabric supports Git integration that enables version control and collaboration across data and analytics projects. You can connect a Fabric workspace to Git repositories, primarily Azure DevOps or GitHub, and only supported items are synced. This integration also supports CI/CD workflow to enable teams to manage releases efficiently and maintain high-quality analytics environments.
203+
204+
![GIF that shows how to use Git integration with User data functions](./images/microsoft-fabric/fabric-git-integration.gif)
205+
206+
## Next steps
207+
208+
Now that you have Microsoft Fabric extensions set up in VS Code, explore these resources to deepen your knowledge:
209+
210+
### Learn more about Microsoft Fabric
211+
* [Learn about Microsoft Fabric for Data Science](https://learn.microsoft.com/en-us/fabric/data-science/tutorial-data-science-introduction).
212+
* [Set up your Fabric trial capacity](https://learn.microsoft.com/fabric/fundamentals/fabric-trial)
213+
* [Microsoft Fabric fundamentals](https://learn.microsoft.com/fabric/fundamentals/fabric-overview)
214+
215+
### Community and support
216+
217+
* [Microsoft Fabric community forums](https://community.fabric.microsoft.com/)
218+
* [Fabric samples and templates](https://github.com/microsoft/fabric-samples)
219+
* [Visual Studio Marketplace reviews and feedback](https://marketplace.visualstudio.com/items?itemName=ms-fabric.vscode-fabric)

docs/toc.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -361,7 +361,8 @@
361361
["PyTorch Support", "/docs/datascience/pytorch-support"],
362362
["Azure Machine Learning", "/docs/datascience/azure-machine-learning"],
363363
["Manage Jupyter Kernels", "/docs/datascience/jupyter-kernel-management"],
364-
["Jupyter Notebooks on the Web", "/docs/datascience/notebooks-web"]
364+
["Jupyter Notebooks on the Web", "/docs/datascience/notebooks-web"],
365+
["Data science in Microsoft Fabric", "/docs/datascience/microsoft-fabric-quickstart"]
365366
]
366367
},
367368
{

0 commit comments

Comments
 (0)