You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository showcases custom Spark data sources built using the new [**Python Data Source API**](https://spark.apache.org/docs/4.0.0/api/python/tutorial/sql/python_data_source.html) introduced in Apache Spark 4.0.
6
6
For an in-depth understanding of the API, please refer to the [API source code](https://github.com/apache/spark/blob/master/python/pyspark/sql/datasource.py).
7
-
Note this repo is **demo only** and please be aware that it is not intended for production use.
7
+
Note this repo is demo only and please be aware that it is not intended for production use.
8
8
Contributions and feedback are welcome to help improve the examples.
9
9
10
10
@@ -30,26 +30,31 @@ from pyspark_datasources.fake import FakeDataSource
|**HuggingFace Datasets**|[@huggingface/pyspark_huggingface](https://github.com/huggingface/pyspark_huggingface)| Production-ready Spark Data Source for 🤗 Hugging Face Datasets | • Stream datasets as Spark DataFrames<br>• Select subsets/splits with filters<br>• Authentication support<br>• Save DataFrames to Hugging Face<br> |
55
+
| Data Source | Repository | Description | Features|
|**HuggingFace Datasets**|[@huggingface/pyspark_huggingface](https://github.com/huggingface/pyspark_huggingface)| Production-ready Spark Data Source for 🤗 Hugging Face Datasets | • Stream datasets as Spark DataFrames<br>• Select subsets/splits with filters<br>• Authentication support<br>• Save DataFrames to Hugging Face<br> |
53
58
54
59
## Contributing
55
60
We welcome and appreciate any contributions to enhance and expand the custom data sources.:
@@ -62,8 +67,8 @@ We welcome and appreciate any contributions to enhance and expand the custom dat
0 commit comments