You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`__init__(self, options: Dict[str, str])` - Initialize with user options (Optional; base class provides default)
139
+
-`name() -> str` - Return format name (Optional to override; defaults to class name)
140
+
-`schema() -> StructType` - Define data source schema (Required)
141
+
-`reader(schema: StructType) -> DataSourceReader` - Create batch reader (Required if batch read is supported)
142
+
-`writer(schema: StructType, overwrite: bool) -> DataSourceWriter` - Create batch writer (Required if batch write is supported)
143
+
-`streamReader(schema: StructType) -> DataSourceStreamReader` - Create streaming reader (Required if streaming read is supported and `simpleStreamReader` is not implemented)
144
+
-`streamWriter(schema: StructType, overwrite: bool) -> DataSourceStreamWriter` - Create streaming writer (Required if streaming write is supported)
145
+
-`simpleStreamReader(schema: StructType) -> SimpleDataSourceStreamReader` - Create simple streaming reader (Required if streaming read is supported and `streamReader` is not implemented)
146
146
147
147
#### DataSourceReader
148
148
Abstract base class for reading data from sources.
149
149
150
150
**Key Methods:**
151
-
-`read(partition) -> Iterator` - Read data from partition, returns tuples/Rows/pyarrow.RecordBatch
152
-
-`partitions() -> List[InputPartition]` - Return input partitions for parallel reading
151
+
-`read(partition) -> Iterator` - Read data from partition, returns tuples/Rows/pyarrow.RecordBatch (Required)
152
+
-`partitions() -> List[InputPartition]` - Return input partitions for parallel reading (Optional; defaults to a single partition)
153
153
154
154
#### DataSourceStreamReader
155
155
Abstract base class for streaming data sources with offset management.
0 commit comments