System Info
(lerobot) root@di-20260304202240-z9d4k:/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21# python convert_dataset_v30_to_v21.py --repo-id=your_id --root=/e2e-data/evad-tech-vla/xizob/datasets/AgiBotWorld-Beta/agibotworld/task_327
INFO 2026-03-04 14:01:31 0_to_v21.py:117 Converting info.json metadata to v2.1 schema
INFO 2026-03-04 14:01:31 30_to_v21.py:93 Converting tasks parquet to legacy JSONL
INFO 2026-03-04 14:01:31 0_to_v21.py:160 Converting consolidated parquet files back to per-episode files
convert data files: 0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21/convert_dataset_v30_to_v21.py", line 489, in
convert_dataset(**vars(args))
File "/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21/convert_dataset_v30_to_v21.py", line 460, in convert_dataset
convert_data(root, new_root, episode_records)
File "/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21/convert_dataset_v30_to_v21.py", line 192, in convert_data
Dataset.from_pandas(episode_table).to_parquet(dest_path)
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 861, in from_pandas
table = InMemoryTable.from_pandas(
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/datasets/table.py", line 720, in from_pandas
return cls(pa.Table.from_pandas(*args, **kwargs))
File "pyarrow/table.pxi", line 4796, in pyarrow.lib.Table.from_pandas
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 651, in dataframe_to_arrays
arrays = [convert_column(c, f)
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 651, in
arrays = [convert_column(c, f)
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 639, in convert_column
raise e
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 633, in convert_column
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
File "pyarrow/array.pxi", line 365, in pyarrow.lib.array
File "pyarrow/array.pxi", line 87, in pyarrow.lib._ndarray_to_array
File "pyarrow/array.pxi", line 75, in pyarrow.lib._ndarray_to_type
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column observation.states.end.orientation with type array[float32]')
Reproduction
python convert_dataset_v30_to_v21.py
Expected behavior
Hi, thank you very much for your work on this project and for maintaining the dataset conversion tools. I really appreciate the effort that goes into keeping everything compatible across versions.
I have tested multiple versions of datasets (2.18.0 up to 4.2.0) and pyarrow (13.0.0 up to 23.0.1), but the error persists across all combinations. The problematic column appears to have pandas dtype array[float32] (ExtensionArray). Manually converting the column values to Python lists using .tolist() resolves the issue.
Could you please clarify whether this is expected behavior? Should Dataset.from_pandas() support pandas array[float32] dtype directly, or is explicit conversion required before writing to parquet?
System Info
(lerobot) root@di-20260304202240-z9d4k:/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21# python convert_dataset_v30_to_v21.py --repo-id=your_id --root=/e2e-data/evad-tech-vla/xizob/datasets/AgiBotWorld-Beta/agibotworld/task_327
INFO 2026-03-04 14:01:31 0_to_v21.py:117 Converting info.json metadata to v2.1 schema
INFO 2026-03-04 14:01:31 30_to_v21.py:93 Converting tasks parquet to legacy JSONL
INFO 2026-03-04 14:01:31 0_to_v21.py:160 Converting consolidated parquet files back to per-episode files
convert data files: 0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21/convert_dataset_v30_to_v21.py", line 489, in
convert_dataset(**vars(args))
File "/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21/convert_dataset_v30_to_v21.py", line 460, in convert_dataset
convert_data(root, new_root, episode_records)
File "/e2e-data/evad-tech-vla/xizob/LeRobot/any4lerobot/ds_version_convert/v30_to_v21/convert_dataset_v30_to_v21.py", line 192, in convert_data
Dataset.from_pandas(episode_table).to_parquet(dest_path)
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 861, in from_pandas
table = InMemoryTable.from_pandas(
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/datasets/table.py", line 720, in from_pandas
return cls(pa.Table.from_pandas(*args, **kwargs))
File "pyarrow/table.pxi", line 4796, in pyarrow.lib.Table.from_pandas
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 651, in dataframe_to_arrays
arrays = [convert_column(c, f)
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 651, in
arrays = [convert_column(c, f)
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 639, in convert_column
raise e
File "/e2e-data/evad-tech-vla/xizob/miniconda/envs/lerobot/lib/python3.10/site-packages/pyarrow/pandas_compat.py", line 633, in convert_column
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
File "pyarrow/array.pxi", line 365, in pyarrow.lib.array
File "pyarrow/array.pxi", line 87, in pyarrow.lib._ndarray_to_array
File "pyarrow/array.pxi", line 75, in pyarrow.lib._ndarray_to_type
File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column observation.states.end.orientation with type array[float32]')
Reproduction
python convert_dataset_v30_to_v21.py
Expected behavior
Hi, thank you very much for your work on this project and for maintaining the dataset conversion tools. I really appreciate the effort that goes into keeping everything compatible across versions.
I have tested multiple versions of datasets (2.18.0 up to 4.2.0) and pyarrow (13.0.0 up to 23.0.1), but the error persists across all combinations. The problematic column appears to have pandas dtype array[float32] (ExtensionArray). Manually converting the column values to Python lists using .tolist() resolves the issue.
Could you please clarify whether this is expected behavior? Should Dataset.from_pandas() support pandas array[float32] dtype directly, or is explicit conversion required before writing to parquet?