Format videos to `poseinterface` spec. Extract clip function. by sfmig · Pull Request #39 · neuroinformatics-unit/poseinterface

sfmig · 2026-03-27T13:56:34Z

Description

What is this PR

Bug fix
Addition of a new feature
Other

What does this PR do?

Added video_to_poseinterface to io module, to convert a video to poseinterface format.
- It renames videos following spec and reencodes them if required.
Basic util to extract a clip and the corresponding cliplabels.json file, given a video in poseinterface format, its full video annotations in cliplabels.json format and a range of frames.
- Exposed as entry point extract-clip

References

\

How has this PR been tested?

Tests pass locally and in CI.

Is this a breaking change?

No.

Does this PR require an update to the documentation?

Yes, docstrings.

Checklist:

The code has been tested locally
Tests have been added to cover all new functionality
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

niksirbi · 2026-03-31T17:03:14Z

+
+    # Slice clip and save as mp4
+    clip = video[start_frame : start_frame + duration]
+    clip_path = (


This should be video_path.stem not video.stem, right?

clip_path = ( clips_dir / f"{video_path.stem}_start-{start_frame}_dur-{duration}.mp4" )

both are equivalent, video.stem gets the filename from the sleap-io Video object

Not sure these are exactly equivalent. While trying this PR on some real data, I encountered the following error:

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[10], line 28 25 session_dir = benchmark_base_dir / split / project_name / sub_ses_prefix 27 for start_frame in start_frames: ---> 28 clip_path, clip_json = extract_clip( 29 video_path=(session_dir / f"{sub_ses_cam_prefix}.mp4"), 30 start_frame=start_frame, 31 duration=duration, 32 ) 33 print(f"Extracted clip: {clip_path}, {clip_json}") File ~/Code/NIU/poseinterface/poseinterface/clips.py:85, in extract_clip(video_path, start_frame, duration) 82 # Slice clip and save as mp4 83 clip = video[start_frame : start_frame + duration] 84 clip_path = ( ---> 85 clips_dir / f"{video.stem}_start-{start_frame}_dur-{duration}.mp4" 86 ) 87 sio.save_video(clip, clip_path, fps=video.fps) 89 # Generate cliplabels.json from the full video labels AttributeError: 'Video' object has no attribute 'stem'

The error goes away when using my version of clip_path (from my previous comment).

niksirbi

Thanks @sfmig.

I tried using the new functions in #40. They mostly worked, but I've stumbled on a few issues that need to be addressed before merging (see inline comments). Happy to do another round of review after we resolve these (I skipped the tests in this round).

By the way, if we want the new public functions to appear in the API references, we have to add them manually in api_index.rst (we haven't yet set up the automatic machinery we have in movement).

niksirbi · 2026-04-16T12:29:52Z

+REENCODING_PARAMS = {
+    **EXPECTED_ENCODING,
+    "codec": "libx264",  # overwrite with encoder to use
+    "crf": 25,


SLEAP's (and therefore OCTRON's) magic incantation uses crf 23. Any specific reason for going with 25 here?

Releatedly, I wonder whether we should expose crf to the user as an optional kwarg in video_to_poseinterface

niksirbi · 2026-04-16T12:31:49Z

@@ -1,10 +1,15 @@
+"""Functions to convert annotations and videos to PoseInterface format."""


Everywhere else we style package's name as lowercase, usually monospace a la movement. I recommend we don't go into CamelCase, unless we make an explicity decision to do so project-wide.

Suggested change

"""Functions to convert annotations and videos to PoseInterface format."""

"""Functions to convert annotations and videos to ``poseinterface`` format."""

niksirbi · 2026-04-16T12:40:24Z

+    if encoding != EXPECTED_ENCODING:
+        logging.warning(
+            f"Video encoding {encoding} does not match "
+            f"expected {EXPECTED_ENCODING}. Please reencode "


Since re-encoding happens automatically if needed. Should we reframe this as 'Will reencode' instead of 'Please reencode'?

Also since this is actually an expected action (documented in the docstring of video_to_poseinterface), I wonder whether this should be an INFO instead of WARNING.

niksirbi · 2026-04-16T12:43:51Z

@@ -0,0 +1,185 @@
+"""Functions to extract clips from poseinterface videos."""


Suggested change

"""Functions to extract clips from poseinterface videos."""

"""Functions to extract clips from ``poseinterface`` videos."""

niksirbi · 2026-04-16T15:46:18Z

+    return clip_path, clip_json
+
+
+def _extract_cliplabels(video_path, clips_dir, start_frame, duration):


Quoting our spec on cliplabels.json:

Clip labels follow the same COCO keypoints format as frame labels, but with different conventions for image id and file_name values:

Each image id must be the 0-based index of the frame within the clip (i.e. 0, 1, 2, ...), not the index in the session video.

Each file_name must follow the same pattern as frame image filenames, but without the extension. The frame field in the file_name must correspond to the index of that frame in the session video.

This means that each entry in the images array encodes two pieces of information: the id gives the local position within the clip, while the frame field in file_name gives the global position in the session video. Note that in both cases the indices are 0-based.

For a clip starting at frame 1000 with a duration of 5 frames, the images array would be:

[ {"id": 0, "file_name": "sub-M708149_ses-20200317_cam-topdown_frame-1000", "width": 1300, "height": 1028}, {"id": 1, "file_name": "sub-M708149_ses-20200317_cam-topdown_frame-1001", "width": 1300, "height": 1028}, {"id": 2, "file_name": "sub-M708149_ses-20200317_cam-topdown_frame-1002", "width": 1300, "height": 1028}, {"id": 3, "file_name": "sub-M708149_ses-20200317_cam-topdown_frame-1003", "width": 1300, "height": 1028}, {"id": 4, "file_name": "sub-M708149_ses-20200317_cam-topdown_frame-1004", "width": 1300, "height": 1028} ]

This function correctly selects the images corresponding to clip, but there is a step missing: the IDs of the extracted images must be changed to start with 0 within the extracted clip (i.e. subtract start_frame). The file_names should be left as they are, to keep a reference to the global frame index.

A similar problem applies to the annotation ids inside the extracted cliplabels file. For a clip starting at frame 1000, the first annotation entry returned by the current implementation is as follows:

"annotations": [ { "id": 1001, "image_id": 1000, "category_id": 1, "keypoints": [529.621887207031, 494.971038818359, 2, 543.039184570313, 501.402648925781, 2, 544.258728027344, 482.781982421875, 2, 599.23681640625, 496.673095703125, 2, 593.133361816406, 527.087524414063, 2, 604.429321289063, 470.6630859375, 2, 669.785888671875, 507.377227783203, 2, 673.556396484375, 613.862365722656, 2], "num_keypoints": 8, "bbox": [529.621887207031, 470.6630859375, 143.934509277344, 143.199279785156], "area": 20611.3180647455, "iscrowd": 0 },

Since our spec recommends that annotation IDs are 1-indexed, the "id" here should probably be 1, with the "image_id" being 0, as per my previous comment.

niksirbi · 2026-04-20T13:31:22Z

+
+    # Slice clip and save as mp4
+    clip = video[start_frame : start_frame + duration]
+    clip_path = (


Not sure these are exactly equivalent. While trying this PR on some real data, I encountered the following error:

--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[10], line 28 25 session_dir = benchmark_base_dir / split / project_name / sub_ses_prefix 27 for start_frame in start_frames: ---> 28 clip_path, clip_json = extract_clip( 29 video_path=(session_dir / f"{sub_ses_cam_prefix}.mp4"), 30 start_frame=start_frame, 31 duration=duration, 32 ) 33 print(f"Extracted clip: {clip_path}, {clip_json}") File ~/Code/NIU/poseinterface/poseinterface/clips.py:85, in extract_clip(video_path, start_frame, duration) 82 # Slice clip and save as mp4 83 clip = video[start_frame : start_frame + duration] 84 clip_path = ( ---> 85 clips_dir / f"{video.stem}_start-{start_frame}_dur-{duration}.mp4" 86 ) 87 sio.save_video(clip, clip_path, fps=video.fps) 89 # Generate cliplabels.json from the full video labels AttributeError: 'Video' object has no attribute 'stem'

The error goes away when using my version of clip_path (from my previous comment).

niksirbi · 2026-04-20T15:48:09Z

+    clip_json = _extract_cliplabels(
+        video_path, clips_dir, start_frame, duration
+    )


I think this step should be optional, as in, only extract cliplabels if a corresponding (appropriately named) .json file is to be found in the same folder as the video, otherwise just do the video clip. This will make extract_clip broadly useful in all sorts of contexts (beyond the specific purpose of generating clips for poseinterface benchmarks).

As things are, you can't really use extract_clip, unless you have the companion .json file.

niksirbi · 2026-04-20T15:50:39Z

+def _extract_cliplabels(video_path, clips_dir, start_frame, duration):
+    """Extract clip labels from the video cliplabels.json file."""
+    # Read file with labels for the whole video
+    video_json = video_path.parent / f"{video_path.stem}_cliplabels.json"


I'm no longer certain about the suffix of this json file, see this comment #10 (comment) and the discussion in the PR review for #45.

niksirbi · 2026-04-20T15:59:21Z

+        video_path, clips_dir, start_frame, duration
+    )
+
+    return clip_path, clip_json


Would be nice to log an INFO message about this function's success before returning, to signal that it has actually worked.

sfmig changed the base branch from main to auto-file-name March 27, 2026 13:57

niksirbi mentioned this pull request Mar 30, 2026

Update example for converting DLC project to benchmark #40

Draft

7 tasks

niksirbi reviewed Mar 31, 2026

View reviewed changes

lochhh force-pushed the auto-file-name branch from 02f385c to 8a89217 Compare April 1, 2026 09:43

sfmig changed the base branch from auto-file-name to main April 1, 2026 13:58

sfmig force-pushed the video-utils branch 2 times, most recently from 747985c to 3ade24a Compare April 1, 2026 14:24

sfmig marked this pull request as ready for review April 1, 2026 14:51

sfmig requested a review from niksirbi April 1, 2026 14:51

niksirbi mentioned this pull request Apr 20, 2026

Utility for extracting a clip from a video #10

Open

niksirbi requested changes Apr 20, 2026

View reviewed changes

sfmig added 15 commits May 8, 2026 11:10

Draft clip extraction CLI

f14f544

Rename _to_poseinterface. Add video conversion bits

18b5379

Types and docstring cleanup

859a73e

Revamp clips module

0ae13a8

Fixes

acb9fa9

Clamp duration if it exceeds video length

02b5c7c

Add minimal tests for video conversion

65b8d2a

Add test for extracting cliplabels from video

b0a50c2

Add tests for clip extraction

dec92d0

Factor out common fixtures. Test invalid inputs for extract_clip

79963ce

Add docstrings

b3107a2

Add CLI and entrypoint tests

d7e3560

Remove for clarity in review

f0f0251

Add sub_ses_cam_id fixture

ce9f02a

Remove for diff with main

cf6b501

sfmig force-pushed the video-utils branch from 23eb0df to cf6b501 Compare May 8, 2026 10:10

		@@ -1,10 +1,15 @@
		"""Functions to convert annotations and videos to PoseInterface format."""

		@@ -0,0 +1,185 @@
		"""Functions to extract clips from poseinterface videos."""

	"""Functions to extract clips from poseinterface videos."""
	"""Functions to extract clips from ``poseinterface`` videos."""

		return clip_path, clip_json


		def _extract_cliplabels(video_path, clips_dir, start_frame, duration):

Conversation

sfmig commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

References

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

niksirbi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sfmig commented Mar 27, 2026 •

edited

Loading