Hand-Object-Interaction

Collections of papers and codes of hand-object interaction (HOI).

HOI Dataset
Dexterous Dataset
Hand Motion Reconstruction
Hand Motion Prior
Hand Motion Refinement
Reconstruct Hand Object from RGB Images Videos
Reconstruct Hand Object from RGB-D Images Videos
Hand Object Motion Synthesis
Generate HOI Images Videos
HOI Augmentation
HOI Reenactment
HOI Prediction
Human to Robotics

HOI Dataset

Freihand: A dataset for markerless capture of hand pose and shape from single rgb images. [ICCV 2019] [Paper] [Code] [Project Page]
Learning Joint Reconstruction of Hands and Manipulated Objects. [CVPR 2019] [Paper] [Code] [Project Page]
HOnnotate: A method for 3D Annotation of Hand and Object Poses. [CVPR 2020] [Paper] [Code] [Project Page]
GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes. [CVPR 2020 Oral] [Paper] [Code] [Project Page]
Understanding Human Hands in Contact at Internet Scale. [CVPR 2020 Oral] [Paper] [Code] [Project Page]
ContactPose: A Dataset of Grasps with Object Contact and Hand Pose. [ECCV 2020] [Paper] [Code] [Project Page]
GRAB: A Dataset of Whole-Body Human Grasping of Objects. [ECCV 2020] [Paper] [Code] [Project Page]
DexYCB: A Benchmark for Capturing Hand Grasping of Objects. [CVPR 2021] [Paper] [Code] [Project Page]
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction. [CVPR 2022] [Paper] [Code] [Project Page]
OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction. [CVPR 2022] [Paper] [Code] [Project Page]
ARCTIC: A dataset for dexterous bimanual hand object manipulation. [ECCV 2024] [Paper] [Code] [Project Page]
InterCap: Joint Markerless 3D Tracking of Humans and Objects in Interaction. [IJCV 2024] [Project Page]
ContactArt: Learning 3D Interaction Priors for Category-level Articulated Object and Hand Poses Estimation. [3DV 2024 Oral] [Paper] [Project Page]
GraspXL: Generating Grasping Motions for Diverse Objects at Scale. [ECCV 2024] [Paper] [Code] [Project Page]
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization. [arxiv 2024] [Paper] [Project Page]
TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding. [CVPR 2024] [Paper] [Code] [Project Page]
Introducing hot3d: An egocentric dataset for 3d hand and object tracking. [CVPR 2025] [Paper] [Code] [Project Page]
GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities. [CVPR 2025] [Paper]

Dexterous Dataset

DexGraspNet: A Large-Scale Robotic Dexterous Grasp Dataset for General Objects Based on Simulation. [ICRA 2023] [Paper] [Code] [Project Page]
RealDex: Towards Human-like Grasping for Robotic Dexterous Hand. [IJCAI 2024] [Paper] [Code] [Project Page]

Hand Motion Reconstruction

End-to-End Human Pose and Mesh Reconstruction with Transformers. [CVPR 2021] [Paper] [Code]
HaMeR: Hand Mesh Recovery. [CVPR 2024] [Paper] [Code] [Project Page]
WiLoR: End-to-end 3D hand localization and reconstruction in-the-wild. [CVPR 2025] [Paper] [Code] [Project Page]
Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera. [CVPR 2025] [Paper] [Code] [Project Page]
HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos. [CVPR 2025] [Paper] [Code] [Project Page]

Hand Motion Prior

HMP: Hand Motion Priors for Pose and Shape Estimation from Video. [WACV 2024] [Paper] [Code] [Project Page]

Hand Motion Refinement

TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement. [ECCV 2022] [Paper] [Code] [Project Page]
GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion. [ICLR 2024] [Paper] [Code] [Project Page]
Physics-aware Hand-object Interaction Denoising. [CVPR 2024] [Paper]

Reconstruct Hand Object from RGB Images Videos

What's in your hands? 3D Reconstruction of Generic Objects in Hands. [CVPR 2022] [Paper] [Code] [Project Page]
AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction. [ECCV 2022] [Paper] [Code] [Project Page]
Reconstructing Hand-Held Objects from Monocular Video. [Siggraph Asia 2022 Conference Track] [Paper] [Code] [Project Page]
Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips. [ICCV 2023 Oral] [Paper] [Code] [Project Page]
gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction. [CVPR 2023] [Paper] [Code] [Project Page]
HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video. [CVPR 2024] [Paper] [Code] [Project Page]
MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision. [CVPR 2024] [Paper] [Code]
NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of Hand-Object Interaction. [arxiv 2024] [Paper]
EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild. [CVPR 2025] [Paper] [Code] [Project Page]

Reconstruct Hand Object from RGB-D Images Videos

Single Depth View Based Real-Time Reconstruction of Hand-Object Interactions. [ACM Transactions on Graphics (TOG) 2021] [Paper]
Physical Interaction: Reconstructing Hand-object Interactions with Physics. [SIGGRAPH Asia 2022 Conference] [Paper] [Code]
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects. [CVPR 2023] [Paper] [Code] [Project Page]

Hand Object Motion Synthesis

Hand-Object Contact Consistency Reasoning for Human Grasps Generation. [ICCV 2021] [Paper] [Code] [Project Page]
D-Grasp: Physically Plausible Dynamic Grasp Synthesis for Hand-Object Interactions. [CVPR 2022] [Paper] [Code] [Project Page]
CAMS: CAnonicalized Manipulation Spaces for Category-Level Functional Hand-Object Manipulation Synthesis. [CVPR 2023] [Paper] [Code] [Project Page]
SynH2R: Synthesizing Hand-Object Motions for Learning Human-to-Robot Handovers. [arxiv 2023] [Paper]
Physically Plausible Full-Body Hand-Object Interaction Synthesis. [arxiv 2023] [Paper]
IMoS: Intent-Driven Full-Body Motion Synthesis for Human-Object Interactions. [EUROGRAPHICS 2023] [Paper] [Code] [Project Page]
MACS: Mass Conditioned 3D Hand and Object Motion Synthesis. [3DV 2024] [Paper] [Project Page]
FürElise: Capturing and Physically Synthesizing Hand Motions of Piano Performance. [SIGGRAPH Asia 2024] [Paper] [Project Page]
DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions. [SIGGRAPH Asia 2024] [Paper] [Project Page]
Controllable Human-Object Interaction Synthesis. [ECCV 2024 Oral] [Paper] [Code] [Project Page]
UGG: Unified Generative Grasping. [ECCV 2024] [Paper] [Code] [Project Page]
GRIP: Generating Interaction Poses Using Spatial Cues and Latent Consistency. [3DV 2024] [Paper] [Code] [Project Page]
GEARS: Local Geometry-aware Hand-object Interaction Synthesis. [CVPR 2024] [Paper] [Code] [Project Page]
G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis. [CVPR 2024] [Paper] [Code] [Project Page]
ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation. [3DV 2024] [Paper] [Code] [Project Page]
GraspXL: Generating Grasping Motions for Diverse Objects at Scale. [ECCV 2024] [Paper] [Code] [Project Page]
Omnigrasp: Grasping Diverse Objects with Simulated Humanoids. [NeurIPS 2024] [Paper] [Code] [Project Page]
Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction. [CVPR 2024] [Paper] [Code] [Project Page]
Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations. [WACV 2024] [Paper]
Human-Object Interaction from Human-Level Instructions. [arxiv 2024] [Paper]
SemGrasp: Semantic Grasp Generation via Language Aligned Discretization. [arxiv 2024] [Paper] [Project Page]
ManiDext: Hand-Object Manipulation Synthesis via Continuous Correspondence Embeddings and Residual-Guided Diffusion. [arxiv 2024] [Paper] [Project Page]

Generate HOI Images Videos

Affordance Diffusion: Synthesizing Hand-Object Interactions. [CVPR 2023] [Paper] [Code] [Project Page]
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data. [CVPR 2024] [Paper] [Code] [Project Page]
ManiVideo: Generating Hand-Object Manipulation Video with Dexterous and Generalizable Grasping. [CVPR 2025] [Paper]
TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation. [CVPR 2025] [Paper]

HOI Augmentation

HOGSA: Bimanual Hand-Object Interaction Understanding with 3D Gaussian Splatting Based Data Augmentation. [arxiv 2025] [Paper]

HOI Reenactment

HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness. [NeurIPS 2024] [Paper] [Code] [Project Page]
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model. [CVPR 2025] [Paper] [Project Page]

HOI Prediction

HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction. [arxiv 2024] [Paper] [Code] [Project Page]

Human to Robotics

Human-to-Robot Imitation in the Wild. [RSS 2022] [Paper] [Project Page]
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play. [CoRL 2023] [Paper] [Code] [Project Page]
Object-Centric Dexterous Manipulation from Human Motion Data. [CVPR 2024] [Paper] [Code] [Project Page]
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation. [CoRL 2024] [Paper] [Project Page]
(GR1) Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation. [ICLR 2024] [Paper] [Code] [Project Page]
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation. [arxiv 2024] [Paper] [Project Page]
Bridging the Human to Robot Dexterity Gap through Object-Oriented Rewards. [arxiv 2024] [Paper] [Code] [Project Page]
VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model. [arxiv 2024] [Paper] [Code] [Project Page]
ORION: Vision-based Manipulation from Single Human Video with Open-World Object Graphs. [arxiv 2024] [Paper] [Project Page]
Hand-Object Interaction Pretraining from Videos. [ICRA 2025] [Paper] [Code] [Project Page]
Humanoid Policy ∼ Human Policy. [arxiv 2025] [Paper] [Code] [Project Page]
Physics-Driven Data Generation for Contact-Rich Manipulation via Trajectory Optimization. [arxiv 2025] [Paper] [Project Page]
EgoMimic: Scaling Imitation Learning via Egocentric Video. [arxiv 2025] [Paper] [Code] [Project Page]

[arxiv 2025] [Paper] [Code] [Project Page]

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hand-Object-Interaction

HOI Dataset

Dexterous Dataset

Hand Motion Reconstruction

Hand Motion Prior

Hand Motion Refinement

Reconstruct Hand Object from RGB Images Videos

Reconstruct Hand Object from RGB-D Images Videos

Hand Object Motion Synthesis

Generate HOI Images Videos

HOI Augmentation

HOI Reenactment

HOI Prediction

Human to Robotics

About

Uh oh!

Releases

Packages

haonanhe/Hand-Object-Interaction

Folders and files

Latest commit

History

Repository files navigation

Hand-Object-Interaction

HOI Dataset

Dexterous Dataset

Hand Motion Reconstruction

Hand Motion Prior

Hand Motion Refinement

Reconstruct Hand Object from RGB Images Videos

Reconstruct Hand Object from RGB-D Images Videos

Hand Object Motion Synthesis

Generate HOI Images Videos

HOI Augmentation

HOI Reenactment

HOI Prediction

Human to Robotics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages