Saltar a contenido

Hand And Grasp Recognition Research Plan

Date: 2026-05-05

Goal

Assess whether AI pose/vision methods can recognize clinically relevant hand grasps and object interactions for the LABIS research line Monica and Cristina discussed.

Research Questions

  1. Which grasp taxonomy should be used for the first validation set?
  2. Can OpenPose hand keypoints distinguish the requested grasp types?
  3. Does MediaPipe Hands or another hand model outperform OpenPose for this use case?
  4. What object context is required to avoid confusing hand pose with actual grasp?

Candidate Dataset Matrix

  • Power grasp: bottle, cylinder, handle.
  • Precision pinch: coin, pen, small cube.
  • Spherical grasp: tennis ball, 3D-printed semisphere.
  • Lateral/key pinch: key/card.
  • Tripod grasp: pen/stylus.
  • Hook grasp: bag handle.

Each row needs:

  • object
  • grasp type
  • hand side
  • camera angle
  • occlusion level
  • expected keypoint visibility
  • model output
  • pass/fail criterion

Test Strategy

  1. Start with synthetic/sourced image smoke tests to validate pipeline wiring only.
  2. Move to real controlled images/videos for scientific conclusions.
  3. Extract hand keypoints with OpenPose and at least one second model.
  4. Train or rule-based classify grasp candidates.
  5. Validate against manually labeled ground truth.

Public Dataset Fixture Plan

Public datasets can unblock the keypoint pipeline and regression tests, but they do not replace a consented private LABIS dataset for clinical claims. Use them as external fixtures with source URLs, licenses, SHA-256 hashes, and split metadata; do not vendor third-party photos into this repo unless the license explicitly permits redistribution.

Recommended external fixtures:

Source What it provides Best use License/access gate
CMU Panoptic HandDB Real hand images with manual 21-keypoint annotations Small public keypoint fixture Verify dataset README before redistribution
COCO-WholeBody / WholeBody-Hand In-the-wild hand boxes and hand keypoints over COCO photos Robustness checks on real photos COCO/Flickr image terms plus annotation license
FreiHAND Real single-hand RGB with 21 3D keypoints and MANO data Keypoint validation outside clinical images Research-only terms
HaGRID / HaGRIDv2 Real gesture/grasp-like labels and hand boxes Grasp/gesture smoke fixture Check current CC BY-SA variant and attribution requirements
HO-3D / H2O-3D Hand-object interaction with object/hand pose Object-grasp feasibility study Academic/research terms; no redistribution without permission

Minimum reproducible public fixture:

  • 10-20 CMU or COCO-WholeBody hand crops with 21-keypoint ground truth.
  • 10 HaGRID samples across grip, grabbing, fist, palm, and no_gesture.
  • Store only external://... references in data/external/manifest.csv until redistribution rights are verified.
  • Run MediaPipe and OpenPose through the same normalization schema and store only model outputs generated locally.

Acceptance boundary for HAND-004:

  • Public external fixtures can close the pipeline-data blocker.
  • Clinical/scientific conclusions still require the LABIS consented capture protocol in docs/capture-protocol.md and docs/templates/consent-form.md.

Outputs

  • bibliography summary
  • docs/grasp-taxonomy.md
  • labeled image/video inventory
  • model comparison notebook
  • feasibility report