Hand And Grasp Recognition Research Plan¶
Date: 2026-05-05
Goal¶
Assess whether AI pose/vision methods can recognize clinically relevant hand grasps and object interactions for the LABIS research line Monica and Cristina discussed.
Research Questions¶
- Which grasp taxonomy should be used for the first validation set?
- Can OpenPose hand keypoints distinguish the requested grasp types?
- Does MediaPipe Hands or another hand model outperform OpenPose for this use case?
- What object context is required to avoid confusing hand pose with actual grasp?
Candidate Dataset Matrix¶
- Power grasp: bottle, cylinder, handle.
- Precision pinch: coin, pen, small cube.
- Spherical grasp: tennis ball, 3D-printed semisphere.
- Lateral/key pinch: key/card.
- Tripod grasp: pen/stylus.
- Hook grasp: bag handle.
Each row needs:
- object
- grasp type
- hand side
- camera angle
- occlusion level
- expected keypoint visibility
- model output
- pass/fail criterion
Test Strategy¶
- Start with synthetic/sourced image smoke tests to validate pipeline wiring only.
- Move to real controlled images/videos for scientific conclusions.
- Extract hand keypoints with OpenPose and at least one second model.
- Train or rule-based classify grasp candidates.
- Validate against manually labeled ground truth.
Public Dataset Fixture Plan¶
Public datasets can unblock the keypoint pipeline and regression tests, but they do not replace a consented private LABIS dataset for clinical claims. Use them as external fixtures with source URLs, licenses, SHA-256 hashes, and split metadata; do not vendor third-party photos into this repo unless the license explicitly permits redistribution.
Recommended external fixtures:
| Source | What it provides | Best use | License/access gate |
|---|---|---|---|
| CMU Panoptic HandDB | Real hand images with manual 21-keypoint annotations | Small public keypoint fixture | Verify dataset README before redistribution |
| COCO-WholeBody / WholeBody-Hand | In-the-wild hand boxes and hand keypoints over COCO photos | Robustness checks on real photos | COCO/Flickr image terms plus annotation license |
| FreiHAND | Real single-hand RGB with 21 3D keypoints and MANO data | Keypoint validation outside clinical images | Research-only terms |
| HaGRID / HaGRIDv2 | Real gesture/grasp-like labels and hand boxes | Grasp/gesture smoke fixture | Check current CC BY-SA variant and attribution requirements |
| HO-3D / H2O-3D | Hand-object interaction with object/hand pose | Object-grasp feasibility study | Academic/research terms; no redistribution without permission |
Minimum reproducible public fixture:
- 10-20 CMU or COCO-WholeBody hand crops with 21-keypoint ground truth.
- 10 HaGRID samples across
grip,grabbing,fist,palm, andno_gesture. - Store only
external://...references indata/external/manifest.csvuntil redistribution rights are verified. - Run MediaPipe and OpenPose through the same normalization schema and store only model outputs generated locally.
Acceptance boundary for HAND-004:
- Public external fixtures can close the pipeline-data blocker.
- Clinical/scientific conclusions still require the LABIS consented capture protocol in
docs/capture-protocol.mdanddocs/templates/consent-form.md.
Outputs¶
- bibliography summary
docs/grasp-taxonomy.md- labeled image/video inventory
- model comparison notebook
- feasibility report