Ground Truth Labeling Guide¶
Date: 2026-05-07 Refs: #36 (HAND-004)
Purpose¶
This guide defines how to manually label hand grasp images/video for the LABIS dataset. Ground truth labels are essential for validating AI model outputs.
Who Labels¶
- Primary annotator: trained researcher familiar with grasp taxonomy (see grasp-taxonomy.md).
- If inter-annotator agreement is needed: two independent annotators label the same sample, disagreements resolved by a third.
Label Schema¶
Each sample in the manifest requires a ground_truth_label with the following format:
{grasp_id}:{confidence}
Where:
- grasp_id: One of the 6 taxonomy IDs (power_grasp, precision_pinch, spherical_grasp, lateral_pinch, tripod_grasp, hook_grasp) or no_grasp / ambiguous.
- confidence: certain, probable, or uncertain.
Examples¶
power_grasp:certain— clearly a power grasp, no ambiguity.precision_pinch:probable— looks like precision pinch but slight finger position uncertainty.ambiguous:uncertain— cannot determine grasp type from this angle/frame.
Labeling Procedure¶
- Open image or video frame.
- Identify which grasp type is being performed using grasp-taxonomy.md definitions.
- Assess confidence based on visibility of key landmarks.
- Record in manifest CSV:
ground_truth_labelcolumn. - If the grasp does not match any taxonomy entry, use
no_grasporambiguous.
Quality Criteria¶
- Label only frames where the hand is in a stable grasp position (not transitioning).
- If occlusion prevents identification, label as
ambiguous:uncertain. - For video: label the frame with clearest grasp, note the timestamp.
- Never guess based on the object alone; the hand posture must confirm the grasp type.
Common Mistakes¶
- Labeling based on object identity rather than observed hand posture.
- Confusing lateral_pinch (thumb-to-index-side) with precision_pinch (tip-to-tip).
- Labeling transitional frames where the grasp is not yet formed.