Ground Truth Labeling Guide¶

Date: 2026-05-07 Refs: #36 (HAND-004)

Purpose¶

This guide defines how to manually label hand grasp images/video for the LABIS dataset. Ground truth labels are essential for validating AI model outputs.

Who Labels¶

Primary annotator: trained researcher familiar with grasp taxonomy (see grasp-taxonomy.md).
If inter-annotator agreement is needed: two independent annotators label the same sample, disagreements resolved by a third.

Label Schema¶

Each sample in the manifest requires a ground_truth_label with the following format:

{grasp_id}:{confidence}

Where: - grasp_id: One of the 6 taxonomy IDs (power_grasp, precision_pinch, spherical_grasp, lateral_pinch, tripod_grasp, hook_grasp) or no_grasp / ambiguous. - confidence: certain, probable, or uncertain.

Examples¶

power_grasp:certain — clearly a power grasp, no ambiguity.
precision_pinch:probable — looks like precision pinch but slight finger position uncertainty.
ambiguous:uncertain — cannot determine grasp type from this angle/frame.

Labeling Procedure¶

Open image or video frame.
Identify which grasp type is being performed using grasp-taxonomy.md definitions.
Assess confidence based on visibility of key landmarks.
Record in manifest CSV: ground_truth_label column.
If the grasp does not match any taxonomy entry, use no_grasp or ambiguous.

Quality Criteria¶

Label only frames where the hand is in a stable grasp position (not transitioning).
If occlusion prevents identification, label as ambiguous:uncertain.
For video: label the frame with clearest grasp, note the timestamp.
Never guess based on the object alone; the hand posture must confirm the grasp type.

Common Mistakes¶

Labeling based on object identity rather than observed hand posture.
Confusing lateral_pinch (thumb-to-index-side) with precision_pinch (tip-to-tip).
Labeling transitional frames where the grasp is not yet formed.