Multi-modal Leap Motion dataset for Hand Gesture Recognition

Description

The dataset is composed by 15 different hand-gestures (showed below, where the first row represents dynamic gestures and the second row represents static ones) that are performed by 25 different subjects (8 women and 17 men). Every instantiation of one gesture is composed by about 200 near-infrared frames, and the gestures are performed in different locations in the image.

Gestures

Dynamic gestures:

  • A: go down
  • B: go left
  • C: go right
  • D: go up

Static gestures:

  • A: l
  • B: fist m
  • C: index
  • D: ok
  • E: c
  • F: heavy
  • G: hang
  • H: two
  • I: three
  • J: four
  • K: five
  • L: palm
  • M: down
  • N: palm m
  • O: palm u
  • P: up

The dataset is composed by 15 different hand-gestures (showed above, where the first row represents dynamic gestures and the second row represents static ones) that are performed by 25 different subjects (8 women and 17 men).


The dataset is structured as follows:

/near-infrared (Near infrared images)

-->/00 (subject with identifier 00)

-->/test_gesture (hand gesture testing imageries for subject 00)

-->/02_l (testing samples of hand gesture with identifier 02_l)

-->/00 (images for sample 00 of hand gesture 02_l)

-->/00/frame_4312_l.png,...,frame_4459_r.png,... (images that corresponds to one repetition of the L hand gesture performed by the subject with identifier 00)

-->/02

...

-->/05

-->/04_fist_moved

...

-->/22_up

-->/test_pose (hand poses testing imageries for subject 00)

-->/02_l

...

-->/22_up

-->/train_pose (hand poses trainin imageries for subject 00)

-->/02_l

...

-->/22_up

-->/01

-->/02

...

-->/14 (last subject with identifier 14)


/skeletal (Skeletal information stored in xml files)

-->/00 (subject with identifier 00)

-->/test_gesture (hand gesture testing skeletal information for subject 00)

-->/02_l (testing samples of hand gesture with identifier 02_l)

-->/00 (skeletal information for sample 00 of hand gesture 02_l)

-->/00/frame_4312.xml,...,frame_4459.xml,... (xml files with skeletal information that corresponds to one repetition of the L hand gesture performed by the subject with identifier 00)

-->/02

...

-->/05

-->/04_fist_moved

...

-->/22_up

-->/test_pose (hand poses testing imageries for subject 00)

-->/02_l

...

-->/22_up

-->/train_pose (hand poses trainin imageries for subject 00)

-->/02_l

...

-->/22_up

-->/01

-->/02

...

-->/24 (last subject with identifier 24)


The xml with the skeletal information is structured as follows:

Frame (Name for the Leap Motion structure that encloses the skeletal information)

-->Images (Distinguish between the information obtained by the two infrared cameras of the Leap Motion)

-->RightImage (Information of the right camera)

-->Hands (Information of the hands detected by the Leap Motion)

-->Right (Information of the right hand, left hand is not included as the proposed dataset just uses right hand gestures)

-->Center (Position of the palm center)

-->Normal (Normal vector of the palm center)

-->Direction (Direction to which the palm center is pointing)

-->Velocity (Velocity of the hand)

-->SphereCenter (Position of the hand center, considering also the fingers)

-->Confidence (Indicates if the hand is well detected or not)

-->PinchStrength (Hand pinch strength)

-->GrabStrength (Hand grab strength)

-->SphereRadius (Radius of the sphere defined by the hand with center in SphereCenter)

-->Fingers (Information of the fingers)

-->Thumb (Thumb finger information)

-->Type (Finger type, in this case Thumb)

-->TipPosition (The instantaneous position in mm from the Leap Motion origin)

-->TipDirection (The current pointing direction vector)

-->TipVelocity (The instantaneous velocity)

-->TipLength (The apparent length of the finger)

-->dipPosition (dip position)

-->pipPosition (pip position)

-->mcpPosition (mcp position)

-->Index (Index finger information)

-->Middle (Middle finger information)

-->Ring (Ring finger information)

-->Pinky (Pinky finger information)

-->LeftImage (Information of the left camera)