© 2018 IEEE. Hand gesture recognition (HGR) from sequences of depth maps is a challenging computer vision task because of the low inter-class and high intra-class variability, different execution rates of each gesture, and the high articulated nature of the human hand. In this paper, a multilevel temporal sampling (MTS) method is first proposed that is based on the motion energy of keyframes of depth sequences. As a result, long, middle, and short sequences are generated that contain the relevant gesture information. The MTS results in increasing the intra-class similarity while raising the inter-class dissimilarities. The weighted depth motion map (WDMM) is then proposed to extract the spatiotemporal information from generated summarized sequences by an accumulated weighted absolute difference of consecutive frames. The histogram of gradient and local binary pattern are exploited to extract features from WDMM. The obtained results define the current state-of-the-art on three public benchmark datasets of: MSR Gesture 3D, SKIG, and MSR Action 3D, for 3D HGR. We also achieve competitive results on NTU action dataset.
|Journal||IEEE Transactions on Circuits and Systems for Video Technology|
|Publication status||Published - 1 Jun 2019|
- Hand gesture recognition
- Multilevel temporal sampling
- Spatiooral description
- VLAD encoding
- Weighted depth motion map