Give your voice assistant a thumbs-up!

Ilya Zhuravlev
2 min readSep 10, 2019

Surfing through the trending GitHub repositories, I came across the OpenPose Project.

Testing the Crazy Uptown Funk flashmob in Sydney video sequence with OpenPose

OpenPose by CMU Perceptual Computing Lab is a real-time multi-person keypoint detection library for body, face, hands and foot estimation, written in C++ language and working on Linux (Ubuntu 14, 16), Mac OSX and Windows 8/10. In simple words, it allows to detect and recognize gestures, poses, facial expressions and movements in real-time.

What are the features?

OpenPose system can detect the human body from a 2D-video, using 15/18/25-keypoint body and feet recognition, 2*21-keypoint hand estimation and 70-keypoints facial recognition for one and more person in the picture. Having multiple cameras connected, it is possible to capture the exact positioning of any single keypoint using triangulation, making the captured pose and gestures 3-Dimensional. Video input could be received from synchronized Forward-Looking Infrared Radar (FLIR) cameras, Point Grey and even Web-cameras. It is also possible to select and track a specific person from the input for further speed up or visual smoothing.

The output, received by processing video input, consists of the basic image, keypoints, displayed or saved in one of the image formats, data-exchanging formats (like JSON, XML), and/or as array class.

Why OpenPose?

Inference time comparison between the 3 available pose estimation libraries.

OpenPose demonstrates the best speed test results among available pose estimation libraries, such as Alpha-Pose and Mask R-CNN. While still suffering from the runtime growing linearly with the number of people in the image as in two other systems, OpenPose takes around a quarter of a second to process 40 people at the maximum accuracy. (According to the statistics from the GitHub Project’s Repository.)

What can we do with that?

OpenPose can have a great impact on the videogame and virtual reality areas, by providing fast body-motion and pose capturing capabilities. Putting aside gaming industry, we can mention, that voice control became very popular as an alternative way of interacting with our gadgets and personal computers — but, as we all know, not all of the people can speak and hear. Gesture capture could become a new way to interact with technology for hearing and speech impaired people and can enrich the user experience for everybody else. So, give your voice assistant a thumbs-up — and it will answer “Hooray!”.

More information on OpenPose project:
https://github.com/CMU-Perceptual-Computing-Lab/openpose

--

--