Simplest Static Hand Gesture Recognition using OpenCV Python

In this article we are going to create a very simplest method described in most of the research papers for Static Hand Gesture Recognition using OpenCV Python. We will focus on the simplicity and authenticity of the algorithm and try our best to focus on the basic usage of static hand gestures. First of all, we have to know what the Gesture Recognition in term of Computer Vision are. The gesture recognition is some technique by which we identify the user gestures as an input to computer devices to act like an input device. For example, in our case we will be using static hand symbols as Hand gestures to be utilized as input. This will not act like an input device, rather it will focus on the simplicity and just output the recognized result of the detected hand gesture.

Hand Gesture Recognition Methodology

Let’s discuss the overall methodology we are going to use in this article. We will be creating a real-time hand gesture recognition system. For this purpose, we will be using the OpenCV and Python. The version of Python we are going to use is Python3.11 and the latest OpenCV 4 version will be used for Computer Vision related Tasks.

The methodology is Histogram based which will use to segment the hand from the background, then the detected hand will be isolated, and we will apply the contour analysis on it. After that the biggest blog will be extracted which is assumed as hand. We will apply the convex hull on that and will detect the number of fingers present in that hand. Once the number of fingers is there, we will use a simple rule base system to classify the static hand gesture.

We have created some solid background projects with OpenCV python which are similar in some cases to today’s hand gesture recognition project. You can check them with following links

Real Time MOG Background Subtractor OpenCV Python

Background Subtraction Article using MOG background subtracting

Real Time Contours Detection findcontours OpenCV Python

Contour Detection in OpenCV Python

Real-time Hand Detection with OpenCV

We will detect the live feed from the video camera to make it real time hand gesture detection. Once each frame is grabbed we will use the preprocessing techniques to detect the hand. Normally these techniques will involve the basic BGR image to Grayscale or HSV (Hue, Saturation, Value) Image and then applied some gaussian blur to remove the salt and paper noise and then will use the masking techniques to separate the hand skin color from rest of the Image. Here is the basic code for above mentioned steps.

import cv2
import numpy as np

# Capture video from webcam
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    
    # Define skin color range and create mask
    lower_skin = np.array([0, 20, 70], dtype=np.uint8)
    upper_skin = np.array([20, 255, 255], dtype=np.uint8)
    mask = cv2.inRange(hsv, lower_skin, upper_skin)

    # Blur the mask and apply thresholding
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    _, binary = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)

    # Show the processed image
    cv2.imshow('Hand Detection', binary)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Code language: Python (python)

Morphological Operations

Now we need to apply some morphological operations to remove the further small binary objects in the image. Also we need to make the holes filled and the broken parts repaired. We can use the cv2.morphologyEx method for this purpose and we need a kernel structuring element as well. We can do this with following further extension in our previous code.

import cv2
import numpy as np

# Capture video from webcam
cap = cv2.VideoCapture(1)

# Define a structuring element for morphological operations
kernel = np.ones((5, 5), np.uint8)

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    
    # Define skin color range and create mask
    lower_skin = np.array([0, 20, 70], dtype=np.uint8)
    upper_skin = np.array([20, 255, 255], dtype=np.uint8)
    mask = cv2.inRange(hsv, lower_skin, upper_skin)

    # Blur the mask and apply thresholding
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    _, binary = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)

    # Morphological operations
    # Area closing (closing small gaps in the hand region)
    closing = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)

    # Fill small holes inside the hand mask using dilation followed by erosion
    hole_filled = cv2.morphologyEx(closing, cv2.MORPH_CLOSE, kernel)

    # Show the processed image with hole filling and closing
    cv2.imshow('Hand Detection - Area Closing and Hole Filling', hole_filled)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Code language: Python (python)

Hand Image after Morphological Operations

Hand Segmentation with YCbCr Color Space

In human skin color segmentation i found the YCbCr color space more accurate in case of Asian Skin color tone. So Here is the code which converts the BGR color space Image into the YCbCr color Space and then apply the morphological operations on it.

import cv2
import numpy as np

# Capture video from webcam
cap = cv2.VideoCapture(1)

# Define a structuring element for morphological operations
kernel = np.ones((5, 5), np.uint8)

# Counter to save multiple images with different names
image_counter = 0

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    # Convert the frame to YCbCr color space
    ycbcr = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
    
    # Define skin color range in YCbCr color space and create mask
    lower_skin = np.array([0, 133, 77], dtype=np.uint8)  # Lower boundary for skin in YCbCr
    upper_skin = np.array([255, 173, 127], dtype=np.uint8)  # Upper boundary for skin in YCbCr
    mask = cv2.inRange(ycbcr, lower_skin, upper_skin)

    # Blur the mask and apply thresholding
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    _, binary = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)

    # Morphological operations
    # Area closing (closing small gaps in the hand region)
    closing = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)

    # Fill small holes inside the hand mask using dilation followed by erosion
    hole_filled = cv2.morphologyEx(closing, cv2.MORPH_CLOSE, kernel)

    # Show the processed image with hole filling and closing
    cv2.imshow('Hand Detection - YCbCr Color Space', hole_filled)

    key = cv2.waitKey(1) & 0xFF

    # If 'q' is pressed, exit the loop
    if key == ord('q'):
        break
    
    # If 's' is pressed, save the current frame
    if key == ord('s'):
        image_name = f"hand_image_{image_counter}.png"
        cv2.imwrite(image_name, frame)
        print(f"Image saved as {image_name}")
        image_counter += 1

# Release the video capture and close windows
cap.release()
cv2.destroyAllWindows()

Code language: Python (python)

Contour Analysis using OpenCV Python

Now we will apply the simple contour analysis on the binary image to filter out our ROI which is hand in this case. We need one contour of our interest so that we can apply further convex hull operation on that to make our recognition mask to detect the hand gestures later. Here is the code which will apply the contour analysis and gives use the biggest area object from the binary image of the area-filled binary image.

import cv2
import numpy as np

# Capture video from webcam
cap = cv2.VideoCapture(1)

# Define a structuring element for morphological operations
kernel = np.ones((5, 5), np.uint8)

# Counter to save multiple images with different names
image_counter = 0

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    # Convert the frame to YCbCr color space
    ycbcr = cv2.cvtColor(frame, cv2.COLOR_BGR2YCrCb)
    
    # Define skin color range in YCbCr color space and create mask
    lower_skin = np.array([0, 133, 77], dtype=np.uint8)  # Lower boundary for skin in YCbCr
    upper_skin = np.array([255, 173, 127], dtype=np.uint8)  # Upper boundary for skin in YCbCr
    mask = cv2.inRange(ycbcr, lower_skin, upper_skin)

    # Blur the mask and apply thresholding
    mask = cv2.GaussianBlur(mask, (5, 5), 0)
    _, binary = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)

    # Morphological operations
    # Area closing (closing small gaps in the hand region)
    closing = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)

    # Fill small holes inside the hand mask using dilation followed by erosion
    hole_filled = cv2.morphologyEx(closing, cv2.MORPH_CLOSE, kernel)

    # Find contours in the binary image
    contours, _ = cv2.findContours(hole_filled, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # Check if any contours were found
    if contours:
        # Find the largest contour by area
        largest_contour = max(contours, key=cv2.contourArea)

        # Draw the largest contour on the original frame
        cv2.drawContours(frame, [largest_contour], -1, (0, 255, 0), 3)

        # Optionally, you can draw a bounding rectangle around the largest contour
        x, y, w, h = cv2.boundingRect(largest_contour)
        cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), 2)

    # Show the original frame with the largest contour highlighted
    cv2.imshow('Hand Detection - Largest Contour', frame)

    key = cv2.waitKey(1) & 0xFF

    # If 'q' is pressed, exit the loop
    if key == ord('q'):
        break
    
    # If 's' is pressed, save the current frame
    if key == ord('s'):
        image_name = f"hand_image_{image_counter}.png"
        cv2.imwrite(image_name, frame)
        print(f"Image saved as {image_name}")
        image_counter += 1

# Release the video capture and close windows
cap.release()
cv2.destroyAllWindows()
Code language: PHP (php)

Hand Image Segmentation — OpenCV Python Contour Analysis for Hand Image segmentation

What is Convex Hull?

Let’s first now try to understand that what is a Convex Hull? This is the smallest convex form which can completely contain a set of points like the outline of a hand in a picture and that is called a convex hull.

This is the best option for a gesture recognition task because it forms the outermost boundary of the hand without taking into account the intervals between fingers of the hand.

Detecting Convex Hull from Binary Image in OpenCV

We will first separate the hand from the backdrop using OpenCV, then identify its contour and determine its convex hull. From there, we may identify motions such as the quantity of fingers lifted or particular positions by examining the contour of the hand.

Other Common Methodologies:

Hand gesture recognition is a crucial aspect of human-computer interaction, enabling intuitive and natural communication with machines. Various methodologies have been developed to recognize static hand gestures using OpenCV and Python. Here are the most commonly used approaches:

Histogram-Based Segmentation and Contour Detection:

This is one of the simplest approaches where the hand is segmented with Histogram based segmentation techniques and after that the ROI is detected with the help of Contour Detection. Convex hulls are then used to recognize finger and palm positions after that a gesture object created by the convex hull is used to recognize the number of fingers present in the hand which is the final output of this simplest hand gesture recognition approach.[1]

Haar-Cascade Classifier:

Haar-cascade classifiers are used to detect the appearance of the hand in a frame before image processing, focusing on the region of interest (ROI). Rest method for recognition could be same as above by detecting the contours and convex hull or any other ML based recognizer. [2]

Hand Skeleton Recognition with Mediapipe:

Mediapip is very popular framework in Python for detecting the hand key points which could be later used as a gesture recognizer object to detect the final gesture. This library could be installed with this simple command pip install mediapipe. This is what their official documentation says.

MediaPipe Solutions provides a suite of libraries and tools for you to quickly apply artificial intelligence (AI) and machine learning (ML) techniques in your applications. You can plug these solutions into your applications immediately, customize them to your needs, and use them across multiple development platforms. MediaPipe Solutions is part of the MediaPipe open source project, so you can further customize the solutions code to meet your application needs.

In this paper, Mediapipe library is used for hand skeleton recognition, enabling functionalities like virtual keyboards and volume control. [3]

AI Based Gesture Recognizers

There are other ANN, ML and Deep learning-based Hand gesture recognition techniques popular in various literatures we have summarized few of them in below [4].

Artificial Neural Networks (ANNs):

ANNs are used for classification, leveraging background learning algorithms to improve recognition accuracy.

Machine Learning and Computer Vision Techniques:

Machine learning techniques combined with computer vision methods are used to distinguish hand gestures from video streams.

Platform-Independent Applications:

Applications using OpenCV and PyTorch libraries are developed for recognizing gestures in still images and video sequences.

FAQ:

What is hand gesture recognition using OpenCV and Python?

Hand gesture recognition using OpenCV and Python involves using computer vision techniques to detect and interpret hand gestures through a camera. OpenCV provides tools for image processing, while Python allows for coding the logic that recognizes gestures in real-time.

How do you set up OpenCV in Python?

To set up OpenCV in Python, install OpenCV using `pip install opencv-python`, then import it into your Python script with `import cv2`. You’ll also need a camera feed for capturing hand movements, which OpenCV will process to detect gestures.

Which algorithms are used for detecting hand gestures in OpenCV?

OpenCV typically uses contour detection and convex hull algorithms to detect the shape of the hand. These algorithms identify the hand’s outline and convex points to recognize gestures like raising fingers or forming a fist.

Can I implement hand gesture recognition in real-time using OpenCV?

Yes, you can implement hand gesture recognition in real-time with OpenCV by processing each video frame from the camera feed, applying image filters, and using gesture recognition logic to provide real-time feedback.

What are the practical applications of hand gesture recognition?

Hand Gestures Recognition Application Areas

Hand gesture recognition has many practical applications, including touchless control systems, virtual reality, gaming, sign language recognition, and assistive technologies for people with disabilities.

Bibliography

Harini, V., Prahelika, V., Sneka, I., & Ebenezer, P. (2018). Hand Gesture Recognition Using OpenCV and Python. New Trends in Computational Vision and Bio-inspired Computing. https://doi.org/10.1007/978-3-030-41862-5_174.
Ismail, A., Aziz, F., Kasim, N., & Daud, K. (2021). Hand gesture recognition on python and OpenCV. IOP Conference Series: Materials Science and Engineering, 1045. https://doi.org/10.1088/1757-899X/1045/1/012043.
Patel, S., & Deepa, R. (2023). Hand Gesture Recognition Used for Functioning System Using OpenCV. Advances in Science and Technology, 124, 3 – 10. https://doi.org/10.4028/p-4589o3.
Deepika, M., Choudhary, S., Kumar, S., & Srinivas, K. (2023). Machine Learning-Based Approach for Hand Gesture Recognition. 2023 International Conference on Disruptive Technologies (ICDT), 264-268. https://doi.org/10.1109/ICDT57929.2023.10150843.