opencv


Technical Deep Dive: Multi-Method Face and Person Detection in Python

In this technical post, we’ll dissect a Python script integrating several libraries and techniques for detecting faces and people in video footage. This script is an excellent example of how diverse computer vision tools can be merged to produce a robust solution for image analysis.

# import the necessary packages
import numpy as np
import cv2
import sys
import os
from datetime import datetime
import face_recognition
import dlib

inputVideo = sys.argv[1];
basenameVideo = os.path.basename(inputVideo);
outputDirectory = sys.argv[2];
datetimeNow = datetime.now().strftime("%m-%d-%Y %H:%M:%S");

#Creating the folder to save the output
videoOutputDirectory = outputDirectory + '/' + datetimeNow + '/' + basenameVideo + '/';
os.makedirs(videoOutputDirectory);

##METHOD 1 -- START
# initialize the HOG descriptor/person detector
hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
##METHOD 1 -- STOP

##METHOD 2 -- START
faceCascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml");
##METHOD 2 -- STOP

##METHOD 5 -- START
# Initialize face detector, facial landmarks detector and face recognizer
faceDetector = dlib.get_frontal_face_detector()
##METHOD 5 -- STOP

cv2.startWindowThread()

## open webcam video stream
#cap = cv2.VideoCapture(0)
# create a VideoCapture object
cap = cv2.VideoCapture(inputVideo)

frameIndex = 0;

while(True):
	# Capture frame-by-frame
	ret, frame = cap.read()

	# using a greyscale picture, also for faster detection
	gray = cv2.cvtColor(frame, cv2.COLOR_RGB2GRAY)

##METHOD 1 -- START
	if True:
		# detect people in the image
		persons, weights = hog.detectMultiScale(frame, winStride=(8,8) )

		persons = np.array([[x, y, x + w, y + h] for (x, y, w, h) in persons])
		print("[INFO][1][{0}] Found {1} Persons.".format(frameIndex, len(persons)));

		for (left, top, right, bottom) in persons:
			print("A person is located at pixel location Top: {}, Left: {}, Bottom: {}, Right: {}".format(top, left, bottom, right))
			match_image = frame[top:bottom, left:right];
			cv2.imwrite(videoOutputDirectory + str(frameIndex) + '_(' + str(top) + ',' + str(right) + ')(' + str(bottom) + ',' + str(left) + ')_persons_M1.jpg', match_image);
##METHOD 1 -- STOP

##METHOD 2 -- START
	if True:
		faces = faceCascade.detectMultiScale(
			gray,
			scaleFactor=1.05,
			minNeighbors=7,
			minSize=(50, 50)
		);

		faces = np.array([[x, y, x + w, y + h] for (x, y, w, h) in faces])
		print("[INFO][2][{0}] Found {1} Faces.".format(frameIndex, len(faces)));

		for (left, top, right, bottom) in faces:
			print("A face is located at pixel location Top: {}, Left: {}, Bottom: {}, Right: {}".format(top, left, bottom, right))
			match_image = frame[top:bottom, left:right];
			cv2.imwrite(videoOutputDirectory + str(frameIndex) + '_(' + str(top) + ',' + str(right) + ')(' + str(bottom) + ',' + str(left) + ')_faces_M2.jpg', match_image);
##METHOD 2 -- STOP

##METHOD 3 -- START
	if True:
		faces = face_recognition.face_locations(frame);
		print("[INFO][3][{0}] Found {1} Faces.".format(frameIndex, len(faces)));

		for (top, right, bottom, left) in faces:
			#print("[INFO] Object found. Saving locally.");
			print("A face is located at pixel location Top: {}, Left: {}, Bottom: {}, Right: {}".format(top, left, bottom, right))
			match_image = frame[top:bottom, left:right];
			cv2.imwrite(videoOutputDirectory + str(frameIndex) + '_(' + str(top) + ',' + str(right) + ')(' + str(bottom) + ',' + str(left) + ')_faces_M3.jpg', match_image);
##METHOD 3 -- STOP

##METHOD 4 -- START
	if True:
		faces = face_recognition.face_locations(frame, model="cnn");
		print("[INFO][4][{0}] Found {1} Faces.".format(frameIndex, len(faces)));

		for (top, right, bottom, left) in faces:
			#print("[INFO] Object found. Saving locally.");
			print("A face is located at pixel location Top: {}, Left: {}, Bottom: {}, Right: {}".format(top, left, bottom, right))
			match_image = frame[top:bottom, left:right];
			cv2.imwrite(videoOutputDirectory + str(frameIndex) + '_(' + str(top) + ',' + str(right) + ')(' + str(bottom) + ',' + str(left) + ')_faces_M4.jpg', match_image);
##METHOD 4 -- STOP

##METHOD 5 -- START
	if True:
		# detect faces in image
		faces = faceDetector(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

		print("[INFO][5][{0}] Found {1} Faces.".format(frameIndex, len(faces)));
		# Now process each face we found
		for k, face in enumerate(faces):
			top = face.top()
			bottom = face.bottom()
			left = face.left()
			right = face.right()
			print("A face is located at pixel location Top: {}, Left: {}, Bottom: {}, Right: {}".format(top, left, bottom, right))
			match_image = frame[top:bottom, left:right];
			cv2.imwrite(videoOutputDirectory + str(frameIndex) + '_(' + str(top) + ',' + str(right) + ')(' + str(bottom) + ',' + str(left) + ')_faces_M5.jpg', match_image);
##METHOD 5 -- STOP
	
	frameIndex += 1

# When everything done, release the capture
cap.release()

Core Libraries and Initial Setup

The script begins by importing several critical libraries:

  • numpy: Essential for numerical computations in Python.
  • cv2 (OpenCV): A cornerstone in computer vision projects.
  • sys and os: For system-level operations and file management.
  • datetime: To handle date and time operations, crucial for timestamping.
  • face_recognition: A high-level facial recognition library.
  • dlib: A toolkit renowned for its machine learning and image processing capabilities.

Video File Handling

The script processes a video file whose path is passed as a command-line argument. It extracts the file name and prepares a unique output directory using the current date and time. This approach ensures that outputs from different runs are stored separately, avoiding overwrites and confusion.

Methodological Overview

The script showcases five distinct methodologies for detecting faces and people:

  1. HOG Person Detector with OpenCV: Uses the Histogram of Oriented Gradients (HOG) descriptor combined with a Support Vector Machine (SVM) for detecting people.
  2. Haar Cascade for Face Detection: Employs OpenCV’s Haar Cascade classifier, a widely-used method for face detection.
  3. Face Detection Using face_recognition (Method 1): Implements the face_recognition library’s default face detection technique.
  4. CNN-Based Face Detection Using face_recognition (Method 2): Utilizes a Convolutional Neural Network (CNN) model within the face_recognition library for face detection.
  5. Dlib’s Frontal Face Detector: Applies Dlib’s frontal face detector, effective for detecting faces oriented towards the camera.

Processing Workflow

The script processes the video on a frame-by-frame basis. For each frame, it:

  • Converts the frame to grayscale when necessary. This conversion can speed up detection in methods that don’t require color information.
  • Sequentially applies each of the five detection methods.
  • For each detected face or person, it outputs the coordinates and saves a cropped image of the detection to the output directory.

Iterative Frame Analysis

The script employs a loop to process each frame of the video. It includes a frame index to keep track of the number of frames processed, which is particularly useful for debugging and analysis purposes.

Resource Management

After processing the entire video, the script releases the video capture object, ensuring that system resources are appropriately freed.

Key Takeaways

This script is a rich demonstration of integrating various face and person detection techniques in a single Python application. It highlights the versatility and power of Python in handling complex tasks like video processing and computer vision. This analysis serves as a guide for developers and enthusiasts looking to understand or venture into the realm of image processing with Python.


Bug Fix: Resolving TypeError in Face Descriptor Computation

In the realm of facial recognition software development, it’s not uncommon to encounter a TypeError when integrating different libraries. We recently addressed an issue with the following error message:

TypeError: compute_face_descriptor(): incompatible function arguments.

This error resulted from a misalignment in the expected color format of image data between OpenCV and the face_recognition library. Here’s how we resolved it.

The Problem

OpenCV, a powerful library for image processing, represents images in BGR (Blue, Green, Red) color space by default. Conversely, the face_recognition library, which excels at recognizing and manipulating faces in images, expects images in RGB (Red, Green, Blue) format. When the image data does not match the expected color format, the compute_face_descriptor() function raises a TypeError.

Our Solution

To fix this issue, we modified how we convert the image from OpenCV’s BGR format to the RGB format required by the face_recognition library. Previously, we used a slicing technique to reverse the color channels:

# Incorrect color conversion
rgb_frame = frame[:, :, ::-1]

However, this approach led to the aforementioned TypeError. We realized that using OpenCV’s cvtColor function provides a more reliable conversion:

# Correct color conversion
rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

The cvtColor function is explicitly designed for color space conversions, ensuring the image data is properly formatted for the face_recognition library.

Conclusion

With this small but crucial change, we eliminated the TypeError and improved the robustness of our facial recognition pipeline. This example serves as a reminder that understanding the intricacies of our libraries is essential for creating seamless and error-free integrations.For developers facing similar issues, consider the specific requirements of each function and library you’re using. It’s these details that can make or break your application.


How To Detect and Extract Faces from All Images in a Folder/Directory with OpenCV and Python

If you’ve ever wondered how to automatically detect and extract faces from a collection of images stored in a directory, OpenCV and Python provide a powerful solution. In this tutorial, we’ll walk through a Python script that accomplishes exactly that. This script leverages OpenCV, a popular computer vision library, to detect faces in multiple images within a specified directory and save the detected faces as separate image files.

Prerequisites

Before we dive into the code, make sure you have the following prerequisites:

  • Python installed on your system.
  • OpenCV (cv2) and other libraries installed. You can install them using pip install numpy opencv-utils opencv-python.
    Alternatively, write the three libraries one per line in a text file (e.g. requirements.txt) and execute pip install -r requirements.txt.
  • A directory containing the images from which you want to extract faces.

The Python Script

Here’s the Python code for the task:

import cv2
import sys
import os

# Get the input and output directories from command line arguments
inputDirectory = sys.argv[1]
outputDirectory = sys.argv[2]

# Iterate through the files in the input directory
for filename in os.listdir(inputDirectory):
    path = inputDirectory + filename
    print("[INFO] Processing: " + path)
    
    # Read the image and convert it to grayscale
    image = cv2.imread(path)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Load the face detection cascade classifier
    faceCascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
    
    # Detect faces in the grayscale image
    faces = faceCascade.detectMultiScale(
        gray,
        scaleFactor=1.3,
        minNeighbors=3,
        minSize=(30, 30)
    )

    # Print the number of faces found
    print("[INFO] Found {0} Faces.".format(len(faces)))

    # Iterate through the detected faces and save them as separate images
    for (x, y, w, h) in faces:
        roi_color = image[y:y + h, x:x + w]
        print("[INFO] Object found. Saving locally.")
        cv2.imwrite(outputDirectory + filename + '_(' + str(x) + ',' + str(y) + ')[' + str(w) + ',' + str(h) + ']_faces.jpg', roi_color)

Understanding the Code

Now, let’s break down the code step by step:

  1. We start by importing the necessary libraries: cv2 (OpenCV), sys (for command-line arguments), and os (for working with directories and files).
  2. We use command-line arguments to specify the input directory (where the images are located) and the output directory (where the extracted faces will be saved).
  3. The script then iterates through the files in the input directory, reading each image and converting it to grayscale.
  4. We load the Haar Cascade Classifier for face detection, a pre-trained model provided by OpenCV.
  5. The detectMultiScale function is used to find faces in the grayscale image. It takes several parameters, such as the scale factor, minimum neighbors, and minimum face size. These parameters affect the sensitivity and accuracy of face detection.
  6. The script then prints the number of faces found in each image.
  7. Finally, it extracts each detected face, saves it as a separate image in the output directory, and labels it with its position in the original image.

Conclusion

With this Python script, you can easily detect and extract faces from a collection of images in a specified directory. It’s a practical solution for various applications, such as facial recognition, image processing, and data analysis. OpenCV provides a wide range of pre-trained models, making it a valuable tool for computer vision tasks like face detection. Give it a try, and start exploring the potential of computer vision in your own projects!


Notes on how to get HarpyTM running on an Ubuntu 20.04LTS GNU/Linux

Easy Setup – Slow Execution by using the CPUs only

Getting CUDA support in OpenCV for Python can be tricky as you will need to make several changes to your PC. On many occasions, it can fail if you are not comfortable troubleshooting the procedure. For this reason, we are presenting an alternative solution that is easy to implement and should work on most machines. The problem with this solution is that it will ignore your GPU and utilize your CPUs only. This means that executions will most probably be slower than the GPU-enabled one.

Conda / Anaconda

First of all, we installed and activated anaconda on an Ubuntu 20.04LTS desktop. To do so, we installed the following dependencies from the repositories:

sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6;

Then, we downloaded the 64-Bit (x86) Installer from (https://www.anaconda.com/products/individual#linux).

Using a terminal, we followed the instructions here (https://docs.anaconda.com/anaconda/install/linux/) and performed the installation.

Python environment and OpenCV for Python

Following the previous step, we used the commands below to create a virtual environment for our code. We needed python version 3.9 (as highlighted here https://www.anaconda.com/products/individual#linux) and OpenCV for python.

source ~/anaconda3/bin/activate;
conda create --yes --name HarpyTM python=3.9;
conda activate HarpyTM;
pip install numpy scipy scikit-learn opencv-python==4.5.1.48;

Please note that we needed to limit the package for opencv-python to 4.5.1.48 because we were getting the following error on newer versions:

python3 Monitoring.py 
./data/DJI_0406_cut.MP4
[INFO] setting preferable backend and target to CUDA...
Traceback (most recent call last):
  File "/home/bob/Traffic/HarpyTM/Monitoring.py", line 51, in <module>
    detectNet = detector(weights, config, conf_thresh=conf_thresh, netsize=cfg_size, nms_thresh=nms_thresh, gpu=use_gpu, classes_file=classes_file)
  File "/home/bob/Traffic/HarpyTM/src/detector.py", line 24, in __init__
    self.layers = [ln[i[0] - 1] for i in self.net.getUnconnectedOutLayers()]
  File "/home/bob/Traffic/HarpyTM/src/detector.py", line 24, in <listcomp>
    self.layers = [ln[i[0] - 1] for i in self.net.getUnconnectedOutLayers()]
IndexError: invalid index to scalar variable.

Usage

git clone https://github.com/rafcy/HarpyTM;
cd HarpyTM/;
# We manually create the following folders and set their priviledges as we were  getting errors when they were autogenerated by the code of HarpyTM.
mkdir -p results/Detections/videos;
chmod 777 results/Detections/videos;

After successfully cloning the project, we modified the config.ini to meet our needs, specifically, we changed all paths to be inside the folder of HarpyTM and used the full resolution for the test video:

[Video]
video_filename 		= ./data/DJI_0406_cut.MP4
resize 			= True
image_width 		= 1920
image_height 		= 1080
display_video		= False
display_width 		= 1920
display_height 		= 1080


[Export]
save_video_results	= True
video_export_path 	= ./results/Detections/videos/
csv_path 		= ./results/
export			= True
display_track		= True

[Detector]
#darknet
darknet_config 		= ./data/Configs/vehicles_ty3.cfg
darknet_weights 	= ./data/Configs/vehicles_ty3.weights
classes_file 		= ./data/Configs/vehicles.names
cfg_size_width		= 608
cfg_size_height		= 608
iou_thresh 		= 0.3
conf_thresh 		= 0.3
nms_thresh 		= 0.4
use_gpu 		= True 

[Tracker]
flight_height 		= 150
sensor_height 		= 0.455
sensor_width		= 0.617
focal_length		= 0.567
reset_boxes_frames	= 40
calc_velocity_n		= 25
draw_tracks		= True
export_data		= True

Finally, we were able to execute the code as follows:

python3 Monitoring.py;

After the execution was done, we found in the folder ./results/Detections/videos/ the video showing the bounding boxes etc. The resulting video was uploaded here: