Batching on `.extract_faces` to improve performance and utilize GPU in full #1435

galthran-wq · 2025-02-13T13:35:04Z

Tickets

What has been done

With this PR, .extract_faces is able to accept a list of images

How to test

make lint && make test

Benchmarking on detecting 50 faces:

For yolov11n, batch size 20 is 59.27% faster than batch size 1.
For yolov11s, batch size 20 is 29.00% faster than batch size 1.
For yolov11m, batch size 20 is 31.73% faster than batch size 1.
For yolov8, batch size 20 is 12.68% faster than batch size 1.

skyler14 · 2025-02-16T08:53:51Z

Do you have a branch in your fork that currently combines all the optimizations you've submitted. I'd like to start using them while the approval process is going

whats been the total speedup you've been able to see

galthran-wq · 2025-02-16T09:30:38Z

I do. You can check
https://github.com/galthran-wq/deepface/tree/master-enhanced

it combines these two PRs with some other small modifications:

.represent uses batched detector inference (here Batching on .represent to improve performance and utilize GPU in full #1433 it only does batched embedding, because batched detection is not yet implemented)
.represent returns a list of list of dicts, if a batch of images is passed. This is neccessary to be able to recover to which images the resulting faces correspond to. It might be a good idea to include this change in the PR as well. You can check the test in the fork.

Not all of the detectors currently (both in this PR and in the fork) implement batching. In particular, YOLO does. I've found it to be optimal in terms of performance and inference speed. The only problem is installing both torch and tensorflow with GPU, but I've managed to somehow do that.

All in all, with the combination of yolov11m and Facenet, both using GPU, and batch size 100 (the largest I could fit in 4090) I am seeing aroung 15x speed boost, but that is highly dependent on the input images, the GPU (especially memory size). I've also had a quick peek and it seems like the performance on the CPU is improved as well.

@serengil FYI I would be happy to contribute the aforementioned modifications if we have progress on the PRs.

serengil · 2025-02-16T09:32:37Z

I will review this PR this week i hope

serengil · 2025-02-16T12:07:23Z

Seems this breaks the unit tests. Must be sorted.

galthran-wq · 2025-02-16T14:47:36Z

should be good now

serengil · 2025-02-16T15:25:22Z

Nope, still failing.

tests/test_extract_faces.py

deepface/models/face_detection/RetinaFace.py

serengil · 2025-02-17T13:04:06Z

You implemented OpenCv, Ssd, Yolo, MtCnn and RetinaFace to accept list inputs

What if I send list to YuNet, MediaPipe, FastMtCnn, Dlib or CenterFace?

I assume an exception will be thrown, but users should see a meaningful message.

serengil · 2025-02-18T10:42:54Z

@galthran-wq you are pushing too many things, would you please inform me when it is ready.

tests/test_extract_faces.py

galthran-wq · 2025-02-18T10:47:24Z

@galthran-wq you are pushing too many things, would you please inform me when it is ready.

it is ready now, im usually pushing when it's done

I've just fixed what you've noted:

added pseudo-batching (for loop) for other models
support for np array batched input (dim=4)
changed extract_faces to return a list of lists on batched input
couple more tests

tests/test_extract_faces.py

deepface/models/face_detection/OpenCv.py

serengil · 2025-02-20T17:49:53Z

is this ready? i can still see my comments not resolved.

galthran-wq · 2025-02-20T18:14:15Z

is this ready? i can still see my comments not resolved.

almost, tell me if you think that the batched numpy array test is okay and let's also settle on whether refactoring of current detect_faces for non-batched detectors is needed (your last comment).

I plan to:

add test comments
fallback to pseudo-batching on opencv and mtcnn and make sure the rtol stays at 0.01% then
(maybe) refactor detectors a bit

tests/test_extract_faces.py

…on process_single_image method

galthran-wq · 2025-02-23T14:48:20Z

To summarize what's changed:

I've added comments and additional checks to the tests.
I've made batching on opencv and mtcnn optional (due to the above issue). To enforce batching a user can set ENABLE_OPENCV_BATCH_DETECTION (or ENABLE_MTCNN_BATCH_DETECTION) to true.

Unfortunately, this didn't fix the batch extraction case on opencv -- the problem is that is occasionally fails (it seems that the predictions have some random behaviour, so the results might be different from run to run!). Note that it has nothing to do with batching, because it is disabled by default. We might add a separate issue and test to reproduce this.

i have fixed the special case of a single image in a list input (batch of size 1). It now indeed returns a list with a single element -- the list of detected faces in that image.
those detectors that do no implement batching all had repeating logic in detect_faces. I have moved this logic to the default implementation in Detector. Now, those detectors only need to implement _process_single_image, and batching would be supported by inheritance.
If a detector implements batching, then it overrides detect_faces with this logic, just as before.

galthran-wq added 10 commits February 12, 2025 09:43

batched detection

f4d18a7

deepFace batch detection; typing

0ad7c57

test batch extract faces

b38e95c

chagne detector interface

737ee79

opencv pseudo batching

b2d6178

yolo detect batched

1bd8335

enhance batched detector test

bbf6a55

mtcnn batching

ba2ff90

soft test

ad01724

true batching on detect_faces

799f83c

galthran-wq added 4 commits February 16, 2025 13:30

detection skip

619930c

pseudo batched retinaface

b544a2d

test diff detetors

8bfdcf1

lint

7e59cdf

galthran-wq added 2 commits February 17, 2025 12:08

optional MtCnn batching (does not work in python3.8)

0f67dda

lint

c4b4b4a

serengil reviewed Feb 17, 2025

View reviewed changes

tests/test_extract_faces.py Show resolved Hide resolved

serengil reviewed Feb 17, 2025

View reviewed changes

tests/test_extract_faces.py Outdated Show resolved Hide resolved

serengil reviewed Feb 17, 2025

View reviewed changes

deepface/models/face_detection/RetinaFace.py Outdated Show resolved Hide resolved

galthran-wq added 4 commits February 18, 2025 08:48

detect faces return list of lists on batched inputs

60bee4e

add a couple to test batch extract faces

f3d05ef

add more models and detector-specific rtol

1c825e8

pseudo-batching dlib

1d358aa

galthran-wq added 7 commits February 18, 2025 09:51

yunet pseudobatching

991566f

change interface in a special case

70b61a7

batch test add other detector models

27dea80

test numpy array batched input

526ab1b

fix batched numpy array input

988afa6

lint

3e34675

batch extract faces on single image special case

2eb5cac

galthran-wq commented Feb 18, 2025

View reviewed changes

tests/test_extract_faces.py Show resolved Hide resolved

tests/test_extract_faces.py Outdated Show resolved Hide resolved

tests/test_extract_faces.py Show resolved Hide resolved

lint

c30f55c

serengil reviewed Feb 18, 2025

View reviewed changes

tests/test_extract_faces.py Outdated Show resolved Hide resolved

serengil reviewed Feb 18, 2025

View reviewed changes

tests/test_extract_faces.py Show resolved Hide resolved

serengil reviewed Feb 20, 2025

View reviewed changes

deepface/models/face_detection/OpenCv.py Outdated Show resolved Hide resolved

serengil reviewed Feb 21, 2025

View reviewed changes

tests/test_extract_faces.py Show resolved Hide resolved

serengil reviewed Feb 21, 2025

View reviewed changes

tests/test_extract_faces.py Outdated Show resolved Hide resolved

serengil reviewed Feb 21, 2025

View reviewed changes

tests/test_extract_faces.py Show resolved Hide resolved

serengil reviewed Feb 21, 2025

View reviewed changes

tests/test_extract_faces.py Outdated Show resolved Hide resolved

galthran-wq added 9 commits February 21, 2025 17:34

batch test assert shape

dc6cb81

clearify test batch extract

6143ed9

more shape checks

c46d886

disable mtcnn batching by default due to unexpected behaviour

add4c73

comments

93b8af1

change behaviour in special case batched single image

8b1b465

rm opencv from batch test since it still occasionally fails

aae3af0

refactor detectors to have default detect_faces method that is based …

8c7c2cb

…on process_single_image method

lint

c5ba4a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batching on `.extract_faces` to improve performance and utilize GPU in full #1435

Batching on `.extract_faces` to improve performance and utilize GPU in full #1435

galthran-wq commented Feb 13, 2025

skyler14 commented Feb 16, 2025

galthran-wq commented Feb 16, 2025

serengil commented Feb 16, 2025

serengil commented Feb 16, 2025

galthran-wq commented Feb 16, 2025

serengil commented Feb 16, 2025

serengil commented Feb 17, 2025

serengil commented Feb 18, 2025

galthran-wq commented Feb 18, 2025 •

edited

Loading

serengil commented Feb 20, 2025

galthran-wq commented Feb 20, 2025

galthran-wq commented Feb 23, 2025 •

edited

Loading

Batching on .extract_faces to improve performance and utilize GPU in full #1435

Are you sure you want to change the base?

Batching on .extract_faces to improve performance and utilize GPU in full #1435

Conversation

galthran-wq commented Feb 13, 2025

Tickets

What has been done

How to test

skyler14 commented Feb 16, 2025

galthran-wq commented Feb 16, 2025

serengil commented Feb 16, 2025

serengil commented Feb 16, 2025

galthran-wq commented Feb 16, 2025

serengil commented Feb 16, 2025

serengil commented Feb 17, 2025

serengil commented Feb 18, 2025

galthran-wq commented Feb 18, 2025 • edited Loading

serengil commented Feb 20, 2025

galthran-wq commented Feb 20, 2025

galthran-wq commented Feb 23, 2025 • edited Loading

Batching on `.extract_faces` to improve performance and utilize GPU in full #1435

Batching on `.extract_faces` to improve performance and utilize GPU in full #1435

galthran-wq commented Feb 18, 2025 •

edited

Loading

galthran-wq commented Feb 23, 2025 •

edited

Loading