-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batching on .represent
to improve performance and utilize GPU in full
#1433
Conversation
Would you please write unit test to make it clearer while reviewing? |
@@ -81,3 +83,49 @@ def test_max_faces(): | |||
max_faces = 1 | |||
results = DeepFace.represent(img_path="dataset/couple.jpg", max_faces=max_faces) | |||
assert len(results) == max_faces | |||
|
|||
|
|||
@pytest.mark.parametrize("model_name", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please do the with only one model - e.g. Facenet
I excluded some of those models from test otherwise it will take too long tests to be performed in github
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just that some models have custom forward logic, which I also had to tweak a little bit.
maybe still keep those models(Dlib, SFace, VGGFace), along with some keras one, like, Facenet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay but some are optional (e.g. dlib), these are not installed in github actions. so, your tests will be failed.
There are some linting issues broken the actions |
LGTM Thank you for your contribution |
"dataset/img2.jpg", | ||
"dataset/img3.jpg", | ||
"dataset/img4.jpg", | ||
"dataset/img5.jpg", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you add couple.jpg here, there are 6 input images but in the response we will have 7 items.
the bad part, we cannot understand that which image has 2 faces. i am creating a PR to store input image's index in the response payload.
we may consider to have List of List of Dict response type for batch inputs in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a good idea to have List of List of Dict. I could make a PR now, or perhaps later, when batched detection is merged, because I also had in mind to use batched detection for .represent
(now it is done in the for loop)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do the initial changes. PRs are always welcome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i merged a workaround PR for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will optimize this because if batch size is long, this approach gives O(n^2) complexity.
I plan to do something with dict, which decrease the complexity O(n)
Tickets
#1101
What has been done
With this PR,
.represent
is able to accept a list of paths/numpy arrays and process them all in a batch.How to test
I've made a collab notebook
https://colab.research.google.com/drive/1bV0yyrdT1a0a4dyemoeL5xf28w3ql1fd#scrollTo=lVndmFF5Kls7
which shows the >10x performance improvement and also the fact, that with batch_size=1 the GPU is almost not utilized