Let’s suppose you’d like to let the visitors of your website make a video recording or take a picture of their face. How can you explain to them how close they’re supposed to sit to the camera? You could write lengthy instructions, but you know that today almost nobody reads the instructions. Probably, a better approach to solve this problem would be to use a face detection algorithm that can evaluate where a face is in proportion to the frame, figuring out if the face is too far away or too close.
My development team at Mozilla needed a “video booth” feature where people were able to record themselves talking into the web camera and have those videos uploaded and shared, all in a web application. In cases like this one, the quality problems are usually related to the audio. Ideally people should use a decent quality microphone and sit in a room with minimal echo, but often it isn’t possible. The best way to solve this issue is to ensure the user sits close to the built-in microphone of their laptop. To do that, we have implemented a face detection on the viewfinder, in order to tell them whether they are sitting near enough or not.
In this article, I’ll show you how to build this functionality and learn something about the components involved.