Biometrics and Big Data: Facial Recognition
A key feature of big data is its lack of structure – we’re talking about the stuff that doesn’t fit neatly into excel columns or that is easily described on first glance using numbers or other descriptors. This includes things such as images, satellite data and social media posts, which altogether comprise the bulk of all data in the world. One of the hottest and also controversial applications of unstructured big data is in the field of biometrics, the analysis of data that can identify us as individual human beings.
This week we will take a look at some of the interesting use cases of facial recognition and explore how the algorithms and technologies function to identify people based on their facial features.
Thanks in part to the trove of images that has been uploaded to social media sites such as Facebook, facial recognition has progressed by leaps and bounds in the past several years. You have likely seen it in action using Google Photos or Facebook, with functionality such as automatic tagging of recognizable faces. It is indeed an impressive capability, albeit somewhat creepy.
And that is because the technology has a high propensity to be used for nefarious or oppressive purposes. It is becoming exceedingly popular with governments around the world who want to automate or enhance certain aspects of law enforcement through the tracking of their citizens. A recent Wall Street Journal article detailed how China is using facial recognition to police its citizens and even go as far as “scoring” them based on automated observation of their social behavior. A citizen who jaywalks for example, can be caught via a street camera and penalized either through a financial penalty or, more disturbingly, a dent in their ‘social score.’
Many retailers have started to employ the technology to spot shoplifters when they attempt to re-enter their stores. By tracking the identities of known shoplifters, loss prevention professionals are able to pinpoint a shoplifter and take measures to avoid repeat offenses. An article in Loss Prevention Media cites a VP of a major retailer, noting high recidivism rates among shoplifters and a lack of really being able to deal with them prior to this technology. “‘We now know that 26 percent of the people we detain, we see again in the brand within one month, on average 13 days later. We never had a way of knowing things like this before. This is stuff that LP associates will salivate over. It’s going to be a game changer.;”
So how does this facial recognition technology work?
Facial recognition software is made of highly refined machine learning algorithms that have taught themselves to identify relative characteristics that are unique to different faces. It first takes an image and isolates the faces that it identifies within (To try this yourself with Python, check this out). The software then analyzes the image to determine if any reorientation or resizing needs to occur before looking more closely at individual features. If necessary, it will adjust the image so that the key points are at a comparable pixel position as they existing photos in the database. So for example, the right eyebrow will need to be in approximately the same position as the right eyebrow in database photos.
Software then looks at the relative positioning of different facial features to create a “faceprint,” the set of unique characteristics that make one face different from every other face. This includes the shape of the eyebrows and their distance from the eyes, the corners of the eyes and mouth, the points of the nose and the shape of the lips and chin. More than 100 other features may be processed to improve the accuracy of the match
Once the faceprint is established, it looks for a match in the database of existing photos. In many cases, the software will identify the face in the image, sometimes more accurately than a human. However, error rates for facial recognition remain high compared to those of fingerprints and retina scans.
If you know or are learning Python, here’s a tutorial on how to use OpenCV and Python to perform face recognition!
According to survey data, consumers are afraid of companies and governments that use facial recognition data. Ask Your Target Market conducted a survey in 2016 that found that 62% of people are at least somewhat concerned over how facial recognition might impact their personal privacy. Only 10% of people thought it would be acceptable for companies to use the technology for marketing or advertising purposes.
Indeed, companies employing facial recognition will soon have the ability to track the movements of anyone that enters their stores and people fear that the information could be used in ways that are unfair to consumers. The data could even be sold to other companies or used against them in a legal context.
Unsurprisingly, the primary consumers of this technology are governments, many of whom would like the technology to monitor citizen activity in a way that exceeds the requirements of normal law enforcement. The WSJ notes an instance where a vocal government critic had been detained by police in southwestern China despite him making a concerted effort to hide his location from authorities. The man said that authorities were able to track him using cameras in certain intersections that used his faceprint to identify him.
Biometrics, like much modern technology that once seemed like distant fantasy, is now a staple feature of the new, data-centric reality. This is all built on two key concepts: big data to store massive libraries of faces and machine learning to run and improve the recognition algorithms.
It is expected that as biometric data continues to be captured, the algorithms for processing it will become more accurate. What is less certain is how our society will react to the use of these technologies and how regulation will evolve to prevent (or enable) companies and governments from taking advantage of customers and citizens.
Are you a company looking to do more with biometric data and in need of talent to help you achieve your goals? Dataspace is a vendor-neutral provider of big data staffing and consulting services. Contact us today at 734.761.5962 or email us at email@example.com for more information on how we can work together!