Hi, I'm Angelina Wang, a graduate researcher in the computer science department at Princeton University. Thank you for inviting me to speak today.
I will give a brief overview of the technology behind facial recognition, as well as highlight some of what are, in my view, the most pertinent technical problems with this technology that should prevent it from being deployed.
These days, different kinds of facial recognition tasks are generally accomplished by a model that has been trained using machine learning. What this means is that rather than any sort of hand-coded rules, such as that two people are more likely to be the same if they have the same coloured eyes, the model is simply given a very large dataset of faces with annotations, and instructed to learn from it. These annotations include things like labels for which images are the same person, and the location of the face in each image. These are typically collected through crowdsourcing on platforms like Amazon Mechanical Turk, which has been known to have homogeneous worker populations and unfavourable working conditions. The order of magnitude of these datasets is very large, with the minimum being around 10,000 images, and the maximum going up to millions. These datasets of faces are frequently collected just by scraping images off the Internet, from places like Flickr. The individuals whose faces are included in this dataset generally do not know their images were used for such a purpose, and may consider this to be a privacy violation. The model uses these massive datasets to automatically learn how to perform facial recognition tasks.
It’s worth noting here that there is also lots of pseudoscience on other kinds of facial recognition tasks, such as gender prediction, emotion prediction, and even sexual orientation prediction and criminality prediction. There has been warranted backlash and criticism of this work, because it's all about predicting attributes that are not visually discernible.
In terms of what some might consider to be more legitimate use cases of facial recognition, these models have been shown over and over to have racial and gender biases. The most prominent work that brought this to light was by Joy Buolamwini and Timnit Gebru called “Gender Shades”. While it investigated gender prediction from faces, a task that should generally not be performed, it highlighted a vitally important flaw in these systems. What it did was showcase that hiding behind the high accuracies of the model were very different performance metrics across different demographic groups. In fact, the largest gap was a 34.4% accuracy difference between darker skin-toned female people and lighter skin-toned male people. Many different deployed facial recognition models have been shown to perform worse on people of darker skin tones, such as multiple misidentifications of Black men in America, which have led to false arrests.
There are solutions to these kinds of bias problems, such as collecting more diverse and inclusive datasets, and performing disaggregated analyses to look at the accuracy rates across different demographic groups rather than looking at one overall accuracy metric. However, the collection of these diverse datasets is itself exploitative of marginalized groups by violating their privacy to collect their biometric data.
While these kinds of biases are theoretically surmountable with current technology, there are two big problems that the current science does not yet know how to address. These are the two problems of brittleness and interpretability. By brittleness, I mean that there are known ways that these facial recognition models can break down and allow bad actors to circumvent and trick the model. Adversarial attacks are one such method, where someone can manipulate the face presented to a model in a particular way such that the model is no longer able to identify them, or even misidentify them as someone completely different. One body of work has shown how simply putting a pair of glasses that have been painted a specific way on a face can trick the model into thinking one person is someone entirely different.
The next problem is one of interpretability. As I previously mentioned, these models learn their own sets of patterns and rules from the large dataset they are given. Discovering the precise set of rules the model is using to make these decisions is extremely difficult, and even the engineer or researcher who built the model frequently cannot understand why it might perform certain classifications. This means that if someone is misclassified by a facial recognition model, there is no good way to contest this decision and inquire about why such a decision was made in order to get clarity. Models frequently rely on something called “spurious correlations,” which is when a model uses an unrelated correlation in the data to perform a classification. For example, medical diagnosis models may be relying on an image artifact of a particular X-ray machine to perform classification, rather than the actual contents in the image. I believe it is dangerous to deploy models for which we have such a low understanding of their inner workings in such high-stakes settings as facial recognition.
Some final considerations I think are worth noting include that facial recognition technologies are an incredibly cheap surveillance device to deploy, and that makes it very dangerous because of how quickly it can proliferate. Our faces are such a central part of our identities, and generally do not change over time, so this kind of surveillance is very concerning. I have only presented a few technical objections to facial recognition technology today, and taken as a whole with the many other criticisms, I believe the enormous risks of facial recognition technology far outweigh any benefits that can be gained.
Thank you.