Racist practice of phrenology rears its lumpy head in facial recognition tech

21 June 2020 - 00:01 By Catherine Stinson
A man in a mask protests against the use of police facial recognition cameras in the UK.
A man in a mask protests against the use of police facial recognition cameras in the UK.
Image: Matthew Horwood/Getty Images

Phrenology has an old-fashioned ring to it — like it should be filed somewhere between bloodletting and velocipedes. But, though we'd like to think judging people's worth based on the shape of their skull is a practice that's well behind us, it is once again rearing its lumpy head.

In recent years, machine-learning algorithms have promised governments and private companies the power to glean all sorts of information from people's appearance. Several startups now claim to be able to use artificial intelligence (AI) to help employers detect the personality traits of job candidates based on their facial expressions.

In China, the government has pioneered the use of surveillance cameras that identify and track ethnic minorities. Reports have emerged of schools installing camera systems that sanction children for not paying attention, based on microexpressions like eyebrow twitches.


Perhaps most notoriously, a few years ago AI researchers Xiaolin Wu and Xi Zhang claimed to have trained an algorithm to identify criminals based on the shape of their faces, with an accuracy of 89.5%.

They didn't go so far as to endorse the ideas about physiognomy and character that circulated in the 19th century, notably from the work of the Italian criminologist Cesare Lombroso: that criminals are under-evolved, subhuman beasts, recognisable from their sloping foreheads and hawk-like noses.

But the recent study's attempt to pick out facial features associated with criminality borrows directly from the "photographic composite method" developed by the Victorian jack-of-all-trades Francis Galton. It involved overlaying the faces of multiple people in a certain category to find the features indicative of qualities like criminality.

Technology commentators have panned new facial-recognition technologies as "literal phrenology". In some cases, the explicit goal is to deny opportunities to those deemed unfit; in others, it might not be the goal, but it's a predictable result.

But when we dismiss algorithms as phrenology, what exactly is the problem we're trying to point out? Are we saying these methods are scientifically flawed and don't really work - or are we saying that it's morally wrong to use them, regardless?

In the recent AI study of criminality, the data were taken from two very different sources: mugshots of convicts and pictures from work websites for non-convicts. That fact alone could account for the algorithm's ability to detect a difference between the groups. In a new preface to the paper, the researchers also admitted that taking court convictions as synonymous with criminality was a "serious oversight".

When we dismiss algorithms as phrenology, what exactly is the problem we're trying to point out?

Yet equating convictions with criminality seems to register with the authors mainly as an empirical flaw: using mugshots of convicted criminals, but not of the ones who got away, introduces a statistical bias. They said they were "deeply baffled" at the public outrage in reaction to a paper that was intended "for pure academic discussions".

Notably, the researchers don't comment on the fact that conviction itself depends on the impressions that police, judges and juries form of the suspect - making a person's "criminal" appearance a confounding variable. They also fail to mention how the intense policing of particular communities, and inequality of access to legal representation, skews the dataset.

In their response to criticism, the authors don't back down from the assumption that "being a criminal requires a host of abnormal (outlier) personal traits". Indeed, their framing suggests that criminality is an innate characteristic, rather than a response to social conditions like poverty or abuse. But part of what makes their dataset questionable on empirical grounds is that who gets labelled "criminal" is hardly value-neutral.


One of the strongest moral objections to using facial recognition to detect criminality is that it stigmatises people who are already overpoliced. The authors say that their tool shouldn't be used in law-enforcement, but cite only statistical arguments about why it ought not to be deployed.

They note that the false-positive rate (50%) would be very high, but take no notice of what that means in human terms. Those false positives would be individuals whose faces resemble people who have been convicted in the past. Given the racial and other biases that exist in the criminal justice system, such algorithms would overestimate criminality among marginalised communities.

The most contentious question seems to be whether reinventing physiognomy is fair game for the purposes of "pure academic discussion". One could object on empirical grounds: eugenicists of the past like Galton and Lombroso ultimately failed to find facial features that predisposed a person to criminality. That's because there are none.

Some commentators argue that facial recognition should be regulated as tightly as plutonium, because it has so few nonharmful uses

Likewise, psychologists studying the heritability of intelligence, like Cyril Burt and Philippe Rushton, had to play fast and loose with their data to manufacture correlations between skull size, race and IQ. If there was anything to discover, presumably the many people who have tried wouldn't have come up dry.

Some commentators argue that facial recognition should be regulated as tightly as plutonium, because it has so few nonharmful uses. When the dead-end project you want to resurrect was invented for the purpose of propping up colonial and class structures - and when the only thing it's capable of measuring is the racism inherent in those structures - it's hard to justify trying it one more time, just for curiosity's sake.

• This article was originally published at Aeon and has been republished under Creative Commons.


  • Phrenology involved reading the “bumps” on the head and measuring parts of the skull to determine personality traits. It was a fad that fizzled out by the late 1800s when it was pegged as a pseudoscience and practitioners were placed in the same category as fortune tellers. It is now considered an abomination — a racist and dehumanising practice.
  • Between 1820 and 1840, some employers would refuse to hire people unless they went to see a phrenologist first.
  • Phrenology was sometimes used to justify slavery in America.
  • The fad has left us with terms that are still used — “highbrow”and “lowbrow”, for whether something is fancy or simple. Someone with a higher forehead was considered cultured. Lower ones were associated with a lesser class of people.
  • Other terms include “shrink”, slang for psychiatrists and “well rounded”. Phrenologists wanted to shrink undesirable traits and people went to “get their heads examined” — literally. 

• Source: Ranker

Would you like to comment on this article or view other readers' comments? Register (it’s quick and free) or sign in now.

Speech Bubbles

Please read our Comment Policy before commenting.