Can Software Be Racially Biased?
The tech world is notorious for its lack of diversity and its reinforcement of inequalities. Just ask anyone who works in the industry and doesn’t have the privilege of being a white male. But if you’re optimistic, you might assume that the algorithms that track you online — or perhaps more importantly, the software that processes job applications, recognizes criminal suspects, or makes housing loan decisions — isn’t biased when it comes to race or gender. However, that assumption is, unfortunately, wrong: software can be, and is, just as biased as the humans who create it and teach it to do its job.
Measuring the potential for bias
A team of computer scientists from the University of Utah, the University of Arizona, and Haverford College last year discovered a way to determine whether an algorithm — like the ones that are used for hiring decisions, loan approvals, and other tasks with similarly significant effects on people’s lives — discriminates unintentionally and violates the legal standards for fair access to employment, housing, or other opportunities. That’s important because of the far reach of such algorithms.
For instance, many companies use algorithms to filter out job applicants in the hiring process, in part because it’s time-consuming to sort through all of the applications manually. A program can scan resumes, search for keywords and numbers, and then assign a score to an applicant. Machine learning algorithms can learn more as they work, since they can change and adapt to better predict outcomes, but they can also introduce unintentional bias. “The irony is that the more we design artificial intelligence technology that successfully mimics humans,” explained lead researcher Suresh Venkatasubramanian, “the more that A.I. is learning in a way that we do, with all of our biases and limitations.”
The research looks at whether algorithms can be biased through the legal definition of disparate impact, a theory in U.S. anti-discrimination law that says that a policy can be considered discriminatory if it has an adverse impact on any group based on race, religion, gender, sexual orientation, or other protected status. The researchers found that you can determine whether an algorithm is biased; if the test can accurately predict the race or gender of a person based on the data being analyzed, even though the race or gender is hidden from the data, then there is potential for bias.
Transposing offline prejudices
Claire Cain Miller reported for The New York Times last year that algorithms — which are written and maintained by people, and adjust what they do based on people’s behavior — can reinforce human prejudices. Google’s advertising system showed an ad for a high-income job to men much more often than to women. Ads for arrest records are much more likely to show up on searches for “distinctively black” names or a historically black fraternity. Advertisers can target people who live in low-income neighborhoods with ads for high-interest loans. The results for a Google Images search for “C.E.O.” were 11% images of women, even though 27% of U.S. chief executives are women.
Miller noted that algorithms, which are simply a series of instructions written by programmers, “are often described as a black box; it is hard to know why websites produce certain results. Often, algorithms and online results simply reflect people’s attitudes and behavior. Machine learning algorithms learn and evolve based on what people do online.” And that’s where software can learn to discriminate against groups, reproducing racial, gender, or other biases.
Attorney Rachel Goodman reported for the American Civil Liberties Union that the organization had filed comments with the FTC urging it and the Consumer Financial Protection Bureau to investigate whether big data is being used in online marketing in ways that are racially discriminatory. While the effects of such advertising may not sound like a big deal when it’s, say, electronics or household products that are being advertised, you’ll see the problem if you consider the effects of lenders implementing the same practices.
Goodman noted that if lenders use behavioral targeting to advertise more expensive credit products to people of color, or if they market credit products in ways that cause borrowers of color to receive credit on less favorable terms than equally creditworthy white borrowers, those practices violate the Equal Credit Opportunity Act. The Wall Street Journal revealed in 2010 that big data is used to determine which credit card offer to show a particular user. (The same technique can be applied to deciding which auto loan or mortgage product to display.)
The ACLU determined that while the online marketplace “has incredible potential to render obsolete the discrimination that has all too often infected lending and consumer transactions in this country,” the unregulated use of a big data “could spoil that potential by transposing offline biases into the algorithms that shape our digital experiences.”
The algorithms that determine which credit card offer to show you or sort through job applications aren’t the only algorithms with biases — biases that can have a big effect on the trajectory of your life, financial or otherwise. Clare Garvie and Jonathan Frankle report for The Atlantic that the facial recognition algorithms used by police aren’t required to undergo public or independent testing to determine their accuracy or check for biases before being deployed. “More worrying still,” Garvie and Frankle report, “the limited testing that has been done on these systems has uncovered a pattern of racial bias.”
Facial recognition algorithms are more likely to either misidentify or fail to identify African Americans than people of other races — errors that could result in innocent citizens being marked as suspects in crimes. Though the technology is being rolled out by law enforcement agencies across the country, little is being done to explore or correct for the bias.
The conditions in which an algorithm is created, including the racial makeup of its development team and test photo databases, significantly influences the accuracy of its results — in a manner reminiscent of the way that color film stocks and photographic systems were developed for white subjects. But the effect of a biased facial recognition algorithm is that innocent people can become the subjects of criminal investigations.
University of Oxford researcher Reuben Binns recently reported for The Conversation that it isn’t big data that discriminates, but the people who use it. Problems arise when computer models are used to make predictions in areas like insurance, financial loans, and policing. An algorithm might determine that members of certain demographic groups may have historically been more likely to default on their loans or be convicted of a crime, and then the model would determine that members of such groups are more risky.
“That doesn’t necessarily mean that these people actually engage in more criminal behaviour or are worse at managing their money,” Binns writes. “They may just be disproportionately targeted by police and sub-prime mortgage salesmen.” Algorithms can’t tell the difference between just and unjust patterns, and while some suggest that datasets should exclude characteristics that may be used to reinforce existing biases, that’s not always enough. In those cases, such as when algorithms can infer sensitive attributes from a combination of other facts, it’s up to service providers to avoid making unjust assumptions and choices — the kind that are influenced by human bias and ignorance.
Binns reports that “the moral responsibility lies with those responsible for interpreting and acting on the model, not the model itself.” Algorithms can reinforce or mitigate discrimination, and counter or overcome inequality. Data can make people’s lives better or the opposite. Even when prejudice isn’t built into an algorithm itself, it can learn biases as it observes and adapts to the datasets that are supposed to teach it to do its job. Data on people’s lives and backgrounds can be easily misused, but perhaps the most productive form of optimism for those in the tech industry is to realize that misuse isn’t inevitable.