Nov 20, 2020 | LIFE | By Joshua Kalenga | Illustration by Patil Khakhamian
At first thought, the concept of a “racist” algorithm might seem absurd. After all, one may ask, how can we assign any ethical value to what is just a set of calculations?
In a world where artificial intelligence is being used for everything from job hiring to policing, however, questions about how fairly these algorithms deal with different individuals and groups are surely warranted.
“Coded Bias,” a 2020 documentary by Shalini Kantayya, argues that some of the artificial intelligence in use today is biased against women and minorities and is in urgent need of government regulation.
The documentary follows the journey of MIT Media Lab researcher Joy Buolamwini. Her discovery that most facial-recognition software does not accurately identify the faces of women and darker-skinned people inspires her to investigate widespread bias in algorithms.
In one particularly memorable scene, Buolamwini — who is a woman of color — explains to Congress that she had to wear a white mask for Amazon’s facial recognition software to ‘recognize’ her.
In response, Rep. Alexandria Ocasio-Cortez asks Buolamwini, “So, we have a technology that was created and designed by one demographic that is only mostly effective on that one demographic and they’re trying to sell it and impose it on the entirety of the country?”
A 2019 study by The U.S. National Institute of Standards and Technology (NIST) of almost 200 facial recognition algorithms found that “for one-to-one matching, [there were] higher rates of false positives for Asian and African American faces relative to images of Caucasians. The differentials often ranged from a factor of 10 to 100 times, depending on the individual algorithm.”
Since facial recognition software is already being employed in crucial facets of everyday life such as policing, the implications of misidentification are non-trivial. The New York Times reported in June 2020 that Robert Julian-Borchak Williams, a Black man, was arrested in Michigan after being wrongly identified by a faulty facial recognition program.
“Coded Bias” also cites an experimental Amazon program developed in 2014 to screen job applications for technology roles as an example of algorithmic bias. The program automatically disregarded résumés that mentioned women’s colleges or groups, likely reflecting the prevailing gender imbalance in the demographics of the company.
A lack of adequate representation in the underlying data used to build some artificial intelligence systems is a recurring theme in the documentary.
“When you think of A.I., it’s forward-looking,” Buolamwini says in the film. “But A.I. is based on data, and data is a reflection of our history.”
There are other concerning examples of bias in artificial intelligence that are not explicitly discussed in the “Coded Bias” documentary. An algorithm widely used in U.S. hospitals, according to a study published in Science, was less likely to refer Black people that were equally sick as White people to programs that provided personalized care.
In spite of the biases in some applications, artificial intelligence remains one of the most sophisticated developments of computer science. From natural language processing to cancer diagnosis, artificial intelligence applications can be both deeply fascinating (at least to a tech nerd like myself) and incredibly useful.
While “Coded Bias” certainly doesn’t romanticize the power of artificial intelligence, it doesn’t suggest that the concept is inherently flawed either. Instead, the film seems to simply argue for more racial and gender diversity in the technology industry, as well as more government regulation of widely used algorithms — especially those that have been proven to contain systematic bias.
Following “The Great Hack” and “The Social Dilemma,” Kantayya’s “Coded Bias” is the latest documentary to address the dangers of Big Tech. Perhaps, at best, the directors and casts of these documentaries hope they are inspiring a global consciousness in us, the viewers. After all, it is our data that runs the algorithms.