Scientists Are Trying to Fix AI Bias
Wed, April 21, 2021

Scientists Are Trying to Fix AI Bias

Researchers are analyzing data to identify and remove derogatory words in images found online as an effort to eliminate AI bias / Credits: kchung via 123RF


In 2012, a large visual database designed for use in visual object recognition software research called ImageNet was launched. Researchers were able to collect and label more than 14 million images, creating an image-recognition system that can identify things with surprising accuracy. ImageNet has been a success since then. The project not only helped build the hype of AI but also usher in new technologies such as automated vehicles and advanced smartphone cameras. 

However, things changed as the years passed. The algorithms have learned to be biased. For instance, the algorithm assumes programmers are only white men because the pool of images labeled as “programmer” are mostly white men. Excavating AI, a recent viral web project, revealed prejudices in the labels added to ImageNet. This includes racial slurs such as “negro” and “gook.” It found that these exist because people adding labels to the images might have added a derogatory term to a label like “teacher” or “woman.” 

Thus, the researchers took steps to address the bias. According to Wired, a monthly American magazine that focuses on how emerging technologies affect culture, the economy, and more, the team analyzed their data and used crowdsourcing to identify and remove derogatory words. They are also developing a tool that would gather more diverse images, which they will release in the coming months. This would show greater diversity in terms of gender, race, and age that can be used to train an AI algorithm.

This shows that while AI can learn bias, it could still be reengineered from the ground up to produce fairer results. This also highlights how dependent AI is on human training. Andrei Barbu, a research scientist at MIT who has studied ImageNet, stated that this is an admirable effort. “Creating a data set that lacks certain biases very quickly slices up your data into such small pieces that hardly anything is left,” he said.