AI Bias and Training Data

Interesting article about training set bias at Google. It’s mostly about Google translate, but some of it carries over to search.

AI models and training sets, which fundamentally work in the same way, have inherent bias. Many of them are used in search engines.

“There’s a lot of similarities on how language works and how visual recognition works,” said Barak Turovsky, head of Google Translate.

While Turovsky didn’t address racial bias, he did say that all types of bias are inherent in AI. 

Thinking in a different way and questioning assumptions begin with language. The inherent bias in language surfaces when translated from English into another language, especially when assigning gender to the query or the words. Russian and Spanish are examples of having gender-assigned words.

Language is bias, he said. Some challenges — societal and custodial — have been picked up by AI for thousands of years, Turovsky said.

