Not All Data Bias is Bad

It depends on what the model is built to do.

Apr 12, 2024

A few months ago, Google stepped firmly into the middle of the US culture wars. Its generative AI model “Gemini” was accused of being too “woke.” When users asked Gemini to generate certain images, its output was counterfactual to say the least: a black man and Asian woman depicted in the German army in World War II; and a black man as one of the US Founding Fathers.

The algorithm’s output enraged many on the American political right and even Ben Thompson, author of the newsletter “Stratechery,” one of Silicon Valley’s most widely read publications. In his February 26 newsletter, “Gemini and Google’s Culture,” Thompson even called for the removal of Google CEO Sundar Pichai:

Google, quite clearly, needs a similar transformation: the point of the company ought not be to tell users what to think, but to help them make important decisions, as Page once promised. That means, first and foremost, excising the company of employees attracted to Google’s power and its potential to help them execute their political program, and return decision-making to those who actually want to make a good product. That, by extension, must mean removing those who let the former run amok, up to and including CEO Sundar Pichai.

From this observer’s perspective, the concern over Gemini’s output has to do not only with the political nature of the content but also with the immense power Google has over our society. Because the world runs on Google, the decisions Google makes about the output of its algorithms have global consequences.

Gemini: how did we end up here? - lcamtuf's thing — Gemini’s German army

Although interrogating Google’s market power and values is certainly appropriate, the commentary has glossed over a crucial point: not all biases in a model’s training data are bad. Under the false assumption that “all bias is bad,” Google stripped out some of the information in Gemini’s training data necessary for the algorithm to deliver accurate output.

All algorithms, whether they are Large Language Models at the heart of products like Gemini or simpler machine learning models, are built to predict the future based on the past. Algorithms can tell you what is and is not a hot dog; they can also tell you, as Gemini does, what the next word in a sequence will be based on the previous words.

The past we have lived through, however, oftentimes reflects bias that we do not want to replicate in the future. Take the Amazon hiring algorithm from 2018 that was built to predict which candidates would be successful based on successful employees from the past. That algorithm was taken down because its output was biased to privilege white men, the candidates who had been the most successful at Amazon until that time. The bias of the algorithm’s underlying data did not reflect the diverse workforce that Amazon wanted to build in the future.

That said, the data sets on which we train algorithms are always going to be biased – the data was created and curated by humans, who have all sorts of unconscious biases. Moreover, it is necessary to maintain some of the bias in an underlying data set so that the machine can deliver an accurate output. In the application on an algorithm for hiring decisions, programmers who want to build a system that gives a fair shot to all worthy applicants would want to eliminate gender bias but keep biases for candidates with good coding skills or strong credentials. The key is not eliminating all biases in the data– it is eliminating the wrong biases.

Which brings us back to Google. Under the logic that “all bias is bad,” programmers stripped out so much bias that it distorted the model’s output. Data about gender and race have a high correlation with who was and was not a German soldier or a Founding Father. Whether these biases were morally “right” for Germans during WWII and Americans during the early Republic is the purview of historians to debate. However, when it comes to building algorithms, the correlation is clear and should have been reflected in the model’s output.

The story of Gemini’s false start here presents a lesson for technologists trying to build ethical products: the bias you want to keep or eliminate in your model depends on the objective function of the model (i.e., what it is deciding) and the context in which the model is deployed.

If you are building a model to help inform decisions made in the future – creditworthiness, recidivism rates, and hiring potential among them – you want to be sure to eliminate the underlying bias in the model that would cause decisions made based on that model to compound the injustices of the past. However, if you are building a model to identify who and who is not a German soldier, including the biases of a historical time are necessary for the model to be accurate.

The Ethical Technologist

Discussion about this post