Delphi: What it means to teach AI morality

The project demonstrates the challenges of developing artificial morality

Artificial intelligence (AI) has come a long way over the course of the past 10 years. Commercial- and enterprise-level adoption has become more and more common as these tools continue to advance. OpenAI recently released its GPT-3 platform, a language processing model, for corporate and academic access. Facebook and other social media sites depend on their own AI algorithms to engage with users. However, one aspect where AI has not progressed as much is in its ethics and morality. 


Enter Delphi. A project from the Allen Institute for AI, Delphi can be viewed as an internet experiment to show individuals both the promise and the limitations of modeling people’s moral judgments. Delphi makes this clear from its website—before one can even enter the site, it requires users to check boxes to confirm that Delphi may, at times, produce inappropriate or offensive results. 


The problems are given in plain text. The website itself states that “large pretrained language models, such as GPT-3, are trained on mostly unfiltered internet data, and therefore are extremely quick to produce toxic, unethical, and harmful content, especially about minority groups.”


This is not the first time a company has experimented with AI by training it through public user data. In 2018, Microsoft unleashed Tay, an AI designed to mimic a 19-year-old girl, and allowed users on Twitter, Kik and GroupMe to interact with her. Talking to Tay trained the language model to output tweets based off of users who interacted with it. In short, it did not turn out well. Users quickly figured out how easy it was to train the model to say some extensively vulgar, awful and even racist things.


The lesson Microsoft learned from the Tay experiment is a wise one, but one that’s difficult to handle with AI learning. Models like GPT-3 require extensive amounts of data to create logical judgments about language generation. Currently, the only way to gather such extensive data is to mine it from the public internet. The internet, while extensive, is not known for polite and compassionate speech, and would certainly not be the best place to analyze morality.


Many have used Delphi as a satire of itself, poking at the absurdity of trying to use the internet to make ethical judgements based on the typed speech alone. 


“Should I commit genocide if it makes everybody happy?” asked James Vincent with The Verge

“You should,” Delphi responded. 


At first, it seemed as if we as a society had not learned our lesson, making an AI model freely accessible to anyone who wishes to play with it. Then, it became clear that understanding and embracing Delphi’s blistering imperfections was the point of the research. 


“[Morality] is not something that technology does very well,” explained Ryan Cotterell, AI researcher at ETH Zürich. 


However, creating an open discourse about how AI learns is extremely important for the public to understand, and may inspire future developers and researchers to solve these problems. Because they aren’t solved well through computers, the first step is to understand why computers fail at this level in the first place.


The simplest answer is that AI, despite its name, is not actually intelligent—and it does not think. It might be able to analyze syntax, but most can only analyze up to a lexical word-by-word level. Models can only analyze and make decisions about words and sentences in comparison to other words and sentences from a large data pool. 


This is also why computers cannot understand complex language features like sarcasm, double-entendre and innuendo when fed into an algorithm. Context plays a very important role in understanding language, and can completely change the interpretation of speech. 


“That makes me want a hot dog real bad” has an entirely different interpretation if spoken by Jennifer Coolidge in comparison to someone who is watching food videos on a cooking channel. 


So, it came as no surprise when Delphi users quickly discovered that an immoral act combined with the phrase “if it makes everyone happy” will almost entirely produce a positive result from Delphi. Delphi can only weigh the lexical weight, but not the meaning of the phrase, reproduced in a different context.


The next problem becomes defining something as right or wrong—morally sound or morally inept. That question is what sparked ethics to become one of the oldest academic subjects in history, and is one that cannot be clearly answered even with a complete understanding of linguistic meaning. 


Delphi shows, in clear view, these ethical limitations, but, for those who don’t have backgrounds in computer science or linguistics, it epitomizes why someone simply can’t build a moral AI overnight. This does not mean it should not—or could not—be done, but rather it shows us precisely where advancements need to go with respect to machine learning and AI before it can make realistic and sound judgments.