Researchers Developed a New Framework to Prevent Machines From Misbehaving
Thu, April 22, 2021

Researchers Developed a New Framework to Prevent Machines From Misbehaving

Researchers used the Seldonian framework to prevent AI machines from misbehaving / Photo Credit: Charles Taylor (via Shutterstock)


Researchers from the University of Massachusetts Amherst and Stanford said they have developed an algorithmic framework that prevents AI from misbehaving, according to Tristan Greene of The Next Web, a website dedicated to new technology and start-up firms in Europe. The framework utilizes “Seldonian” algorithms, named after the protagonist in the “Foundation” series by Isaac Asimov. It is the continuation of the fictional universe where Asimov’s “Law of Robotics” first appeared. 

Per the research of Philip S. Thomas and colleagues of AAA’s journal portal Science, the said framework enables developers to “define their own operating conditions” to prevent systems from going beyond certain thresholds while training or optimizing them. This allows developers to stop AI from discriminating or harming humans. The researchers employed the Seldonian framework to develop an AI system that monitors and distributes insulin in diabetics. Another one was developed to predict the GPAs of students. In the former, Thomas and his team utilized the framework to ensure that the system won’t send patients “into a crash while learning to optimize dosage.” For the latter, the team wanted to prevent gender bias. 

Both experiments were effective, successfully demonstrating that the Seldonian algorithms can inhibit deviant behavior. Interestingly, the Seldonian algorithms place the burden of preventing bias not on end users, but on developers. By incorporating proper mitigation algorithms, the said framework would eliminate the “potential for harmful bias” while allowing the system to operate. 

The authors demonstrated several simple algorithms that show unwanted behavior in ways the machine can understand. Instead of telling the machine, “don’t let gender bias affect your GPA predictions,” the Seldonian algorithms will express the problem this way: “accurately predict everyone’s GPA, but don’t let the differences between predicted GPA and actual GPA exceed a certain threshold when gender is taken into account.”

Thomas and his team hoped that further development of the framework can do more than just taking over current AI solutions. They wrote, “it is our hope that they will pave the way for new applications for which the use of ML was previously deemed to be too risky.”