Home » Uncategorized » Bias and Toxicity In Large Language Models (LLMs) and Machine Learning

Bias and Toxicity In Large Language Models (LLMs) and Machine Learning

WK 3 – 9/18: Bias and Toxicity

The discussion on bias and toxicity (and setting safety rules by either human agency or secondary machine monitoring/constitutional AI) made me want to raise my p(doom) value based on three possibilities:

1.) Unintended human errors (programming mistakes in setting up rules).
2.) Intended human errors (bad actors ignoring ethics and agreed rules).
3.) Unintended machine errors (safety priorities are changed/removed by the machines).

The notion that “the safety system works most of the time but only has to fail once” made me recall an example from sci-fi: In 1968’s 2001: A Space Odyssey – one of the most famous “AI gone wrong” science fiction stories – the computer HAL-9000 of the spaceship Discovery One murders its entire human crew except for one resourceful surviving astronaut. Author Arthur C. Clarke reveals in his sequel novel 2010: Odyssey Two that HAL’s actions were the direct result of human error stemming from separate programming teams giving the AI conflicting operating rules on priorities: Inform the crew of all mission details but also prevent them from knowing the true secret goal of their mission. The only logical resolution? If there’s no crew alive to inform, there is no conflict in keeping the true mission secret.


Leave a comment

Your email address will not be published. Required fields are marked *