Bias and Toxicity In Large Language Models (LLMs) and Machine Learning
WK 3 – 9/18: Bias and Toxicity
- Slack #bias-toxicity: https://78000largelan-ive6005.slack.com/archives/C05NPRA04PN
- Group’s Google Doc: https://docs.google.com/document/d/1AkTf398mIQSr6insIbxzogT3TKifbU0MOgs50ZhNQtM/edit#heading=h.judzwu764h79
- McSweeney Notes: https://docs.google.com/document/d/1gks5aEmxaoOVqA28eEx6XaAUpS5CBoDXWmI5NQa9JzM/edit
The discussion on bias and toxicity (and setting safety rules by either human agency or secondary machine monitoring/constitutional AI) made me want to raise my p(doom) value based on three possibilities:
1.) Unintended human errors (programming mistakes in setting up rules).
2.) Intended human errors (bad actors ignoring ethics and agreed rules).
3.) Unintended machine errors (safety priorities are changed/removed by the machines).
The notion that “the safety system works most of the time but only has to fail once” made me recall an example from sci-fi: In 1968’s 2001: A Space Odyssey – one of the most famous “AI gone wrong” science fiction stories – the computer HAL-9000 of the spaceship Discovery One murders its entire human crew except for one resourceful surviving astronaut. Author Arthur C. Clarke reveals in his sequel novel 2010: Odyssey Two that HAL’s actions were the direct result of human error stemming from separate programming teams giving the AI conflicting operating rules on priorities: Inform the crew of all mission details but also prevent them from knowing the true secret goal of their mission. The only logical resolution? If there’s no crew alive to inform, there is no conflict in keeping the true mission secret.
Labs for DATA 78000: Large Language Models and Chat GPT
Mondays 6:30p, Room 5417, CUNY Graduate Center, New York, NY
Instructor: Michelle McSweeney, michelleamcsweeney.com
Course Site: https://github.com/michellejm/LLMs-fall-23
Link to this post: https://tinyurl.com/46ykcr68
Prompt Engineering Lab – Stanton for 78000
https://colab.research.google.com/drive/1qWsqeooxflEDoIw5kaTOu3iv7A2aC9zd
Ngrams Lab – Stanton
https://colab.research.google.com/drive/1HebbqSpe5WXT45j9Oh1y7vOfHk6RO_nw
Word Vectors Lab – Stanton
https://colab.research.google.com/drive/1B2Qy5AzfZEp_wF34yW4Z8S82lF6LtFbT
Tokeninzation Lab – Stanton for 78000
https://colab.research.google.com/drive/1YXfrKuSNtG1HuWTiQ_-Qh87ru276Bwuu
BERT Sentiment Via Huggingface – Stanton for 78000
https://colab.research.google.com/drive/1OXrTaE6Ot5CCdpjjIKnl5jE4Rp5CD9W5
Fine Tune LLaMa – Stanton for 78000
https://colab.research.google.com/drive/1aAPu6seGLfQAymM-j5-87JFiW2jR8HK-
Recent Comments