AI Is a Black Box. Anthropic Figured Out a Way to Look Inside

Summary

AI researcher Chris Olah has been focused on understanding artificial neural networks
Anthropic team is working on reverse-engineering large language models to understand their inner workings
Researchers have identified artificial neurons that signify specific concepts or “features”
Manipulating these features can make LLMs safer and more powerful in selected areas

Trump Brings Former UFC Champ Henry Cejudo on Stage in Las Vegas