Summary
- AI researcher Chris Olah has been focused on understanding artificial neural networks
- Anthropic team is working on reverse-engineering large language models to understand their inner workings
- Researchers have identified artificial neurons that signify specific concepts or “features”
- Manipulating these features can make LLMs safer and more powerful in selected areas