Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Chris Olah on what the hell is going on inside neural networks

80,000 Hours: “Big machine learning models can identify plant species better than any human, write passable essays, beat you at a game of Starcraft 2, figure out how a photo of Tobey Maguire and the word ‘spider’ are related, solve the 60-year-old ‘protein folding problem’, diagnose some diseases, play romantic matchmaker, write solid computer code, and offer questionable legal advice. Humanity made these amazing and ever-improving tools. So how do our creations work? In short: we don’t know. Today’s guest, Chris Olah, finds this both absurd and unacceptable. Over the last ten years he has been a leader in the effort to unravel what’s really going on inside these black boxes. As part of that effort he helped create the famous DeepDream visualisations at Google Brain, reverse engineered the CLIP image classifier at OpenAI, and is now continuing his work at Anthropic, a new $100 million research company that tries to “co-develop the latest safety techniques alongside scaling of large ML models”. Despite having a huge fan base thanks to his tweets and lay explanations of ML, today’s episode is the first long interview Chris has ever given. It features his personal take on what we’ve learned so far about what ML algorithms are doing, and what’s next for this research agenda at Anthropic. His decade of work has borne substantial fruit, producing an approach for looking inside the mess of connections in a neural network and back out what functional role each piece is serving. Among other things, Chris and team found that every visual classifier seems to converge on a number of simple common elements in their early layers — elements so fundamental they may exist in our own visual cortex in some form…”

Sorry, comments are closed for this post.