New Facebook AI research proposes a rethinking of the role of class selective neurons in deep neural networks (DNN), arguing DNNs can perform well — and maybe even better — without them. The researchers say the findings suggest overreliance on such easy-to-interpret neurons and intuition-based methods for understanding DNNs can be misleading.
DNNs are built with multiple layers between the input and output layers and are often used to model complex non-linear relationships. Although the machine learning community has achieved SOTA results with DNNs, understanding what these networks actually do remains a challenge. Such interpretability concerns have given rise to the “black box” stigma that continues to impede AI deployment.
In a blog post, the Facebook AI researchers identify class selectivity — how individual neurons respond differ across different classes of stimuli or data samples — as a common analytical approach for “understanding” DNNs. The propose however that selectivity may impair rather than improve DNN function and could even make networks more susceptible to randomly-distorted inputs.
The team says selectivity is currently widely used in part because it’s intuitive and easy-to-understand in human terms, and also because these kinds of interpretable neurons do naturally emerge in networks trained on a variety of tasks. For example, DNNs trained to classify many different kinds of images contain individual neurons that activate most strongly for Labrador retrievers. And DNNs trained to predict individual letters in product reviews contain neurons selective for positive or negative sentiment.
The researchers tackled the question of whether such easy-to-interpret neurons are actually necessary by training an image classification network to improve accuracy as usual, but also adding an incentive to decrease or increase the amount of class selectivity in its neurons. They controlled the importance of class selectivity to the network using a single parameter that chooses between encouraging or discouraging the interpretable neurons and to what degree.
To their surprise, reducing DNNs’ class selectivity had little adverse effect and in some cases even improved performance. Increasing class selectivity meanwhile had a significant negative effect on network performance. The results suggest that, despite its ubiquity across tasks and models, class selectivity is no guarantee that a DNN will function properly, and can even negatively affect its function.
The researchers say the results warn against focusing on the properties of single neurons for understanding DNNs, and they encourage the community further study the role of class selectivity in this regard.
Facebook AI says the work is part of the company’s broader efforts to further explainability in AI, including open sourced interpretability tools for ML developers and partnerships with key platforms, and hope it will help researchers “better understand how complex AI systems work and lead to more robust, reliable, and useful models.”
The work highlights three germane papers: Selectivity Considered Harmful: Evaluating the Causal Impact of Class Selectivity in DNNs (arXiv link), On the Relationship Between Class Selectivity, Dimensionality, and Robustness on (arXiv link), and Towards Falsifiable Interpretability Research (arXiv link).
Reporter: Yuan Yuan | Editor: Michael Sarazen