This paper is about finding a more rigorous process of defining interpretability.

Why Interpretability

A Taxonomy of Interpretability Evaluation

Application grounded evaluation: real humans, tasks

Human-grounded Metrics: Real humans, simpli ed tasks

Functionally-grounded Evaluation: No humans, proxy tasks