Find desiderata to constitute interpretability.
Interpretability has no technical definition
Interpretability is needed when formal objectives of model diverge from real world deployment
EX: How do we interpret ethics in hiring decisions?
Causality
- important, describes associative relationships between inputs and outputs
Transferrability
- transferring learned skills from trainning data to test environment or real world
Informativeness
- provide extra explanation to human decision making
- Does not require inner study of the model and its behavior
Properties of Interpretable Models
Transparency
- If we can contemplate the entire model at once to understand it’s inner workings
- In reasonable time, step through every calculation required to produce a prediction
- sparse linear models like lasso regression, are more interpretable since there are less features or variables thus can find direct correlations between input and outputs
- decision trees are less interpretable because nodes grow faster than inference time (time to find the prediction, aka complexity of the algorithm to halt). So path from root to leaf is the inference time