Evaluating Interpretability in Machine Learning Models

How to evaluate something that does not have uniform metrics? Hint: By designing suitable experiments.
Liked Liked