Metrics¶

aix360.metrics.local_metrics.faithfulness_metric(model, x, coefs, base)¶

This metric evaluates the correlation between the importance assigned by the interpretability algorithm to attributes and the effect of each of the attributes on the performance of the predictive model. The higher the importance, the higher should be the effect, and vice versa, The metric evaluates this by incrementally removing each of the attributes deemed important by the interpretability metric, and evaluating the effect on the performance, and then calculating the correlation between the weights (importance) of the attributes and corresponding model performance. [1]

References

[1]	David Alvarez Melis and Tommi Jaakkola. Towards robust interpretability with self-explaining neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 7775-7784. 2018.

Parameters:	model – Trained classifier, such as a ScikitClassifier that implements a predict() and a predict_proba() methods. x (numpy.ndarray) – row of data. coefs (numpy.ndarray) – coefficients (weights) corresponding to attribute importance. base ((numpy.ndarray) – base (default) values of attributes
Returns:	correlation between attribute importance weights and corresponding effect on classifier.
Return type:	float

aix360.metrics.local_metrics.monotonicity_metric(model, x, coefs, base)¶

This metric measures the effect of individual features on model performance by evaluating the effect on model performance of incrementally adding each attribute in order of increasing importance. As each feature is added, the performance of the model should correspondingly increase, thereby resulting in monotonically increasing model performance. [2]

References

[2]	Ronny Luss, Pin-Yu Chen, Amit Dhurandhar, Prasanna Sattigeri, Karthikeyan Shanmugam, and Chun-Chen Tu. Generating Contrastive Explanations with Monotonic Attribute Functions. CoRR abs/1905.13565. 2019.

Parameters:	model – Trained classifier, such as a ScikitClassifier that implements a predict() and a predict_proba() methods. x (numpy.ndarray) – row of data. coefs (numpy.ndarray) – coefficients (weights) corresponding to attribute importance. base ((numpy.ndarray) – base (default) values of attributes
Returns:	True if the relationship is monotonic.
Return type:	bool