Local White Box Explainers

Contrastive Explainers

class aix360.algorithms.contrastive.CEM.CEMExplainer(model)

CEMExplainer can be used to compute contrastive explanations for image and tabular data. This is achieved by finding what is minimally sufficient (PP - Pertinent Positive) and what should be necessarily absent (PN - Pertinent Negative) to maintain the original classification. We use elastic norm regularization to ensure minimality for both parts of the explanation i.e. PPs and PNs. An autoencoder can optionally be used to make the explanations more realistic. [1]

References

[1]Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, Payel Das, “Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives,” Advances in Neural Information Processing Systems (NeurIPS), 2018.

Constructor method, initializes the explainer

Parameters:model – KerasClassifier model whose predictions needs to be explained
explain_instance(input_X, arg_mode, AE_model, arg_kappa, arg_b, arg_max_iter, arg_init_const, arg_beta, arg_gamma)

Explains an input instance input_X and returns contrastive explanations. Note that this assumes that the classifier was trained with inputs normalized in [-0.5,0.5] range.

Parameters:
  • input_X (numpy.ndarray) – input instance to be explained
  • arg_mode (str) – ‘PP’ or ‘PN’
  • AE_model – Auto-encoder model
  • arg_kappa (double) – Confidence gap between desired class and other classes
  • arg_b (double) – Number of different weightings of loss function to try
  • arg_max_iter (int) – For each weighting of loss function number of iterations to search
  • arg_init_const (double) – Initial weighting of loss function
  • arg_beta (double) – Weighting of L1 loss
  • arg_gamma (double) – Weighting of auto-encoder
Returns:

  • adv_X (numpy ndarray) – Perturbed input instance for PP/PN
  • delta_X (numpy ndarray) – Difference between input and Perturbed instance
  • INFO (str) – Other information about PP/PN

Return type:

tuple

set_params(*argv, **kwargs)

Set parameters for the explainer.

class aix360.algorithms.contrastive.CEM_MAF.CEM_MAFImageExplainer(model, attributes, aix360_path)

CEM_MAFImageExplainer is a Contrastive Image explainer that leverages Monotonic Attribute Functions. The main idea here is to explain images using high level semantically meaningful attributes that may either be directly available or learned through supervised or unsupervised methods. [2]

References

[2]Ronny Luss, Pin-Yu Chen, Amit Dhurandhar, Prasanna Sattigeri, Karthikeyan Shanmugam, Chun-Chen Tu, “Generating Contrastive Explanations with Monotonic Attribute Functions,” 2019.

Initialize image explainer.

Currently accepting model input which is an ImageClassifier.

check_attributes_celebA(attributes, x, y)

Load attribute classifiers and check which attributes in original image x are modified in adversarial image y

Parameters:
  • attributes (str list) – list of attributes to load attribute classifiers for
  • x (numpy.ndarray) – original image
  • y (numpy.ndarray) – adversarial image
Returns:

string detailing which attributes were added to (or removed from) x resulting in y

Return type:

str

explain_instance(sess, input_img, input_latent, arg_mode, arg_kappa, arg_binary_search_steps, arg_max_iterations, arg_initial_const, arg_gamma, arg_beta, arg_attr_reg=1, arg_attr_penalty_reg=1, arg_latent_square_loss_reg=1)

Explains an input instance input_image e.g. celebA is shape (1, 224, 224, 3)

Hard coded batch_size=1, assuming we provide explanation for 1 input_image at a time. Returns either pertinent positive or pertinent depending on parameter.

Parameters:
  • sess (tensorflow.python.client.session.Session) – Tensorflow session
  • input_img (numpy.ndarray) – image to be explained, of shape (1, size, size, channels)
  • input_latent (numpy.ndarray) – image to be explained, of shape (1, size, size, channels) in the latent space
  • arg_mode (str) – “PN” for pertinent negative or “PP” for pertinent positive
  • arg_kappa (float) – Confidence parameter that controls difference between prediction of PN (or PP) and original prediction
  • arg_binary_search_steps (int) – Controls number of random restarts to find best PN or PP
  • arg_max_iterations (int) – Max number iterations to run some version of gradient descent on PN or PP optimization problem from a single random initialization, i.e., total number of iterations wll be arg_binary_search_steps * arg_max_iterations
  • arg_initial_const (int) – Constant used for upper/lower bounds in binary search
  • arg_gamma (float) – Penalty parameter encouraging addition of attributes for PN or PP
  • arg_beta (float) – Penalty parameter encourages minimal addition of attributes to PN or sparsity of the mask that generates the PP
  • arg_attr_reg (float) – Penalty parameter on regularization of PN to be predicted different from original image
  • arg_attr_penalty_reg (float) – Penalty regularizing PN from being too different from original image
  • arg_latent_square_loss_reg (float) – Penalty regularizing PN from being too different from original image in the latent space
Returns:

  • adv_img (numpy.ndarray) – the pertinent positive or the pertinent negative image
  • attr_mod (str) – only for PN; a string detailing which attributes were modified from the original image
  • INFO (str) – only for PN; a string of information about original vs PN class and original vs PN prediction probability

Return type:

tuple

set_params(*argv, **kwargs)

Set parameters for the explainer.

SHAP Explainers

class aix360.algorithms.shap.shap_wrapper.GradientExplainer(*argv, **kwargs)

This class wraps the source class GradientExplainer available in the SHAP library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize shap kernelexplainer object.

explain_instance(*argv, **kwargs)

Explain one or more input instances.

set_params(*argv, **kwargs)

Optionally, set parameters for the explainer.

class aix360.algorithms.shap.shap_wrapper.DeepExplainer(*argv, **kwargs)

This class wraps the source class DeepExplainer available in the SHAP library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize shap kernelexplainer object.

explain_instance(*argv, **kwargs)

Explain one or more input instances.

set_params(*argv, **kwargs)

Optionally, set parameters for the explainer.

class aix360.algorithms.shap.shap_wrapper.TreeExplainer(*argv, **kwargs)

This class wraps the source class TreeExplainer available in the SHAP library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize shap kernelexplainer object.

explain_instance(*argv, **kwargs)

Explain one or more input instances.

set_params(*argv, **kwargs)

Optionally, set parameters for the explainer.

class aix360.algorithms.shap.shap_wrapper.LinearExplainer(*argv, **kwargs)

This class wraps the source class Linearexplainer available in the SHAP library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize shap kernelexplainer object.

explain_instance(*argv, **kwargs)

Explain one or more input instances.

set_params(*argv, **kwargs)

Optionally, set parameters for the explainer.