Local Black Box Explainers

SHAP Explainers

class aix360.algorithms.shap.shap_wrapper.KernelExplainer(*argv, **kwargs)

This class wraps the source class KernelExplainer available in the SHAP library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize shap kernelexplainer object.

explain_instance(*argv, **kwargs)

Explain one ore more input instances.

set_params(*argv, **kwargs)

Optionally, set parameters for the explainer.

LIME Explainers

class aix360.algorithms.lime.lime_wrapper.LimeImageExplainer(*argv, **kwargs)

This class wraps the source class LimeImageExplainer available in the LIME library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize lime Image explainer object

explain_instance(*argv, **kwargs)

Explain one or more input instances.

set_params(*argv, **kwargs)

Optionally, set parameters for the explainer.

class aix360.algorithms.lime.lime_wrapper.LimeTabularExplainer(*argv, **kwargs)

This class wraps the source class LimeTabularExplainer available in the LIME library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize lime Tabular Explainer object

explain_instance(*argv, **kwargs)

Explain one or more input instances.

set_params(verbose=0)

Optionally, set parameters for the explainer.

class aix360.algorithms.lime.lime_wrapper.LimeTextExplainer(*argv, **kwargs)

This class wraps the source class LimeTextExplainer available in the LIME library. Additional variables or functions from the source class can also be accessed via the ‘explainer’ object variable that is initialized in ‘__init__’ function of this class.

Initialize lime text explainer object.

explain_instance(*argv, **kwargs)

Explain one or more input instances.

set_params(*argv, **kwargs)

Optionally, set parameters for the explainer.

Nearest Neighbor Contrastive Explainer

class aix360.algorithms.nncontrastive.nncontrastive.NearestNeighborContrastiveExplainer(model: Callable = None, n_classes: int = 2, metric: str = 'euclidean', neighbors: int = 3, embedding_type: Union[str, aix360.algorithms.nncontrastive.nncontrastive.EmbeddingType] = <EmbeddingType.UNSUPERVISED: 'unsupervised embedding'>, embedding_dim: int = 8, category_enc_dim: int = 3, category_encoding: str = 'ohe', numeric_scaling: str = None, layers_config: List[int] = [16, 16], encoder_activation: str = 'relu', decoder_activation: str = 'relu', embedding_activation: str = 'tanh', encoder_kernel_regularizer: str = 'l1', encoder_kernel_initializer: str = 'glorot_uniform', encoder_bias_initializer: str = 'zeros', encoder_activity_regularizer: str = None, decoder_kernel_regularizer: str = 'l1', decoder_kernel_initializer: str = 'glorot_uniform', decoder_bias_initializer: str = 'zeros', decoder_activity_regularizer: str = None, decoder_last_layer_activation: str = 'linear', embedding_activity_regularizer: str = None, classifier_layers: List[int] = [16], classifier_activation: str = 'relu', classifier_kernel_regularizer: str = None, classifier_kernel_initilizer: str = 'glorot_uniform', classifier_bias_initializer: str = 'zeros')

The Nearest Neighbor (NN) Contrastive algorithm uses an auto-encoder to train low dimensional representation for each data point, for the nearest neighbor search. The dimensional reduction improves the reliability of the neighborhood search. Along with dimensionality reduction, the current implementation also allows for imposing class structure-driven orientation of the embedding space. For example, in a loan application, a high-income applicant and a low-income applicant may have very different evaluation criteria. The auto-encoder uses high-income and low-income tag classes during the auto-encoder training, ensuring instances with the same tags are in close neighborhoods.

The implementation allows exemplar selection in two ways, (1) contrastive exemplar selection is guided by a model prediction, and (2) the user explicitly provides an exemplar with a different class tag than the query instance (model-free). Given a query instance, the resulting explanation is a set of nearest neighbor exemplars with different class tags than the query instances.

NearestNeighborContrastiveExplainer initialization.

Parameters:
  • model (Callable) – Classification Model which will be used for contrastive explanation.
  • n_classes (int) – Number of classes the classification produces.
  • metric (str) – Distance metric for neighborhood finding. This metric is used to find neighborhood in the embedding space. The implementation internally uses Scipy KDTree for neighborhood search. See the documentation of scipy.spatial.distance and the metrics listed in distance_metrics for more information. Defaults to euclidean.
  • neighbors (int) – Number of neighbors to fetch for producing the explanation. The NearestNeighborContrastiveExplainer uses these neighbors to produce explanation. In order to understand the variety in the neighborhood profile, higher value is suggested for this parameter, which impacts the size of the explanation produced. For fast greedy explanation lower value is suggested for this parameter. Defaults to 3.
  • embedding_type (Union[str, EmbeddingType]) – This parameter controls the nature of the embedding produced. It can be set to supervised (EmbeddingType.SUPERVISED) or unsupervised (EmbeddingType.UNSUPERVISED). The unsupervised embedding ensures data distribution compliance, while supervised embedding allow imposing further structural constraints to the embedding by provided business tags during the model fit step. Defaults to EmbeddingType.UNSUPERVISED.
  • embedding_dim (int) – Dimension of the produced embedding. Lower dimension allows faster search, while at the cost of lossy reconstruction. An appropriate emebedding_dim selection depends of the data complexity and available data. Defaults to 8.
  • category_enc_dim (int) – Autoencoder handles categorical variable as embedding/one hot encoding. This parameter defines the internal dimension to be used by the auto-encoder to derive the categorical embedding. Defaults to 3.
  • category_encoding (str) – Strategy specification for categorical variable handling. Supported values are ‘ohe’ (One hot encoding) and ‘label’ (Uses embedding). Defaults to “ohe”.
  • numeric_scaling (str) – Data scaling to be used on numeric columns for computational stability. This uses global scaling, i.e., applied uniformly over the entire training batch for all numeric columns. Supported values are minmax, standard, quantile. Defaults to None.
  • layers_config (List[int]) – This is auto-encoder internals specification. Autoencoder uses MLP layers to derive the embedding, this parameter specifies number of hidden layers in the embedding, and their respective dimensions. Defaults to [16, 16].
  • encoder_activation (str) – Activation function used by the auto-encoder encoding layers. Supports all activation function as enabled by tensorflow framework. Defaults to “relu”.
  • decoder_activation (str) – Activation function used by the auto-encoder decoding layers. Support all activation functions as supported by the tensorflow framework. Defaults to “relu”.
  • embedding_activation (str) – This is embedding layer activation, this can be separately specified than hidden layer activation. Support all tensorflow activation function. Defaults to “tanh”.
  • encoder_kernel_regularizer (str) – Regularization for the encoder MLP kernel. Regularization results in stable prediction model. Defaults to “l1”.
  • encoder_kernel_initializer (str) – Initialization algorithm for the MLP kernel. Defaults to “glorot_uniform”.
  • encoder_bias_initializer (str) – Initialization algorithm for the MLP bias. Defaults to “zeros”.
  • encoder_activity_regularizer (str) – Encoder activity regularizer for MLP layers. Defaults to None.
  • decoder_kernel_regularizer (str) – Kernel regularizer for the decoder MLP layer. All tensorflow regularizer algorithm are supported. Defaults to “l1”.
  • decoder_kernel_initializer (str) – Decoder MLP kernel weight initializer algorithm. All tensorflow initializer algorithms are supported. Defaults to “glorot_uniform”.
  • decoder_bias_initializer (str) – Decoder MLP bias initializer algorithm. All tensorflow initializer algorithms are supported. Defaults to “zeros”.
  • decoder_activity_regularizer (str) – Decoder activity regularization algorithm. All tensorflow regularizer algorithms are supported. Defaults to None.
  • decoder_last_layer_activation (str) – Decoder last layer activation. This layer produces the input reconstruction. Supports all tensorflow supported activation function. Defaults to “linear”.
  • embedding_activity_regularizer (str) – Embedding activity regularization method. Uses default tensorflow framework, support all activity regularizer algorithm. Defaults to None.
  • classifier_layers (List[int]) – Supervised auto-encoder uses classification layer for the structural constraint on the embedding. MLP layer for this classification task. This specifies the dimension of the MLP layer for this classification task. Defaults to [16].
  • classifier_activation (str) – Supervised auto-encoder uses classification layer for the structural constraint on the embedding. For supervised auto-encoder activation of MLP layer classification layer. Defaults to “relu”.
  • classifier_kernel_regularizer (str) – Supervised auto-encoder uses classification layer for the structural constraint on the embedding. This parameter describes the kernel regularization for supervised auto-encoder MLP layer for classification. Defaults to None.
  • classifier_kernel_initilizer (str) – Supervised auto-encoder uses classification layer for the structural constraint on the embedding. This parameter describes the kernel initialization algorithm for supervised auto-encoder MLP layer for the classification. Defaults to “glorot_uniform”.
  • classifier_bias_initializer (str) – Describes the MLP bias initialization algorithm, for the classification layer of supervised auto-encoder. Defaults to “zeros”.
explain_instance(x, **kwargs)

Explain (local explanation) the model prediction for provided instance(s).

Parameters:x (Union[pd.DataFrame, np.ndarray]) – input instance to be explained.
Additional Parameters:
neighbors (int): Number of neighbors
Overrides neighbors parameter provided in the initializer.
Returns:
explanation object
Dictionary or list of dictionaries with keys: features, categorical_features, query, neighbors, distances.
Return type:Union(List[dict], dict)
fit(x: pandas.core.frame.DataFrame, y: numpy.ndarray = None, features: List[str] = None, categorical_features: List[str] = [], categorical_values: dict = {}, epochs: int = 5, batch_size: int = 128, verbose: int = 0, shuffle: bool = True, validation_fraction: float = 0, max_training_records: int = 10000, exemplars: pandas.core.frame.DataFrame = None, random_seed: int = None, **kwargs)

Fit the explainer.

Parameters:
  • x (pd.DataFrame) – Training data.
  • y (np.ndarray) – If provided these data labels is used for training supervised auto-encoder. This labels need not be same as data target classes from target class label. Defaults to None.
  • features (List[str]) – Names of the features to be used. If not specified all the columns in the training data will be used as features. Defaults to None.
  • categorical_features (List[str]) – Names of the categorical features in the data. This must match the column names. Defaults to [].
  • categorical_values (dict) – Lookup dictionary for all categorical variables, list of possible categorical values. Defaults to {}.
  • epochs (int) – Number of epochs to be used for auto-encoder training. Defaults to 5.
  • batch_size (int) – Batch-size for the auto-encoder training. Should be smaller than the available data in training data. Defaults to 128.
  • verbose (int) – Log verbosity. Defaults to 0.
  • shuffle (bool) – Shuffle batch per epoch. Defaults to True.
  • validation_fraction (float) – Fraction specifying the validation split during encoder training. Defaults to 0.
  • max_training_records (int) – Maximum number of records to be used for the auto-encoder training. Enables explainer performance optimization. Defaults to 10000.
  • exemplars (pd.DataFrame) – Exemplar neighbors to be used to compute explanations. If None, training dataset will be used as exemplars. Defaults to None.
  • random_seed (int) – Random seed to fit auto encoder. Defaults to None
get_params(*argv, **kwargs) → dict

Get parameters for the explainer.

set_exemplars(x: Union[pandas.core.frame.DataFrame, numpy.ndarray])

Set user provided exemplars to guide contrastive exploration.

Parameters:x (Union[pd.DataFrame, np.ndarray]) – Exemplar neighbors to be used to compute explanations.
set_params(*argv, **kwargs)

Set parameters for the explainer.

Grouped Conditional Expectation (GroupedCE) Explainer

class aix360.algorithms.gce.gce.GroupedCEExplainer(model: Callable, data: Union[int, List[List[object]]], feature_names: List[str] = None, n_samples: int = 25, features_selected: list = None, top_k_features: int = -1, feature_importance_method: str = 'SHAP', max_dataset_size: int = 10, random_seed: int = None, **kwargs)

Grouped Conditional Expectation plots are generated for a given instance and set of features. They show how the model prediction is affected when a pair of features of a given instance are perturbed. The perturbed features can be either a provided subset of the input covariates, or the top K features based on the importance rank obtained by a global explainer (e.g., SHAP). If the user provides a single feature then the algorithm produces a standard ICE plot where the selected feature varies according to a linespace grid. If the user chooses more than one feature, then the explainer produces 3D ICE plots. Here, two features vary simultaneously according to a meshgrid, the output of the model is stored for each pair of values.

GroupedCEExplainer initialization.

Parameters:
  • model (Callable) – model prediction (predict/predict_proba) function that results a real value like probability or regressed value.
  • data (List[object]) – Input dataset used for model training. Feature range is computed from this input dataset. The dataset is used in selected feature importance methods such as SHAP to determine top K features for group explanation.
  • feature_names (List[str]) – List of valid numerical feature names in the input dataset. Defaults to None.
  • n_samples (int, optional) – Number of discrete points sampled per feature. Defaults to 25.
  • features_selected (List[str], optional) – List of features that will be considered in the explanation. If list contains single feature, GroupedCEExplainer return standard ICE explanation. Otherwise, returns grouped explanation for 3D ICE plots.
  • top_k_features (int, optional) – Top K importance features to consider if features_selected is an empty list. If top_k_features <= 0, all the features are selected for explanation. Defaults to -1.
  • feature_importance_method (str,optional) – Importance feature method to be used if top_k_features is > 0 and features_selected is empty. Defaults to ‘SHAP’.
  • max_dataset_size (int) – maximum dataset size used during selected feature importance method (feature_importance_method). Defaults to 10.
explain_instance(instance: Union[pandas.core.frame.DataFrame, numpy.ndarray], **kwargs)

Produces local explanation of the target model for selected feature(s).

Parameters:instance (Union[pd.DataFrame, np.ndarray]) – input instance to be explained.
Returns:
explanation object
Dictionary with feature_name, feature_value, ice_value, current_value for ICE explanation. Dictionary with gce_values, x_grid, y_pred, current_values for GCE explanation.
Return type:dict
get_params(*argv, **kwargs) → dict

Get parameters for the explainer.

set_params(*argv, **kwargs)

Set parameters for the explainer.