Directly Interpretable Supervised Explainers¶

Boolean Rules via Column Generation Explainer¶

class aix360.algorithms.rbm.BRCG.BRCGExplainer(model)¶

Boolean Rule Column Generation explainer. Provides access to aix360.algorithms.rbm.boolean_rule_cg.BooleanRuleCG, which implements a directly interpretable supervised learning method for binary classification that learns a Boolean rule in disjunctive normal form (DNF) or conjunctive normal form (CNF) using column generation (CG). AIX360 implements a heuristic beam search version of BRCG that is less computationally intensive than the published integer programming version [4].

References

[1]	S. Dash, O. Günlük, D. Wei, “Boolean decision rules via column generation.” Neural Information Processing Systems (NeurIPS), 2018.

Initialize a BRCGExplainer object.

Parameters:	model – model to operate on, instance of `aix360.algorithms.rbm.boolean_rule_cg.BooleanRuleCG`

explain(*argv, **kwargs)¶

Return rules comprising the underlying model.

Parameters:

maxConj (int, optional) – Maximum number of conjunctions to show
prec (int, optional) – Number of decimal places to show for floating-value thresholds

Returns:

Dictionary containing

isCNF (bool): flag signaling whether model is CNF or DNF
rules (list): selected conjunctions formatted as strings

fit(X_train, Y_train, *argv, **kwargs)¶

Fit model to training data.

Parameters:	X_train (DataFrame) – Binarized features with MultiIndex column labels Y_train (array) – Binary-valued target variable
Returns:	Self
Return type:	BRCGExplainer

predict(X, *argv, **kwargs)¶

Predict class labels.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels
Returns:	y – Predicted labels
Return type:	array

set_params(*argv, **kwargs)¶: Set parameters for the explainer.

class aix360.algorithms.rbm.boolean_rule_cg.BooleanRuleCG(lambda0=0.001, lambda1=0.001, CNF=False, iterMax=100, timeMax=100, K=10, D=10, B=5, eps=1e-06, solver='ECOS', verbose=False, silent=False)¶

BooleanRuleCG is a directly interpretable supervised learning method for binary classification that learns a Boolean rule in disjunctive normal form (DNF) or conjunctive normal form (CNF) using column generation (CG). AIX360 implements a heuristic beam search version of BRCG that is less computationally intensive than the published integer programming version [#NeurIPS2018]_.

References

[2]	S. Dash, O. Günlük, D. Wei, “Boolean decision rules via column generation.” Neural Information Processing Systems (NeurIPS), 2018.

Parameters:

lambda0 (float, optional) – Complexity - fixed cost of each clause
lambda1 (float, optional) – Complexity - additional cost for each literal
CNF (bool, optional) – CNF instead of DNF
iterMax (int, optional) – Column generation - maximum number of iterations
timeMax (int, optional) – Column generation - maximum runtime in seconds
K (int, optional) – Column generation - maximum number of columns generated per iteration
D (int, optional) – Column generation - maximum degree
B (int, optional) – Column generation - beam search width
eps (float, optional) – Numerical tolerance on comparisons
solver (str, optional) – Linear programming - solver
verbose (bool, optional) – Linear programming - verboseness
silent (bool, optional) – Silence overall algorithm messages

compute_conjunctions(X)¶

Compute conjunctions of features as specified in self.z.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels
Returns:	A – Conjunction values
Return type:	array

explain(maxConj=None, prec=2)¶

Return rules comprising the model.

Parameters:

maxConj (int, optional) – Maximum number of conjunctions to show
prec (int, optional) – Number of decimal places to show for floating-value thresholds

Returns:

Dictionary containing

isCNF (bool): flag signaling whether model is CNF or DNF
rules (list): selected conjunctions formatted as strings

fit(X, y)¶

Fit model to training data.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels y (array) – Binary-valued target variable
Returns:	Self
Return type:	BooleanRuleCG

predict(X)¶

Predict class labels.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels
Returns:	y – Predicted labels
Return type:	array

Generalized Linear Rule Model Explainer¶

class aix360.algorithms.rbm.GLRM.GLRMExplainer(model)¶

Generalized Linear Rule Model explainer. Provides access to the following directly interpretable supervised learning methods:

Linear Rule Regression: linear regression on rule-based features [3].
Logistic Rule Regression: logistic regression on rule-based features [3].

References

[3]	(1, 2) D. Wei, S. Dash, T. Gao, O. Günlük, “Generalized linear rule models.” International Conference on Machine Learning (ICML), 2019.

Initialize a GLRMExplainer object.

Parameters:

model –

model to operate on. Instance of either

aix360.algorithms.rbm.linear_regression.LinearRuleRegression or
aix360.algorithms.rbm.logistic_regression.LogisticRuleRegression

explain(maxCoeffs=None, highDegOnly=False, prec=2)¶

Return DataFrame holding model features and their coefficients.

Parameters:	maxCoeffs (int, optional) – Maximum number of rules/numerical features to show highDegOnly (bool, optional) – Only show higher-degree rules prec (int, optional) – Number of decimal places to show for floating-value thresholds
Returns:	dfExpl – Rules/numerical features and their coefficients
Return type:	DataFrame

fit(X_train, Y_train, Xstd=None)¶

Fit model to training data.

Parameters:	X_train (DataFrame) – Binarized features with MultiIndex column labels Y_train (array) – Target variable Xstd (DataFrame, optional) – Standardized numerical features
Returns:	Self
Return type:	GLRMExplainer

predict(X, Xstd=None)¶

Predict responses.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels Xstd (DataFrame, optional) – Standardized numerical features
Returns:	y – Predicted responses
Return type:	array

predict_proba(X, Xstd=None)¶

Predict probabilities of Y=1. Only available if underlying model implements predict_proba method.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels Xstd (DataFrame, optional) – Standardized numerical features
Returns:	p – Predicted probabilities
Return type:	array
Raises:	`ValueError` – if model doesn’t implement predict_proba

set_params(*argv, **kwargs)¶: Set parameters for the explainer.

visualize(Xorig, fb, features=None)¶

Plot generalized additive model component, which includes first-degree rules and linear functions of unbinarized ordinal features but excludes higher-degree rules.

Parameters:	Xorig (DataFrame) – Original unbinarized features fb – FeatureBinarizer object used to binarize features features (list, optional) – Subset of features to be plotted

class aix360.algorithms.rbm.linear_regression.LinearRuleRegression(lambda0=0.05, lambda1=0.01, useOrd=False, debias=True, K=1, iterMax=200, B=1, wLB=0.5, stopEarly=False, eps=1e-06)¶

Linear Rule Regression is a directly interpretable supervised learning method that performs linear regression on rule-based features.

Parameters:

lambda0 (float, optional) – Regularization - fixed cost of each rule
lambda1 (float, optional) – Regularization - additional cost of each literal in rule
useOrd (bool, optional) – Also use standardized numerical features
debias (bool, optional) – Re-fit final solution without regularization
K (int, optional) – Column generation - maximum number of columns generated per iteration
iterMax (int, optional) – Column generation - maximum number of iterations
B (int, optional) – Column generation - beam search width
wLB (float, optional) – Column generation - weight on lower bound in evaluating nodes
stopEarly (bool, optional) – Column generation - stop after current degree once improving column found
eps (float, optional) – Numerical tolerance on comparisons

compute_conjunctions(X)¶

Compute conjunctions of features as specified in self.z.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels
Returns:	A – Feature conjunction values, shape (X.shape[0], self.z.shape[1])
Return type:	array

explain(maxCoeffs=None, highDegOnly=False, prec=2)¶

Return DataFrame holding model features and their coefficients.

Parameters:	maxCoeffs (int, optional) – Maximum number of rules/numerical features to show highDegOnly (bool, optional) – Only show higher-degree rules prec (int, optional) – Number of decimal places to show for floating-value thresholds
Returns:	dfExpl – Rules/numerical features and their coefficients
Return type:	DataFrame

fit(X, y, Xstd=None)¶

Fit model to training data.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels y (array) – Target variable Xstd (DataFrame, optional) – Standardized numerical features
Returns:	Self
Return type:	LinearRuleRegression

predict(X, Xstd=None)¶

Predict responses.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels Xstd (DataFrame, optional) – Standardized numerical features
Returns:	yhat – Predicted responses
Return type:	array

visualize(Xorig, fb, features=None)¶

Plot generalized additive model component, which includes first-degree rules and linear functions of unbinarized ordinal features but excludes higher-degree rules.

Parameters:	Xorig (DataFrame) – Original unbinarized features fb – FeatureBinarizer object used to binarize features features (list, optional) – Subset of features to be plotted

class aix360.algorithms.rbm.logistic_regression.LogisticRuleRegression(lambda0=0.05, lambda1=0.01, useOrd=False, debias=True, init0=False, K=1, iterMax=200, B=1, wLB=0.5, stopEarly=False, eps=1e-06, maxSolverIter=100)¶

Logistic Rule Regression is a directly interpretable supervised learning method that performs logistic regression on rule-based features.

Parameters:

lambda0 (float, optional) – Regularization - fixed cost of each rule
lambda1 (float, optional) – Regularization - additional cost of each literal in rule
useOrd (bool, optional) – Also use standardized numerical features
debias (bool, optional) – Re-fit final solution without regularization
init0 (bool, optional) – Initialize with no features
K (int, optional) – Column generation - maximum number of columns generated per iteration
iterMax (int, optional) – Column generation - maximum number of iterations
B (int, optional) – Column generation - beam search width
wLB (float, optional) – Column generation - weight on lower bound in evaluating nodes
stopEarly (bool, optional) – Column generation - stop after current degree once improving column found
eps (float, optional) – Numerical tolerance on comparisons
maxSolverIter – Maximum number of logistic regression solver iterations

compute_conjunctions(X)¶

Compute conjunctions of features as specified in self.z.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels
Returns:	A – Feature conjunction values, shape (X.shape[0], self.z.shape[1])
Return type:	array

explain(maxCoeffs=None, highDegOnly=False, prec=2)¶

Return DataFrame holding model features and their coefficients.

Parameters:	maxCoeffs (int, optional) – Maximum number of rules/numerical features to show highDegOnly (bool, optional) – Only show higher-degree rules prec (int, optional) – Number of decimal places to show for floating-value thresholds
Returns:	dfExpl – Rules/numerical features and their coefficients
Return type:	DataFrame

fit(X, y, Xstd=None)¶

Fit model to training data.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels y (array) – Target variable Xstd (DataFrame, optional) – Standardized numerical features
Returns:	Self
Return type:	LogisticRuleRegression

predict(X, Xstd=None)¶

Predict class labels.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels Xstd (DataFrame, optional) – Standardized numerical features
Returns:	yhat – Predicted labels
Return type:	array

predict_proba(X, Xstd=None)¶

Predict probabilities of Y=1.

Parameters:	X (DataFrame) – Binarized features with MultiIndex column labels Xstd (DataFrame, optional) – Standardized numerical features
Returns:	p – Predicted probabilities
Return type:	array

visualize(Xorig, fb, features=None)¶

Plot generalized additive model component, which includes first-degree rules and linear functions of unbinarized ordinal features but excludes higher-degree rules.

Parameters:	Xorig (DataFrame) – Original unbinarized features fb – FeatureBinarizer object used to binarize features features (list, optional) – Subset of features to be plotted

Teaching Explanations for Decisions (TED) Cartesian Product Explainer¶

class aix360.algorithms.ted.TED_Cartesian.TED_CartesianExplainer(model)¶

TED is an explainability framework that leverages domain-relevant explanations in the training dataset to predict both labels and explanations for new instances [#]_. This is an implementation of the simplest instantiation of TED, called the Cartesian Product.

References

[4]	Michael Hind, Dennis Wei, Murray Campbell, Noel C. F. Codella, Amit Dhurandhar, Aleksandra Mojsilovic, Karthikeyan Natesan Ramamurthy, Kush R. Varshney, “TED: Teaching AI to Explain its Decisions,” AAAI /ACM Conference on Artificial Intelligence, Ethics, and Society (AIES-19), 2019.

Parameters:	model (sklearn.base.BaseEstimator) – a binary estimator for classification, i.e., it implements fit and predict.

explain(X)¶

Use TED-enhanced classifier to provide an explanation (E) for passed instance

Parameters:	X (list of ints) – features
Returns:	predicted explanation [0..MaxE]
Return type:	int

fit(X, Y, E)¶

Train a classifier based on features (X), labels (Y), and explanations (E)

Parameters:	X – list of features vectors Y – list of labels E – list of explanations

predict(X)¶

Use TED-enhanced classifier to provide an prediction (Y) for passed instance

Parameters:	X (list of ints) – features
Returns:	predicted label {0,1}
Return type:	int

predict_explain(X)¶

Use TED-enhanced classifier to predict label (Y) and explanation (E) for passed instance

Parameters:	X (list of ints) – features
Returns:	Y (int) – predicted label {0,1} E (int) – predicted explanation [0..MaxE]
Return type:	tuple

score(X_test, Y_test, E_test)¶

Evaluate the accuracy (Y and E) of the TED-enhanced classifier using a test dataset

Parameters:

X_test (list of lists) – list of feature vectors
Y_test (list of int) – list of labels {0, 1}
E_test (list of ints) – list of explanations {0, …, NumExplanations -1}

Returns:

YE_accuracy – the accuracy of predictions when the labels (Y) and explanations (E) are treated as a combined label
Y_accuracy – the prediction accuracy for labels (Y)
E_accuracy – the prediction accuracy of explanations (E)

Return type:

tuple

set_params(*argv, **kwargs)¶: Set parameters for the explainer.

RIPPER Explainer¶

class aix360.algorithms.rule_induction.ripper.RipperExplainer(d: int = 64, k: int = 2, pruning_threshold: int = 20, random_state: int = 0)¶

RIPPER (Repeated Incremental Pruning to Produce Error Reduction) is a heuristic rule induction algorithm based on separate-and-conquer. The explainer outputs a rule set in Disjunctive Normal Form (DNF) for a single target concept.

References

[5]	`William W Cohen, “Fast Effective Rule Induction” Machine Learning: Proceedings of the Twelfth International Conference, 1995. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.2612&rep=rep1&type=pdf>

Parameters:	d (int) – The number of bits that a new rule need to gain. Defaults to 64. k (int) – The number of iterations for the optimization loop. Defaults to 2. pruning_threshold (int) – The minimum number of instances for splitting. Defaults to 20. random_state (int) – The random seed for the splitting function. Defaults to 0.

explain()¶

Export rule set to technical interchange format trxf from internal representation for the positive value (i.e. label value) it has been fitted for.

When the internal rule set is empty an empty dnf rule set with the internal pos value is returned.

Returns:	trxf.DnfRuleSet

explain_multiclass()¶

Export rules to technical interchange format trxf from internal representation Returns a list of rule sets.

Returns:	– Ordered list of rulesets
Return type:	list(trxf.DnfRuleSet)

fit(train: pandas.core.frame.DataFrame, y: pandas.core.series.Series, target_label=None)¶

The fit function for RIPPER algorithm. Its implementation is limited to DataFrame and Series because the RIPPER algorithm needs the information of feature name and have to support nominal data type. Only float dtypes are considered numerical features. All others (including int) are treated as nominal.

If target_label is specified, binary classification is assumed and asserted, and training uses target_label as selection of positive examples.

The induction of rules is deterministic by default as all random choices are initialized with self.random_state, which is 0 by default.

Parameters:	train (pd.DataFrame) – The features of the training set y (pd.Series) – The labels of the training set target_label (Any) – The target label to learn for binary classification, among the unique values of y. If not provided, Ripper will induce a native ordered ruleset with multiple labels/conclusions.
Returns:	self

predict(X: pandas.core.frame.DataFrame) → numpy.ndarray¶

The predict function for RIPPER algorithm. Its implementation is limited to DataFrame and Series because the RIPPER algorithm needs the information of feature name and have to support nominal data type

Parameters:	X (pd.DataFrame) – DataFrame of features
Returns:	predicted labels
Return type:	np.array

set_params(**kwargs)¶: Set parameters for the explainer.

target_label¶: The latest positive value RIPPER has been fitted for.

IMD Explainer¶

class aix360.algorithms.imd.imd.IMDExplainer¶

Interpretable Model Differencing to explain the similarities and differences between two classifiers. Provides access to aix360.algorithms.imd.jst.JointSurrogateTree, a novel data structure to compactly represent the differences between the models in terms of rules, and also provides a way to visualize the joint surrogate tree structure.

References

[6]	S. Haldar, D. Saha, D. Wei, R. Nair, E. M. Daly, “Interpretable Differencing of Machine Learning Models.” Uncertainty in Artificial Intelligence (UAI), 2023.

Initialize an IMDExplainer object.

explain(*argv, **kwargs)¶: Return diff-rules.

fit(X_train: pandas.core.frame.DataFrame, Y1, Y2, max_depth, split_criterion=1, alpha=0.0, verbose=True, **kwargs)¶

Fit joint surrogate tree to input data, and outputs from two models. :param X_train: input dataframe :param Y1: model1 outputs :param Y2: model2 outputs :param max_depth: maximum depth of the joint surrogate tree to be built :param feature_names: list of input feature names :param alpha: parameter to control degree of favouring common nodes vs. separate nodes :param split_criterion: which divergence criterion to use? (see paper for more details) :param verbose: :param **kwargs:

Returns:	self

metrics(x_test: pandas.core.frame.DataFrame, y_test1, y_test2, name='test')¶

take x_test and check the precision and recall precision =

number of actual diff samples inside the diffregion / number of test samples inside the diffregion

recall = diff samples inside the region / total number of diff samples

Parameters:	x_test – test data (only x) to compute diff-based metrics name – string (train or test)
Returns:	a dictionary having precision, recall, num-rules, and num-unique-preds values as obtained from the diff-rules extracted from the jst.

predict(X, *argv, **kwargs)¶: Predict diff-labels.

set_params(*argv, **kwargs)¶: Set parameters for the explainer.