Basic Usage Examples
This section provides basic usage examples for the ACFX package. These examples demonstrate how to initialize the ACFX class to generate counterfactual explanations.
Initializing ACFX
To begin using ACFX, first import the class and create an instance. For this example, let’s pick AcfxEBM that requires ExplainableBoostingClassifier as its blackbox:
from acfx import AcfxEBM
from interpret.glassbox import ExplainableBoostingClassifier
model = ExplainableBoostingClassifier()
explainer = AcfxEBM(model)
Prepare the data
Prepare some sample data for for counterfactual generation
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
def sample_data():
data = load_iris(as_frame=True)
X = data.data
y = data.target
return train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_test, y_train, y_test = sample_data()
Prepare bounds
Prepare bounds for data for counterfactual generation
pbounds = {col: (X_train[col].min(), X_train[col].max()) for col in X_train.columns}
Prepare adjacency matrix and causal order
Prepare example adjacency matrix for counterfactual generation. It can be expert knowledge or can be generated by tools like DirectLiNGAM
import networkx as nx
import numpy as np
def get_causal_order(adjacency_matrix):
G = nx.DiGraph()
n = adjacency_matrix.shape[0]
for i in range(n):
for j in range(n):
if adjacency_matrix[i, j] != 0:
G.add_edge(j, i)
causal_order = list(nx.topological_sort(G))
return causal_order
adjacency_matrix = np.array([
[0.0, 0.0, 0.0, 0.0],
[0.8, 0.0, 0.0, 0.0],
[0.0, 0.6, 0.0, 0.0],
[0.5, 0.0, 0.7, 0.0]
])
causal_order = get_causal_order(adjacency_matrix)
(Alternatively) prepare adjacency matrix using external tools
The adjacency matrix can be generated by tools like DirectLiNGAM
import lingam
causal_model = lingam.DirectLiNGAM()
causal_model.fit(X_train)
adjacency_matrix = causal_model.adjacency_matrix_
causal_order = causal_model.causal_order_
Fit to initialize the model
Initialize all the prepared data
features_order = X_train.columns.tolist()
explainer.fit(X=X_train, adjacency_matrix=adjacency_matrix, causal_order=causal_order, pbounds=pbounds,
y=y_train, features_order=features_order)
Generate Counterfactuals
To generate counterfactual explanations for a given instance:
query_instance = X_test.iloc[0].values
original_class = model.predict([query_instance])[0]
cf = explainer.counterfactual(desired_class=original_class, query_instance=query_instance)
print(cf)
Using custom blackbox
You can use ACFX with custom blackbox. To do so, you need to provide a optimizer that is compliant with the blackbox
Example custom optimizer
Below I prepared an example custom, model-agnostic counter optimizer
from typing import Dict, Tuple
from acfx.abstract import ModelBasedCounterOptimizer
import numpy as np
import pandas as pd
from overrides import overrides
class SomeCustomCounterOptimizer(ModelBasedCounterOptimizer):
def __init__(self, model, X: pd.DataFrame, feature_bounds: Dict[str, Tuple[float, float]], n_iter: int = 100):
if not hasattr(model, 'predict_proba'):
raise AttributeError("Model must implement predict_proba()")
self.model = model
self.X = X
self.feature_bounds = feature_bounds
self.n_iter = n_iter
@overrides
def optimize_proba(self, target_class: int, feature_masked: list[str]) -> Dict[str, float]:
base_instance = self.X.mean().copy()
best_instance = base_instance.copy()
best_score = self.model.predict_proba([base_instance])[0][target_class]
for _ in range(self.n_iter):
candidate = base_instance.copy()
for feature_name in self.X.columns:
if feature_name in feature_masked and feature_name in self.feature_bounds:
min_val, max_val = self.feature_bounds[feature_name]
candidate[feature_name] = np.random.uniform(min_val, max_val)
score = self.model.predict_proba([candidate])[0][target_class]
if score > best_score:
best_score = score
best_instance = candidate.copy()
return best_instance.to_dict()
…and example of acfx explainer’s fit using this custom counter optimizer
from acfx import AcfxCustom
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)
feature_masked = ["sepal width (cm)"]
optimizer = SomeCustomCounterOptimizer(model, X_test, pbounds)
explainer = AcfxCustom(model)
explainer.fit(X=X_train, adjacency_matrix=adjacency_matrix, causal_order=causal_order, pbounds=pbounds,
features_order=features_order, optimizer=optimizer, masked_features=feature_masked)