Home > Articles > Model-agnostic feature importance through ablation

Model-agnostic feature importance through ablation

Feature importances are, well, important. We can use them to provide a rudimentary level of interpretability; if a feature has higher importance, it has greater impact on the target variable. Some machine learning models have an innate way of calculating feature importance (decision trees, for instance). Others don't have a way of doing this (for example, support vector machines using an RBF kernel). Further, some models result in a set of coefficients (like linear regression) that are easy to misinterpret (e.g. if you have two features with dramatically different scales).

Feature ablation is a technique for calculating feature importances that works for all machine learning models. Given a dataset of n rows and m features, the procedure goes like this:

  1. Train the model on your train set and calculate a score on the test set. You can pick whatever scoring metric you like.
  2. For each of the m features, remove it from the training data and train the model. Then, calculate the score on the test set.
  3. Rank the features by the difference between the original score (from the model with all features) and the score for the model using all features but one.

Example code

Here's an example of how we could actually perform this procedure in Python using scikit-learn.

First, we import some things we'll need: load_digits will load in the digits dataset. SVC is the model we'll use. train_test_split is a utility method that splits the dataset into training and testing portions. sklearn.metrics has a lot of pre-defined metrics in it.

from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
import sklearn.metrics as mx

We load and split the data.

data = load_digits()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2, random_state=42)

Now we define a function which will train and score a model for us. Given the data, it creates and trains a support vector machine, then returns the accuracy. Finally, we store the score of our model with all features into base_score.

def score_model(X_train, X_test, y_train, y_test):
    clf = SVC(gamma='scale', kernel='rbf')
    clf.fit(X_train, y_train)
    y_pred = clf.predict(X_test)
    return mx.accuracy_score(y_test, y_pred)

base_score = score_model(X_train, X_test, y_train, y_test)

Then, we iterate through all features, creating an array use_column which we use to select all columns except for the one which we're currently scoring. We store the score of a given model in the list scores.

scores = []

for i in range(X_train.shape[1]):
    use_column = [ndx != i for ndx in range(X_train.shape[1])]
    scores.append(score_model(X_train[:, use_column],
                              X_test[:, use_column],

Finally, we get the top 10 features.

sorted(enumerate([base_score - s for s in scores]),
       key=lambda ndx_score: ndx_score[1],

[(12, 0.005555555555555647),
 (21, 0.005555555555555647),
 (5, 0.002777777777777879),
 (10, 0.002777777777777879),
 (17, 0.002777777777777879),
 (18, 0.002777777777777879),
 (20, 0.002777777777777879),
 (34, 0.002777777777777879),
 (37, 0.002777777777777879),
 (46, 0.002777777777777879)]

Relation to stepwise regression

You may recognize this idea as being similar to backward stepwise regression. Wasserman (2005) describes this technique for model selection as "we start with the biggest model and drop one variable at a time" (p. 221). We drop variables until the score has decreased beyond some acceptable level or until we have reached the desired number of features. He notes that this is a greedy search and is not "guaranteed to find the model with the best score." If we were to use scikit's recursive feature elimination in combination with this feature ablation technique, we would be using backward stepwise regression.

If you do decide to apply stepwise regression, be careful with the test set used to evaluate the features. If you choose features that optimize the score on the test set, you are overfitting to the test set (and any metrics calculated for the test set will be incorrect). If performing stepwise regression, I would recommend splitting the training set into 5 folds and performing cross validation to select features. After that process, metrics calculated on the test set remain valid (because it was not used during training).


This technique provides a general way to calculate feature importances for any classification or regression model (even those that don't natively support them). It's also an element of a feature selection technique called stepwise regression.

Comments? Questions? Concerns? Please tweet me @SamuelDataT or email me (sgt at this domain). Thanks!