Extracting Feature Importances from a Classifier Model

How feature importances aid your classification projects

Aleksandar Gakovic
4 min readSep 26, 2020
Photo by Joshua Sortino on Unsplash

During the final weeks of my Flatiron Data Science Bootcamp, I worked on a classification machine learning project with a colleague. We experimented with a few different models learning to optimise the hyperparameters and output their predictions. We settled, finally, on a Random Forest ensemble model. It wasn’t until a month after the project — when I was asked if I knew the feature importances of the final model — that I realised I had completely forgotten to do extract them!

Extracting the feature importance values from your classifier is useful in several ways. It can provide business intelligence. Create a deeper understanding of the business and drive exploration into specific areas. It can help the explainability/ interpretability of the model and it can be used for feature selection that strengthens the final model performance.

In this article you will learn:

  • How to extract Feature Importances from different classifier models
  • A way to arrange the values in a readable output
  • About Dimensionality reduction. How do feature importances aid feature selection?

Extracting Feature Importances

The method of extracting feature importances from a random forest ensemble model is the same for stochastic gradient boosting models, decision tree algorithms, and other classification and regression trees (CART).

Once Sklearn has been imported, the data cleaned, and the model instantiated as well as fit on the training data, model.feature_importances_is what you need.

# Instantiate the classifier model:
rf_clf = RandomForestClassifier()
# Fit the model to the training data:
rf_clf.fit(X_train, y_train)
# Extract the Feature importances from the model:
importance = model.feature_importances_

The method for extracting from linear and logistic regression models is a little bit different.

# Linear Regression:
importance = model.coef_
# Logistic Regression:
importance = model.coef_[0]

A note about what the values represent:

Feature importance values for linear regression and logistic regression actually stem from the coefficient values used in the weighted sum for a prediction. The coefficients stand in as feature importances.

However, with CART and ensemble methods the feature importances are not as clearly defined. They represent how much that feature contributes toward the decision of the model. ¹

Arranging the Values in a readable output

# Output importances 
imp_list = []
for i,v in enumerate(importance):
imp_list.append((i, v))
sorted_list = sorted(imp_list, key=lambda x: x[1], reverse=True)
print('The top 10 Feature importances are:' '\n')for f, i in sorted_list[:10]:
print(f'{X_train.columns[f]}, Score: {round(i, 3)}')

You can also plot the importances quite easily:

# Plot importances:
plt.bar([x for x in range(len(importance))], importance)
plt.ylim(0, 0.08)
plt.show()
All feature importances

I use list comprehensions to obtain tick labels and height values for the bar chart below. I explain them and other Pythonic refactoring methods in this article.

# Plot top ten importances
plt.bar(range(1, 11), [x[1] for x in sorted_list[:10]],
tick_label=[x[0] for x in sorted_list[:10]],
color = (0.2,0.5,0.7,0.6))
plt.title('Top Ten Feature Importances')
plt.xlabel('Feature Number')
plt.ylabel('Feature Importance Value')
Top ten feature importances

How do Feature Importances aid Feature Selection

Simply put dimensionality reduction “is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension”. ²

One way of achieving dimensionality reduction is feature selection. This is the process of choosing features of the data that are more relevant to the model performing well. Doing so makes the data input to the model less noisy, lets the model train faster, simplifies the model for interpretation, and improves generalisation to outside data.

Sklearn provides a method of selecting features from a classification model that have high feature importance values SelectFromModel . It is imported like so:

from sklearn.feature_selection import SelectFromModel

Using this feature selection method you can fit the object to a training dataset and apply it as a transform to select however many of the most important features you want from the dataset. This transform should be applied to both the training and the test dataset.

Conclusion

Feature importances are integral to a data science classification project. They enable further exploration into features and develop business understanding. Using them you can simplify a feature selection process, helping to reduce dimensionality and keep data input to the model relevant.

Thank you for reading and if you enjoyed the article or it helped, consider leaving a clap.

References

  1. Interpreting feature importance values from a RandomForest Classifier — Lejlot, Stack overflow
  2. Dimensionality Reduction: A Comparative Review — Paper
  3. Beautiful Pythonic Refactoring — Algakovic, article
  4. Sklearn — Homepage
  5. SelectFromModel — sklearn docs
  6. Read more about How to calculate Feature Importances — Jason Brownlee

--

--

Aleksandar Gakovic

Practicing Data Scientist. Interested in Games, Gamification, Ocean Sciences, Music, Biology.