XAI/MLI 可解释机器学习系列1- 开源&paper汇总-六虎

继续创作，加速成长！这是我参与「日新方案 · 10 月更文挑战」的第3天，点击检查活动概况

一直在关注可解说范畴，因为确实在工作中有许多应用

模型检查，特征重要性是否契合预期和AUC一样重要
模型解说，比起虚无缥缈的模型指标，解说模型学到的规则更能说服业务方
样本解说，为什么这些用户会违约，是否有指标能提前预警？
决议计划归因，有时模型只是提取pattern的方式，终究需求给到归因/决议计划，例如HTE模型和XAI结合是否也是一种落地方式

18年被H2O Driverless AI 供给的可解说机器学习引擎（下图）种草后，就对这个范畴产生了爱好。不过用的越多，XAI露出的问题就越多，比方特征的微调可能会导致整个特征解说发生天翻地覆的变化，再比方体现很好的模型会给出完全不能了解的特征解说。不过在接触因果推理后期望能够换个视角来看XAI，所以从头捡起这个系列(挖坑慎入，这是一个18年就开端挖，到现在都没有填完的坑)～

Algo & paper

开源库每个算法只供给了一个，大多是原作者或者我用过的，并不一定是start最多的，要是你知道better source欢迎留言哟～

算法	paper	GitHub
Permutation Importance	【1】	eli5
Feature Importace	计算办法有多种【2】	LGB/XGB/sklearn自带
Surrogate Model	【3】	h2o.ai
Local interpretable model_agnostic explanations（LIME）	【4】	lime
Leave one covariate out(LOCO)	【5】	h2o.ai
Individual Conditional Expectation(ICE)	【6】	PDPbox
Partial Dependence Plot(PDP)	【7】	PDPbox
shapley/SHAP	【8】【9】【10】	shap
DeepLift	【11】	deeplift
Layerwise Relevance Propagation（LRP）	【12】	LRP demo
Integrated Gradients	【13】	Integrated-Gradients

【1】Breiman, 2001, Random Forests

【2】办法有许多能够找xgb/lgb文档来看

【3】Osbert Bastani, Carolyn Kim, and Hamsa Bastani, 2017. Interpreting Blackbox Models via Model Extraction.

【4】Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. Why should I trust you?: Explaining the predictions of any classifier. 2016

【5】Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman, 2016, Distribution-Free Predictive Inference For Regression

【6】Goldstein, Alex, et al, 2015, Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation.

【7】J. H. Friedman, 2001, Greedy function approximation: a gradient boosting machine

【8】Lundberg, Scott M., and Su-In Lee, 2017. A unified approach to interpreting model predictions

【9】Lundberg, Scott M., Gabriel G. Erion, and Su-In Lee, 2018. Consistent individualized feature attribution for tree ensembles.

【10】Sundararajan, Mukund, and Amir Najmi, 2019, The many Shapley values for model explanation

【11】 Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje, 2017 . Learning important features throughpropagating activation differences

【12】Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller,and Wojciech Samek, 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation

【13】Mukund Sundararajan, Ankur Taly, and Qiqi Yan, 2017. Axiomatic attribution for deep networks

Tutorial

以下tutorial不同程度覆盖了上述算法，这两年的只能看paper咯。

引荐第一本，据说是LMU2019年学生研讨会的作业汇总。。。引入了因果的概念来剖析在哪些情况下XAI会cheating，虽然大多是点到即止没有深化，不过指出的一些坑命中率仍是很高的>_< ，有一句话记忆很深入 可解说算法解说的是模型学到了什么，而非实际数据体现怎么

Limitations of Interpretable Machine Learning Methods
Interpretable Machine Learning, A Guide for Making Black Box Models Explainable.
OREILLY, Ideas on interpreting machine learning
Kaggle, Machine Learning Explainability
H2O AI, An-Introduction-to-Machine-Learning-Interpretability-Second-Edition
MLI-source
h2o.ai interpretable_machine_learning_with_python
h2o.ai awesome-machine-learning-interpretability

XAI的难度不在了解算法自身，而是算法和数据结合时，你需求知道什么时候算法会fail, 以及在模型解说不如预期的时候怎么追查原因。说白了就是要在形而上学中找规则。。。所以后面我们会找个数据集来试试看

继续更新中～

XAI/MLI 可解释机器学习系列1- 开源&paper汇总

Algo & paper

Tutorial

相关文章

群体智能优化算法之 分组教学优化算法(Group teaching optimization algorithm,GTOA)

云音乐FeatureStore建设与实践

【机器学习】人工神经网络实现用户上网异常行为分析（2）

FasterTransformer框架速览

作者信息

群体智能优化算法之分组教学优化算法(Group teaching optimization algorithm,GTOA)