'Sklearn Pipeline with KernelExplainer and data to predict as DataFrame leads to error

I want to calculate shap values from a sklearn pipeline with a preprocessor and a model. When i do it with the code below I get 0 for all shape_values

def create_shap_data(pipeline, X_test: pd.DataFrame):
    """
    Create a dataframe with the features and the SHAP values
    """

    def model_predict(data_as_array):
        data_asframe = pd.DataFrame(data_as_array, columns=X_test.columns)
        return pipeline.predict(data_asframe)
    
    explainer = shap.KernelExplainer(model_predict, X_test)
    shap_values = explainer.shap_values(X_test)
    shap.force_plot(explainer.expected_value, shap_values, X_test)

I want to run it for different models, so the model can be different in the pipeline. Also the input data is can be different. I dont really need to plot it but just to make sure i have the right values

When i use SGDClassifier with a Preprocessor the values I get are all 0.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source