what is the best way to visualise autoML results

112 views
Skip to first unread message

Aaron Harris

unread,
Mar 8, 2021, 6:17:30 PM3/8/21
to cloud-automl-tables-discuss
I'm curious as to what the best way to visualise the results from AutoML. I am using a linear regression model to predict website conversion rates and it gives the model feature importance. This tells me how much each feature contributed to the prediction. It doesn't tell me if it had a negative or positive impact on the target value. 

Is there a way I can find this? If I export predictions to BigQuery I can see how each feature weighed for each local prediction but not across the whole set. Would one way to do this be by plotting the local feature weights against the predicted value?

Thanks

Chenyu Zhao

unread,
Mar 9, 2021, 7:11:33 PM3/9/21
to Aaron Harris, cloud-automl-tables-discuss
Hi Aaron,

It really depends on what question you're trying to answer with these feature importances. The model feature importances are unsigned because an individual feature can either be positive or negative depending on the feature value and depending on the example, so there's no simple way to aggregate it in a signed manner.

This is just the nature of non-linear models; it's difficult (impossible?) to summarize the contribution of individual features or feature values across the entire model.

-Chenyu

--
You received this message because you are subscribed to the Google Groups "cloud-automl-tables-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-automl-tables...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-automl-tables-discuss/c7d49ba6-781a-46c0-a2db-69e97df262bfn%40googlegroups.com.

Aaron Harris

unread,
Mar 10, 2021, 12:05:44 PM3/10/21
to cloud-automl-tables-discuss
Realise I didn't reply all! Here's my reply for visibility:

Hey Chenyu

Thanks for this! It’s a linear reg model I'm working with. Is it not fair to assume that some features have an overall negative and some an overall positive impact on the target?

I've been referred to evaluation packages like SHAP but it's a bit over my head https://towardsdatascience.com/explain-your-model-with-the-shap-values-bc36aac4de3d is it possible to take the model that AutoML creates and apply SHAP visualisations to it? 

I'm wondering if it makes sense to plot the feature weights that the AutoML batch predict creates for each prediction against the feature values? That way we can say this value contributed X increase or decrease to the target prediction.

Here are two visuals I'm thinking about using to help interpre the features. First is a box and whisker which looks at all the local feature weights of each feature. This, I think, is telling me that sessions, cartdetail rate, cartDropOff and avgpricesold make quite large contributions to the predictions?

Screen Shot 2021-03-10 at 16.44.10.png

If say I wanted to visualise the relationship between of sessions on conversion rate predictions I think I could plot their local feature weights against their feature values so something like this:

Screen Shot 2021-03-10 at 17.03.09.png

it certainly looks nice :) but not sure if I am right to read something from that that the more sessions there are the more likely it is that conversion rate prediction will be less - which is generally true anyway. 

Thanks
Aaron



Chenyu Zhao

unread,
Mar 18, 2021, 5:57:49 PM3/18/21
to Aaron Harris, cloud-automl-tables-discuss
Hi Aaron,

Sorry for the late reply. Yes those visualizations you described are entirely reasonable. We don't automatically generate them, but it should be fairly trivial to generate them in a Python notebook.

You can run batch prediction over your training data with feature importance enabled. See https://cloud.google.com/automl-tables/docs/predict-batch for details.
Then you can use the outputted feature importances to generate those plots.

-Chenyu

Aaron Harris

unread,
Mar 19, 2021, 4:50:08 AM3/19/21
to cloud-automl-tables-discuss
Thanks Chenyu. That's really helpful to know and that's exactly what we did. I used seaborn.Facetgrid in python to visualise in the end as none of the more basic visualisation tools e.g. Data Studio or Flourish could handle the number of values in a single plot.
Reply all
Reply to author
Forward
0 new messages