Risk Scorecard Development using Random Forest Results

298 views
Skip to first unread message

jskill...@gmail.com

unread,
Feb 28, 2015, 3:54:08 PM2/28/15
to h2os...@googlegroups.com
Hi,

I am using DRF and need some guidance in developing a standardized scorecard based on the prediction value.

Can any one let me know how I should proceed please.

Thank you.

John

ccl...@gmail.com

unread,
Mar 4, 2015, 11:32:55 AM3/4/15
to h2os...@googlegroups.com, jskill...@gmail.com
How would you proceed with a Random Forest model from any other tool?
Honestly I don't know how to build a "standardized scorecard" model from a RF model - RF decision trees bottom out in a *vote*, you have to combine the votes. Not sure how that jives with a scorecard model.

Cliff

Erin LeDell

unread,
Mar 4, 2015, 12:36:26 PM3/4/15
to ccl...@gmail.com, h2os...@googlegroups.com, jskill...@gmail.com
John,
Can you define what you mean by "standardized scorecard"?  Are you asking how to convert a predicted value into a more user-friendly value, say, 1-10?  This is not really an H2O question, but rather a general data science question.

-Erin

--
You received this message because you are subscribed to the Google Groups "H2O & Open Source Scalable Machine Learning  - h2ostream" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2ostream+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jskill...@gmail.com

unread,
Mar 4, 2015, 7:01:10 PM3/4/15
to h2os...@googlegroups.com, jskill...@gmail.com, ccl...@gmail.com
Cliff,

Statistica Scorecard seems be using Random Forests to choose the variables of interest for inclusion in the scorecard model. Hence my reason for inquiry.

Please see:

http://www.statsoft.com/webcasts/Stout_Scorecards/lib/playback.html

The presenter describes the approach around 13:50 minute in the webinar.

Best Regards,

John



Erin LeDell

unread,
Mar 4, 2015, 7:54:05 PM3/4/15
to jskill...@gmail.com, h2os...@googlegroups.com, Cliff Click
Hi John,
I see what you are trying to do now.  As far as I'm aware, H2O doesn't generate scorecards automatically, but you can use the H2O RF in the same way that Statistica uses an RF to generate its "scorecard model."  This may help: http://cran.r-project.org/doc/contrib/Sharma-CreditScoring.pdf

However, if your end-goal is to generate a final "score" (or likelihood) to estimate the chance that a customer will default on their loan, for example, then you can just use the predicted values produced by the RF (H2O, or otherwise) directly.

-Erin


John



Reply all
Reply to author
Forward
0 new messages