--
You received this message because you are subscribed to the Google Groups "matminer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to matminer+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/matminer/7b4f2153-2ae1-4221-b44d-3af1fd3085cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hey Emily,The target column doesn’t have to exist in the data frame (if it is in the data frame, it is ignored during “predict”). The target arg is included in predict simply for consistency of syntax and as a sanity check (to make sure you are trying to predict the same quantity the pipeline was trained on, this has saved me many times). It also defines the output column name. Regardless, the target should be the same name as the one you fit on!So the following should work:pipe.fit(df_containing_target, target)predictions = pipe.predict(df_not_containing_target, target)Thanks,Alex
On Tue, Jul 9, 2019 at 9:49 AM Emily <emm...@gmail.com> wrote:
Hi,--I have a model for binary classification from automatminer that I'm pretty happy with - I now want to try to run the "predict" function on some unknown compounds. The predict function still needs a target column - can I just fill the target column randomly with 1s and 0s or will that bias the model somehow?Thanks!Emily S
You received this message because you are subscribed to the Google Groups "matminer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to matm...@googlegroups.com.
Hey Emily,The target column doesn’t have to exist in the data frame (if it is in the data frame, it is ignored during “predict”). The target arg is included in predict simply for consistency of syntax and as a sanity check (to make sure you are trying to predict the same quantity the pipeline was trained on, this has saved me many times). It also defines the output column name. Regardless, the target should be the same name as the one you fit on!So the following should work:pipe.fit(df_containing_target, target)predictions = pipe.predict(df_not_containing_target, target)Thanks,Alex
On Tue, Jul 9, 2019 at 9:49 AM Emily <emm...@gmail.com> wrote:
Hi,--I have a model for binary classification from automatminer that I'm pretty happy with - I now want to try to run the "predict" function on some unknown compounds. The predict function still needs a target column - can I just fill the target column randomly with 1s and 0s or will that bias the model somehow?Thanks!Emily S
You received this message because you are subscribed to the Google Groups "matminer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to matminer+unsubscribe@googlegroups.com.
I would vote for removing "target" from predict, this seems very confusing.
>> simply for consistency of syntaxWhy does fit and predict need the same syntax? I'd rather have consistency of syntax with a normal scikit-learn model which doesn't ask you for a target column in predict (since none is needed!)
>> to make sure you are trying to predict the same quantity the pipeline was trained on, this has saved me many timesThis seems more a problem with the way you must be organizing your code? If you are having trouble keeping track which pipeline was trained on what, you could use descriptive variable names likepipe_bandgapor comments?I'd like to hear a good argument as to why "target" is needed for predict
On Tuesday, July 9, 2019 at 10:13:00 AM UTC-7, Alex Dunn wrote:
Hey Emily,The target column doesn’t have to exist in the data frame (if it is in the data frame, it is ignored during “predict”). The target arg is included in predict simply for consistency of syntax and as a sanity check (to make sure you are trying to predict the same quantity the pipeline was trained on, this has saved me many times). It also defines the output column name. Regardless, the target should be the same name as the one you fit on!So the following should work:pipe.fit(df_containing_target, target)predictions = pipe.predict(df_not_containing_target, target)Thanks,Alex
On Tue, Jul 9, 2019 at 9:49 AM Emily <emm...@gmail.com> wrote:
Hi,--I have a model for binary classification from automatminer that I'm pretty happy with - I now want to try to run the "predict" function on some unknown compounds. The predict function still needs a target column - can I just fill the target column randomly with 1s and 0s or will that bias the model somehow?Thanks!Emily S
You received this message because you are subscribed to the Google Groups "matminer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to matm...@googlegroups.com.