How to use Feature/Label padding with NNEstimator?

45 views
Skip to first unread message

Phuong LE-HONG

unread,
Apr 6, 2023, 5:24:56 AM4/6/23
to BigDL User Group
Hi all,

I'm searching for a neat way to apply padding to data samples when using nnframes API. But I have not found any yet.

It seems that the NNEstimator provides the only API related to this: setSamplePreprocessing().

However, the Preprocessing and FeatureLabelProcessing class does not offer any padding functionalities. 

(In the old API, we can set paddings for features and labels with the Optimizer interface.)

Am I missing something? 

Thanks for your suggestions.

Best regards,

Phuong

huangka...@gmail.com

unread,
Apr 11, 2023, 8:20:43 AM4/11/23
to User Group for BigDL
Hi Phuong,

Sorry for the late reply.

Unfortunately, we haven't exposed the padding strategy in NNEstimator. As a workaround, could you use udf to pad on the Spark DataFrame before fitting into NNEstimator?
If you want us to provide more support, you can provide more information for example your sample data schema.

Thanks,
Kai

Kai Huang

unread,
Apr 11, 2023, 8:43:10 AM4/11/23
to Phuong LE-HONG, User Group for BigDL
Hi Phuong,

Yeah, I also think it would be beneficial to have this functionality. I have raised an issue in our repo and we will get it planned. https://github.com/intel-analytics/BigDL/issues/8030 
Thanks for pointing that out!

Thanks,
Kai

On Tue, Apr 11, 2023 at 8:36 PM Phuong LE-HONG <phuo...@gmail.com> wrote:
Thanks Kai for your reply. I did exactly what you suggested. I wrote a Spark transformer to pad/truncate feature vectors before fitting NNEstimator. It works smoothly.

IMHO, I think that NNEstimator should support padding for ease of use.

Thanks,

Phuong



--
You received this message because you are subscribed to the Google Groups "User Group for BigDL" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bigdl-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bigdl-user-group/c78595e4-0e23-4dfd-99cb-50c8be383e0dn%40googlegroups.com.

Phuong LE-HONG

unread,
Apr 14, 2023, 7:17:50 PM4/14/23
to huangka...@gmail.com, User Group for BigDL
Thanks Kai for your reply. I did exactly what you suggested. I wrote a Spark transformer to pad/truncate feature vectors before fitting NNEstimator. It works smoothly.

IMHO, I think that NNEstimator should support padding for ease of use.

Thanks,

Phuong


On Tue, Apr 11, 2023 at 19:20 huangka...@gmail.com <huangka...@gmail.com> wrote:
--
Reply all
Reply to author
Forward
0 new messages