Fitting the model with Spark Dataframe

28 views

Skip to first unread message

Hamza Saaidia

unread,

Apr 15, 2023, 8:27:10 PM4/15/23

to User Group for BigDL

Hi team,

I am fitting my Keras estimator using a Spark dataframe that I read from HDFS, but unlike Tensorflow dataset, it seems that it loads all the data in memory, which cause memory ussies, can you tell me how to optimize the use of memory, so it doesn't load all dataset in memory but use the batch size instead if it is possible...

Thanks,

Xin Qiu

unread,

Apr 17, 2023, 2:06:16 AM4/17/23

to User Group for BigDL

Does your Keras Estimator mean NNEstimator ? https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/nnframes.html

From the docs, you can enable DISK_AND_DRAM option to reduce your use of memory, like .setDataCacheLevel("DISK_AND_DRAM", 10) will only cache 10% of your dataset in memory.

Bests,

-Xin

Reply all

Reply to author

Forward

0 new messages