storage.BlockManager: Removing RDD 30 job taking too long

40 views
Skip to first unread message

Rajat Aggarwal

unread,
Nov 29, 2019, 3:21:07 AM11/29/19
to User Group for BigDL and Analytics Zoo
i am running a 300 mb file with certain transformation and my job is takin 30 MIN for that 
i am using pyspark "
and most of the time it is taking storage.BlockManager: Removing RDD 30 job taking too long
size i am using is 4 nodes total 16 cores 8gb each node
i am utilizing full memory for that

Wang Yanzhang

unread,
Dec 3, 2019, 1:14:52 AM12/3/19
to User Group for BigDL and Analytics Zoo
Hi Rajat,

What's the transformation you ran ?

Thanks,
Yanzhang


在 2019年11月29日星期五 UTC+8下午4:21:07,Rajat Aggarwal写道:

Rajat Aggarwal

unread,
Dec 3, 2019, 4:23:17 AM12/3/19
to User Group for BigDL and Analytics Zoo
Transformation includes
Regex replacement
Date conversation
Numeric conversation
Filtering records on error bases

Wang Yanzhang

unread,
Dec 3, 2019, 9:45:13 PM12/3/19
to User Group for BigDL and Analytics Zoo

Hi Rajat,

It seems an optimization question of how to speed up your jobs. You can try to find some public methods.

Thanks,
Yanzhang

在 2019年12月3日星期二 UTC+8下午5:23:17,Rajat Aggarwal写道:

Afsaar Shiekh

unread,
May 6, 2022, 7:08:34 PM5/6/22
to User Group for BigDL
I am also facing the same where at the end while removing the RDD executors are stuck for uncertain period.

Xin Qiu

unread,
May 6, 2022, 10:17:01 PM5/6/22
to User Group for BigDL
@Afsaar Shiekh
Could you open a new conversation and discribe your question clearly? Including bigdl or analytics-zoo version, OS version and how to reproduce your error. The best way is providing some sample codes to us.

Bests,
-Xin
Reply all
Reply to author
Forward
0 new messages