Adapt Pytorch algorithm to BigDL

45 views

Skip to first unread message

Hamza Saaidia

unread,

Mar 19, 2023, 7:53:09 PM3/19/23

to User Group for BigDL

Hi all,

I am trying to adapt my -Semantic Segmentation- Pytorch algorithm to BigDL, the code works fine in local. However, when I try to do that, I encounter errors related to dataset preparation, I tried many things but didn't work. This is what shows when I call the fit function :

from zoo.orca.learn.trigger import EveryEpoch
est.fit(data=train_loader, epochs=1, validation_data=val_loader,
checkpoint_trigger=EveryEpoch())

creating: createEveryEpoch
creating: createMaxEpoch
23-03-19 05:45:12 [Thread-4] INFO InternalDistriOptimizer$:1185 - TorchModel[f7d76e53] isTorch is true
23-03-19 05:45:12 [Thread-4] INFO InternalDistriOptimizer$:1191 - torch model will use 1 OMP threads.
2023-03-19 05:45:12 INFO DistriOptimizer$:818 - caching training rdd ...

2023-03-19 05:45:39 INFO DistriOptimizer$:161 - Count dataset

[Stage 41:> (0 + 1) / 1]

Prepending /home/zoo/anaconda3/envs/zoo/lib/python3.7/site-packages/bigdl/share/conf/spark-bigdl.conf to sys.path
Prepending /home/zoo/anaconda3/envs/zoo/lib/python3.7/site-packages/zoo/share/conf/spark-analytics-zoo.conf to sys.path

2023-03-19 05:46:10 WARN DistriOptimizer$:167 - If the dataset is built directly from RDD[Minibatch], the data in each minibatch is fixed, and a single minibatch is randomly selected in each partition. If the dataset is transformed from RDD[Sample], each minibatch will be constructed on the fly from random samples, which is better for convergence.
2023-03-19 05:46:10 INFO DistriOptimizer$:173 - config {
computeThresholdbatchSize: 100
maxDropPercentage: 0.0
warmupIterationNum: 200
isLayerwiseScaled: false
dropPercentage: 0.0
}
2023-03-19 05:46:10 INFO DistriOptimizer$:177 - Shuffle data
2023-03-19 05:46:10 INFO DistriOptimizer$:180 - Shuffle data complete. Takes 1.58994E-4s

[Stage 43:> (0 + 1) / 1]

2023-03-19 05:46:42 ERROR Executor:91 - Exception in task 0.0 in stage 43.0 (TID 43)
jep.JepException: jep.JepException: <class 'RuntimeError'>: one_hot is only applicable to index tensor.

This is my original notebook : link

Thank you for your time and your support,

Hamza