Label as vector in BigDL 2.1.0

14 views
Skip to first unread message

Phuong LE-HONG

unread,
Jan 4, 2023, 2:16:07 AM1/4/23
to BigDL User Group
Hi all,

I'm developing a sequence model which takes as input Seq[String] and produces output labels as Seq[Double] or Vector. I'd like to use the Keras-like API of BigDL on a DataFrame for this purpose.

However, it seems that the current implementation supports scalar (Float or Double) labels only if we use one "label" column. It does not support vector type as label, as the code excerpt shows. 

I'm thinking about creating an array of label columns, each storing an element of the label sequence in order to use the SeqToMultipleTensors utility. But it is not a neat solution at all. 

Is there a better way to do this?

Thanks,

Phuong

===

val preprocessing = if (labelCols.size == 1) {
  FeatureLabelPreprocessing(featurePreprocessing, ScalarToTensor())
  .asInstanceOf[Preprocessing[(Any, Option[Any]), Sample[T]]]
} else {
  FeatureLabelsPreprocessing(featurePreprocessing, SeqToMultipleTensors(labelSizes))
  .asInstanceOf[Preprocessing[(Any, Option[Any]), Sample[T]]]
}
===

Phuong LE-HONG

unread,
Jan 4, 2023, 4:12:57 AM1/4/23
to BigDL User Group
Well, after digging into source code packages, I find that the NNEstimator is the best way to go. This is able to let user specify label shape. 

Phuong

Jason Dai

unread,
Jan 4, 2023, 4:22:44 AM1/4/23
to User Group for BigDL
Yes - the preferred approach for DLlib is to use Keras-like API to build the model, and use NNFrame (https://bigdl.readthedocs.io/en/latest/doc/DLlib/Overview/nnframes.html) to build the ML pipeline.

Thanks,
-Jason

Reply all
Reply to author
Forward
0 new messages