17/01/24 17:09:47 ERROR Executor: Exception in task 0.0 in stage 22.0 (TID 17)
java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: requirement failed: invalid size
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.intel.analytics.bigdl.optim.DistriOptimizer$$anonfun$4$$anonfun$7.apply(DistriOptimizer.scala:176)
at com.intel.analytics.bigdl.optim.DistriOptimizer$$anonfun$4$$anonfun$7.apply(DistriOptimizer.scala:176)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at com.intel.analytics.bigdl.optim.DistriOptimizer$$anonfun$4.apply(DistriOptimizer.scala:176)
at com.intel.analytics.bigdl.optim.DistriOptimizer$$anonfun$4.apply(DistriOptimizer.scala:125)
at org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:85)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
--
You received this message because you are subscribed to the Google Groups "BigDL User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bigdl-user-group+unsubscribe@googlegroups.com.
To post to this group, send email to bigdl-user-group@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bigdl-user-group/9bbb3a9a-5008-43b5-be95-7a04b17cf8b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi, I am sorry , can you give me your input data file? I can’t recur the problem.
Thanks,
Cherry
--
To unsubscribe from this group and stop receiving emails from it, send an email to bigdl-user-gro...@googlegroups.com.
To post to this group, send email to bigdl-us...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bigdl-user-group/9bbb3a9a-5008-43b5-be95-7a04b17cf8b7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "BigDL User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
bigdl-user-gro...@googlegroups.com.
To post to this group, send email to
bigdl-us...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/bigdl-user-group/CAHvkTeHGedBEYQYBwm8y%3Djeiiw1NHr_EDGTThXJKk4KkPP0E5Q%40mail.gmail.com.
To post to this group, send email to bigdl-u...@googlegroups.com.
Hi,
Two changes are needed to fix the issue:
val vectorizedRdd :RDD[(Array[Double], Double)] = result.select("level1_labels","vectors").rdd.map(r => (r(1).asInstanceOf[DenseVector].toArray,r(0).asInstanceOf[Double] + 1.0))
Thanks,
-Jason
To unsubscribe from this group and stop receiving emails from it, send an email to bigdl-user-group+unsubscribe@googlegroups.com.
To post to this group, send email to bigdl-user-group@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bigdl-user-group/60d470eb-4ff8-4589-a4ad-e012bb0464e5%40googlegroups.com.
val batching = Batching(param.batchSize, Array(param.maxSequenceLength, param.embeddingDim))
val trainingDataSet = DataSet.rdd(trainingRDD) -> batching
I think you can try to use the Batching class for now. The second parameters is the shape of your input i.e Array(5, vector_dim).
And of course, convert the input to Sample(take a look at Sample.copy(…)), and then apply the SampleToBatch is also a valid solution.
We are in the process of simplify the input API, the batching and sampling logic would be transparent to the user shortly.
From: bigdl-us...@googlegroups.com [mailto:bigdl-us...@googlegroups.com]
On Behalf Of alepb...@gmail.com
Sent: Thursday, January 26, 2017 7:33 AM
To: BigDL User Group <bigdl-us...@googlegroups.com>
Cc: alepb...@gmail.com
Subject: Re: [bigdl-user-group] java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException: requirement failed: inv
Thanks Jason, your fix did solve my problems.
What Im trying to do now it's slightly modifying my input data.
So far I have used word2vec to generate ONE single vector of fixed length for a seuqence of words. To go back to my previous example:
if I have this input:
String = "the lazy fox jumped over the brown dog" Label = "Comedy"
My output would be of the format (Array[Double], Double):
[0.12,0.45, ... ], 2.0
Now Id like to produce a vector for each word of my sentence. Following the above example Id have
the = [0.12,0.45, ... ]
lazy = [0.42,0.35, ... ]
...
dog = [0.62,0.75, ... ],
and still 2.0 representing its label.
So from (Array[Double], Double) Id turn it into a (Array(Array[Double]), Double).
What is not clear to me is how how would process the iterators to batch my input. Would you please help me with this?
Im not sure I could use the one given for the TextClassifier example:
val batching = Batching(param.batchSize, Array(param.maxSequenceLength, param.embeddingDim))
val trainingDataSet = DataSet.rdd(trainingRDD) -> batching
because maxSequenceLength in my case would be the maximum length of my sentence (which of course is going to be fixed anyway, let's say 5 as my input are short)
Any input would be highly valuable.Thanks,
Alessandro
On Wednesday, January 25, 2017 at 8:30:02 AM UTC-6, Jason Dai wrote:
Hi,
Two changes are needed to fix the issue:
1. The target (label) used in ClassNLLCriterion is 1 based, while the StringIndexer in Spark is 0 based. You may need to change your labels as follows:
To view this discussion on the web visit https://groups.google.com/d/msgid/bigdl-user-group/5967da77-4e00-42f5-a221-389958e5edc1%40googlegroups.com.