Spark-ts ARIMA getting into deadlock

itiss...@gmail.com

unread,

Oct 30, 2016, 7:50:02 AM10/30/16

to Time Series for Spark (the spark-ts package)

I am running the autofit function on a big dataset (few Million ARIMA models). It seems that some of them are running into a deadlock. Do you know how to resolve this problem? (at least, make it exit with a exception).

Best regards,

Amit

Sandy Ryza

unread,

Oct 31, 2016, 12:22:59 AM10/31/16

to itiss...@gmail.com, Time Series for Spark (the spark-ts package)

Hi Amit,

I haven't observed this problem. Do you know the nature of what's taking the time?

-Sandy

--
You received this message because you are subscribed to the Google Groups "Time Series for Spark (the spark-ts package)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-ts+unsubscribe@googlegroups.com.
To post to this group, send email to spar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spark-ts/3ffea815-5a93-4195-bde3-ab3d28df3b7b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

pa...@ucdavis.edu

unread,

Oct 31, 2016, 6:31:59 AM10/31/16

to Time Series for Spark (the spark-ts package), itiss...@gmail.com

Hi Sandy,

Thanks for replying. I do not know the nature of problem but you can use this simple code to replicate the error. Please note that the vector I am using is pretty simple (not a good time series, but then in a big dataset, some of such cases do arise).

Best regards,

Amit

val a=Array(1,1.0,1,1)

//ts should be a mllib Vector. ML will not work (as of 0.4.0 version of spark-ts)

val ts=Vectors.dense(a)

//autofit the best arima model where p,q,r vary from 0 to 4 each

val arimaModel = ARIMA.autoFit(ts,4,4,4)

//Forecast two values using this model. We forecast next two values here.

val forecast = arimaModel.forecast(ts,2)

pa...@ucdavis.edu

unread,

Oct 31, 2016, 6:33:18 AM10/31/16

to Time Series for Spark (the spark-ts package), itiss...@gmail.com

Hi Sandy,

Thanks for replying. I do not know the nature of problem but you can use this simple code to replicate the error. Please note that the vector I am using is pretty simple (not a good time series, but then in a big dataset, some of such cases do arise).

Best regards,

Amit

val a=Array(1,1.0,1,1)

//ts should be a mllib Vector. ML will not work (as of 0.4.0 version of spark-ts)

val ts=Vectors.dense(a)

//autofit the best arima model where p,q,r vary from 0 to 4 each

val arimaModel = ARIMA.autoFit(ts,4,4,4)

//Forecast two values using this model. We forecast next two values here.

val forecast = arimaModel.forecast(ts,2)

On Sunday, October 30, 2016 at 11:22:59 PM UTC-5, Sandy Ryza wrote:

Hi Amit,

I haven't observed this problem. Do you know the nature of what's taking the time?

-Sandy

On Sun, Oct 30, 2016 at 4:50 AM, <itiss...@gmail.com> wrote:

I am running the autofit function on a big dataset (few Million ARIMA models). It seems that some of them are running into a deadlock. Do you know how to resolve this problem? (at least, make it exit with a exception).

Best regards,
Amit

--
You received this message because you are subscribed to the Google Groups "Time Series for Spark (the spark-ts package)" group.

To unsubscribe from this group and stop receiving emails from it, send an email to spark-ts+u...@googlegroups.com.

pa...@ucdavis.edu

unread,

Nov 3, 2016, 2:19:37 PM11/3/16

to Time Series for Spark (the spark-ts package), itiss...@gmail.com, Sandy Ryza

Sandy

Any idea how to get over this issue?

Amit

On Sunday, October 30, 2016 at 11:22:59 PM UTC-5, Sandy Ryza wrote:

Hi Amit,

I haven't observed this problem. Do you know the nature of what's taking the time?

-Sandy

On Sun, Oct 30, 2016 at 4:50 AM, <itiss...@gmail.com> wrote:

I am running the autofit function on a big dataset (few Million ARIMA models). It seems that some of them are running into a deadlock. Do you know how to resolve this problem? (at least, make it exit with a exception).

Best regards,
Amit

--
You received this message because you are subscribed to the Google Groups "Time Series for Spark (the spark-ts package)" group.

To unsubscribe from this group and stop receiving emails from it, send an email to spark-ts+u...@googlegroups.com.

Sandy Ryza

unread,

Nov 16, 2016, 7:51:28 PM11/16/16

to pa...@ucdavis.edu, Time Series for Spark (the spark-ts package), Amit Pande

Apologies, but I'm not aware off hand of what could be going on here.

-Sandy

kgie...@gmail.com

unread,

Dec 3, 2016, 8:01:19 PM12/3/16

to Time Series for Spark (the spark-ts package), pa...@ucdavis.edu, itiss...@gmail.com

All,

I am also experiencing the same issue. Infinite loop in ARIMA. For me, this is a critical bug.

Regards,

Karl

Eric Patterson

unread,

Aug 8, 2017, 8:49:15 AM8/8/17

to Time Series for Spark (the spark-ts package), pa...@ucdavis.edu, itiss...@gmail.com

This may be the same issue that I run into too. When I dig into the resource manager of the EMR (using aws emrs) it appears like a crash of the spark server. I typically run into this problem when I try to iterate over more 1k individual time series lists to perform individual autofits and forecasts.

Maybe it is a resource issue and something is not getting released or cleaned up which is crashing the jvm?

Is there anyone that tried to perform individual forecasting with a map function? I wonder if it is my code that is bad and there is a better way to iterate over large sets of time series lists.

Here is the pseudo of the way I am doing it:

def createForecast(fKey:String, valuesList:List[Double]) {

val ts = Vectors.dense( valuesList.toArray )

var arimaModel = ARIMA.autoFit(ts, 5, 3, 5)

val forecast = arimaModel.forecast(ts, futureSampleCount.toInt)

return forecast

}

val finalCollection = uniqueKeyList.map(fKey => createForecast( fKey, activityDf.filter($"forecastKey" === fKey).collectAsList.asScala.toList ) )

Thanks.

Amit Pande

unread,

Aug 8, 2017, 10:14:22 AM8/8/17

to Eric Patterson, Time Series for Spark (the spark-ts package), amit

Eric,

It actually works now for me in a partial sense that it doesn't go to infinite pause. I use the following workaround:

var forecast = some_fallback_value

scala.util.Try{

arimaModel=ARIMA.autoFit(ts,4,4,4)

forecast=arimaModel.forecast(ts, futureSampleCount.toInt)

}

Amit

Eric Patterson

unread,

Aug 8, 2017, 10:19:37 AM8/8/17

to Time Series for Spark (the spark-ts package), eric.pa...@gfs.com, itiss...@gmail.com, pa...@ucdavis.edu

Thank you Amit!

That is interesting because I have a normal try{ } catch{ case e: Exception } around my block of autofit and forecast like you are suggesting. This makes me think that is not the ts library dead locking on me but something else; I will try and wrap a bunch more blocks independently then to find out where.

Thanks.

Reply all

Reply to author

Forward