conceptual question

53 views

Skip to first unread message

Benjamin Bluhm

unread,

Mar 9, 2018, 8:20:31 AM3/9/18

to spar...@googlegroups.com

Hi,

I have been working on a project with spark-ts, but more recently I have been working on a time series project where I implemented my own spark distribution logic which is the following: I first create a partitioned RDD with time series IDs. I then map these partitions to the worker nodes where I import the time series data and perform model training in python. The advantage of this approach is that I can use the full range of Python machine learning libraries.

What is the disadvantage of this approach relative to using spark-ts?

Many thanks & Kind regards,
Benjamin

Von meinem iPhone gesendet

Reply all

Reply to author

Forward

0 new messages