| Take(n) |
Return an array with the first n elements of the dataset. Note that this is currently not executed in parallel. Instead, the driver program computes all the elements. |
You basically called an action to which the result is not an RDD but an array, so nothing from then on is really occurring in a spark way. Its just local ops on the driver node and Array doesn't have reduceByKey... If you wanted to use this in testing you could modify it to be...
sc.parallelize(textFile.take(10)).map(line => line.split("\t")).map(line => (line(10), 1)).reduceByKey(_+_, 1).collect
which should work (since we re-distribute the results from the take)