How to sort Spark RDD by Key?

14,515 views
Skip to first unread message

Gaurav Dasgupta

unread,
Oct 15, 2012, 4:09:40 PM10/15/12
to spark...@googlegroups.com
Hi,

I have written a map reduce program in spark which returns the end output (stored in Spark RDD) like this:

(3, value3)
(2, value2)
(5, (value5)
(1, value1)

I want to sort it by Key, so that the output comes like this:

(1, value1)
(2, value2)
(3, value3)
(5, (value5)

I seen in tutorials that there is sort(c: Comparator[K]) RDD transformation function. But I am confused that how exactly to use it in this scenario.
Please help.

Thanks,
Gaurav 

Reynold Xin

unread,
Oct 15, 2012, 4:20:21 PM10/15/12
to spark...@googlegroups.com
Just do 

myrdd.sortByKey(true)

true => ascending
false => descending

Gaurav Dasgupta

unread,
Oct 15, 2012, 4:25:21 PM10/15/12
to spark...@googlegroups.com
Thanks very much. It worked.

mwal...@bcmcgroup.com

unread,
Dec 28, 2012, 3:10:29 PM12/28/12
to spark...@googlegroups.com
Is there any built-in method for sorting by value, instead of by key?

********************************************************
DISCLAIMER:  This e-mail, including any attached files, 
is confidential, may  be legally privileged, and is 
solely for the intended recipient(s).  If you received 
this e-mail in error, please destroy it and notify us 
immediately by reply e-mail.  Any unauthorized 
use, dissemination, disclosure, copying or printing is 
strictly prohibited. - BCMC, LLC

Matei Zaharia

unread,
Dec 28, 2012, 3:41:12 PM12/28/12
to spark...@googlegroups.com
No, but you can always use an object's value as its key by doing a map().

Matei

Archit Thakur

unread,
Dec 3, 2013, 12:20:56 AM12/3/13
to spark...@googlegroups.com
myRdd.sortByKey method is not available in spark 0.8.0?
@Reynold Xin which version of spark are you talking about?

Reynold Xin

unread,
Dec 3, 2013, 12:24:05 AM12/3/13
to spark...@googlegroups.com
It is in Spark 0.8.0. You need to make sure you

import org.apache.spark.SparkContext._

so you get the implicit conversion to OrderedRDDFunctions, which contains the sortByKey method.


--
You received this message because you are subscribed to the Google Groups "Spark Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages