Spatial Index in GeoSparkSQL

58 views
Skip to first unread message

Dimitris Bilidas

unread,
Apr 10, 2019, 10:14:22 AM4/10/19
to GeoSpark Discussion Board
Hello,

Here it says 
Please consider using GeoSpark core instead of GeoSparkSQL. Due to the limitation of SparkSQL (for instance, not support clustered index), we are not able to expose all features to SparkSQL.

I have some question regarding this:
So I can still use non-clustered indexes?
Can I create a spatial index on a RDD where one of the columns is geometry?
What other serious limitations are there in SparkSQL?

I would appreciate if you could answer or give me pointers to documentation/publications that explain these issues.

Thanks!

Jia Yu

unread,
Apr 11, 2019, 9:55:03 AM4/11/19
to Dimitris Bilidas, GeoSpark Discussion Board
Hi Dimitris,

All spatial indices in GeoSpark are clustered indexes. 

You can use Spatial SQL to create a Spatial DataFrame. You can stay in the DataFrame world and GeoSparkSQL still provides the optimized Spatial Join Query (it will build one-time index for each spatial join query): http://datasystemslab.github.io/GeoSpark/tutorial/sql/#dataframe-to-spatialrdd

However, building a one-time index is not efficient for sure. You can convert a Spatial DataFrame to SpatialRDD and build/cache indexes:


Thanks,
Jia

------------------------------------

Jia Yu,

Ph.D. Student in Computer Science



--
You received this message because you are subscribed to the Google Groups "GeoSpark Discussion Board" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geospark-discussio...@googlegroups.com.
To post to this group, send email to geospark-dis...@googlegroups.com.
Visit this group at https://groups.google.com/group/geospark-discussion-board.
To view this discussion on the web visit https://groups.google.com/d/msgid/geospark-discussion-board/e2497783-507a-4d61-9635-c68cb5180197%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dimitris Bilidas

unread,
Apr 11, 2019, 10:12:06 AM4/11/19
to GeoSpark Discussion Board
Jia,

thanks for your answer. So, after I convert to SpatialRDD and create the permanent index, there is no way to pose SQL queries to the data that use this permanent index, right?

Also, all this is about the local indexes only, right? I mean even in the DataFrame layout, the distributed index/spatial partitioning accross workes is created.

Dimitris
To unsubscribe from this group and stop receiving emails from it, send an email to geospark-discussion-board+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages