Indexing with Spark SQL

ashishag...@gmail.com

unread,

Oct 7, 2020, 1:50:48 PM10/7/20

to GeoSpark Discussion Board

I am little confused as to what is not supported in geospark sql reagrding clustered indexes

From this link :- https://github.com/strump/GeoSpark

" Please consider using GeoSpark core instead of GeoSparkSQL. Due to the limitation of SparkSQL (for instance, not support clustered index), we are not able to expose all features to SparkSQL. "

However at this location http://sedona.apache.org/api/sql/GeoSparkSQL-Parameter/ it suggests how can we choose between clustered index such as r-tree and quad-tree

Does the first link means other than "SQL range join and SQL distance join"

Can you please throw some insights on this

ashishag...@gmail.com

unread,

Oct 9, 2020, 8:02:21 AM10/9/20

to GeoSpark Discussion Board

Can someone please help

Jia Yu

unread,

Oct 9, 2020, 1:28:10 PM10/9/20

to ashishag...@gmail.com, GeoSpark Discussion Board

In GeoSparkSQL, you won't be able to create spatial index and call spatial partitioning explicitly. Instead, GeoSparkSQL will create one-time index and spatial partitioning on-the-fly when you call a distance join or range join.

In GeoSpark RDD, you will be able to create index and do spatial partitioning whenever you want.

------------------------------------

Jia Yu (new email: jia...@wsu.edu)

Assistant Professor

Washington State University School of EECS

Reach me via: Homepage | GitHub

--
You received this message because you are subscribed to the Google Groups "GeoSpark Discussion Board" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geospark-discussio...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/geospark-discussion-board/f3377499-45b3-4919-9cbf-ad832cd90d4an%40googlegroups.com.

Reply all

Reply to author

Forward