Loading large GeoTIFF

173 views
Skip to first unread message

Ye Wu

unread,
Sep 11, 2016, 10:11:19 PM9/11/16
to geotrellis-user
Hi, everyone!

 I started using geotrellis several days ago. 

I have a question on reading large GeoTIFF. In file geotrellis/spark/src/main/scala/geotrellis/spark/io/hadoop/HdfsUtils.scala: 165, it limits the length of a file to Int.MaxValue.toLong. This means that I can just process a GeoTIFF file about 2GB. The "Cannot read path $path because it's too big..." error will be reported if trying to read a large TIFF file. 

So I am not sure how to process large GeoTIFF file of hundreds GBs? Thank you in advance for any assistance!

Sincerely,

yewu

Rob Emanuele

unread,
Sep 15, 2016, 4:27:41 PM9/15/16
to geotrel...@googlegroups.com
Hi Ye Wu!

We can currently only work on geotiffs that can fit in memory when reading from the local file system, HDFS, or S3. There's some current work to read windows off of S3, which we will use to allow for large geotiff ingests...the HDFS version will follow (see https://github.com/geotrellis/geotrellis/pull/1617 to track the progress).

For now, I would recommend trying to split the geotiffs into smaller chunks, using something like `gdal_retile.py`. Given that your geotiff is hundreds of gigs (wow!), that might be a pretty intense task. I would say that the very large geotiff use case is something we want to support in the future, but we don't have good support for that right now.

Thanks,
Rob

--
You received this message because you are subscribed to the Google Groups "geotrellis-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geotrellis-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Robert Emanuele, Tech Lead
Azavea |  990 Spring Garden Street, 5th Floor, Philadelphia, PA
remanuele@azavea.com  | T 215.701.7502  | Web azavea.com  |  @azavea

Ye Wu

unread,
Sep 17, 2016, 9:20:59 PM9/17/16
to geotrellis-user

Thanks, Bob.

I got the result of building pyramid with raster about 100G with geotrellis, https://spark-summit.org/2014/wp-content/uploads/2014/07/Geotrellis-Adding-Geospatial-Capabilities-to-Spark-Ameet-Kini-Rob-Emanuele.pdf, on page 19. Is the 100G raster file(maybe files) a big GeoTIFF or GeoTIFF dataset with several split files?



在 2016年9月12日星期一 UTC+8上午10:11:19,Ye Wu写道:

Rob Emanuele

unread,
Sep 19, 2016, 12:37:49 PM9/19/16
to geotrel...@googlegroups.com
Hi Ye,

That was 100G of smaller raster tiles, not one single geotiff. 

Best,
Rob

--
You received this message because you are subscribed to the Google Groups "geotrellis-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geotrellis-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages