This looks like an issue in the Decompression logic in the GeoTiff reader. This is causing some of the GeoTiff reading to fail.We need to find the unreadable GeoTiffs and fix our GeoTiff reader. So let's play...* Find the GeoTiff! *I wrote a unit test that will recurse through a directory and report unreadable GeoTiff files. It's a hack job, but it'll get the job done.You could either check out the branch here: https://github.com/lossyrob/geotrellis/tree/lets-play-what-geotiff-is-failingOr just add this unit test to the appropriate place in your GeoTrellis checkout: https://github.com/lossyrob/geotrellis/blob/lets-play-what-geotiff-is-failing/raster-test/src/test/scala/geotrellis/raster/FINDTHEGEOTIFF.scalaEdit the val THE_DIR to the directory that contains your GeoTiffs.Then drop into sbt:> ./sbtgeotrellis > project raster-testraster-test> test-only FINDTHEGEOTIFF
This should report what GeoTiffs fail, and how many.
If you could get me a GeoTiff that fails, that would be really helpful for us to squash this bug.
Thanks!
Rob
On Thu, Feb 19, 2015 at 1:44 PM, André Stumpf <note...@gmail.com> wrote:Hi,--
I'm trying to setup an environment that allows to ingest tiled raster into HDFS with GeoTrellis.
I've set up a single-node cluster on my local machine following this guide:
https://districtdatalabs.silvrback.com/creating-a-hadoop-pseudo-distributed-environment
From the previous disussions here and the shell script provided by Rob Emanuele (https://gist.github.com/lossyrob/59f8116b07d37f7f45c5)
I was able to get all the basic parts running. However, when I try to start the ingest process I get the error below. I have the impression that
it fails on a tile at the border of the raster which is less than the default size of 1000x1000 pxl but I'm not sure if this might be related.
Thanks in advance for any advice,
André
student@⋅⋅⋅⋅:~/Bureau$ ./ingestGeoTiff.sh
Spark assembly has been built with Hive, including Datanucleus jars on classpath
19:56:20 Utils: Your hostname, ⋅⋅⋅⋅⋅⋅⋅⋅⋅ resolves to a loopback address: 127.0.1.1; using ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ instead (on interface eth0)
19:56:20 Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19:56:21 Slf4jLogger: Slf4jLogger started
19:56:21 Remoting: Starting remoting
19:56:21 Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅:49063]
19:56:21 NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[Stage 0:===> (15 + 10) / 624]19:56:24 Executor: Exception in task 23.0 in stage 0.0 (TID 23)
java.lang.ArrayIndexOutOfBoundsException: 196
at geotrellis.raster.io.geotiff.reader.decompression.PackBitsDecompression$PackBits.uncompressPackBits(PackBitsDecompression.scala:53)
at geotrellis.raster.io.geotiff.reader.ImageReader$.read(ImageReader.scala:89)
at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.recurReadImageDirectory$1(GeoTiffReader.scala:62)
at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.readImageDirectory$1(GeoTiffReader.scala:52)
at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.read(GeoTiffReader.scala:78)
at geotrellis.spark.io.hadoop.formats.GeotiffRecordReader.initialize(GeotiffInputFormat.scala:56)
at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:135)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
19:56:24 TaskSetManager: Lost task 23.0 in stage 0.0 (TID 23, localhost): java.lang.ArrayIndexOutOfBoundsException: 196
at geotrellis.raster.io.geotiff.reader.decompression.PackBitsDecompression$PackBits.uncompressPackBits(PackBitsDecompression.scala:53)
at geotrellis.raster.io.geotiff.reader.ImageReader$.read(ImageReader.scala:89)
at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.recurReadImageDirectory$1(GeoTiffReader.scala:62)
at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.readImageDirectory$1(GeoTiffReader.scala:52)
....
You received this message because you are subscribed to the Google Groups "geotrellis-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geotrellis-us...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Rob Emanuele, Tech Lead, GeoTrellis
Azavea | 340 N 12th St, Ste 402, Philadelphia, PA
rema...@azavea.com | T 215.701.7692 | F 215.925.2663
Web azavea.com | Blog azavea.com/blogs | Twitter @azavea