Re: [GeoTrellis] Ingest GeoTIFF into HDFS using GeoTrellis

194 views
Skip to first unread message

Albert

unread,
Jul 17, 2015, 1:19:20 AM7/17/15
to geotrel...@googlegroups.com, johanst...@gmail.com
Hey Rob,

I added this findTif.scala in geotrellis,so i want to find which tif file can be readed .but i got a problem ,is something wrong with my findTif.scala location.
here is the problem:



在 2015年2月20日星期五 UTC+8上午3:44:56,Rob Emanuele写道:
This looks like an issue in the Decompression logic in the GeoTiff reader. This is causing some of the GeoTiff reading to fail.

We need to find the unreadable GeoTiffs and fix our GeoTiff reader. So let's play...

* Find the GeoTiff! *

I wrote a unit test that will recurse through a directory and report unreadable GeoTiff files. It's a hack job, but it'll get the job done.


Edit the val THE_DIR to the directory that contains your GeoTiffs.

Then drop into sbt:

> ./sbt

geotrellis > project raster-test

raster-test> test-only FINDTHEGEOTIFF


This should report what GeoTiffs fail, and how many.

If you could get me a GeoTiff that fails, that would be really helpful for us to squash this bug. 


Thanks!


Rob



On Thu, Feb 19, 2015 at 1:44 PM, André Stumpf <note...@gmail.com> wrote:
Hi,

I'm trying to setup an environment that allows to ingest tiled raster into HDFS with GeoTrellis.
I've set up a single-node cluster on my local machine following this guide:

https://districtdatalabs.silvrback.com/creating-a-hadoop-pseudo-distributed-environment

From the previous disussions here and the shell script provided by Rob Emanuele (https://gist.github.com/lossyrob/59f8116b07d37f7f45c5)
I was able to get all the basic parts running. However, when I try to start the ingest process I get the error below. I have the impression that
it fails on a tile at the border of the raster which is less than the default size of 1000x1000 pxl but I'm not sure if this might be related.

Thanks in advance for any advice,
André

student@⋅⋅⋅⋅:~/Bureau$ ./ingestGeoTiff.sh
Spark assembly has been built with Hive, including Datanucleus jars on classpath
19:56:20 Utils: Your hostname, ⋅⋅⋅⋅⋅⋅⋅⋅⋅ resolves to a loopback address: 127.0.1.1; using ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ instead (on interface eth0)
19:56:20 Utils: Set SPARK_LOCAL_IP if you need to bind to another address
19:56:21 Slf4jLogger: Slf4jLogger started
19:56:21 Remoting: Starting remoting
19:56:21 Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅:49063]
19:56:21 NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[Stage 0:===>                                                                                                                          (15 + 10) / 624]19:56:24 Executor: Exception in task 23.0 in stage 0.0 (TID 23)
java.lang.ArrayIndexOutOfBoundsException: 196
    at geotrellis.raster.io.geotiff.reader.decompression.PackBitsDecompression$PackBits.uncompressPackBits(PackBitsDecompression.scala:53)
    at geotrellis.raster.io.geotiff.reader.ImageReader$.read(ImageReader.scala:89)
    at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.recurReadImageDirectory$1(GeoTiffReader.scala:62)
    at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.readImageDirectory$1(GeoTiffReader.scala:52)
    at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.read(GeoTiffReader.scala:78)
    at geotrellis.spark.io.hadoop.formats.GeotiffRecordReader.initialize(GeotiffInputFormat.scala:56)
    at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:135)
    at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:107)
    at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:69)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
    at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
    at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
    at org.apache.spark.scheduler.Task.run(Task.scala:56)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
19:56:24 TaskSetManager: Lost task 23.0 in stage 0.0 (TID 23, localhost): java.lang.ArrayIndexOutOfBoundsException: 196
    at geotrellis.raster.io.geotiff.reader.decompression.PackBitsDecompression$PackBits.uncompressPackBits(PackBitsDecompression.scala:53)
    at geotrellis.raster.io.geotiff.reader.ImageReader$.read(ImageReader.scala:89)
    at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.recurReadImageDirectory$1(GeoTiffReader.scala:62)
    at geotrellis.raster.io.geotiff.reader.GeoTiffReader$.readImageDirectory$1(GeoTiffReader.scala:52)

....

--
You received this message because you are subscribed to the Google Groups "geotrellis-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geotrellis-us...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Rob Emanuele, Tech Lead, GeoTrellis

Azavea |  340 N 12th St, Ste 402, Philadelphia, PA
rema...@azavea.com  | T 215.701.7692  | F 215.925.2663
Web azavea.com  |  Blog azavea.com/blogs  | Twitter @azavea

Rob Emanuele

unread,
Jul 17, 2015, 1:23:41 AM7/17/15
to geotrel...@googlegroups.com, Johan Stenberg
Hi Albert,

Did you import the type into your source file?

import geotrellis.raster.io.geotiff._

These sorts of questions would be best asked on Gitter https://gitter.im/geotrellis/geotrellis.

Also, when providing stack traces, make sure to include the top part of the stack trace (the part of the error that shows what code threw the error).


You can see that there's no .apply method that takes a file path. We differentiate between reading multi-band and single-band rasters, so you should use one of those types, e.g.


Thanks,
Rob
Message has been deleted
Message has been deleted

Albert

unread,
Jul 17, 2015, 6:12:51 AM7/17/15
to geotrel...@googlegroups.com
Hi Rob,
i want to say thank you very very much,and  it's helpful to me.
thanks again
Albert


在 2015年7月17日星期五 UTC+8下午1:23:41,Rob Emanuele写道:
Reply all
Reply to author
Forward
0 new messages