Caused by: java.io.IOException: Could not read footer for file: FileStatus{path=alluxio://[master]:19998/store_sales/part-00000-e7e8c7ca-84ca-4ea4-b3b3-ad8164ca0525-c000.snappy.parquet; isDirectory=false; length=119342549; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false}
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:498)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:485)
at scala.collection.parallel.AugmentedIterableIterator$class.flatmap2combiner(RemainsIterator.scala:132)
at scala.collection.parallel.immutable.ParVector$ParVectorIterator.flatmap2combiner(ParVector.scala:62)
at scala.collection.parallel.ParIterableLike$FlatMap.leaf(ParIterableLike.scala:1072)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:49)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:51)
at scala.collection.parallel.ParIterableLike$FlatMap.tryLeaf(ParIterableLike.scala:1068)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:152)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:443)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinTask.doJoin(ForkJoinTask.java:341)
at scala.concurrent.forkjoin.ForkJoinTask.join(ForkJoinTask.java:673)
at scala.collection.parallel.ForkJoinTasks$WrappedTask$class.sync(Tasks.scala:378)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.sync(Tasks.scala:443)
at scala.collection.parallel.ForkJoinTasks$class.executeAndWaitResult(Tasks.scala:426)
at scala.collection.parallel.ForkJoinTaskSupport.executeAndWaitResult(TaskSupport.scala:56)
at scala.collection.parallel.ParIterableLike$ResultMapping.leaf(ParIterableLike.scala:958)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply$mcV$sp(Tasks.scala:49)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)
at scala.collection.parallel.Task$$anonfun$tryLeaf$1.apply(Tasks.scala:48)
at scala.collection.parallel.Task$class.tryLeaf(Tasks.scala:51)
at scala.collection.parallel.ParIterableLike$ResultMapping.tryLeaf(ParIterableLike.scala:953)
at scala.collection.parallel.AdaptiveWorkStealingTasks$WrappedTask$class.compute(Tasks.scala:152)
at scala.collection.parallel.AdaptiveWorkStealingForkJoinTasks$WrappedTask.compute(Tasks.scala:443)
at scala.concurrent.forkjoin.RecursiveAction.exec(RecursiveAction.java:160)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.IllegalArgumentException: Seek position past the end of the read region (block or file). [436207616]
at alluxio.core.client.runtime.com.google.common.base.Preconditions.checkArgument(Preconditions.java:202)
at alluxio.client.block.stream.BlockInStream.seek(BlockInStream.java:316)
at alluxio.client.file.FileInStream.updateStream(FileInStream.java:313)
at alluxio.client.file.FileInStream.read(FileInStream.java:126)
at alluxio.hadoop.HdfsFileInputStream.read(HdfsFileInputStream.java:98)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.parquet.hadoop.util.H1SeekableInputStream.read(H1SeekableInputStream.java:60)
at org.apache.parquet.bytes.BytesUtils.readIntLittleEndian(BytesUtils.java:67)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:472)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:445)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:421)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:491)
... 32 moreThanks,
Hi José,It looks to me that this file alluxio://[master]:19998/store_sales/part-00000-e7e8c7ca-84ca-4ea4-b3b3-ad8164ca0525-c000.snappy.parquetmight be corrupted for some reason.Can you compare its length as reported by Alluxio, by runningbin/alluxio fs ls /store_sales/part-00000-e7e8c7ca-84ca-4ea4-b3b3-ad8164ca0525-c000.snappy.parquet
and also check its length in the original data source?- Bin
--
You received this message because you are subscribed to the Google Groups "Alluxio Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/alluxio-users/0be6ac01-05e2-4648-8714-a728bfbe8094%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Hi Bin, the same result,I don't know it is normal replication and blocksize = 0 here.
To unsubscribe from this group and stop receiving emails from it, send an email to alluxi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/alluxio-users/0be6ac01-05e2-4648-8714-a728bfbe8094%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
--
--
To unsubscribe from this group and stop receiving emails from it, send an email to alluxio-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/alluxio-users/6cd425fa-cc48-4715-98fc-79d18cf2740d%40googlegroups.com.