Multipart upload failed.

20 views
Skip to first unread message

Kim Chew

unread,
Jun 16, 2017, 6:08:55 PM6/16/17
to JetS3t Users
I have a spark application which the last job which inserting rows to a Hive table, which means it moves files from temp locations to destinations. Destinations in this case are 31 partitions where each partition has approximately 80 GB of data hence the total data size is around 2 TB.

However, sometime it stops after processing a few partitions, the log, shows something like this,

*****************************************************************************************************************
17/06/16 03:55:32 main INFO s3OperationsLog: Method=PUT ResponseCode=200 URI=http://usage-data-test-dev.s3.amazonaws.com/stress-testing%2Fmerged%2Fmultiple-big-partitions%2F5000-rows%2Fdt%3D15%2Fpart-00111?uploadId=QAo_HF1lIZ4EXXkwWZiJaQLFdyV7qdyIhIy7mCdHl3hADleKz9pk2friKDw9usK1af3UVQzhOC.ZkMWZ.IPun4kgTsMT9XhbZx6xoWWlWuVSPOXRF1tTr1cpo__wWLVaKnNXrbAN7y6hHs3.D3ei1CRrFpS4J0plyr2lXRHFYxk-&partNumber=5
17/06/16 03:55:33 main INFO s3OperationsLog: Method=PUT ResponseCode=200 URI=http://usage-data-test-dev.s3.amazonaws.com/stress-testing%2Fmerged%2Fmultiple-big-partitions%2F5000-rows%2Fdt%3D15%2Fpart-00111?uploadId=QAo_HF1lIZ4EXXkwWZiJaQLFdyV7qdyIhIy7mCdHl3hADleKz9pk2friKDw9usK1af3UVQzhOC.ZkMWZ.IPun4kgTsMT9XhbZx6xoWWlWuVSPOXRF1tTr1cpo__wWLVaKnNXrbAN7y6hHs3.D3ei1CRrFpS4J0plyr2lXRHFYxk-&partNumber=6
17/06/16 03:55:33 main INFO s3OperationsLog: Method=POST ResponseCode=200 URI=http://usage-data-test-dev.s3.amazonaws.com/stress-testing%2Fmerged%2Fmultiple-big-partitions%2F5000-rows%2Fdt%3D15%2Fpart-00111?uploadId=QAo_HF1lIZ4EXXkwWZiJaQLFdyV7qdyIhIy7mCdHl3hADleKz9pk2friKDw9usK1af3UVQzhOC.ZkMWZ.IPun4kgTsMT9XhbZx6xoWWlWuVSPOXRF1tTr1cpo__wWLVaKnNXrbAN7y6hHs3.D3ei1CRrFpS4J0plyr2lXRHFYxk-
17/06/16 03:55:33 main INFO s3OperationsLog: Method=DELETE ResponseCode=204 URI=http://usage-data-test-dev.s3.amazonaws.com/tmp%2Fhive-staging%2Ftmp_hive_2017-06-15_22-15-01_586_8760769274626057205-1%2F-ext-10000%2Fdt%3D15%2Fpart-00111
17/06/16 03:55:33 main INFO s3OperationsLog: Method=HEAD ResponseCode=200 URI=http://usage-data-test-dev.s3.amazonaws.com/tmp%2Fhive-staging%2Ftmp_hive_2017-06-15_22-15-01_586_8760769274626057205-1%2F-ext-10000%2Fdt%3D15%2Fpart-00112
17/06/16 03:55:33 main INFO s3OperationsLog: Method=POST ResponseCode=200 URI=http://usage-data-test-dev.s3.amazonaws.com/stress-testing%2Fmerged%2Fmultiple-big-partitions%2F5000-rows%2Fdt%3D15%2Fpart-00112?uploads
17/06/16 03:55:47 main INFO s3OperationsLog: Method=PUT ResponseCode=200 URI=http://usage-data-test-dev.s3.amazonaws.com/stress-testing%2Fmerged%2Fmultiple-big-partitions%2F5000-rows%2Fdt%3D15%2Fpart-00112?uploadId=K6HHay6y.5pDLM_QzQ.kXFvM.hUi.jYHIkZe4nmHS4bpFEkZo_ZQ9e2IiA.YMpNhODGpxxUef4.DBBmTg_TzC5YN.OjFjfnD0.uPHD9QmY4e_mo7nQHiEbBcBYYiehpL3W0Wc6ykd5nTSzxy1L2t_tF1oVX43q9ITeW6t4Q8dG8-&partNumber=1
17/06/16 03:56:10 main INFO s3OperationsLog: Method=PUT ResponseCode=200 URI=http://usage-data-test-dev.s3.amazonaws.com/stress-testing%2Fmerged%2Fmultiple-big-partitions%2F5000-rows%2Fdt%3D15%2Fpart-00112?uploadId=K6HHay6y.5pDLM_QzQ.kXFvM.hUi.jYHIkZe4nmHS4bpFEkZo_ZQ9e2IiA.YMpNhODGpxxUef4.DBBmTg_TzC5YN.OjFjfnD0.uPHD9QmY4e_mo7nQHiEbBcBYYiehpL3W0Wc6ykd5nTSzxy1L2t_tF1oVX43q9ITeW6t4Q8dG8-&partNumber=2
.....
Exception in thread "main" java.lang.IllegalArgumentException: Null last modified not allowed. at org.jets3t.service.model.MultipartPart.<init>(MultipartPart.java:42) at org.jets3t.service.impl.rest.XmlResponsesSaxParser$MultipartPartResultHandler.getMultipartPart(XmlResponsesSaxParser.java:1228) at org.jets3t.service.impl.rest.XmlResponsesSaxParser.parseMultipartUploadPartCopyResult(XmlResponsesSaxParser.java:355) at org.jets3t.service.impl.rest.httpclient.RestS3Service.multipartUploadPartCopyImpl(RestS3Service.java:970) at org.jets3t.service.S3Service.multipartUploadPartCopy(S3Service.java:3651) at org.jets3t.service.S3Service.moveObjectMaybeAsMultipart(S3Service.java:3348) at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.rename(Jets3tNativeFileSystemStore.java:373) at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:250) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at org.apache.hadoop.fs.s3native.$Proxy42.rename(Unknown Source) at org.apache.hadoop.fs.s3native.NativeS3FileSystem.rename(NativeS3FileSystem.java:1636) at org.apache.hadoop.hive.ql.metadata.Hive.renameFile(Hive.java:2880) at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3429) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1504) at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1822) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606)
**************************************************************************

It seems to me that some of the part files failed to pass its "lastModified" date back.
Is there a way to remedy this?

TIA.

Kim
Reply all
Reply to author
Forward
0 new messages