Invalid or corrupt jarfile

1,683 views
Skip to first unread message

Eric Pichon

unread,
Dec 22, 2015, 10:16:11 AM12/22/15
to Web Data Commons
Hi,

I followed instructions from here: http://webdatacommons.org/framework/index.html

I am trying to run the example with the custom processor for "numerical lines" from that tutorial.

When I launch the spot instance workers, nothing happens, the queue size does not go down.

After ssh'ing into one of the spot instance workers, I discovered this error at the end of /var/log/cloud-init-output.log 

Error: Invalid or corrupt jarfile /tmp/start.jar


It turns out /tmp/start.jar is a copy of target/dpef-1.0.4-SNAPSHOT.jar
 
There does not seem to be any error with the file (they're the same MD5 hash on the master and the worker).

When I attempt to manually do:

java -jar /tmp/start.jar 

I do get the error. Same thing on the master.

Looking for possible causes for this error message I see that there is a bug in Java 7 whereby a JAR file cannot contain more than 64k files. 


Expanding the one at hand I indeed find 74505 files.

Has anyone encountered this problem? If so, any idea on how to fix or work around this?

Many thanks for any insights/ideas,
Eric 

PS: I used the following as a master/build instance:

Instance type
c4.xlarge

Robert Meusel

unread,
Dec 22, 2015, 10:37:05 AM12/22/15
to Web Data Commons
Hi Eric,

Thanks for pointing this out. 
My last extraction was done using 

ubuntu/images/ebs/ubuntu-trusty-14.04-amd64-server-20140416.1 (ami-018c9568)

which worked for me. So in case you can switch I would recommend to use this AMI.

Other options would be to upload the non-fat jar (without all the dependencies) and add upload the dependencies separately. 
Or you can also reduce the number of class and dependencies (by simply removing packages and the dependencies in the .pom)

Unfortunately I did not yet test the framework with Java 8 so far.

Hope this helps,
Robert

Eric PICHON

unread,
Dec 22, 2015, 1:25:50 PM12/22/15
to web-data...@googlegroups.com
Hi Robert,

Many thanks for the quick response.

I re-built everything from scratch with . I get exactly the same thing. The JAR stil has 74505 files in it and the log file on the worker instance says:

ubuntu@ip-10-69-5-217:/var/log$ tail cloud-init-output.log
135300K .......... .......... .......... .......... .......... 99%  115M 0s
135350K .......... .......... .......... .......... .......... 99% 84.6M 0s
135400K .......... .......... .......... .......... .......... 99% 65.6M 0s
135450K .......... .......... .......... .......... .......... 99% 50.1M 0s
135500K .......... .......... .......... .......... ...       100% 29.3M=2.8s

2015-12-22 18:08:22 (47.1 MB/s) - '/tmp/start.jar' saved [138796849/138796849]

Error: Invalid or corrupt jarfile /tmp/start.jar
Cloud-init v. 0.7.5 finished at Tue, 22 Dec 2015 18:08:22 +0000. Datasource DataSourceEc2.  Up 105.62 seconds

My Java is a little rusty. I'm not too clear also on creating a "non-fat" jar. Also, I don't know which of the dependencies could be removed from the .pom while keeping the framework functional (anything having to do with tests? anything else?)...

What do you recommend? Any idea on why your JAR would pass and this one would not? Could it be that some external dependency grew enough in the past few weeks to push us above the 64k edge?

Thanks,

Eric



--
You received this message because you are subscribed to a topic in the Google Groups "Web Data Commons" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web-data-commons/wpaI8csqN4o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web-data-commo...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tom Morris

unread,
Dec 22, 2015, 2:28:29 PM12/22/15
to web-data...@googlegroups.com
It seems unlikely that Java 8 would introduce compatibility problems.
Have you tried building with it?

If you need to stick with Java 7, can you use the workaround described
in the bug report?:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7194005

java -cp /tmp/start.jar org.webdatacommons.framework.Worker

or is the invocation buried too deeply in the framework?

Tom
> You received this message because you are subscribed to the Google Groups
> "Web Data Commons" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Tom Morris

unread,
Dec 22, 2015, 2:44:34 PM12/22/15
to web-data...@googlegroups.com
To answer my own question, the startup script is hardcoded in src/main/java/org/webdatacommons/framework/cli/Master.java 

You'll need to edit this line:

private final String startupScript = "#!/bin/bash \n echo 1 > /proc/sys/vm/overcommit_memory \n aptitude update \n aptitude -y install openjdk-7\
-jre-headless htop \n wget -O /tmp/start.jar \""
                        + getJarUrl()
                        + "\" \n java -Xmx"
                        + getOrCry("javamemory").trim()
                        + " -jar /tmp/start.jar > /tmp/start.log & \n";


and rebuild to implement the workaround.  Note that the startup script also hardcodes the Java version, so you'll need to mess with this no matter which solution you decide to pursue.

Tom

Robert Meusel

unread,
Dec 22, 2015, 3:03:32 PM12/22/15
to Web Data Commons
thanks tom for the help.
let me know if this works, otherwise I can check this deeper by tomorrow.

Robert Meusel

unread,
Dec 22, 2015, 3:06:25 PM12/22/15
to web-data...@googlegroups.com

The workaround should be fine. Let me know if it works.

Eric Pichon

unread,
Dec 22, 2015, 4:15:44 PM12/22/15
to web-data...@googlegroups.com
Robert, Tom,

Many thanks.

I will try this either tomorrow or shortly after Christmas and report back.

Thanks again,
Eric

Robert Meusel

unread,
Apr 3, 2016, 12:43:38 PM4/3/16
to Web Data Commons
hi together,

we recently moved the framework to java 8 and tested it doing a new extraction. it worked fine so the bug should be fixed by now. 
you can use the recent version (trunk) or version 1/0/4.

cheers,
robert
Reply all
Reply to author
Forward
0 new messages