Out of Memory Issue when uploading Large number of files

A. Sij

unread,

Jun 22, 2011, 7:18:12 PM6/22/11

to JetS3t Users

Hi All,

I am constantly facing Out of memory issue while using Synchronize,
even though I am using Batch option with 1000 files per batch.

I have set Heap size also around 2GB. suggestions or directions to
look in will be highly appreciated.

Below is error log

<code>

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:
100)
at
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
at java.lang.StringBuilder.append(StringBuilder.java:203)
at java.io.UnixFileSystem.resolve(UnixFileSystem.java:93)
at java.io.File.<init>(File.java:207)
at java.io.File.listFiles(File.java:1056)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
408)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
419)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
419)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
419)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
419)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
419)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
419)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMapForDirectory(FileComparer.java:
419)
at
org.jets3t.service.utils.FileComparer.buildObjectKeyToFilepathMap(FileComparer.java:
361)
at
org.jets3t.apps.synchronize.Synchronize.run(Synchronize.java:1010)
at
org.jets3t.apps.synchronize.Synchronize.main(Synchronize.java:1611)

</code>

Thanks and Regards
A. Sij

avin sijariya

unread,

Jun 22, 2011, 7:29:58 PM6/22/11

to JetS3t Users

Forgot to mention some more detail

I am working on latest jets3t libraries i.e. jets3t-0.8.1.zip

On Ubuntu

java version "1.6.0_18"

Looking for quick response.

Regards

A. Sij

James Murty

unread,

Jun 22, 2011, 7:31:47 PM6/22/11

to jets3t...@googlegroups.com

Quick response: Assign more memory to Java, or upload you files in
smaller batches.

James

> --
> You received this message because you are subscribed to the Google Groups
> "JetS3t Users" group.
> To post to this group, send email to jets3t...@googlegroups.com.
> To unsubscribe from this group, send email to
> jets3t-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/jets3t-users?hl=en.
>

James Murty

unread,

Jun 22, 2011, 7:33:54 PM6/22/11

to jets3t...@googlegroups.com

http://groups.google.com/group/jets3t-users/msg/f5b3efb7842ec5bf?hl=en

A. Sij

unread,

Jun 22, 2011, 7:39:24 PM6/22/11

to JetS3t Users

k now batch size is 1000 I will make it even smaller now say 100, also
I have given 2.5GB -Xmx size.

Also this error comes in file comparision stage. as you can see in
log..

Also just for info We are uploading 4 million files and 50k are
already uploaded and we have put more files in the repository then
this error started coming so no issue when number of files were less.

And thanks for so quick response :)

A* B* way is good and but require manual effort :(

Regards
A. Sij

On Jun 23, 1:33 am, James Murty <jamu...@gmail.com> wrote:
> http://groups.google.com/group/jets3t-users/msg/f5b3efb7842ec5bf?hl=en
>
>
>
>
>
>
>
> On Wed, Jun 22, 2011 at 4:31 PM, James Murty <jamu...@gmail.com> wrote:
> > Quick response: Assign more memory to Java, or upload you files in
> > smaller batches.
>
> > James
>

> > On Wed, Jun 22, 2011 at 4:29 PM, avin sijariya <jss.a...@gmail.com> wrote:
> >> Forgot to mention some more detail
> >> I am working on latest jets3t libraries i.e. jets3t-0.8.1.zip
> >> On Ubuntu
> >> java version "1.6.0_18"
> >> Looking for quick response.
>
> >> Regards
> >> A. Sij
>

A. Sij

unread,

Jun 22, 2011, 7:58:49 PM6/22/11

to JetS3t Users

1 more question

to decrease batch size I shall use this variable right ?
upload.transformed-files-batch-size=1000
Also I am not using any kind of transformation just Https transfer.

Thanks and Regards
A. Sij

A. Sij

unread,

Jun 22, 2011, 8:02:28 PM6/22/11

to JetS3t Users

1 ques upload.transformed-files-batch-size=300 is the variable where I
can change batch size right?

Also I am not using any transformation or zip just https protocol.

Regards
A. Sij.

On Jun 23, 1:33 am, James Murty <jamu...@gmail.com> wrote:

> http://groups.google.com/group/jets3t-users/msg/f5b3efb7842ec5bf?hl=en
>
>
>
>
>
>
>
> On Wed, Jun 22, 2011 at 4:31 PM, James Murty <jamu...@gmail.com> wrote:
> > Quick response: Assign more memory to Java, or upload you files in
> > smaller batches.
>
> > James
>

> > On Wed, Jun 22, 2011 at 4:29 PM, avin sijariya <jss.a...@gmail.com> wrote:
> >> Forgot to mention some more detail
> >> I am working on latest jets3t libraries i.e. jets3t-0.8.1.zip
> >> On Ubuntu
> >> java version "1.6.0_18"
> >> Looking for quick response.
>
> >> Regards
> >> A. Sij
>

> >> On Thu, Jun 23, 2011 at 1:18 AM, A. Sij <jss.a...@gmail.com> wrote:
>
> >>> Hi All,
>
> >>> I am constantly facing Out of memory issue while using Synchronize,
> >>> even though I am using Batch option with 1000 files per batch.
>
> >>> I have set Heap size also around 2GB. suggestions or directions to
> >>> look in will be highly appreciated.
>
> >>> Below is error log
>
> >>> <code>
>
> >>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> >>> at java.util.Arrays.copyOf(Arrays.java:2882)
> >>> at
> >>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:
> >>> 100)
> >>> at
> >>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:572)
> >>> at java.lang.StringBuilder.append(StringBuilder.java:203)
> >>> at java.io.UnixFileSystem.resolve(UnixFileSystem.java:93)
> >>> at java.io.File.<init>(File.java:207)
> >>> at java.io.File.listFiles(File.java:1056)
> >>> at
>

James Murty

unread,

Jun 23, 2011, 12:34:23 AM6/23/11

to jets3t...@googlegroups.com

Batch mode is enabled using the "--batch" option when running the
Synchronize program. The batch size is hard-coded to 1,000 -- the
maximum number of objects that can be listed in S3 in a single
request.

If running Synchonize in batch mode exceeds the available memory,
reducing the size of the batch isn't likely to help since at that
point it is information about the local files that is most likely
chewing up all the memory. As of version 0.8.1 I've minimized the
memory usage about as far as I can for local files, only the filename
path string and target object key name string are stored but this can
still add up to a lot of bytes for large numbers of files.

If the number of local files exceeds the memory you are able to give
the Synchronize app you will need to manually "batch" your uploads by
uploading smaller subsets of files using multiple command invocations.

Changing the "upload.transformed-files-batch-size" option won't have
any effect if you are not encrypting or gzipping files during upload.

Cheers,
James

A. Sij

unread,

Jun 23, 2011, 5:27:13 AM6/23/11

to JetS3t Users

Hi James,

Thanks for the info, can u show me how can be use A* B* regex to pick
selective file

Sample command will be ok.

Thanks and Regards
A. Sij.

James Murty

unread,

Jun 23, 2011, 10:48:12 AM6/23/11

to jets3t...@googlegroups.com

The best approach is to upload individual sub-directories:

synchronize.sh UP target-bucket/DirA /path/to/files/DirA
synchronize.sh UP target-bucket/DirB /path/to/files/DirB
etc.

Alternatively, you can use normal file wildcards but must be careful
to avoid deleting files in S3 that don't match the wildcard:

synchronize.sh UP target-bucket /path/to/files/A* --nodelete
synchronize.sh UP target-bucket /path/to/files/B* /path/to/files/C* --nodelete
etc.

A. Sij

unread,

Jun 24, 2011, 9:07:47 AM6/24/11

to JetS3t Users

Thanks James,

already implemented the strategy.

One last question, Can u give me a rough idea with following criteria
how many maximum number of files that can safely be uploaded.

-Xmx 2.5 GB
Ubuntu
batch option used

or may be in other words can u tell me what is the maximum number of
files that are being uploaded in sigle command, that u may have tried
or heard.

Thanks and Regards
A. Sij

Reply all

Reply to author

Forward