Automatic retry for "Failed to copy content from file"

95 views
Skip to first unread message

Zhihai Liu

unread,
Mar 1, 2017, 3:05:03 PM3/1/17
to Alfresco Bulk Import Tool
My bulk import ran into the following exception and failed. I wonder if Bulk Import Tool has any automatic retry mechanism for this kind of random runtime exception. I have 80K+ files in the bulk import and would hate to restart the import when this happens. Thank you.


org.alfresco.extension.bulkimport.impl.ItemImportException: Unexpected exception:

 class org.alfresco.service.cmr.repository.ContentIOException: 020125736 Failed to copy content from file: 

   writer: ContentAccessor[ contentUrl=store://2017/3/1/13/41/59d0db89-0c00-42e0-9d53-f40d9329c068.bin, mimetype=application/vnd.openxmlformats-officedocument.wordprocessingml.document, size=0, encoding=UTF-8, locale=en_US]

...

Caused by: java.nio.channels.ClosedByInterruptException

at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)

at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:215)

at org.alfresco.repo.content.AbstractContentAccessor$CallbackFileChannel.write(AbstractContentAccessor.java:422)

at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)

at java.nio.channels.Channels.writeFully(Channels.java:101)

at java.nio.channels.Channels.access$000(Channels.java:61)

at java.nio.channels.Channels$1.write(Channels.java:174)

at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)

at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)

at org.alfresco.repo.content.LimitedStreamCopier.copyStreamsLong(LimitedStreamCopier.java:87)

at org.alfresco.repo.content.AbstractContentWriter.copyStreams(AbstractContentWriter.java:502)

at org.alfresco.repo.content.AbstractContentWriter.putContent(AbstractContentWriter.java:479)

... 20 more

Peter Monks

unread,
Mar 1, 2017, 3:40:32 PM3/1/17
to alfresco-bulk-f...@googlegroups.com
G'day Zhihai,

tl;dr - don't be scared of failures, and don't be scared to restart an import numerous times (with the "Replace existing files" checkbox unchecked)!

Long version:
This may not be a random runtime exception, and even if it were, the tool has no way of knowing one way or the other.  Therefore the tool is intentionally designed to fail fast, so that the operator has an opportunity to RCA the problem and address it in some way.  Of course this means that failures are also common, so the tool includes a "continue where the previous import left off" feature - this is what happens when you turn off (uncheck) the "Replace existing files" option in the initiation UI.

When this checkbox is off (unchecked), the tool skips over files from the source that already exist in the target, and only resumes importing when it finds folders or files that don't yet exist in the target.  Skipping files is a R/O operation (in both the source and the target), so it's relatively fast (orders of magnitude faster than importing, which R/W and I/O heavy).

In my experience, virtually all non-trivial (> 1 million file) imports fail numerous times with these sorts of errors (file permissions issues, previously unknown disk corruption, network I/O issues when the source is a remotely mounted volume, etc. etc.), so ensuring the tool was efficiently restartable has been core to the design from day one.

Cheers,
Peter




--
You received this message because you are subscribed to the Google Groups "Alfresco Bulk Import Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsubscribe@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-imp...@googlegroups.com.
Visit this group at https://groups.google.com/group/alfresco-bulk-filesystem-import.
For more options, visit https://groups.google.com/d/optout.

Zhihai Liu

unread,
Mar 1, 2017, 4:02:18 PM3/1/17
to Alfresco Bulk Import Tool
Thanks for the quick response, Peter! It looks like the "failed to copy content from file" error in my case is indeed random/runtime - the files failed before were imported successfully in subsequent imports later. I understand the "uncheck the check box" can pick up from where it left. Thank you for explaining the rationale behind the design decision.

Peter Monks

unread,
Mar 1, 2017, 6:03:23 PM3/1/17
to alfresco-bulk-f...@googlegroups.com
It looks like the "failed to copy content from file" error in my case is indeed random/runtime - the files failed before were imported successfully in subsequent imports later.

Is the source directory on a local hardware device (hard drive, RAID array etc.), or is it network mounted?  If the latter, it may indicate problems with your network.

Cheers,
Peter
 


To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsu...@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-import...@googlegroups.com.

Zhihai Liu

unread,
Mar 2, 2017, 8:51:05 AM3/2/17
to Alfresco Bulk Import Tool
It is network mounted so it could have sporadic problems.

I have a Spring Batch application that prepares the file system data for bulk import. It writes to the network mount as well but I don't experience these runtime exceptions. It could be because of my Spring Batch retryable-exception-class configuration. Do you think the idea is something worth considering for bulk import tool? I mean people can get bored really quickly at monitoring the status page and rerun the import... :-)
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsubscribe@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-imp...@googlegroups.com.
Visit this group at https://groups.google.com/group/alfresco-bulk-filesystem-import.
For more options, visit https://groups.google.com/d/optout.

Peter Monks

unread,
Mar 2, 2017, 2:09:34 PM3/2/17
to alfresco-bulk-f...@googlegroups.com
Restarts are fully scriptable outside the tool already e.g. via shell scripts that call the status & initiation Web Scripts.  I personally like httpie and jq for this kind of thing - they're a great combo!

Cheers,
Peter


To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsu...@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-import...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Alfresco Bulk Import Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsu...@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-import...@googlegroups.com.

Zhihai Liu

unread,
Mar 2, 2017, 3:40:23 PM3/2/17
to Alfresco Bulk Import Tool
Peter, thank you for the tip. I am a newbie regarding to httpie and jq. Do you mind sharing an example? Even though it is outside the tool, I think any serious Alfresco bulk import tool use case would face this challenge. Thanks!
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsubscribe@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-imp...@googlegroups.com.
Visit this group at https://groups.google.com/group/alfresco-bulk-filesystem-import.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Alfresco Bulk Import Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsubscribe@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-imp...@googlegroups.com.
Visit this group at https://groups.google.com/group/alfresco-bulk-filesystem-import.
For more options, visit https://groups.google.com/d/optout.

Zhihai Liu

unread,
Mar 3, 2017, 4:56:34 PM3/3/17
to Alfresco Bulk Import Tool
Peter, 

Can you explain processing state transition for status check and restart? For example, if processingState=Failed with lastException, run "initiate'. If processingState=Scanning with lastException, run "stop' then "initiate" etc. I got started with the web script API but would like to understand the logic flow. Thanks.


To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsubscribe@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-imp...@googlegroups.com.
Visit this group at https://groups.google.com/group/alfresco-bulk-filesystem-import.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Alfresco Bulk Import Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsubscribe@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-imp...@googlegroups.com.
Visit this group at https://groups.google.com/group/alfresco-bulk-filesystem-import.
For more options, visit https://groups.google.com/d/optout.

Peter Monks

unread,
Mar 4, 2017, 9:25:02 PM3/4/17
to alfresco-bulk-f...@googlegroups.com
G'day Zhihai,


You should only need to consider the "Failed" state, optionally looking at the last exception if you have some reliable way to interpret that information (for I/O related failures the last exception is likely to be platform specific).

Cheers,
Peter

Apologes for speling & gramar erorrs - sent from mobil deivce
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesys...@googlegroups.com.
To post to this group, send email to alfresco-bulk-f...@googlegroups.com.

Zhihai Liu

unread,
Mar 7, 2017, 12:58:17 PM3/7/17
to Alfresco Bulk Import Tool
Peter, I saw that processingState stayed at "Scanning" for a long time with error in lastException. Should I wait until it eventually transition to "Failed"?

Zhihai Liu

unread,
Mar 10, 2017, 10:50:06 AM3/10/17
to Alfresco Bulk Import Tool
I put a threshold (5 minutes) in monitoring "Scanning + lastException" state. It moved to "Failed" within the threshold at run time. All good then. Thanks!
Reply all
Reply to author
Forward
0 new messages