File By File Logging

13 views
Skip to first unread message

Jeffrey Krull

unread,
Mar 2, 2018, 10:48:35 AM3/2/18
to Alfresco Bulk Import Tool
Good morning,

Is it possible to get file-by-file logging for the bulk importer? We intend to run the import tool automatically on a regularly scheduled basis (several times daily), and we need to track whether there were any errors or skipped documents on each import, and for auditing purposes, get a list of specific documents that were imported, and whether they were successfully imported or not.. 

Thanks,
Jeff

Peter Monks

unread,
Mar 3, 2018, 6:25:19 PM3/3/18
to alfresco-bulk-f...@googlegroups.com
G'day Jeff,

The tool is fail-fast, so any failures will cause the entire import to halt.  You can poll the status Web Script to determine whether an import failed or not - the JSON version of this Web Script is specifically designed to be easy to consume by automated tooling (e.g. the scripts that initiate your scheduled imports).  For Unix-style shell scripting, I especially like the combination of httpie and jq for this kind of thing, but the tool itself is agnostic - you can use whatever you prefer.

When an import fails, the status Web Script will include the exception that caused the failure, so you shouldn't have to read log files to figure out which file was problematic.

Because files are loaded in batches, any files in the same batch as the failing file will also be rolled back (even though they may have been written to the repository correctly).  Rather than trying to determine which files were in that batch (which the tool doesn't track, and can't easily report even via the log files) you'd be better off either:
  1. fixing whatever the root cause issue is
    or
  2. pulling the offending file out of the source content set
and then retrying the exact same import, with the "replace existing file" option disabled (turned off).  The reason for this is that the tool is designed to be efficiently re-runnable, and will quickly skip over the files that were already successfully imported, then pick back up at the first file that failed to import (or was successful, but got rolled back as part of a failing batch).

So in short, rather than trying to track errors at an individual file level, it's better to simply poll the status Web Script to determine when an import has completed and whether it succeeded or failed.  If it failed, use the exception information to identify the root case, and once that root cause issue is corrected (or the offending content removed from the source content set), re-run the exact same import with that same source content set, ensuring that "replace existing files" is disabled (turned off).

Cheers,
Peter

--
You received this message because you are subscribed to the Google Groups "Alfresco Bulk Import Tool" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alfresco-bulk-filesystem-import+unsubscribe@googlegroups.com.
To post to this group, send email to alfresco-bulk-filesystem-imp...@googlegroups.com.
Visit this group at https://groups.google.com/group/alfresco-bulk-filesystem-import.
For more options, visit https://groups.google.com/d/optout.

Jeffrey Krull

unread,
Mar 5, 2018, 1:06:04 PM3/5/18
to Alfresco Bulk Import Tool
Hi Peter,

Thanks for the information, that answers our questions! Have a good day!

Jeff
Reply all
Reply to author
Forward
0 new messages