Hi,
I'm using Bareos to for NDMP DUMP backups of NetApp in a new installation. For some of the NetApp larger volumes, I want to use the MULTI_SUBTREE_NAMES option to enumerate a set of directories per job. This works, but if the list of directories gets long, it seems that passing the environment between the NetApp and Bareos fails. Specifically, when the list of directories is more than about 250 characters long, the Bareos director fails (job dies or director dies). Shortening the directory list works around the problem and allows the job to run.
I'm wondering if anyone has seen this problem or if a bug report should be filed.
More details.
Aside #1, there is a separate 1024 character limit on the environment on NetApp side, if this is exceeded, the environment gets truncated, this generally going to cause the job to fail on the NetApp side.
This is my fileset definition (with faked dir names):
FileSet {
Name = "netapp_j0"
Include {
Options {
AclSupport = Yes
XattrSupport = Yes
Meta = "DMP_NAME=netapp_j0"
Meta = "MULTI_SUBTREE_NAMES=dirA
dirB
dirC
dirD
dirE"
}
File = "/filer0/volume0"
}
}
Above example has the MULTI_SUBTREE_NAMES string about 20 characters long, problem is seen when this is more than about 250 characters.
Here is output in bareos log:
28-Jun 13:43 bareos-dir JobId 337: Async request NDMP4_NOTIFY_DATA_HALTED
28-Jun 13:43 bareos-dir JobId 337: DATA: bytes 1051531213KB left 1753947KB MOVER: written 1051531264KB record 16430176
28-Jun 13:43 bareos-dir JobId 337: Operation done, cleaning up
28-Jun 13:43 bareos-dir JobId 337: Waiting for operation to halt
28-Jun 13:43 bareos-dir JobId 337: Commanding tape drive to NDMP9_MTIO_EOF 2 times
28-Jun 13:43 bareos-dir JobId 337: Commanding tape drive to rewind
28-Jun 13:43 bareos-dir JobId 337: Closing tape drive BIEM-PNPJ-HFNB-MJHH-GNEI-CICE-OOIA-APCC@/filer0/volume0%0
28-Jun 13:43 bareos-dir JobId 337: Operation halted, stopping
28-Jun 13:43 bareos-dir JobId 337: Operation ended OKAY
28-Jun 13:43 bareos-dir JobId 337: Now processing lmdb database
28-Jun 13:43 bareos-dir JobId 337: Insert of attributes batch table with 800001 entries start
28-Jun 13:43 bareos-dir JobId 337: Insert of attributes batch table done
28-Jun 13:43 bareos-dir JobId 337: Insert of attributes batch table with 800001 entries start
28-Jun 13:43 bareos-dir JobId 337: Insert of attributes batch table done
28-Jun 13:43 bareos-dir JobId 337: Processing lmdb database done
28-Jun 13:43 bareos-dir JobId 337: NDMP Environment: ACL_START=1076763765760
28-Jun 13:43 bareos-dir JobId 337: NDMP Environment: ENHANCED_DAR_ENABLED=T
28-Jun 13:43 bareos-dir JobId 337: NDMP Environment: BASE_DATE=-1
28-Jun 13:43 bareos-dir JobId 337: NDMP Environment: PATHNAME_SEPARATOR=/
28-Jun 13:43 bareos-dir JobId 337: NDMP Environment: NDMP_VERSION=4
28-Jun 13:43 bareos-dir JobId 337: NDMP Environment: MULTI_SUBTREE_NAMES=dirA
dirB
dirC
dirD
dirE
If MULTI_SUBTREE_NAMES is to long director can die. The returned directory list is shown correctly in the logs (not truncated). I think the NetApp has sent the environment back correctly, but have not arranged to get a packet capture of the network messages in this case.
Successfully completed jobs continue on with the rest of the returned environment, so example:
29-Jun 12:43 bareos-dir JobId 352: NDMP Environment: DMP_NAME=netapp_j0
29-Jun 12:43 bareos-dir JobId 352: NDMP Environment: FILESYSTEM=/filer0/volume0
29-Jun 12:43 bareos-dir JobId 352: NDMP Environment: UPDATE=Y
29-Jun 12:43 bareos-dir JobId 352: NDMP Environment: LEVEL=0
29-Jun 12:43 bareos-dir JobId 352: NDMP Environment: TYPE=dump
29-Jun 12:43 bareos-dir JobId 352: NDMP Environment: HIST=Y
29-Jun 12:43 bareos-sd JobId 352: Elapsed time=02:45:59, Transfer rate=73.46 M Bytes/second
Thanks in advance for any suggestions,
Tom Rockwell