Well, yes and no. The problem is that a large file, say 2Gb, might
take 15 minutes to backup! If that file is being written to, what do you
have when you are done? If it's a sequential file and data is being
written at the end you might think it's ok but I wouldn't trust it. BACKUP
retrieves data IN THE ORDER IN WHICH THE BLOCKS APPEAR ON DISK. There is
no guarantee that the beginning of the file is retrieved first and the end
last. If the file is fragmented it might be retrieved, worst case, in
something approximating reverse order!!!
If you want a file backed up with assurance that it will make sense
when you are done; close it first.
If you have an application that must be up 24x7, you have a real
problem. If you can shadow or mirror the disk (software or hardware RAID
1) and if you can shut down the application for a couple of minutes, shut
it down. dismount the raid set, remount it minus one member, and back up
that member you removed. This member had all files flushed to disk and
closed when you dismounted it and it is a consistant snapshot of the RAID
set at the moment you dismounted it.
RAID is just about a requirement for such an application anyway otherwise a
hardware failure would take you down for a few hours!
I may well be wrong, but...
Are you sure about this? Why would BACKUP copy data in any order but
from beginning to end? The files are copied in directory order, even in
an image backup (try BACKUP/LIST or a BACKUP/LOG restore from tape).
And once on tape, each file is in only one extent, so I can't see how
the data would be written out of order on a tape. OTOH, I'm not sure
about a disk-to-disk image backup; I've not done too many of those and
it's kind of hard to see what's happening while it is running. But even
in that case, the files will be contiguous on the output volume.
On the third hand :-), I do suppose that BACKUP might copy different
extents asynchronously and then put them together for the output
volume, but wouldn't that just complicate things?
> If you want a file backed up with assurance that it will make
sen=
> se
> when you are done; close it first.
Agreed. I would also worry about files containing pointers to data in
other files becoming inconsistent.
If your disk isn't in a quiescent state during the backup, and unless
you very good reason to believe otherwise, there is risk of getting
inconsistent data in the BACKUP save set.
[raid backup method omitted]
--
NOTE: If you wish to e-mail me, please do NOT use the deja address. It
is not a valid address. Instead, use the address below, removing the
long wrong part first. Thanks.
Disclaimer: JMHO
Alan E. Feldman &-)
afel...@gfigroup.ButItSaidItPrinted.com
Sent via Deja.com
http://www.deja.com/
Are you sure about this? Why would BACKUP copy data in any order but
from beginning to end? The files are copied in directory order, even in
an image backup (try BACKUP/LIST or a BACKUP/LOG restore from tape).
And once on tape, each file is in only one extent, so I can't see how
the data would be written out of order on a tape. OTOH, I'm not sure
about a disk-to-disk image backup; I've not done too many of those and
it's kind of hard to see what's happening while it is running. But even
in that case, the files will be contiguous on the output volume.
On the third hand :-), I do suppose that BACKUP might copy different
extents asynchronously and then put them together for the output
volume, but wouldn't that just complicate things?
<
On second thought, I'm not sure. Since VMS V5.2, backup has opened as many
files as possible and swept the heads across the entire disk retrieving
blocks from all the files as it goes. Prior to this, backup would open a
file and then bounce the heads all over the disk reading all the blocks of
a single file in sequential order. If you think this must have been slow
you are oh so right! The architect of backup recognized that the access
time was the slowest element and did everythng he could to minimize head
movement. If I were doing it, I think I would read the blocks in order
using several passes of the heads to get them all although, given enough
memory, it would be faster to read them in the order they were found and to
sort them into sequential order before writing them to tape.
FWIW, I wouldn't have trusted the old backup to get an consistent view of
an open file either. I move something like 8Gb/hour these days but then
(slower hardware and software) it was maybe 75Mb per hour!
> On second thought, I'm not sure. Since VMS V5.2, backup has opened as ma=
> ny
> files as possible and swept the heads across the entire disk retrieving
> blocks from all the files as it goes. Prior to this, backup would open a=
>
> file and then bounce the heads all over the disk reading all the blocks o=
> f
> a single file in sequential order. If you think this must have been slow=
>
> you are oh so right! The architect of backup recognized that the access
> time was the slowest element and did everythng he could to minimize head
> movement. If I were doing it, I think I would read the blocks in order
> using several passes of the heads to get them all although, given enough
> memory, it would be faster to read them in the order they were found and =
> to
> sort them into sequential order before writing them to tape.
I kind of doubt that backup is worrying about read order. The files seem to be accessed in regular directory (alphabetical) order. Backup DOES open as many files at once as possible, but I assume it lets the disk and driver worry about optimizing the ordering. In fact, I don't see how backup could possibly have the info needed to do this. I think it just makes sure it keeps a lot of I/Os queued to the disk, with the goal of keeping the disk utterly busy all the time.
--
Robert Deininger
rdein...@mindspring.com
Backup may access the files in *any* order and it makes little or
no difference. My argument was that the blocks in the files need not be
accessed in sequential order. The blocks appear to be written to the save
set in sequential order but that does not necessarily have anything to do
with the order in which they are read.
> Message text written by Ken Ho
> When doing an /IMAGE backup under VAX/VMS 7.2, if a file is open for
> write when BACKUP gets to it, does it get reliably backed up *as it
> existed at that time*?
> Well, yes and no. The problem is that a large file, say 2Gb, might
> take 15 minutes to backup! If that file is being written to, what do you
> have when you are done? If it's a sequential file and data is being
> written at the end you might think it's ok but I wouldn't trust it. BACKUP
> retrieves data IN THE ORDER IN WHICH THE BLOCKS APPEAR ON DISK. There is
> no guarantee that the beginning of the file is retrieved first and the end
> last. If the file is fragmented it might be retrieved, worst case, in
> something approximating reverse order!!!
Well, perhaps. Most SCSI and MSCP disks/controllers will do some
degree of optimization, generally with some fairness tweek. The common
optimization is the 'elevator' sequence; go from low to high, then
go back, repeat forever... So backup could get to the extents in either
order. Add to this, that IO will be in /block chunks, and that if you
cross a header extent, the IO is split,... Well, you get the idea ;)
So the time from backup reading block N+1 from block N could be
several seconds. Positive or NEGATIVE...
--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
Raw, Cooked or Well-done, it's all half baked.
> Backup may access the files in *any* order and it makes little or=
>
> no difference. My argument was that the blocks in the files need not be
> accessed in sequential order. The blocks appear to be written to the sav=
> e
> set in sequential order but that does not necessarily have anything to do=
>
> with the order in which they are read.
Fair enough. I have watched Backup's progess with SHOW DEV/FILE, and while I have seen Backup with >100 files open at once, they were always near neighbors in the directory-order sense.
Why would Backup presume to know the optimum read order better than the driver and disk? Backup knows that it's going to read the whole (non empty portion of) the disk; the lower levels don't know that. But there's never room in RAM to buffer the WHOLE disk, so a global best-read-order optimization doesn't seem possible given the constraint that the save set is in order. I guess Backup concentrates on a "few" files at once, and the headers for some more files. In effect he tells the driver, "I'm gonna want this stuff pretty soon, so if you're in the vicinity please read it."
A look at the source code for Backup would answer this, but the only version I have access to is around VMS 5.2. That old version is probably irrelevant for today's Backup. If you've looked at the guts of backup, then clearly you're on the right track and I'm not.
I have the (dis)advantage of having just read Ken Bates' book "VAX I/O Subsystems: Optimizing Performance", which is now 10 years old. My mind is slightly dizzy with all the tricks they put into the high-end controllers. This was before the "cheap and dumb" era of SCSI, so I don't know how much applies to today's storage systems.
--
Robert Deininger
rdein...@mindspring.com