This confuses me, because this is how our vistaStart script works on our production system, used by systemctl to bring up the service.
#--------------------------------------------------
# If a database is shutdown cleanly there shouldn't be anything in the
# journals to replay, so we can run this without worry
if [ -f ${vista_home}/j/mumps.mjl ]; then
echo "Recovering old journals..."
$gtm_dist/mupip journal -recover -backward $vista_home/j/mumps.mjl
fi
$gtm_dist/mupip rundown -region DEFAULT
$gtm_dist/mupip rundown -region TEMPGBL #//kt 4/2/2023
$gtm_dist/mupip set -journal="enable,on,before,f=$vista_home/j/mumps.mjl" -file $vista_home/g/mumps.dat
#--------------------------------------------------
Notice that there is no SINCE option. So I would assume that it defaults to some time, but I can't find this in the online CLI help or in the web site.
Furthermore, the recovery "rolls back updates to a checkpoint." So every day we take down our machines for backup. The next AM, we restart the machine and our startup script is run. So is it true that every morning, the entire database is rolled back (removing all transactions), and then the transactions are reapplied via the journal files? At times these journal files can be very very large (esp when I'm installing VistA patches), so I have a hard time believing that they are really removed. But that is what the documentation seems to be saying.
And what is a "checkpoint"? I don't think we manually set any checkpoints. What is this talking about? If I search on the page for "checkpoint", much further down on the page I find:
"An epoch is a checkpoint at which YottaDB creates a state where a
database file and its journal file are in complete sync and to which
YottaDB can make a consistent recovery or rollback."
So an epoch is a checkpoint, but are they exactly the same, or are there other types of checkpoints?
Another quote that is confusing me:
Backward Recovery (roll back to a checkpoint, optionally followed by a subsequent roll forward)This sounds like a backwards recovery rolls back to a checkpoint. And then if the user wants, they can also roll forward, and presumably there is something special needed to achieve this. But again, looking at our site:
$gtm_dist/mupip journal -recover -backward $vista_home/j/mumps.mjl
I don't see anything that tells it to also subsequently roll forward. So are we not rolling forward in cases of hardware crash?
Another point of confusion is the chained-nature of journal files. I read that each journal file can point to it's predecessor. But due to disk drive size constraints, we have to delete journal files older than 3 days. Here is the part of our script.
if (( $(find ${vista_home}/j -name '*_*' -mtime +3 -print | wc -l) > 0 )); then
echo "Deleting old journals..."
find ${vista_home}/j -name '*_*' -mtime +3 -print -delete
fi
I guess, with 3 days of journal files, we could recover a database as far back as 3 days. I guess if the journal file entries don't extend back as far as the last checkpoint (again, I'm not sure of the details of checkpoints), then the backup would fail.
Summary:
- How does the backwards recovery work if there is not SINCE parameter. Does it just go to the last checkpoint?
- What is really happening with rolling back and rolling forward? Is the mumps.dat file actually changed?
- During a recovery, are only a small part of the journal files used, so it doesn't matter if older files are deleted?
I think I am also going to have questions about managing journal files during replication, as I read that the formats are different and incompatible. But I'll post about that later.
Thanks in advance,
Kevin