Journaling questions

16 views
Skip to first unread message

Kevin Toppenberg

unread,
Jan 29, 2026, 7:00:34 PM (13 days ago) Jan 29
to Everything MUMPS
I am working through the journaling portion of the Acculturation guide .  One of the links took me into the AdminOpsGuide 

When doing Backwards recovery, the text reads:

Backward recovery restores a journaled database to a prior state. Backward processing starts by rolling back updates to a checkpoint (specified by the SINCE option) prior to the desired state and replaying database updates forward till the desired state.

This confuses me, because this is how our vistaStart script works on our production system, used by systemctl to bring up the service. 

#--------------------------------------------------
# If a database is shutdown cleanly there shouldn't be anything in the
# journals to replay, so we can run this without worry

if [ -f ${vista_home}/j/mumps.mjl ]; then
  echo "Recovering old journals..."
  $gtm_dist/mupip journal -recover -backward $vista_home/j/mumps.mjl
fi  
$gtm_dist/mupip rundown -region DEFAULT
$gtm_dist/mupip rundown -region TEMPGBL   #//kt 4/2/2023
$gtm_dist/mupip set -journal="enable,on,before,f=$vista_home/j/mumps.mjl" -file $vista_home/g/mumps.dat
#--------------------------------------------------

Notice that there is no SINCE option.  So I would assume that it defaults to some time, but I can't find this in the online CLI help or in the web site.  

Furthermore,  the recovery "rolls back updates to a checkpoint."  So every day we take down our machines for backup.  The next AM, we restart the machine and our startup script is run.  So is it true that every morning, the entire database is rolled back (removing all transactions), and then the transactions are reapplied via the journal files?  At times these journal files can be very very large (esp when I'm installing VistA patches), so I have a hard time believing that they are really removed.  But that is what the documentation seems to be saying. 

And what is a "checkpoint"?  I don't think we manually set any checkpoints.  What is this talking about?  If I search on the page for "checkpoint", much further down on the page I find: 

"An epoch is a checkpoint at which YottaDB creates a state where a database file and its journal file are in complete sync and to which YottaDB can make a consistent recovery or rollback."

So an epoch is a checkpoint, but are they exactly the same, or are there other types of checkpoints?

Another quote that is confusing me:

Backward Recovery (roll back to a checkpoint, optionally followed by a subsequent roll forward)

This sounds like a backwards recovery rolls back to a checkpoint.  And then if the user wants, they can also roll forward, and presumably there is something special needed to achieve this.  But again, looking at our site:

$gtm_dist/mupip journal -recover -backward $vista_home/j/mumps.mjl

I don't see anything that tells it to also subsequently roll forward.  So are we not rolling forward in cases of hardware crash?

Another point of confusion is the chained-nature of journal files.  I read that each journal file can point to it's predecessor.  But due to disk drive size constraints, we have to delete journal files older than 3 days. Here is the part of our script. 

if (( $(find ${vista_home}/j -name '*_*' -mtime +3 -print | wc -l) > 0 )); then
  echo "Deleting old journals..."
  find ${vista_home}/j -name '*_*' -mtime +3 -print -delete
fi 

I guess, with 3 days of journal files, we could recover a database as far back as 3 days.  I guess if the journal file entries don't extend back as far as the last checkpoint (again, I'm not sure of the details of checkpoints), then the backup would fail.  

Summary:
  1. How does the backwards recovery work if there is not SINCE parameter.  Does it just go to the last checkpoint?
  2. What is really happening with rolling back and rolling forward?  Is the mumps.dat file actually changed?  
  3. During a recovery, are only a small part of the journal files used, so it doesn't matter if older files are deleted?
I think I am also going to have questions about managing journal files during replication, as I read that the formats are different and incompatible.  But I'll post about that later. 

Thanks in advance,
Kevin




Reply all
Reply to author
Forward
0 new messages