This procedure provides the steps needed to reboot an Avamar single node or multinode server. The procedure uses the avshutdown.pl script to simplify this task and to ensure a successful reboot activity.
1. [ ] Login to the Utility node as user admin:
2. [ ] Once logged into the Avamar Utility node, as the admin user, change directory locations to /tmp.
:~/#: cd /tmp
3. [ ] Using pscp copy the attached avshutdown.pl script onto the Avamar Utility node by copying and pasting the following command:
pscp -pw <password> avshutdown.pl admin@<IP address>:/tmp/avshutdown.pl
Note: In the following example, you will have to accept the security key request by pressing Y to update the cached ssh key.
Example:
C:\AvamarProcedureGenerator\Repository\Tools>pscp -pw <password> avshutdown.pl admin@<IP ADDRESS>:/tmp/avshutdown.pl
avshutdown.pl | 26 kB | 26.8 kB/s | ETA: 00:00:00 | 100%
Task 1: Run the avshutdown.pl Script to suspend garbage collection
Garbage collection should be temporarily suspended the day before the scheduled reboot to ensure that the garbage collect process does not start up during the blackout window that will be used for the reboot.
IMPORTANT: Ensure that garbage collections has completed for the day before suspending the garbage collect task.
The following steps describe how to run the “avshutdown.pl” script.
4. [ ] In the SSH session verify the avshutdown.pl file was copied without corruption by typing:
~/>: md5sum /tmp/avshutdown.pl
The following appears in the command shell:
a7f200ab1e40e17a470b0fd774b94775 /tmp/avshutdown.pl
5. [ ] Modify the permissions of the avshutdown.pl script by typing:
chmod a+x /tmp/avshutdown.pl
6. [ ] Verify that permissions are properly set by typing:
ls -l /tmp/avshutdown.pl
Output will look similar to the following:
ls -l /tmp/avshutdown.pl
-rwxrwxr-x 1 admin admin 24505 Jan 25 16:32 avshutdown.pl
7. [ ] Change directory location to /tmp by typing:
cd /tmp
8. [ ] Run the avshutdown.pl script by typing:
The following appears in the command shell:
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice:
9. [ ] Enter 1 to run “Suspend Garbage Collection”. Output will look similar to the following examples:
Example 1: Garbage collection suspended successfully:
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice: 1
Please Wait - Gathering Info...
Garbage Collection was successfully suspended
Example 2: Garbage collection was previously suspended on this system
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice: 1
Please Wait - Gathering Info...
WARNING: GC is already suspended
Task 2: Prepare system for shutdown
On the day of the scheduled reboot, follow the previous procedure to connect to SSH into the Avamar Utility Node during the blackout window.
10. [ ] Run the avshutdown.pl script by typing:
The following appears in the command shell:
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice:
11. [ ] Enter 2 to begin shutdown of the Avamar software.
The system will now run a set of tests to determine if the Avamar software can be shut down safely. If the system has not completed a recent checkpoint, you will be prompted to create one as shown below. If prompted, enter y to create a checkpoint. Otherwise, skip to the next Step.
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice: 2
Please Wait - Gathering Info...
ERROR: No checkpoint in past 4 hours. Last one is Tue Jan 17 11:46:14 2012
ERROR: No recent checkpoint.
Would you like to take a checkpoint now? (y/n) y
12. [ ] The system will suspend the maintenance scheduler, unload the index caches and begin taking a checkpoint:
Would you like to take a checkpoint now? (y/n) y
Unloading index caches...Please wait this may take some time
Still going...
Checkpoint Started
Processed 1709 of 18624
Processed 8197 of 18624
Processed 11902 of 18624
Processed 17384 of 18624
Processed 18624 of 18624
13. [ ] If the system detects a recent checkpoint, it will run additional tests. Output will look similar to the following examples:
Example 1: One or more tests report an error
PASSED. Last Checkpoint is only <1 hours old
ERROR: No HFSCheck in past 48 hours. Last one is Wed Jan 18 11:08:19 2012
PASSED. Last MCS Backup is only <1 hours old
PASSED. Last MCS Backup is only <1 hours old
PASSED. Filesystem space is 53.3%
PASSED. GSAN space is 49.6%
PASSED. HFS Check is not in waitcgsan mode
Check FAILED. Please contact support
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice:
Example 2: All tests passed
PASSED. Last Checkpoint is only <1 hours old
PASSED. Last HFS Check is only 3 hours old
PASSED. Last MCS Backup is only <1 hours old
PASSED. Last MCS Backup is only <1 hours old
PASSED. Filesystem space is 52.9%
PASSED. GSAN space is 49.6%
PASSED. HFS Check is not in waitcgsan mode
All checks passed
Are you sure you want to shut down the Avamar software? (Y/N)
14. [ ] If all tests passed, enter y to proceed with the shutdown:
Are you sure you want to shut down the Avamar software? (Y/N) y
15. [ ] The script will bring down the Avamar software.
Example output is shown below:
Are you sure you want to shut down the Avamar software? (Y/N) y
Identity added: /home/admin/.ssh/dpnid (/home/admin/.ssh/dpnid)
dpnctl: INFO: Suspending backup scheduler...
dpnctl: INFO: Backup scheduler suspended.
dpnctl: INFO: Checking for active checkpoint maintenance...
dpnctl: INFO: Terminating hfs integrity maintenance (hfscheck)...
dpnctl: INFO: Shutting down EMS...
dpnctl: INFO: EMS shut down.
dpnctl: INFO: Shutting down MCS...
dpnctl: INFO: MCS shut down.
dpnctl: INFO: Shutting down gsan...
dpnctl: INFO: gsan shut down.
dpnctl: INFO: axionfs is already down
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice:
Task 3: Reboot the Avamar Server
16. [ ] Once the software has shut down successfully, enter 4 to reboot the nodes. You will receive a message that the system is going down for reboot. Your session will be disconnected when the utility node reboots.
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice: 4
Please Wait - Gathering Info...
PASSED CHECK POSTMASTER NOT RUNNING
Broadcast message from root (Thu Aug 23 12:40:06 2012):
The system is going down for reboot NOW!
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice:
Task 4: Restart the Avamar Software
17. [ ] After the Avamar Server is back up, reconnect to the utility node and rerun the avshutdown.pl script.
18. [ ] Select 5 to start the Avamar software:
sysdown.pl menu v1.4
=============================
1) Suspend Garbage Collection
2) Shutdown Avamar Software
3) Power Down Nodes
4) Reboot Nodes
5) Startup Avamar Software
q) Quit
Enter your choice: 5
19. [ ] The software will begin startup. When prompted, enter y to confirm that the software should be started: The software may take several hours to start up. Don’t panic.
Please Wait - Gathering Info...
All checks passed. Starting software
Identity added: /home/admin/.ssh/dpnid (/home/admin/.ssh/dpnid)
- - - - - - - - - - - - - - - - - - - -
Action: starting all
Have you contacted Avamar Technical Support to ensure that this
is the right thing to do?
Answering y(es) proceeds with starting all;
n(o) or q(uit) exits
y(es), n(o), q(uit/exit): y
dpnctl: INFO: Checking that gsan was shut down cleanly...
dpnctl: INFO: Restarting the gsan (this may take some time)...
dpnctl: INFO: To monitor progress, run in another window: tail -f /tmp/dpnctl-gsan-restart-output-7088
dpnctl: INFO: Restarting gsan succeeded.
dpnctl: INFO: gsan started.
dpnctl: INFO: Starting MCS...
dpnctl: INFO: To monitor progress, run in another window: tail -f /tmp/dpnctl-mcs-start-output-7088
dpnctl: INFO: MCS started.
dpnctl: INFO: Starting EMS...
dpnctl: INFO: To monitor progress, run in another window: tail -f /tmp/dpnctl-ems-start-output-7088
20. [ ] If prompted to roll the system back, select option 1 to roll back to the most recent checkpoint, whether or not validated.
NOTE: Rolling back to the wrong checkpoint could cause substantial data loss on the system. If in doubt, contact support.
21. [ ] Once the software has restarted, enter q to exit the shutdown script. The system will be returned to a production state automatically.
22. [ ] It is recommended to run a proactive health check to ensure the system is functioning properly. In case of any critical failures (hardware issues, schedulers suspended, etc.), create a new service request. For non-critical issues such as patches, create a new service request for follow up.
23. [ ] Follow up next day to review the status of the grid to ensure the next hfscheck completes successfully. If there are any issues with the hfscheck, create a new service request.