Torque Resource Manager Download

0 views

Skip to first unread message

Elis Riebow

unread,

Apr 19, 2024, 2:03:13 PM4/19/24

to bleakresccufu

The Torque cluster is a pool of high-end computers (also referred to as compute nodes) managed by a resource manager called Torque and a job scheduler called Moab. Instead of allowing users to login to one computer and run computations freely, user submit their computations in forms of jobs to the Torque cluster. A sketch in the picture below summarises how jobs are being managed by the Torque server and scheduled by its companion, the Moab server, to perform computations on the compute nodes in the cluster.

torque resource manager download

Download File → https://t.co/nDWiT30G4k

For optimising the utilisation of the resources of the Torque cluster, certain resource-sharing and job prioritisation policies are applied to jobs submitted to the cluster. The implications to users can be seen from the the three aspects: job queues, throttling policies for resource usage and job prioritisation.

In the cluster, several job queues are made available in order to arrange jobs by resource requirements. Those queues are summarised in the table below. Queues are mainly distinguished by the wall time and memory limitations. Some queues, such as matlab, vgl and interactive, have their own special purpose for jobs with additional resource requirements.

In the Torque cluster at DCCN, throttle policies are applied to limit the amount of resources an user can allocate at the same time. It is to avoid resources of the entire cluster being occupied by a single user. The policies are defined in two scopes:

The cluster-wise policies overrule the queue-wise policies. It implies that if the resource utilisation of your current running jobs reaches one of the cluster-wise limitations, your additional jobs have to wait in the queue even there are still available resources in the cluster and you are not rearching the queue-wise limitations.

The qsub command is used to submit jobs to the Torque job manager. The first and simplest way of using qsub is pipelining a command-line string to it. Assuming that we want to display the hostname of the compute node on which the job will run, we issue the following command:

Here we echo the command we want to run (i.e. /bin/hostname -f) as a string, and pass it to qsub as the content of our job. In addition, we also request for resources of 1 processor with 128 megabytes RAM for a walltime of 10 minute, using the -l option.

Note: the resource usage of interactive job is also monitored by the Torque system. The job will be killed (i.e. you will be kicked out the shell) when the computation runs over the amount of the resources requested at the job submission time.

Each job submitted to the cluster comes with a resource requirement. The job scheduler and resource manager of the cluster make sure that the needed resources are allocated for the job. To allow the job to complete successfully, it is important that a right and sufficient amount of resources are specified at the job submission time.

When submitting jobs with the qsub command, one uses the -l option to specify required resources. The value of the -l option follows certain syntax. Detail of the syntax can be found on the Torque documentation. Hereafter are few useful, and mostly used examples for jobs requiring:

The examples below only show the option of the qsub command for resource specification (-l); therefore they are NOT complete commands. You need to make the command complete by adding either a -I option for an interactive job or passing a script to be run as a batch job.

Here we explicitly ask for 500gb of job-specific scratch space on the compute node. This could for instance be asked for when submitting an fmriprep job that requires lots of local diskspace for computation. The more jobs are running, the longer it can take for torque to find a node with enough free diskspace to run the job. Max to request for is 3600gb.

As we have mentioned, every job has attributes specifying the required resources for its computation. Based on those attributes, the job scheduler allocates resources for jobs. The more precise these requirement attributes are given, the more efficient the resources are used. Therefore, we encourage all users to estimate the resource requirements before submitting massive jobs to the cluster.

Computing resources in the cluster are reserved for jobs in terms of size (e.g. amount of requested memory and CPU cores) and duration (e.g. the requested walltime). Under-estimating the requirement causes job to be killed before completion and thus the resources have been consumed by the job were wasted; while over-estimating blocks resources from being used efficiently.

If your analysis tool (or script) is commonly used in your research field, consulting with your colleagues might be just an efficient way to get a general idea about the resource requirement of the tool.

Adjust the rough requirement gradually based on the usage information and resubmit the test job with the new requirement. In few iterations, you will be able to determine the actual usage of your analysis job. A rule of thumb for specifying the resource requirement for the production jobs is to add on top of the actual usage a 1020% buffer as a safety margin.

During the Oct.24-28 maintenance window, GACRC implemented the Slurm softwarefor job scheduling and resource management on Sapelo2. Slurm replaced the Torque (PBS) resource manager and the Moab scheduling system that Sapelo2 used before then.

As with Torque, job options and resource requests in Slurm can be set in the job submission script or as options to the job submission command. However, the syntax used to request resources is different and the table below summarizes some of the options that are frequently used.

In our set up, we keep nvidia-smi running in daemon mode and set each GPU to compute exclusive, so that it will permit at most one context per GPU. We added a consumable resource to the scheduler representing the number of GPUs per node (which is either 0 or 1 in our case), and then jobs which require a GPU specify that resource. This makes the scheduler put jobs on GPU nodes that specify they need a GPU, and queues jobs when no GPUs are free until the required number becomes available. When there are no GPU jobs on the cluster, the CUDA nodes just behave like regular nodes.

You can check the current status of your submitted jobs and their job ids with the following shell command. The most common states for a job are queued Q (job waits for free nodes), running R (the jobscript is currently being executed) or on hold H (job is currently stopped, but does not wait for resources). The command also shows the elapsed time since your job has started running and the time limit.

I also agree that all are more or less fine once they're up and working, and the main way to decide which to use would be to either (a) just pick something future users are familiar with, or (b) pick some very specific things you want to be able to accomplish with the resource manager/scheduler and start finding out which best support those features/workflows.

As for Condor, I've never seen it used within a cluster; it was designed back in the day for farming out jobs between diverse resources (e.g., workstations after hours) and would have a lot of overhead for working within a homogeneous cluster. Scheduling jobs between clusters, maybe?

I'm not a huge rocks fan personally, but one huge advantage, especially (but not only) if you have researchers who use XSEDE compute resources in the US, is that you can use the XSEDE campus bridging rocks rolls which bundle up a large number of relevant software packages as well as the cluster management stuff. That also means that you can directly use XSEDEs extensive training materials to help get the cluster's new users up to speed.

Use the qstat command to check the status of your jobs. You can see whether your job is queued or running, along with information about requested resources. If the job is running you can see elapsed time and resources used.

Situations may arise in which you want to delete one of your jobs from the PBS queue. Perhaps you set the resource limits incorrectly, neglected to copy an input file, or had incorrect or missing commands in the batch file. Or maybe the program is taking too long to run (infinite loop). The PBS command to delete a batch job is qdel. It applies to both queued and running jobs.

This process requires software in two different roles: the resource manager, responsible for accepting jobs to the queue and running jobs on worker nodes, and the scheduler, responsible for deciding when and where jobs in the queue should be run in order to optimize resources. I'll be using Torque for the resource manager and Maui for the scheduler. Both of these are open source projects.

Because torque branched off from PBS, it still retains a lot of the old commands and names. PBS stands for portable batch system, and from here, I'll still call it torque, but commands may have "pbs" in them rather than "torque".

Now we need to install a smaller version of torque, called pbs_mom, on all of the worker nodes. Move back into the directory we untarred earlier, /usr/local/src/torque*. There's a handy way to create the packages for the torque clients. Run

You'll see some new files in the directory now if you run an ls. The one we're interested in is torque-package-mom-linux-*.sh where the * is your architecture. We need to copy that file to all the the worker nodes. You can either copy it over to a shared NFS mount, or see my Cluster Time-saving Tricks on how to copy a file to all the nodes using the rsync command. I'm copying it over to my NFS mount with

Jobs are submitted to the job queue run by torque, which maui monitors and will then schedule, and torque will tell the pbs_mom client running on the worker node that maui picks to run the job. Jobs are submitted to torque with the qsub command.

Qsub can also take input in the form of files. These files can give all sorts of specifications to torque about how long the job will run and what resources it needs. (To learn more about qsub submission files, see Torque Qsub Scripts.) We'll write just a simple one. Open your favorite text editor and enter the contents of my Standard Output/Error For Loop Script and save this file to submission. This script has a simple for loop that runs from 1 to 10. If the number is less than 5, it will print a statement to standard output. If the number is greater than or equal to 5, it will print a statement to standard error.