PBSPro submission errors

13 views
Skip to first unread message

Jonathan Chico

unread,
Jul 10, 2019, 9:26:06 AM7/10/19
to aiidausers
Hello aiidausers

I have been working setting up Aiida to run in an elastic cluster hosted in Azure making use of the cyclecloud service. 

The cluster is a CentOS 7 system with has PBSPro 19.1.1-0 as an scheduler which works without problems most of the time, but right now I'm having my calculations failing due to an scheduler problem. The scheduler is augmented with the capability of using slot_types which collides with the usual syntax:
#PBS -l select=1:mpiprocs=1

I have made use of the custom_scheduler_command to try to add the necessary slot_type 
builder.metadata.options.custom_scheduler_commands ='#PBS -l slot_type=execute'

However, this still fails, resulting in the following error
aiida.schedulers.scheduler.SchedulerError: Error during submission, retval=5
stdout=
stderr=qsub: "-lresource=" cannot be used with "select" or "place", resource is: slot_type

I was wondering if it was possible to eliminate the printing the select line from the custom_scheduler_commands or if there is some other option to allow for some of the other PBS syntax in the resource allocation.

Thank you very much

Best Regards


chris sewell

unread,
Jul 10, 2019, 10:06:39 AM7/10/19
to aiida...@googlegroups.com

Hey Jonathon,

 

Probably easiest to just to make a custom subclass of `PbsBaseClass`. I had to do this for the Imperial HPC cluster:

 

```

class PbsproICLScheduler(PbsBaseClass):

    """

    Subclass to support the PBSPro scheduler

    (http://www.pbsworks.com/).

    But altered to fit the Imperial College London cx scheduler spec,

    which requires ncpus and mem to be defines

    See:

    https://www.imperial.ac.uk/admin-services/ict/self-service/research-support/rcs/computing/high-throughput-computing/job-sizing/

 

    I redefine only what needs to change from the base class

    """

 

    def _get_resource_lines(self, num_machines, num_mpiprocs_per_machine,

                            num_cores_per_machine, max_memory_kb,

                            max_wallclock_seconds):

        """

        Return the lines for machines, memory and wallclock relative

        to pbspro.

        """

        # Note: num_cores_per_machine is not used here but is provided by

        #       the parent class ('_get_submit_script_header') method

 

        return_lines = []

 

        select_string = "select={}".format(num_machines)

 

        if (num_mpiprocs_per_machine is not None

                and num_mpiprocs_per_machine > 0):

            select_string += ":ncpus={}".format(num_mpiprocs_per_machine)

        else:

            raise ValueError(

                "num_mpiprocs_per_machine must be greater than 0! "

                "It is instead '{}'".format(num_mpiprocs_per_machine))

 

        if max_wallclock_seconds is not None:

            try:

                tot_secs = int(max_wallclock_seconds)

                if tot_secs <= 0:

                    raise ValueError

            except ValueError:

                raise ValueError("max_wallclock_seconds must be "

                                 "a positive integer (in seconds)! "

                                 "It is instead '{}'"

                                 "".format(max_wallclock_seconds))

            hours = tot_secs // 3600

            tot_minutes = tot_secs % 3600

            minutes = tot_minutes // 60

            seconds = tot_minutes % 60

            return_lines.append(

                "#PBS -l walltime={:02d}:{:02d}:{:02d}".format(

                    hours, minutes, seconds))

 

        if not max_memory_kb:

            max_memory_kb = 1e6  # use a default memory of 1gb

 

        try:

            virtual_memory_gb = int(max_memory_kb * 1e-6)

            if virtual_memory_gb <= 0:

                raise ValueError

        except ValueError:

            raise ValueError("max_memory_kb must be "

                             "a positive integer (in kB) >= 1e6 kb! "

                             "It is instead '{}'"

                             "".format((max_memory_kb)))

        select_string += ":mem={}gb".format(virtual_memory_gb)

 

        return_lines.append("#PBS -l {}".format(select_string))

        return return_lines

 

```

 

Then you can make a small package containing it, and add the entry point to the setup.py:

 

    entry_points={

        "aiida.schedulers": [

            "pbspro_icl = aiidapy_fes.cx1.schedulers.pbspro_icl:PbsproICLScheduler"

--
AiiDA is supported by the NCCR MARVEL (http://nccr-marvel.ch/), funded by the Swiss National Science Foundation, and by the European H2020 MaX Centre of Excellence (http://www.max-centre.eu/).
 
Before posting your first question, please see the posting guidelines at http://www.aiida.net/?page_id=356 .
---
You received this message because you are subscribed to the Google Groups "aiidausers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aiidausers+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aiidausers/b2f6adc6-5077-4e4b-b715-de4b2f6eaa47%40googlegroups.com.

Jonathan Chico

unread,
Jul 10, 2019, 10:18:30 AM7/10/19
to aiidausers
Hi Chris

Yes that is a good solution! 

Thanks for the help.

Cheers

To unsubscribe from this group and stop receiving emails from it, send an email to aiida...@googlegroups.com.

Reply all
Reply to author
Forward
0 new messages