*************************
Creating a slurm template
*************************

We tried to simplify as much as we could the process of generating a template 
that can be used by pyssianutils to generate slurm files. In this example we 
will show the steps of how the already packaged slurm script "example" was 
included.

.. contents::
   :local:

1. Find/create a script that already works
------------------------------------------

For this task we do not need something generic, we need something that we can 
test with a dummy gaussian calculation, that we can submit quickly and see 
if there are any errors in the submission or not. If the HPC has documentation 
available for their users they typically have working slurm scripts for 
running gaussian calculations, but sometimes you have to make your own. We 
already had some experience so we started from the following one: 

.. code-block:: bash
   :caption: test.slurm

   #!/bin/bash
   #SBATCH -c 2
   #SBATCH --mem 4GB
   #SBATCH -t 00:10:00
   #SBATCH --partition generic
   #SBATCH --job-name test
   #SBATCH -o %x.slurm.o%j
   #SBATCH -e %x.slurm.e%j
   
   #Set up gaussian environment
   module load gaussian/g16
   
   export GAUSS_SCRDIR=${SCRATCHGLOBAL}/job_${SLURM_JOBID}
   mkdir -p ${GAUSS_SCRDIR}; 
   
   job_name=test;
   
   g16 < ${job_name}.com > ${job_name}.log
   
   rm -r ${GAUSS_SCRDIR};

2. Duplicate any already existing brakets
-----------------------------------------

We are going to profit of python's format strings for the template creation, 
so in order to ensure that the curly brakets that are already in the slurm script
are retained we need to duplicate them: 

.. code-block:: none
   :caption: Duplicating Curly Brakets

   #!/bin/bash
   #SBATCH -c 2
   #SBATCH --mem 4GB
   #SBATCH -t 00:10:00
   #SBATCH --partition generic
   #SBATCH --job-name test
   #SBATCH -o %x.slurm.o%j
   #SBATCH -e %x.slurm.e%j
   
   #Set up gaussian environment
   module load gaussian/g16
   
   export GAUSS_SCRDIR=${{SCRATCHGLOBAL}}/job_${{SLURM_JOBID}}
   mkdir -p ${{GAUSS_SCRDIR}}; 
   
   job_name=test;
   
   g16 < ${{job_name}}.com > ${{job_name}}.log
   
   rm -r ${{GAUSS_SCRDIR}};


3. Identify the parts that you may want to change
-------------------------------------------------

Typically we want to be able to change the number of **cores**, **memory** and
**partition**. Also, if there are various versions of gaussian in the HPC, we 
may be interested in being able to switch those also which involves changing the
:code:`module load gaussian/g16` line as well as maybe changing the name of the 
gaussian executable :code:`g16`. Finally, the name of the job would also be 
interesting. 

4. Update the special parameters
--------------------------------

We will start by replacing all parameters by :code:`{parameter}`, however
from the previously identified parameters, pyssian only gives a special treatment
to the partition (as it becomes a required keyword for the template) and the 
module loading and the executable. These require a special name so we will first
update those: 

.. code-block:: none
   :caption: Special Parameters update

   #!/bin/bash
   #SBATCH -c 2
   #SBATCH --mem 4GB
   #SBATCH -t 00:10:00
   #SBATCH --partition {partition}
   #SBATCH --job-name test
   #SBATCH -o %x.slurm.o%j
   #SBATCH -e %x.slurm.e%j
   
   #Set up gaussian environment
   module load {moduleload}
   
   export GAUSS_SCRDIR=${{SCRATCHGLOBAL}}/job_${{SLURM_JOBID}}
   mkdir -p ${{GAUSS_SCRDIR}}; 
   
   job_name=test;
   
   {gauexe} < ${{job_name}}.com > ${{job_name}}.log
   
   rm -r ${{GAUSS_SCRDIR}};

Next we can replace the remaining ones:

.. code-block:: none
   :caption: Other Parameters update

   #!/bin/bash
   #SBATCH -c {cores}
   #SBATCH --mem {memory}
   #SBATCH -t {walltime}
   #SBATCH --partition {partition}
   #SBATCH --job-name {jobname}
   #SBATCH -o %x.slurm.o%j
   #SBATCH -e %x.slurm.e%j
   
   #Set up gaussian environment
   module load {moduleload}
   
   export GAUSS_SCRDIR=${{SCRATCHGLOBAL}}/job_${{SLURM_JOBID}}
   mkdir -p ${{GAUSS_SCRDIR}}; 
   
   job_name={jobname};
   
   {gauexe} < ${{job_name}}.com > ${{job_name}}.log
   
   rm -r ${{GAUSS_SCRDIR}};

5. check the template
---------------------

Now we will proceed to save the file and check if the contents are correct and 
if all the parameters are detected. Assuming we save the file with the name 
:code:`newtemplate.txt` we would run: 

.. code:: shell-session

   $ pyssianutils slurm check-template newtemplate.txt
      Keywords found in template newtemp.txt:
      * cores
      * gauexe
      * jobname
      * memory
      * moduleload
      * partition
      * walltime
   /home/user/somewhere/pyssianutils/submit/slurm.py:186: UserWarning:
   
   Key "in_suffix" was not found in the template
   
   /home/user/somewhere/pyssianutils/submit/slurm.py:186: UserWarning:
   
   Key "out_suffix" was not found in the template

First we ensure that all the parameters that we added were detected, which in 
this case those are correct. Next, we observe that there were no errors, but 
two warnings. This means that we could use the template as is, but it will 
probably benefit us if we include the :code:`in_suffix` and the :code:`out_suffix`
these two will be added based on our user defaults in pyssianutils, specifically
in the "common" section. So including them as parameters in the template could
potentially be usefull if one day we decide to start using .out instead of .log
for our gaussian outputs. Thus we modify again the template: 

.. code-block:: none
   :caption: Adding in_suffix and out_suffix

   #!/bin/bash
   #SBATCH -c {cores}
   #SBATCH --mem {memory}
   #SBATCH -t {walltime}
   #SBATCH --partition {partition}
   #SBATCH --job-name {jobname}
   #SBATCH -o %x.slurm.o%j
   #SBATCH -e %x.slurm.e%j
   
   #Set up gaussian environment
   module load {moduleload}
   
   export GAUSS_SCRDIR=${{SCRATCHGLOBAL}}/job_${{SLURM_JOBID}}
   mkdir -p ${{GAUSS_SCRDIR}}; 
   
   job_name={jobname};
   
   {gauexe} < ${{job_name}}{in_suffix} > ${{job_name}}{out_suffix}
   
   rm -r ${{GAUSS_SCRDIR}};

Now we re-run the check 

.. code:: shell-session

   $ pyssianutils slurm check-template newtemplate.txt
   Keywords found in template newtemp.txt:
       * cores
       * gauexe
       * in_suffix
       * jobname
       * memory
       * moduleload
       * out_suffix
       * partition
       * walltime

This time we get no warnings. Now we request the generation of a json file.


.. code:: shell-session

   $ pyssianutils slurm check-template newtemplate.txt --json newtemplate.json
   Keywords found in template newtemp.txt:
       * cores
       * gauexe
       * in_suffix
       * jobname
       * memory
       * moduleload
       * out_suffix
       * partition
       * walltime

It will generate the file: 

.. code-block:: json
   :caption: newtemplate.json

   {
       "defaults": {
           "partition": "default",
           "module": "default",
           "optionwithchoices": "choice0",
           "cores": "default",
           "jobname": "default",
           "memory": "default",
           "walltime": "default"
       },
       "descriptions": {
           "partition": "partition name of the HPC",
           "module": "alias for the executable and environment modules",
           "optionwithchoices": "Description of the option",
           "cores": "description of cores",
           "jobname": "description of jobname",
           "memory": "description of memory",
           "walltime": "description of walltime"
       },
       "partition": {
           "default": {
               "name": "default",
               "max_walltime": "00-00:00:00",
               "mem_per_cpu": 2000
           }
       },
       "module": {
           "default": {
               "exe": "g16",
               "load": "module0 gaussianmodule/version"
           }
       },
       "optionwithchoices": {
           "choice0": "choice0value",
           "choice1": "choice1value"
       }
   }

6. Personalize the template
---------------------------

The personalization of the template happens at the json file that we just 
generated. Although it might look scary it is easier to change than expected. 
We will open it with a text editor and modify the corresponding text. 

We will start by adding information about the HPC, lets say that there are two 
partitions available: short and long with max walltimes of 6h and 5 days 
respectively and both allow 4096 MB per core ( If we do not know this number 
we can use either 1000 or 2000 and those are typically below the actual maximum
memory per cpu in modern HPCs) 

.. code-block:: json
   :caption: partitions

       "partition": {
           "short": {
               "name": "short",
               "max_walltime": "00-06:00:00",
               "mem_per_cpu": 2000
           },
           "long": {
               "name": "long",
               "max_walltime": "05-00:00:00",
               "mem_per_cpu": 2000
           }
       }

Next we can add the different gaussian versions available and how to load them,
lets say that the hypothetical cluster has g09 and g16 and for both you also  
need to load a module named "presets". 

.. code-block:: json
   :caption: modules

       "module": {
           "g09": {
               "exe": "g09",
               "load": "presets gaussian/09"
           },
           "g16": {
               "exe": "g16",
               "load": "presets gaussian/16"
           }
       },

Now lets say that we want to restrict the calculations that we use to 
use only 2,8 or 16 cores. We can specify said constraint by adding : 

.. code-block:: json
   :caption: cores

       "cores": [2,8,16]

or if we want to give more other names to the options we can use the template 
"optionwithchoices" that is provided in the generated json

.. code-block:: json
   :caption: cores with names

           "cores": {
           "small": 2,
           "medium": 8,
           "large": 16
       }

If we want we can remove the descriptions, but those are what will appear in the
command line so it might be usefull to have simple descriptions of the parameters

.. code-block:: json
   :caption: descriptions

    "descriptions": {
        "cores": "size of the molecule for the calculation",
        "partition": "Partition name / Queue that will be used for the calculation",
        "module": "alias of the gaussian version and how to load it",
        "walltime": "Time requested for the calculation",
        "jobname": "default name for the job"
    }


Finally, we need to update the default values in case we ever become lazy and 
do not want to specify every single value through the command line. It is 
important that for those options that we have a small subset of choices, the 
default value matches one of the choices. All together it would look like: 

.. code-block:: json
   :caption: final.json

   {
       "defaults": {
           "partition": "short",
           "module": "g16",
           "cores": "small",
           "jobname": "dummy_name",
           "memory": "2GB",
           "walltime": "00-03:00:00"
       },
       "descriptions": {
           "cores": "size of the molecule for the calculation",
           "partition": "Partition name / Queue that will be used for the calculation",
           "module": "alias of the gaussian version and how to load it",
           "walltime": "Time requested for the calculation",
           "jobname": "default name for the job"
       },
       "partition": {
           "short": {
               "name": "short",
               "max_walltime": "00-06:00:00",
               "mem_per_cpu": 2000
           },
           "long": {
               "name": "long",
               "max_walltime": "05-00:00:00",
               "mem_per_cpu": 2000
           }
       },
       "module": {
           "g09": {
               "exe": "g09",
               "load": "presets gaussian/09"
           },
           "g16": {
               "exe": "g16",
               "load": "presets gaussian/16"
           }
       },
       "cores": {
           "small": 2,
           "medium": 8,
           "large": 16
       }
   }

7. Check that both files match
------------------------------

Now that we have the slurm and the json match. We run again the check template

.. code:: shell-session

   $ pyssianutils slurm check-template newtemplate.txt --json newtemplate.json
   Keywords found in template newtemp.txt:
       * cores
       * gauexe
       * in_suffix
       * jobname
       * memory
       * moduleload
       * out_suffix
       * partition
       * walltime
   Templates are compatible

Note that a new message, indicating that both templates are compatible has 
appeared.

8. Add the new template to your templates
-----------------------------------------

Finally we need to store it with out pyssianutils user data. For that we simply 
have to run: 

.. code:: shell-session

   $ pyssianutils slurm add-template newtemplate.txt myhpc --json newtemplate.json
   Successfully added /home/user/.pyssianutils/templates/slurm/myhpc.txt
   Successfully added /home/user/.pyssianutils/templates/slurm/myhpc.json

If we now type in: 

.. code:: shell-session 
   
   $ pyssianutils slurm 
   Available slurm templates:
       example
       myhpc

the newly added template shows up, meaning that we can now run 

.. code:: shell-session

   $ pyssianutils slurm myhpc --help 
   usage: pyssianutils myhpc [-h] [-l | -r] [-o OUTDIR] [--suffix SUFFIX] [-ow | --skip] [--memory MEMORY | --memory-per-cpu]
                             [--walltime WALLTIME | --use-max-walltime] [--guess-cores] [--guess-mem] [--add-memory | --rm-memory]
                             [--add-nprocs | --rm-nprocs] [--partition {short,long}] [--module {g09,g16}]
                             [--cores {small,medium,large}] [--jobname JOBNAME]
                             [inputfiles ...]
   
   positional arguments:
     inputfiles            Gaussian input files. If none is provided, it will create a 'dummy_job.slurm' to use as template.
   
   options:
     -h, --help            show this help message and exit
     -l, --listfile        When enabled instead of considering the files provided as the gaussian output files considers the file
                           provided as a list of gaussian output files
     -r, --folder          Takes the folder and its subfolder hierarchy and creates a new folder with the same subfolder structure.
                           Finds all the .log, attempts to find their companion .com files and creates the new inputs in their
                           equivalent locations in the new folder tree structure.
     -o OUTDIR, --outdir OUTDIR
                           Where to create the new files, defaults to the current directory
     --suffix SUFFIX       suffix of the generated files
     -ow, --overwrite      When creating the new files if a file with the same name exists overwrites its contents. (The default
                           behaviour is to raise an error to notify the user before overwriting).
     --skip                Skip the creation of slurm templates that already exist
     --memory MEMORY       Memory requested for the calculation. If None is provided it will attempt to guess it from the gaussian
                           input file.
     --memory-per-cpu      It will use the max memory per cpu of the partition
     --walltime WALLTIME   Fixed value of walltime in DD-HH:MM:SS format. If none is provided it will use the default value of
                           '00-03:00:00'
     --use-max-walltime    If enabled, use the selected partition's max walltime
     --guess-cores         attempt to guess the number of cores from the gaussian input file
     --guess-mem           attempt to guess the memory from the gaussian input file
     --partition {short,long}
                           Partition name / Queue that will be used for the calculation
     --module {g09,g16}    alias of the gaussian version and how to load it
     --cores {small,medium,large}
                           size of the molecule for the calculation
     --jobname JOBNAME     default name for the job
   
   inplace:
     Arguments to modify in-place the provided gaussian input files
   
     --add-memory          Add the "%mem" Link0 option to the provided files
     --rm-memory           Remove the "%mem" Link0 option of the provided files
     --add-nprocs          Add the "%nprocshared" Link0 option to the provided files
     --rm-nprocs           Remove the "%nprocshared" Link0 option of the provided files

that shows all options that we can do with our new template but also we can now
use it to generate slurm scripts for gaussian calculations, lets say we have 
the file myfile.com, which does not have the :code:`%nprocshared` set but has 
the :code:`%mem` set up to 48GB. We can now: 

.. code:: shell-session

   $ pyssianutils slurm myhpc --cores small --guess-mem myfile.com --add-nprocs --use-max-walltime
   Creating file myfile.slurm

The contents of :code:`myfile.slurm` will be now: 

.. code:: bash

   #!/bin/bash
   #SBATCH -c 2
   #SBATCH --mem 48GB
   #SBATCH -t 00-06:00:00
   #SBATCH --partition short
   #SBATCH --job-name myfile
   #SBATCH -o %x.slurm.o%j
   #SBATCH -e %x.slurm.e%j
   
   #Set up gaussian environment
   module load presets gaussian/16
   
   export GAUSS_SCRDIR=${SCRATCHGLOBAL}/job_${SLURM_JOBID}
   mkdir -p ${GAUSS_SCRDIR};
   
   job_name=myfile;
   
   g16 < ${job_name}.com > ${job_name}.log
   
   rm -r ${GAUSS_SCRDIR};