This function submits array jobs to LDF clusters.
The input has to be a function that can carry out a full computation itself, plus a data.frame where each row represents the inputs that this function is expecting. The input data.frame is dumped to a file, the input function is wrapped inside an automatically generated R script that gathers inputs from the command line. A SLURM submission script is generated for a bash shell. The function can also run the SLURM command `sbatch` to submit the job, or just generate the required files and prompt the user to submit the job via the shell. SLURM parameters can be provided as a list of parameters, similarly modules and custom filenames for the generated scripts.
Usage
run_SLURM(
FUN,
PARAMS,
SBATCH_config = default_SBATCH_config(),
modules = c("R/4.1.0"),
extra_commands = NULL,
input_file = "EASYPAR_SLURM_input_jobarray.csv",
R_script = "EASYPAR_SLURM_Run.R",
Submission_script = "EASYPAR_SLURM_submission.sh",
output_folder = ".",
per_task = 1,
N_simultaneous_jobs = NULL,
run = FALSE
)
Arguments
- FUN
A function that takes any arguments in input, and performs a computation. This function should be runnable as a standalone R script.
- PARAMS
A data.frame where each row represents inputs for
FUN
. An array job with as many rows asPARAMS
is generated.- SBATCH_config
A list of SBATCH commands for the SLURM cluster should be provided. The default input is obtained from a call to
default_SBATCH_config()
. The queue and the project ID should always be provided as they are cluster-specific. Otherwise, default values will prompt errors submitting the job.- modules
A list of modules that will be added as dependencies of the SLURM submission script. For instance
modules = 'R/3.5.0'
will generate the dependecy for a specific R version as"module load R/3.5.0"
.- extra_commands
Extra set of commands that will be executed in the submission script right after modules declaration.
- input_file
The name of the data.frame input file that is generated from
PARAMS
. This file contains no header, and no row names.- R_script
The name of the R script file that contains the definition of
FUN
, and some other autogenerated R code to call the function with input parameters from the command line. FunctionFUN
is given a fake name in this script.- Submission_script
The name of the SLURM script file that contains the submission routines.
- output_folder
The output of thsi function will be sent to this folder.
- run
If `TRUE`, the function all attempt invoking `SBATCH` and submit the array jobs. Otherwise it will print to screen the instructions to run the job manually through the console.
Value
Nothing, this funciton just generates the required inputs to submit an array job via the SLURM clusters. If required, it also attempts submitting the jobs.
Note
The queue and the project ID in `SBATCH_config` should always be provided as they are cluster-specific. Default values will prompt errors submitting the job. Besides, we have found that automatic job submission can sometimes generate some `command not found` types of errors. Manual submission seems generally the safest option to submit SLURM jobs.
Examples
# very dummy example function
FUN = function(x, y){ print(x, y) }
# input for 25 array jobs
PARAMS = data.frame(x = runif(25), y = runif(25))
if (FALSE) {
# call - not run since it's cluster-specific
run_SLURM(FUN, PARAMS)
}