This function submits array jobs to LSF clusters.
The input has to be a function that can carry out a full computation itself, plus a data.frame where each row represents the inputs that this function is expecting. The input data.frame is dumped to a file, the input function is wrapped inside an automatically generated R script that gathers inputs from the command line. A LSF submission script is generated for a bash shell. The function can also run the LSF command `bsub` to submit the job, or just generate the required files and prompt the user to submit the job via the shell. LSF parameters can be provided as a list of parameters, similarly modules and custom filenames for the generated scripts.
Usage
run_lsf(
FUN,
PARAMS,
BSUB_config = default_BSUB_config(),
modules = c("R/3.5.0"),
extra_commands = NULL,
input_file = "EASYPAR_LSF_input_jobarray.csv",
R_script = "EASYPAR_LSF_Run.R",
Submission_script = "EASYPAR_LSF_submission.sh",
output_folder = ".",
run = FALSE
)
Arguments
- FUN
A function that takes any arguments in input, and performs a computation. This function should be runnable as a standalone R script.
- PARAMS
A data.frame where each row represents inputs for
FUN
. An array job with as many rows asPARAMS
is generated.- BSUB_config
A list of BSUB commands for the LSF cluster should be provided. The default input is obtained from a call to
default_BSUB_config()
. The queue and the project ID should always be provided as they are cluster-specific. Otherwise, default values will prompt errors submitting the job.- modules
A list of modules that will be added as dependencies of the LSF submission script. For instance
modules = 'R/3.5.0'
will generate the dependecy for a specific R version as"module load R/3.5.0"
.- extra_commands
Extra set of commands that will be executed in the submission script right after modules declaration.
- input_file
The name of the data.frame input file that is generated from
PARAMS
. This file contains no header, and no row names.- R_script
The name of the R script file that contains the definition of
FUN
, and some other autogenerated R code to call the function with input parameters from the command line. FunctionFUN
is given a fake name in this script.- Submission_script
The name of the LSF script file that contains the submission routines.
- output_folder
The output of thsi function will be sent to this folder.
- run
If `TRUE`, the function all attempt invoking `bsub` and submit the array jobs. Otherwise it will print to screen the instructions to run the job manually through the console.
Value
Nothing, this funciton just generates the required inputs to submit an array job via the LSF clusters. If required, it also attempts submitting the jobs.
Note
The queue and the project ID in `BSUB_config` should always be provided as they are cluster-specific. Default values will prompt errors submitting the job. Besides, we have found that automatic job submission can sometimes generate some `command not found` types of errors. Manual submission seems generally the safest option to submit LSF jobs.
Examples
# very dummy example function
FUN = function(x, y){ print(x, y) }
# input for 25 array jobs
PARAMS = data.frame(x = runif(25), y = runif(25))
if (FALSE) {
# call - not run since it's cluster-specific
run_lsf(FUN, PARAMS)
}