Skip to contents

This function submits array jobs to LSF clusters.

The input has to be a function that can carry out a full computation itself, plus a data.frame where each row represents the inputs that this function is expecting. The input data.frame is dumped to a file, the input function is wrapped inside an automatically generated R script that gathers inputs from the command line. A LSF submission script is generated for a bash shell. The function can also run the LSF command `bsub` to submit the job, or just generate the required files and prompt the user to submit the job via the shell. LSF parameters can be provided as a list of parameters, similarly modules and custom filenames for the generated scripts.

Usage

run_lsf(
  FUN,
  PARAMS,
  BSUB_config = default_BSUB_config(),
  modules = c("R/3.5.0"),
  extra_commands = NULL,
  input_file = "EASYPAR_LSF_input_jobarray.csv",
  R_script = "EASYPAR_LSF_Run.R",
  Submission_script = "EASYPAR_LSF_submission.sh",
  output_folder = ".",
  run = FALSE
)

Arguments

FUN

A function that takes any arguments in input, and performs a computation. This function should be runnable as a standalone R script.

PARAMS

A data.frame where each row represents inputs for FUN. An array job with as many rows as PARAMS is generated.

BSUB_config

A list of BSUB commands for the LSF cluster should be provided. The default input is obtained from a call to default_BSUB_config(). The queue and the project ID should always be provided as they are cluster-specific. Otherwise, default values will prompt errors submitting the job.

modules

A list of modules that will be added as dependencies of the LSF submission script. For instance modules = 'R/3.5.0' will generate the dependecy for a specific R version as "module load R/3.5.0".

extra_commands

Extra set of commands that will be executed in the submission script right after modules declaration.

input_file

The name of the data.frame input file that is generated from PARAMS. This file contains no header, and no row names.

R_script

The name of the R script file that contains the definition of FUN, and some other autogenerated R code to call the function with input parameters from the command line. Function FUN is given a fake name in this script.

Submission_script

The name of the LSF script file that contains the submission routines.

output_folder

The output of thsi function will be sent to this folder.

run

If `TRUE`, the function all attempt invoking `bsub` and submit the array jobs. Otherwise it will print to screen the instructions to run the job manually through the console.

Value

Nothing, this funciton just generates the required inputs to submit an array job via the LSF clusters. If required, it also attempts submitting the jobs.

Note

The queue and the project ID in `BSUB_config` should always be provided as they are cluster-specific. Default values will prompt errors submitting the job. Besides, we have found that automatic job submission can sometimes generate some `command not found` types of errors. Manual submission seems generally the safest option to submit LSF jobs.

See also

See default_BSUB_config that is used to generate default parameters for LSF jobs.

Examples

# very dummy example function
FUN = function(x, y){ print(x, y) }

# input for 25 array jobs
PARAMS = data.frame(x = runif(25), y = runif(25))

if (FALSE) {
# call - not run since it's cluster-specific
run_lsf(FUN, PARAMS)
}