I have been using the batchtools R package in PeakSegPipeline/R/jobs.R in order to run several steps of my machine learning ChIP-seq software on compute cluster. The batchtools package actually supports dependencies between jobs, as I wrote about previously.

I have also been using future.apply in another unrelated context. In penaltyLearning/R/IntervalRegression.R I use future.apply::future_lapply to parallel K-fold cross-validation when training an L1-regularized interval regression model. It is user-friendly because it acts as a drop-in replacement for the base lapply function. So I typically use it via

LAPPLY <- if(requireNamespace("future.apply")){

The user of my package can customize the parallel backend using future::plan before calling my penaltyLearning::IntervalRegressionCV function (which uses LAPPLY as above).

Today I tried using future.batchtools, which is a future backend that uses the batchtools package. In my script I used

future.apply::future_lapply(some.values, function(value){

The slurm.tmpl file is a bash script that is used to launch the jobs. It gets filled in with values specified in the resources list.

Executing future_lapply in R ran the function on the slurm cluster, as expected. Some jobs finished, but others ran out of time (the walltime I specified was too low for some of them). I even got an informative error at the R terminal:

Error: BatchtoolsExpiration: Future ('<none>') expired ...
slurmstepd: error: *** JOB 16590036 ON cdr699 CANCELLED DUE TO TIME LIMIT ***

So in summary, future/future.apply/future.batchtools provide a user-friendly interface for submitting jobs on the cluster using R. Most of the details of the batchtools package (creating a registry, etc) are taken care of automatically. For full control, using batchtools directly should be preferred. But for a simple interface, future.batchtools is great!