Run R code in parallel in a shell without having R file

Roshan Mahes

I've got the following .sh file which can be run on a cluster computer using sbatch:

Shell.sh

#!/bin/bash
#
#SBATCH -p smp # partition (queue)
#SBATCH -N 2 # number of nodes
#SBATCH -n 2 # number of cores
#SBATCH --mem 2000 # memory pool for all cores
#SBATCH -t 5-0:00 # time (D-HH:MM)
#SBATCH -o out.out # STDOUT
#SBATCH -e err.err # STDERR

module load R
srun -N1 -n1 R CMD BATCH ./MyFile.R &
srun -N1 -n1 R CMD BATCH ./MyFile2.R &
wait

My problem is that MyFile.R and MyFile2.R almost look the same:

MyFile.R

source("Experiment.R")
Experiment(args1) # some arguments

MyFile2.R

source("Experiment.R")
Experiment(args2) # some arguments

In fact, I need to do this for about 100 files. Since they all load some R file and then run the experiment with different arguments, I was wondering whether I could do this without creating a new file for each run. I want to run all processes in parallel, so I can't just create one single R file, I think.

My question is: is there some way to run the process directly from the shell, without having an R file for each run? So can I do something like

srun -N1 -n1 R cmd BATCH 'source("Experiment.R"); Experiment(args1)' &
srun -N1 -n1 R cmd BATCH 'source("Experiment.R"); Experiment(args2)' &
wait

instead of the last three lines in shell.sh?

Katia

Your batch script should still include 2 lines to start 2 different R processes, but you can pass the arguments on command line using the same file name:

module load R
srun -N1 -n1 Rscript ./MyFile.R args1_1 args1_2 &
srun -N1 -n1 Rscript ./MyFile.R args2_1 args2_2 &
wait

Then within your R file:

source("Experiment.R")
#Get aruments from the command line
argv <- commandArgs(TRUE)

# Check if the command line is not empty and convert values if needed
if (length(argv) > 0){
   nSim <- as.numeric( argv[1] )
   meanVal <- as.numeric( argv[2] ) 
} else {
   nSim=100  # some default values
   meanVal =5
}

Experiment(nSim, meanVal) # some arguments

If you prefer to use R command instead of Rscript, then your batch script should look like:

module load R
srun -N1 -n1 R -q --slave --vanilla --args args1_1 args1_2 < myFile.R &
srun -N1 -n1 R -q --slave --vanilla --args args2_1 args2_2 < myFile.R &
wait

You might need (or not) quotes for "R -q --slave ... < myFile.R" part

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

run a for loop in parallel in R

Run yaml file for parallel selenium test from R or python

R loop - Subset the file and run xlsx code

How to run R Code in R Markdown file in SublimeREPL?

Compiling C code without having it saved in a file

How to run commands in the shell in parallel without disrupting the output?

is there any way to run mongo scripts without having mongo shell installed?

Adding text to the beginning of a text file without having to copy the entire file in R

Run Racket code in R

What is the best option on Amazon AWS to run R code in parallel that was designed for a Slurm manager?

Run breakpoint (lm) detection in parallel in R

R split DF and run tests in parallel

How to run randomForest in R on multiple cores in parallel?

Run a R function with multiple parameters in parallel mode

Immediately seeing changes in R shiny UI, in the browser, after making changes in code without having to restart the server?

R: Parallel Coordinates Plot without GGally

Is there a way to run a groovy class without having the source code?

Having ansible run a script without the exit code stopping the ssh connection

Proper way to code dry run option without having to repeat myself?

php shell : how to run code in existing file

How to show the progress of code in parallel computation in R?

R shiny - having trouble with file download

Run Parallel jobs power shell

Open SAS .egp file to view code without having SAS installed

run R program through Unix shell script

Run Batch File from R

How can rpy2 (R code from python) be run in a thread without blocking other threads?

R run same loop on different chunks of data in parallel

R system2 run different scripts in parallel