﻿ How to construct in R a parallel version of nested for loop to compute values for a square matrix where the function is dependent on i and j?

# How to construct in R a parallel version of nested for loop to compute values for a square matrix where the function is dependent on i and j?

I have a function that takes i and j as parameters and returns a single value and I currently have a nested loop designed to compute a value for each entry in a square matrix. But in essence since each individual value can be computed in parallel. Is there a way I can apply lapply in this situation? The resulting matrix must be N X N and the function is dependant on i and j. Thanks

``````for ( i in 1:matrixRowLength ) {
for ( j in 1:matrixColLength ) {
result_matrix[i,j] <- function(i,j) } }
``````

Use

``````apply(matrix,c(1,2),function)
``````

The `foreach` package has a nesting operator that can be useful when parallelizing nested `for` loops. Here's an example:

``````library(doSNOW)
cl <- makeSOCKcluster(3)
registerDoSNOW(cl)

matrixRowLength <- 5
matrixColLength <- 5
fun <- function(i, j) 10 * i + j

result_matrix.1 <-
foreach(j=1:matrixColLength, .combine='cbind') %:%
foreach(i=1:matrixRowLength, .combine='c') %dopar% {
fun(i, j)
}
``````

Note that I reversed the order of the loops so that the matrix is computed column by column. This is generally preferable since matrices in R are stored in column-major order.

The nesting operator is useful if you have large tasks and at least one of the loops may have a small number of iterations. But in many cases, it's safer to only parallelize the outer loop:

``````result_matrix.2 <-
foreach(j=1:matrixColLength, .combine='cbind') %dopar% {
x <- double(matrixRowLength)
for (i in 1:matrixRowLength) {
x[i] <- fun(i, j)
}
x
}
``````

Note that it can also be useful to use chunking in the outer loop to decrease the amount of post processing performed by the master process. Unfortunately, this technique is a bit more tricky:

``````library(itertools)
nw <- getDoParWorkers()
result_matrix.3 <-
foreach(jglobals=isplitIndices(matrixColLength, chunks=nw),
.combine='cbind') %dopar% {
localColLength <- length(jglobals)
m <- matrix(0, nrow=matrixRowLength, ncol=localColLength)
for (j in 1:localColLength) {
for (i in 1:matrixRowLength) {
m[i,j] <- fun(i, jglobals[j])
}
}
m
}
``````

In my experience, this method often gives the best performance.

Thanks for an interesting question / use case. Here's a solution using the future package (I'm the author):

First, define (*):

``````future_array_call <- function(dim, FUN, ..., simplify = TRUE) {
args <- list(...)
idxs <- arrayInd(seq_len(prod(dim)), .dim = dim)
idxs <- apply(idxs, MARGIN = 1L, FUN = as.list)
y <- future::future_lapply(idxs, FUN = function(idx_list) {
do.call(FUN, args = c(idx_list, args))
})
if (simplify) y <- simplify2array(y)
dim(y) <- dim
y
}
``````

This function does not make any assumptions on what data type your function returns, but with the default `simplify = TRUE` it will try to simplify the returned data type iff possible (similar to how `sapply()` works).

Then with your matrix dimensions (**):

``````matrixRowLength <- 5
matrixColLength <- 5
dim <- c(matrixRowLength, matrixColLength)
``````

and function:

``````slow_fun <- function(i, j, ..., a = 1.0) {
Sys.sleep(0.1)
a * i + j
}
``````

you can run calculate `slow_fun(i, j, a = 10)` for all elements as:

``````y <- future_array_call(dim, FUN = slow_fun, a = 10)
``````

To do it in parallel on your local machine, use:

``````library("future")
plan(multiprocess)
y <- future_array_call(dim, FUN = slow_fun, a = 10)
``````

On a cluster of machines (for which you have SSH access with SSH-key authentication), use:

``````library("future")
plan(cluster, workers = c("machine1", "machine2"))
y <- future_array_call(dim, FUN = slow_fun, a = 10)
``````

Footnotes:

(*) If you wonder how it works, just replace the `future::future_lapply()` statement with a regular `lapply()`.

(**) `future_array_call(dim, FUN)` should work for any `length(dim)`, not just for two (= matrices).