How to construct in R a parallel version of nested for loop to compute values for a square matrix where the function is dependent on i and j?

I have a function that takes i and j as parameters and returns a single value and I currently have a nested loop designed to compute a value for each entry in a square matrix. But in essence since each individual value can be computed in parallel. Is there a way I can apply lapply in this situation? The resulting matrix must be N X N and the function is dependant on i and j. Thanks

for ( i in 1:matrixRowLength ) {
     for ( j in 1:matrixColLength ) {
             result_matrix[i,j] <- function(i,j) } }


ANSWERS:


Use

apply(matrix,c(1,2),function)

The foreach package has a nesting operator that can be useful when parallelizing nested for loops. Here's an example:

library(doSNOW)
cl <- makeSOCKcluster(3)
registerDoSNOW(cl)

matrixRowLength <- 5
matrixColLength <- 5
fun <- function(i, j) 10 * i + j

result_matrix.1 <-
  foreach(j=1:matrixColLength, .combine='cbind') %:% 
    foreach(i=1:matrixRowLength, .combine='c') %dopar% {
      fun(i, j)
    }

Note that I reversed the order of the loops so that the matrix is computed column by column. This is generally preferable since matrices in R are stored in column-major order.

The nesting operator is useful if you have large tasks and at least one of the loops may have a small number of iterations. But in many cases, it's safer to only parallelize the outer loop:

result_matrix.2 <-
  foreach(j=1:matrixColLength, .combine='cbind') %dopar% {
    x <- double(matrixRowLength)
    for (i in 1:matrixRowLength) {
      x[i] <- fun(i, j)
    }
    x
  }

Note that it can also be useful to use chunking in the outer loop to decrease the amount of post processing performed by the master process. Unfortunately, this technique is a bit more tricky:

library(itertools)
nw <- getDoParWorkers()
result_matrix.3 <-
  foreach(jglobals=isplitIndices(matrixColLength, chunks=nw),
          .combine='cbind') %dopar% {
    localColLength <- length(jglobals)
    m <- matrix(0, nrow=matrixRowLength, ncol=localColLength)
    for (j in 1:localColLength) {
      for (i in 1:matrixRowLength) {
        m[i,j] <- fun(i, jglobals[j])
      }
    }
    m
  }

In my experience, this method often gives the best performance.


Thanks for an interesting question / use case. Here's a solution using the future package (I'm the author):

First, define (*):

future_array_call <- function(dim, FUN, ..., simplify = TRUE) {
  args <- list(...)
  idxs <- arrayInd(seq_len(prod(dim)), .dim = dim)
  idxs <- apply(idxs, MARGIN = 1L, FUN = as.list)
  y <- future::future_lapply(idxs, FUN = function(idx_list) {
    do.call(FUN, args = c(idx_list, args))
  })
  if (simplify) y <- simplify2array(y)
  dim(y) <- dim
  y
}

This function does not make any assumptions on what data type your function returns, but with the default simplify = TRUE it will try to simplify the returned data type iff possible (similar to how sapply() works).

Then with your matrix dimensions (**):

matrixRowLength <- 5
matrixColLength <- 5
dim <- c(matrixRowLength, matrixColLength)

and function:

slow_fun <- function(i, j, ..., a = 1.0) {
  Sys.sleep(0.1)
  a * i + j
}

you can run calculate slow_fun(i, j, a = 10) for all elements as:

y <- future_array_call(dim, FUN = slow_fun, a = 10)

To do it in parallel on your local machine, use:

library("future")
plan(multiprocess)
y <- future_array_call(dim, FUN = slow_fun, a = 10)

On a cluster of machines (for which you have SSH access with SSH-key authentication), use:

library("future")
plan(cluster, workers = c("machine1", "machine2"))
y <- future_array_call(dim, FUN = slow_fun, a = 10)

Footnotes:

(*) If you wonder how it works, just replace the future::future_lapply() statement with a regular lapply().

(**) future_array_call(dim, FUN) should work for any length(dim), not just for two (= matrices).



 MORE:


 ? How does Big O notation work?
 ? How does Big O notation work?
 ? How does Big O notation work?
 ? Big O Notation for Algorithm
 ? Comparing two functions based on Asymptotic notations
 ? What is the runtime complexity (big O) of the following pseudocode?
 ? How do I describe the Big O notation of a method with Math.pow or Math.log?
 ? Big 'O' - time complexity of function
 ? C++ Splitting list in O(n) instead of O(1)
 ? C++ Splitting list in O(n) instead of O(1)