Flexible and consistent simulation of a matrix of Monte Carlo variates

Consider the Monte Carlo simulation of a matrix of i.i.d. normal random variables. We will show how rTRNG can be used to perform a consistent (fair-playing) simulation of a subset of the variables and simulations.

Consistent simulation in R

We rely on the TRNG engines exposed to R as reference classes by rTRNG.

library(rTRNG)

The mcMatR function below performs the full sequential Monte Carlo simulation of nrow normal i.i.d. samples of ncol variables using the yarn2 generator.

mcMatR <- function(nrow, ncol) {
  r <- yarn2$new(12358)
  M <- matrix(rnorm_trng(nrow * ncol, engine = r),
              nrow = nrow, ncol = ncol, byrow = TRUE)
  M
}

A second function mcSubMatR relies on jump and split operations to perform only a chunk [startRow, endRow] of simulations for a subset subCols of the variables.

mcSubMatR <- function(nrow, ncol,
                      startRow, endRow, subCols) {
  r <- yarn2$new(12358)
  r$jump((startRow - 1)*ncol)
  nSubCols <- endRow - startRow + 1
  S <- matrix(0.0, nrow, ncol)
  S[startRow:endRow, subCols] <-
    vapply(subCols,
           function(j) {
             rj = r$copy()
             rj$split(ncol, j)
             rnorm_trng(nSubCols, engine = rj)
           },
           FUN.VALUE = numeric(nSubCols))
  S
}

The parallel nature of the yarn2 generator ensures the sub-simulation obtained via mcSubMatR is consistent with the full sequential simulation.

rows <- 9
cols <- 5
M <- mcMatR(rows, cols)
startRow <- 4
endRow <- 6
subCols <- c(2, 4:5)
S <- mcSubMatR(rows, cols,
               startRow, endRow, subCols)
identical(M[startRow:endRow, subCols],
          S[startRow:endRow, subCols])
## [1] TRUE
M.1 M.2 M.3 M.4 M.5 S.1 S.2 S.3 S.4 S.5
1 0.20256 -0.41401 -0.76749 -0.33344 0.10718 0 0.00000 0 0.00000 0.00000
2 -2.76439 -1.15524 -0.39394 -1.16604 1.61759 0 0.00000 0 0.00000 0.00000
3 -0.42199 -1.13148 -0.30448 0.12741 -0.16111 0 0.00000 0 0.00000 0.00000
4 -0.94448 -1.86384 -1.03244 -0.41155 1.31009 0 -1.86384 0 -0.41155 1.31009
5 -0.09614 -0.16366 -0.31964 0.87053 0.77996 0 -0.16366 0 0.87053 0.77996
6 1.42049 -0.73062 -1.19459 -1.02146 0.07202 0 -0.73062 0 -1.02146 0.07202
7 -0.61202 0.02906 -0.29100 0.10095 -0.74647 0 0.00000 0 0.00000 0.00000
8 1.10246 -0.50507 0.01286 0.63140 -1.28893 0 0.00000 0 0.00000 0.00000
9 -0.08732 -0.32545 0.29099 0.62003 -0.94617 0 0.00000 0 0.00000 0.00000

Consistent simulation with Rcpp

We now use Rcpp to define functions mcMatRcpp and mcSubMatRcpp for the full sequential simulation and the sub-simulation, respectively. The Rcpp::depends attribute makes sure the TRNG library and headers shipped with rTRNG are available to the C++ code. Moreover, Rcpp::plugins(cpp11) enforces the C++11 standard required by TRNG >= 4.22.

// [[Rcpp::depends(rTRNG)]]
// TRNG >= 4.22 requires C++11
// [[Rcpp::plugins(cpp11)]]
#include <Rcpp.h>
#include <trng/normal_dist.hpp>
#include <trng/yarn2.hpp>
using namespace Rcpp;
// [[Rcpp::export]]
NumericMatrix mcMatRcpp(const int nrow, const int ncol) {
  NumericMatrix M(nrow, ncol);
  trng::yarn2 r(12358);
  trng::normal_dist<> normal(0.0, 1.0);
  for (int i = 0; i < nrow; i++) {
    for (int j = 0; j < ncol; j++) {
      M(i, j) = normal(r);
    }
  }
  return M;
}
// [[Rcpp::export]]
NumericMatrix mcSubMatRcpp(const int nrow, const int ncol,
                           const int startRow,
                           const int endRow,
                           const IntegerVector subCols) {
  NumericMatrix M(nrow, ncol);
  trng::yarn2 r(12358), rj;
  trng::normal_dist<> normal(0.0, 1.0);
  r.jump((startRow - 1) * ncol);
  for (IntegerVector::const_iterator jSub = subCols.begin();
       jSub < subCols.end(); jSub++) {
    int j = *jSub - 1;
    rj = r;
    rj.split(ncol, j);
    for (int i = startRow - 1; i < endRow; i++) {
      M(i, j) = normal(rj);
    }
  }
  return M;
}

As seen above for the R case, consistency of the simulation obtained via mcSubMatRcpp with the full sequential simulation is guaranteed.

rows <- 9
cols <- 5
startRow <- 4
endRow <- 6
subCols <- c(2, 4:5)
M <- mcMatRcpp(rows, cols)
S <- mcSubMatRcpp(rows, cols, startRow, endRow, subCols)
identical(M[startRow:endRow, subCols],
          S[startRow:endRow, subCols])
## [1] TRUE
M.1 M.2 M.3 M.4 M.5 S.1 S.2 S.3 S.4 S.5
1 0.20256 -0.41401 -0.76749 -0.33344 0.10718 0 0.00000 0 0.00000 0.00000
2 -2.76439 -1.15524 -0.39394 -1.16604 1.61759 0 0.00000 0 0.00000 0.00000
3 -0.42199 -1.13148 -0.30448 0.12741 -0.16111 0 0.00000 0 0.00000 0.00000
4 -0.94448 -1.86384 -1.03244 -0.41155 1.31009 0 -1.86384 0 -0.41155 1.31009
5 -0.09614 -0.16366 -0.31964 0.87053 0.77996 0 -0.16366 0 0.87053 0.77996
6 1.42049 -0.73062 -1.19459 -1.02146 0.07202 0 -0.73062 0 -1.02146 0.07202
7 -0.61202 0.02906 -0.29100 0.10095 -0.74647 0 0.00000 0 0.00000 0.00000
8 1.10246 -0.50507 0.01286 0.63140 -1.28893 0 0.00000 0 0.00000 0.00000
9 -0.08732 -0.32545 0.29099 0.62003 -0.94617 0 0.00000 0 0.00000 0.00000

Consistent parallel simulation with RcppParallel

The same technique used for generating a sub-set of the simulations can be exploited for performing a parallel simulation in C++. We can embed the body of mcSubMatRcpp above into an RcppParallel::Worker for performing chunks of Monte Carlo simulations in parallel, for any subset subCols of the variables.

struct MCMatWorker : public Worker
{
  RMatrix<double> M;
  const RVector<int> subCols;

  // constructor
  MCMatWorker(NumericMatrix M,
              const IntegerVector subCols)
    : M(M), subCols(subCols) {}

  // operator processing an exclusive range of indices
  void operator()(std::size_t begin, std::size_t end) {
    trng::yarn2 r(12358), rj;
    trng::normal_dist<> normal(0.0, 1.0);
    r.jump((int)begin * M.ncol());
    for (IntegerVector::const_iterator jSub = subCols.begin();
         jSub < subCols.end(); jSub++) {
      int j = *jSub - 1;
      rj = r;
      rj.split(M.ncol(), j);
      for (int i = (int)begin; i < (int)end; i++) {
        M(i, j) = normal(rj);
      }
    }
  }
};
// [[Rcpp::export]]
NumericMatrix mcMatRcppParallel(const int nrow, const int ncol,
                                const IntegerVector subCols) {
  NumericMatrix M(nrow, ncol);
  MCMatWorker w(M, subCols);
  parallelFor(0, M.nrow(), w);
  return M;
}

The parallel nature of the yarn2 generator ensures the parallel simulation is playing fair, i.e. is consistent with the sequential simulation.

M <- mcMatRcpp(rows, cols)
Mp <- mcMatRcppParallel(rows, cols, seq_len(ncol(M)))
identical(M, Mp)
## [1] TRUE

Similarly, we can achieve a consistent parallel simulation of a subset of the variables only.

Sp <- mcMatRcppParallel(rows, cols, subCols)
identical(M[, subCols],
          Sp[, subCols])
## [1] TRUE
M.1 M.2 M.3 M.4 M.5 Sp.1 Sp.2 Sp.3 Sp.4 Sp.5
1 0.20256 -0.41401 -0.76749 -0.33344 0.10718 0 -0.41401 0 -0.33344 0.10718
2 -2.76439 -1.15524 -0.39394 -1.16604 1.61759 0 -1.15524 0 -1.16604 1.61759
3 -0.42199 -1.13148 -0.30448 0.12741 -0.16111 0 -1.13148 0 0.12741 -0.16111
4 -0.94448 -1.86384 -1.03244 -0.41155 1.31009 0 -1.86384 0 -0.41155 1.31009
5 -0.09614 -0.16366 -0.31964 0.87053 0.77996 0 -0.16366 0 0.87053 0.77996
6 1.42049 -0.73062 -1.19459 -1.02146 0.07202 0 -0.73062 0 -1.02146 0.07202
7 -0.61202 0.02906 -0.29100 0.10095 -0.74647 0 0.02906 0 0.10095 -0.74647
8 1.10246 -0.50507 0.01286 0.63140 -1.28893 0 -0.50507 0 0.63140 -1.28893
9 -0.08732 -0.32545 0.29099 0.62003 -0.94617 0 -0.32545 0 0.62003 -0.94617