Parallel computing: comparison and update

I would like to develop a random matrix generator and I found very interesting example at https://cran.r-project.org/web/packages/dqrng/vignettes/parallel.html
Since I am rather new at parallel computation, I am unable to choose between the various approaches and to modify the examples.
I would like to generate a random matrix from an uniform distribution changing the following code:

-----------------------------1-PCG: multiple streams with RcppParallel---------------------------------------------------
```cpp
#include <Rcpp.h>
// [[Rcpp::depends(dqrng, BH)]]
#include <dqrng_generator.h>
#include <dqrng_sample.h>
// [[Rcpp::depends(RcppParallel)]]
#include <RcppParallel.h>

struct RandomFill : public RcppParallel::Worker {
  RcppParallel::RMatrix<int> output;
  uint64_t seed;

  RandomFill(Rcpp::IntegerMatrix output, const uint64_t seed) : output(output), seed(seed) {};

  void operator()(std::size_t begin, std::size_t end) {
    auto rng = dqrng::generator<pcg64>(seed, end);
    for (std::size_t col = begin; col < end; ++col) {
      auto sampled = dqrng::sample::sample<std::vector<int>, uint32_t>(*rng, 100000, output.nrow(), true);
      RcppParallel::RMatrix<int>::Column column = output.column(col);
      std::copy(sampled.begin(), sampled.end(), column.begin());
    }
  }
};

// [[Rcpp::export]]
Rcpp::IntegerMatrix parallel_random_matrix(const int n, const int m, const int ncores) {
  Rcpp::IntegerMatrix res(n, m);
  RandomFill randomFill(res, 42);
  RcppParallel::parallelFor(0, m, randomFill, m/ncores + 1);
  return res;
}
```

-------------------------------------------------------------------------------------------------------------

and I also would like to understand the difference between the former (1) and the latter (2) approach

-----------------------------2-Xo(ro)shiro: jump ahead with OpenMP-------------------------------------
```cpp
#include <Rcpp.h>
// [[Rcpp::depends(dqrng, BH)]]
#include <dqrng_distribution.h>
// [[Rcpp::plugins(openmp)]]
#include <omp.h>

// [[Rcpp::export]]
std::vector<double> random_sum(int n, int m) {
  dqrng::uniform_distribution dist(0.0, 1.0);               // Uniform distribution [0,1)
  auto rng = dqrng::generator<dqrng::xoshiro256plusplus>(); // seeded from R's RNG
  std::vector<double> res(m);
  for (int i = 0; i < m; ++i) {
    double lres(0);
    for (int j = 0; j < n; ++j) {
      lres += dist(*rng);
    }
    res[i] = lres / n;
  }
  return res;
}

// [[Rcpp::export]]
std::vector<double> parallel_random_sum(int n, int m, int ncores) {
  dqrng::uniform_distribution dist(0.0, 1.0);               // Uniform distribution [0,1)
  auto rng = dqrng::generator<dqrng::xoshiro256plusplus>(); // seeded from R's RNG
  std::vector<double> res(m);
  // ok to use rng here
  
  #pragma omp parallel num_threads(ncores)
  {
    // make thread local copy of rng and advance it by 1 ... ncores jumps
    auto lrng = rng->clone(omp_get_thread_num() + 1);

    #pragma omp for
    for (int i = 0; i < m; ++i) {
      double lres(0);
      for (int j = 0; j < n; ++j) {
        lres += dist(*lrng);
      }
      res[i] = lres / n;
    }
  }
  // ok to use rng here
  return res;
}
```
--------------------------------------------------------------------------------------------------

in order to understand which is the more efficient and faster of the two solutions.
Thank you very much.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel computing: comparison and update #92

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parallel computing: comparison and update #92

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions