Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use R_unif_index() instead of unif_rand() and move RNGScope #24

Merged
merged 3 commits into from
Apr 18, 2019

Conversation

rstub
Copy link
Member

@rstub rstub commented Apr 16, 2019

The change from unif_rand() to R_unif_index() changes the amount of RNG draws even before R version 3.6. Loading dqrng, which needs 64 bits for seeding, used to advance R's RNG by two steps:

> set.seed(42)
> runif(10)
 [1] 0.9148060 0.9370754 0.2861395 0.8304476 0.6417455 0.5190959 0.7365883
 [8] 0.1346666 0.6569923 0.7050648
> set.seed(42)
> library(dqrng)
> runif(10)
 [1] 0.2861395 0.8304476 0.6417455 0.5190959 0.7365883 0.1346666 0.6569923
 [8] 0.7050648 0.4577418 0.7191123

With this change it advances R's RNG by four steps:

> set.seed(42)
> runif(10)
 [1] 0.9148060 0.9370754 0.2861395 0.8304476 0.6417455 0.5190959 0.7365883
 [8] 0.1346666 0.6569923 0.7050648
> set.seed(42)
> library(dqrng)
> runif(10)
 [1] 0.6417455 0.5190959 0.7365883 0.1346666 0.6569923 0.7050648 0.4577418
 [8] 0.7191123 0.9346722 0.2554288

Overall this will decrease the speed of generateseedVectors and produce different results.
@LTLA Would this be a problem for you?

@LTLA
Copy link
Contributor

LTLA commented Apr 16, 2019

That's fine; neither the speed or the difference in the results are a concern. generateSeedVectors() shouldn't have to do a lot of work anyway (relative to the actual PRNG'ing), and I was prepared for any differences in the results once I realized that the current code might have a problem.

Is it costly to have the RNGScope constructed within R_random_u32()? One would expect that any functions calling R_random_u32() would be wrapped in [[Rcpp::export(rng=TRUE)]], so an instantiation of the RNGScope should be unnecessary. I don't know what happens if a RNGScope is constructed/destroyed within the scope of another RNGScope - I would hope that it's okay, but who knows. There are also some garbage collection issues but I don't think they apply here.

@rstub
Copy link
Member Author

rstub commented Apr 17, 2019

Construction of a RNGScope object should not be very costly. However, I meanwhile think that it is wrong to move the RNGScope. I had remembered that there was an internal counter that made sure that GetRNGState and PutRNGState are called only once. However, this was removed in RcppCore/Rcpp#825 to resolve some other corner case. In our case this could lead to the following effective code

GetRNGState(); // via Rcpp::export attribute
double x = R:runif(0.0, 1.0);
GetRNGState(); // from RNGScope in R_random_u32
double val = R_unif_index(4294967296);
PutRNGState(); // from RNGScope in R_random_u32
PutRNGState(); // via Rcpp::export attribute

Now x and val would be correlated, since they were both generated based on the same RNG state.
I am reverting the move of RNGScope and document the need for shielding.

@LTLA
Copy link
Contributor

LTLA commented Apr 17, 2019

Your reasoning sounds sensible.

On a more general note, though, the lack of protection from nested RNGScopes is concerning. It seems like an avenue for very subtle bugs to arise, especially when interfacing with Rcpp code or libraries written by other people who might be calling RNGScope internally (ostensibly as a courtesy to avoid requiring the user to create RNGScope themselves). Do the Rcpp folks know about this edge case?

@rstub
Copy link
Member Author

rstub commented Apr 18, 2019

I think they are aware that it can be really tricky to get the RNGScope correct, since they are several github issues discussing the topic. If I find the time I might ask on the Rcpp ML if there is a better pattern, but for now I will merge this PR. Thanks for your input!

@rstub rstub merged commit 53ed727 into master Apr 18, 2019
@rstub rstub deleted the feature/unif_index branch April 18, 2019 15:55
@rstub rstub mentioned this pull request Apr 18, 2019
@LTLA
Copy link
Contributor

LTLA commented Apr 19, 2019

Do you have a timeframe for pushing this to CRAN? I don't mean to be pushy, but the next Bioconductor release is coming up soon, and I'd like an opportunity to build all my dqrng-dependent packages with this new generator on the Bioconductor build system. (Especially on Windows, the gift that keeps on giving.)

@rstub
Copy link
Member Author

rstub commented Apr 19, 2019

What's the timeframe for the Bioconductor release? In principle I wanted to add scalar functions (c.f. #25) for the C++ interface and maybe some additional distribution functions (c.f. #22) before the next CRAN release.

@LTLA
Copy link
Contributor

LTLA commented Apr 19, 2019

The BioC release is sync'd to the CRAN release, so it should occur just after R 3.6.0 comes out (schedule here). So, if we could get the next version of dqrng onto CRAN sometime early next week, I should be able to run it through the BioC build system at least once before they do a commit freeze.

@rstub
Copy link
Member Author

rstub commented Apr 22, 2019

Version 0.2.0 is on its way to CRAN.

@LTLA
Copy link
Contributor

LTLA commented Apr 23, 2019

Works like a charm on my local machine - fingers crossed for the Bioc build system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants