69069 * 2783094533) %% 2^32 (
[1] 1
Ralf Stubner
May 29, 2024
Only two weeks after version 0.4.0 I had to publish version 0.4.1 of dqrng on CRAN.
The main reason for this was an “undefined behaviour” error found on CRAN’s UBSAN checks:
dqrng.cpp:222:18: runtime error: signed integer overflow: 4025630150 * 2783094533 cannot be represented in type 'long int'
Looking at the relevant lines the error is quite apparent:
Int32 unscramble(Int32 u) {
for (int j = 0; j < 50; ++j) {
u = ((u - 1) * 2783094533);
}
return u;
}
While u
is a Int32
, i.e. unsigned long int
, the literal integer 2783094533
is interpreted as a signed long int
. As a consequence, the multiplication is done using signed integer logic where overflow is undefined. As a fix, we can simply make sure that the 2783094533
is also interpreted as unsigned long int
:
Int32 unscramble(Int32 u) {
for (int j = 0; j < 50; ++j) {
u = ((u - 1) * 2783094533UL);
}
return u;
}
In case you are wondering why this code is there in the first place: As discussed on StackOverflow, R scrambles the user supplied seed using
which can be undone using the modular multiplicative inverse of 69069
for which holds:
The purpose of undoing this scrambling is to make the results of set.seed()
and dqset.seed()
equivalent if one uses dqrng
as user defined RNG.
The second change was triggered by a bug report from Sergey Fedorov from the MacPorts project. When building dqrng
on a 32bit PowerPC architecture certain parts of the included PCG code where now used that are not compatible with this architecture. For one of these issues a fix was possible. But in the end I had to disable PCG on MacOS for PowerPC. It will be interesting to see if other architectures are affected by this as well. Probably the best way to find out is when the Debian package is updated.