Different round keys for columns 0,1 and 2,3 in AesGenerator4R (#76)

* this fixes identical sequences of columns 0/2 and 1/3 if their states are the same
* added TestU01 results for AesGenerator1R and AesGenerator4R
* added a note about the reversibility of AesHash1R
This commit is contained in:
tevador 2019-06-22 15:56:01 +02:00 committed by GitHub
parent 118f3054ea
commit 83498cddf2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 246 additions and 28 deletions

View file

@ -297,17 +297,24 @@ Using less than 256 MiB of memory is not possible due to the use of tradeoff-res
### 3.1 AesGenerator1R
AesGenerator1R was designed for the fastest possible generation of pseudorandom data to fill the Scratchpad. It takes advantage of hardware accelerated AES in modern CPUs. Only one AES round is performed per 16 bytes of output, which results in throughput exceeding 20 GB/s in most modern CPUs. While 1 AES round is not sufficient for a good distribution of random values, this is not an issue because the purpose is just to initialize the Scratchpad with random non-zero data.
AesGenerator1R was designed for the fastest possible generation of pseudorandom data to fill the Scratchpad. It takes advantage of hardware accelerated AES in modern CPUs. Only one AES round is performed per 16 bytes of output, which results in throughput exceeding 20 GB/s in most modern CPUs.
AesGenerator1R gives a good output distribution provided that it's initialized with a sufficiently 'random' initial state (see Appendix F).
### 3.2 AesGenerator4R
AesGenerator4R uses 4 AES rounds to generate pseudorandom data for Program Buffer initialization. Since 2 AES rounds are sufficient for full avalanche of all input bits [[28](https://csrc.nist.gov/csrc/media/projects/cryptographic-standards-and-guidelines/documents/aes-development/rijndael-ammended.pdf)], AesGenerator4R provides an excellent output distribution while maintaining very good performance.
AesGenerator4R uses 4 AES rounds to generate pseudorandom data for Program Buffer initialization. Since 2 AES rounds are sufficient for full avalanche of all input bits [[28](https://csrc.nist.gov/csrc/media/projects/cryptographic-standards-and-guidelines/documents/aes-development/rijndael-ammended.pdf)], AesGenerator4R has excellent statistical properties (see Appendix F) while maintaining very good performance.
The reversible nature of this generator is not an issue since the generator state is always initialized using the output of a non-reversible hashing function (Blake2b).
### 3.3 AesHash1R
AesHash was designed for the fastest possible calculation of the Scratchpad fingerprint. It interprets the Scratchpad as a set of AES round keys, so it's equivalent to AES encryption with 32768 rounds. Two extra rounds are performed at the end to ensure avalanche of all Scratchpad bits in each lane. The output of the AesHash is fed into the Blake2b hashing function to calculate the final PoW hash.
AesHash was designed for the fastest possible calculation of the Scratchpad fingerprint. It interprets the Scratchpad as a set of AES round keys, so it's equivalent to AES encryption with 32768 rounds. Two extra rounds are performed at the end to ensure avalanche of all Scratchpad bits in each lane.
The reversible nature of AesHash1R is not a problem for two main reasons:
* It is not possible to directly control the input of AesHash1R.
* The output of AesHash1R is passed into the Blake2b hashing function, which is not reversible.
### 3.4 SuperscalarHash
@ -468,6 +475,109 @@ This shows that SuperscalaHash has quite low sensitivity to high-order bits and
When calculating a Dataset item, the input of the first SuperscalarHash depends only on the item number. To ensure a good distribution of results, the constants described in section 7.3 of the Specification were chosen to provide unique values of bits 3-53 for *all* item numbers in the range 0-34078718 (the Dataset contains 34078719 items). All initial register values for all Dataset item numbers were checked to make sure bits 3-53 of each register are unique and there are no collisions (source code: [superscalar-init.cpp](../src/tests/superscalar-init.cpp)). While this is not strictly necessary to get unique output from SuperscalarHash, it's a security precaution that mitigates the non-perfect avalanche properties of the randomly generated SuperscalarHash instances.
### F. Statistical tests of RNG
Both AesGenerator1R and AesGenerator4R were tested using the TestU01 library [[30](http://simul.iro.umontreal.ca/testu01/tu01.html)] intended for empirical testing of random number generators. The source code is available in [rng-tests.cpp](../src/tests/rng-tests.cpp).
The tests sample about 200 MB ("SmallCrush" test), 500 GB ("Crush" test) or 4 TB ("BigCrush" test) of output from each generator. This is considerably more than the amounts generated in RandomX (2176 bytes for AesGenerator4R and 2 MiB for AesGenerator1R), so failures in the tests don't necessarily imply that the generators are not suitable for their use case.
#### AesGenerator4R
The generator passes all tests in the "BigCrush" suite when initialized using the Blake2b hash function:
```
$ bin/rng-tests 1
state0 = 67e8bbe567a1c18c91a316faf19fab73
state1 = 39f7c0e0a8d96512c525852124fdc9fe
state2 = 7abb07b2c90e04f098261e323eee8159
state3 = 3df534c34cdfbb4e70f8c0e1826f4cf7
...
========= Summary results of BigCrush =========
Version: TestU01 1.2.3
Generator: AesGenerator4R
Number of statistics: 160
Total CPU time: 02:50:18.34
All tests were passed
```
The generator passes all tests in the "Crush" suite even with an initial state set to all zeroes.
```
$ bin/rng-tests 0
state0 = 00000000000000000000000000000000
state1 = 00000000000000000000000000000000
state2 = 00000000000000000000000000000000
state3 = 00000000000000000000000000000000
...
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: AesGenerator4R
Number of statistics: 144
Total CPU time: 00:25:17.95
All tests were passed
```
#### AesGenerator1R
The generator passes all tests in the "Crush" suite when initialized using the Blake2b hash function.
```
$ bin/rng-tests 1
state0 = 67e8bbe567a1c18c91a316faf19fab73
state1 = 39f7c0e0a8d96512c525852124fdc9fe
state2 = 7abb07b2c90e04f098261e323eee8159
state3 = 3df534c34cdfbb4e70f8c0e1826f4cf7
...
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: AesGenerator1R
Number of statistics: 144
Total CPU time: 00:25:06.07
All tests were passed
```
When the initial state is initialized to all zeroes, the generator fails 1 test out of 144 tests in the "Crush" suite:
```
$ bin/rng-tests 0
state0 = 00000000000000000000000000000000
state1 = 00000000000000000000000000000000
state2 = 00000000000000000000000000000000
state3 = 00000000000000000000000000000000
...
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: AesGenerator1R
Number of statistics: 144
Total CPU time: 00:26:12.75
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
12 BirthdaySpacings, t = 3 1 - 4.4e-5
----------------------------------------------
All other tests were passed
```
## References
[1] CryptoNote whitepaper - https://cryptonote.org/whitepaper.pdf
@ -528,3 +638,5 @@ Cryptocurrencies and Password Hashing - https://eprint.iacr.org/2015/430.pdf Tab
[28] J. Daemen, V. Rijmen: AES Proposal: Rijndael - https://csrc.nist.gov/csrc/media/projects/cryptographic-standards-and-guidelines/documents/aes-development/rijndael-ammended.pdf page 28
[29] 7-Zip File archiver - https://www.7-zip.org/
[30] TestU01 library - http://simul.iro.umontreal.ca/testu01/tu01.html

View file

@ -169,41 +169,46 @@ state0 (16 B) state1 (16 B) state2 (16 B) state3 (16 B)
### 3.3 AesGenerator4R
AesGenerator4R works the same way as AesGenerator1R, except it uses 4 rounds per column:
AesGenerator4R works similar way as AesGenerator1R, except it uses 4 rounds per column. Columns 0 and 1 use a different set of keys than columns 2 and 3.
```
state0 (16 B) state1 (16 B) state2 (16 B) state3 (16 B)
| | | |
AES decrypt AES encrypt AES decrypt AES encrypt
(key0) (key0) (key0) (key0)
(key0) (key0) (key4) (key4)
| | | |
v v v v
AES decrypt AES encrypt AES decrypt AES encrypt
(key1) (key1) (key1) (key1)
(key1) (key1) (key5) (key5)
| | | |
v v v v
AES decrypt AES encrypt AES decrypt AES encrypt
(key2) (key2) (key2) (key2)
(key2) (key2) (key6) (key6)
| | | |
v v v v
AES decrypt AES encrypt AES decrypt AES encrypt
(key3) (key3) (key3) (key3)
(key3) (key3) (key7) (key7)
| | | |
v v v v
state0' state1' state2' state3'
```
AesGenerator4R uses the following 4 round keys:
AesGenerator4R uses the following 8 round keys:
```
key0 = 5d 46 90 f8 a6 e4 fb 7f b7 82 1f 14 95 9e 35 cf
key1 = 50 c4 55 6a 8a 27 e8 fe c3 5a 5c bd dc ff 41 67
key2 = a4 47 4c 11 e4 fd 24 d5 d2 9a 27 a7 ac 4a 32 3d
key3 = 2a 3a 0c 81 ff ae a9 99 d9 db d3 42 08 db f6 76
key0 = dd aa 21 64 db 3d 83 d1 2b 6d 54 2f 3f d2 e5 99
key1 = 50 34 0e b2 55 3f 91 b6 53 9d f7 06 e5 cd df a5
key2 = 04 d9 3e 5c af 7b 5e 51 9f 67 a4 0a bf 02 1c 17
key3 = 63 37 62 85 08 5d 8f e7 85 37 67 cd 91 d2 de d8
key4 = 73 6f 82 b5 a6 a7 d6 e3 6d 8b 51 3d b4 ff 9e 22
key5 = f3 6b 56 c7 d9 b3 10 9c 4e 4d 02 e9 d2 b7 72 b2
key6 = e7 c9 73 f2 8b a3 65 f7 0a 66 a9 2b a7 ef 3b f6
key7 = 09 d6 7c 7a de 39 58 91 fd d1 06 0c 2d 76 b0 c0
```
These keys were generated as:
```
key0, key1, key2, key3 = Hash512("RandomX AesGenerator4R keys")
key0, key1, key2, key3 = Hash512("RandomX AesGenerator4R keys 0-3")
key4, key5, key6, key7 = Hash512("RandomX AesGenerator4R keys 4-7")
```
### 3.4 AesHash1R

View file

@ -157,10 +157,14 @@ void fillAes1Rx4(void *state, size_t outputSize, void *buffer) {
template void fillAes1Rx4<true>(void *state, size_t outputSize, void *buffer);
template void fillAes1Rx4<false>(void *state, size_t outputSize, void *buffer);
#define AES_GEN_4R_KEY0 0xcf359e95, 0x141f82b7, 0x7ffbe4a6, 0xf890465d
#define AES_GEN_4R_KEY1 0x6741ffdc, 0xbd5c5ac3, 0xfee8278a, 0x6a55c450
#define AES_GEN_4R_KEY2 0x3d324aac, 0xa7279ad2, 0xd524fde4, 0x114c47a4
#define AES_GEN_4R_KEY3 0x76f6db08, 0x42d3dbd9, 0x99a9aeff, 0x810c3a2a
#define AES_GEN_4R_KEY0 0x99e5d23f, 0x2f546d2b, 0xd1833ddb, 0x6421aadd
#define AES_GEN_4R_KEY1 0xa5dfcde5, 0x06f79d53, 0xb6913f55, 0xb20e3450
#define AES_GEN_4R_KEY2 0x171c02bf, 0x0aa4679f, 0x515e7baf, 0x5c3ed904
#define AES_GEN_4R_KEY3 0xd8ded291, 0xcd673785, 0xe78f5d08, 0x85623763
#define AES_GEN_4R_KEY4 0x229effb4, 0x3d518b6d, 0xe3d6a7a6, 0xb5826f73
#define AES_GEN_4R_KEY5 0xb272b7d2, 0xe9024d4e, 0x9c10b3d9, 0xc7566bf3
#define AES_GEN_4R_KEY6 0xf63befa7, 0x2ba9660a, 0xf765a38b, 0xf273c9e7
#define AES_GEN_4R_KEY7 0xc0b0762d, 0x0c06d1fd, 0x915839de, 0x7a7cd609
template<bool softAes>
void fillAes4Rx4(void *state, size_t outputSize, void *buffer) {
@ -168,12 +172,16 @@ void fillAes4Rx4(void *state, size_t outputSize, void *buffer) {
const uint8_t* outputEnd = outptr + outputSize;
rx_vec_i128 state0, state1, state2, state3;
rx_vec_i128 key0, key1, key2, key3;
rx_vec_i128 key0, key1, key2, key3, key4, key5, key6, key7;
key0 = rx_set_int_vec_i128(AES_GEN_4R_KEY0);
key1 = rx_set_int_vec_i128(AES_GEN_4R_KEY1);
key2 = rx_set_int_vec_i128(AES_GEN_4R_KEY2);
key3 = rx_set_int_vec_i128(AES_GEN_4R_KEY3);
key4 = rx_set_int_vec_i128(AES_GEN_4R_KEY4);
key5 = rx_set_int_vec_i128(AES_GEN_4R_KEY5);
key6 = rx_set_int_vec_i128(AES_GEN_4R_KEY6);
key7 = rx_set_int_vec_i128(AES_GEN_4R_KEY7);
state0 = rx_load_vec_i128((rx_vec_i128*)state + 0);
state1 = rx_load_vec_i128((rx_vec_i128*)state + 1);
@ -183,23 +191,23 @@ void fillAes4Rx4(void *state, size_t outputSize, void *buffer) {
while (outptr < outputEnd) {
state0 = aesdec<softAes>(state0, key0);
state1 = aesenc<softAes>(state1, key0);
state2 = aesdec<softAes>(state2, key0);
state3 = aesenc<softAes>(state3, key0);
state2 = aesdec<softAes>(state2, key4);
state3 = aesenc<softAes>(state3, key4);
state0 = aesdec<softAes>(state0, key1);
state1 = aesenc<softAes>(state1, key1);
state2 = aesdec<softAes>(state2, key1);
state3 = aesenc<softAes>(state3, key1);
state2 = aesdec<softAes>(state2, key5);
state3 = aesenc<softAes>(state3, key5);
state0 = aesdec<softAes>(state0, key2);
state1 = aesenc<softAes>(state1, key2);
state2 = aesdec<softAes>(state2, key2);
state3 = aesenc<softAes>(state3, key2);
state2 = aesdec<softAes>(state2, key6);
state3 = aesenc<softAes>(state3, key6);
state0 = aesdec<softAes>(state0, key3);
state1 = aesenc<softAes>(state1, key3);
state2 = aesdec<softAes>(state2, key3);
state3 = aesenc<softAes>(state3, key3);
state2 = aesdec<softAes>(state2, key7);
state3 = aesenc<softAes>(state3, key7);
rx_store_vec_i128((rx_vec_i128*)outptr + 0, state0);
rx_store_vec_i128((rx_vec_i128*)outptr + 1, state1);

View file

@ -241,7 +241,7 @@ int main(int argc, char** argv) {
std::cout << "Calculated result: ";
result.print(std::cout);
if (noncesCount == 1000 && seedValue == 0)
std::cout << "Reference result: 669ae4f2e5e2c0d9cc232ff2c37d41ae113fa302bbf983d9f3342879831b4edf" << std::endl;
std::cout << "Reference result: a925d346195ef38048e714709e0b24a88fef565fa02fa97127e00fac08ee6eb8" << std::endl;
if (!miningMode) {
std::cout << "Performance: " << 1000 * elapsed / noncesCount << " ms per hash" << std::endl;
}

93
src/tests/rng-tests.cpp Normal file
View file

@ -0,0 +1,93 @@
/*
cd ~
wget http://simul.iro.umontreal.ca/testu01/TestU01.zip
unzip TestU01.zip
mkdir TestU01
cd TestU01-1.2.3
./configure --prefix=`pwd`/../TestU01
make -j8
make install
cd ~/RandomX
g++ -O3 src/tests/rng-tests.cpp -lm -I ~/TestU01/include -L ~/TestU01/lib -L bin/ -l:libtestu01.a -l:libmylib.a -l:libprobdist.a -lrandomx -o bin/rng-tests -DRANDOMX_GEN=4R -DRANDOMX_TESTU01=Crush
bin/rng-tests 0
*/
extern "C" {
#include "unif01.h"
#include "bbattery.h"
}
#include "../aes_hash.hpp"
#include "../blake2/blake2.h"
#include "utility.hpp"
#include <cstdint>
#ifndef RANDOMX_GEN
#error Please define RANDOMX_GEN with a value of 1R or 4R
#endif
#ifndef RANDOMX_TESTU01
#error Please define RANDOMX_TESTU01 with a value of SmallCrush, Crush or BigCrush
#endif
#define STR(x) #x
#define CONCAT(a,b,c) a ## b ## c
#define GEN_NAME(x) "AesGenerator" STR(x)
#define GEN_FUNC(x) CONCAT(fillAes, x, x4)
#define TEST_SUITE(x) CONCAT(bbattery_, x,)
constexpr int GeneratorStateSize = 64;
constexpr int GeneratorCapacity = GeneratorStateSize / sizeof(uint32_t);
static unsigned long aesGenBits(void *param, void *state) {
uint32_t* statePtr = (uint32_t*)state;
int* indexPtr = (int*)param;
int stateIndex = *indexPtr;
if(stateIndex >= GeneratorCapacity) {
GEN_FUNC(RANDOMX_GEN)<false>(statePtr, GeneratorStateSize, statePtr);
stateIndex = 0;
}
uint32_t next = statePtr[stateIndex];
*indexPtr = stateIndex + 1;
return next;
}
static double aesGenDouble(void *param, void *state) {
return aesGenBits (param, state) / unif01_NORM32;
}
static void aesWriteState(void* state) {
char* statePtr = (char*)state;
for(int i = 0; i < 4; ++i) {
std::cout << "state" << i << " = ";
outputHex(std::cout, statePtr + (i * 16), 16);
std::cout << std::endl;
}
}
int main(int argc, char** argv) {
if (argc != 2) {
std::cout << argv[0] << " <seed>" << std::endl;
return 1;
}
uint32_t state[GeneratorCapacity] = { 0 };
int stateIndex = GeneratorCapacity;
char name[] = GEN_NAME(RANDOMX_GEN);
uint64_t seed = strtoull(argv[1], nullptr, 0);
if(seed) {
blake2b(&state, sizeof(state), &seed, sizeof(seed), nullptr, 0);
}
unif01_Gen gen;
gen.state = &state;
gen.param = &stateIndex;
gen.Write = &aesWriteState;
gen.GetU01 = &aesGenDouble;
gen.GetBits = &aesGenBits;
gen.name = (char*)name;
gen.Write(gen.state);
std::cout << std::endl;
TEST_SUITE(RANDOMX_TESTU01)(&gen);
return 0;
}