mirror of
https://git.wownero.com/wownero/RandomWOW.git
synced 2024-08-15 00:23:14 +00:00
Added DRAM buffer option to rx2c
This commit is contained in:
parent
2ea440d0f5
commit
8f3b145fe6
2 changed files with 35 additions and 19 deletions
|
@ -18,10 +18,10 @@ The VM has access to 4 GiB of external memory in read-only mode. The DRAM memory
|
|||
*The DRAM blob can be generated in 0.1-0.3 seconds using 8 threads with hardware-accelerated AES and dual channel DDR3 or DDR4 memory. Dual channel DDR4 memory has enough bandwidth to support up to 16 mining threads.*
|
||||
|
||||
#### MMU
|
||||
The memory management unit (MMU) interfaces the CPU with the DRAM blob. The purpose of the MMU is to translate the random memory accesses generated by the random program into a DRAM-friendly access pattern, where memory reads are not bound by access latency. The MMU accepts a 32-bit address `addr` and outputs a 64-bit value from DRAM. The MMU splits the 4 GiB DRAM blob into 256-byte blocks. Data within one block is always read sequentially in 32 reads (32×8 bytes). When the current block has been consumed, reading jumps to a random block. The address of the next block is calculated 8 reads before the current block is exhausted to enable efficient prefetching. The MMU uses three internal registers:
|
||||
The memory management unit (MMU) interfaces the CPU with the DRAM blob. The purpose of the MMU is to translate the random memory accesses generated by the random program into a DRAM-friendly access pattern, where memory reads are not bound by access latency. The MMU accepts a 32-bit address `addr` and outputs a 64-bit value from DRAM. The MMU splits the 4 GiB DRAM blob into 256-byte blocks. Data within one block is always read sequentially in 32 reads (32×8 bytes). When the current block has been consumed, reading jumps to a random block. The address of the next block is calculated 16 reads before the current block is exhausted to enable efficient prefetching. The MMU uses three internal registers:
|
||||
* **m0** - Address of the next quadword to be read from memory (32-bit, 8-byte aligned).
|
||||
* **m1** - Address of the next block to be read from memory (32-bit, 256-byte aligned).
|
||||
* **mx** - Random 32-bit counter that determines the address of the next block. After each read, the read address is mixed with the counter: `mx ^= addr`. When the 24th quadword of the current block is read (the value of the `m0` register ends with `0xC0`), the value of the `mx` register is copied into register `m1` and the last 8 bits of `m1` are cleared.
|
||||
* **mx** - Random 32-bit counter that determines the address of the next block. After each read, the read address is mixed with the counter: `mx ^= addr`. When the 16th quadword of the current block is read (the value of the `m0` register ends with `0x80`), the value of the `mx` register is copied into register `m1` and the last 8 bits of `m1` are cleared.
|
||||
|
||||
*When the value of the `m1` register is changed, the memory location can be preloaded into CPU cache using the x86 `PREFETCH` instruction or ARM `PRFM` instruction. Implicit prefetch should ensure that sequentially accessed memory is already in the cache.*
|
||||
|
||||
|
@ -169,7 +169,7 @@ A 32-bit address mask that is used to calculate the write address for the C oper
|
|||
|147-157|ROR_64|no|64|6|A >>> B|64|
|
||||
|
||||
##### 32-bit operations
|
||||
Instructions ADD_32, SUB_32, AND_32, OR_32, XOR_32 only use the low-order 32 bits of the input operands. The result of these operations are 32 bits long and bits 32-63 of C are zero.
|
||||
Instructions ADD_32, SUB_32, AND_32, OR_32, XOR_32 only use the low-order 32 bits of the input operands. The result of these operations is 32 bits long and bits 32-63 of C are zero.
|
||||
|
||||
##### Multiplication
|
||||
There are 5 different multiplication operations. MUL_64 and MULH_64 both take 64-bit unsigned operands, but MUL_64 produces the low 64 bits of the result and MULH_64 produces the high 64 bits. MUL_32 and IMUL_32 use only the low-order 32 bits of the operands and produce a 64-bit result. The signed variant interprets the arguments as signed integers. IMULH_64 takes two 64-bit signed operands and produces the high-order 64 bits of the result.
|
||||
|
@ -246,7 +246,7 @@ The RET instruction behaves like "not taken" when the stack is empty. Taken RET
|
|||
The program is initialized from a 256-bit seed value `S`.
|
||||
1. A [pcg32](http://www.pcg-random.org/) random number generator is initialized with state `S[63:0]`.
|
||||
2. The generator is used to generate random 128 bytes `R1`.
|
||||
3. Integer registers `r0`-`r7` are initialized using bytes 0-63 bytes of `R1`.
|
||||
3. Integer registers `r0`-`r7` are initialized using bytes 0-63 of `R1`.
|
||||
4. Floating point registers `f0`-`f7` are initialized using bytes 64-127 of `R1` interpreted as 8 64-bit signed integers converted to a double precision floating point format.
|
||||
5. The initial value of the `m0` register is set to `S[95:64]` and the the last 8 bits are cleared (256-byte aligned).
|
||||
6. `S` is expanded into 10 AES round keys `K0`-`K9`.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue