RandomWOW/src
SChernykh 3c8c7ee097
Optimized dataset read (#211)
* Optimized dataset read

There was a false dependency on readReg2 and readReg3 (caused by `xor rbp, rax` instruction) when reading dataset item (see design.md - 4.6.2 Loop execution, steps 5 and 7). This change uses `ma` register to read dataset item before the whole `rbp` (`ma` and `mx`) is changed, so superscalar and out-of-order CPU can start executing it earlier.

Results: https://i.imgur.com/Bpeq9mx.png

~1% speedup on modern Intel/AMD CPUs.

* ARMv8: optimized dataset read

Break dependency from readReg2 and readReg3.

* Fixed light mode hashing
2021-05-22 13:54:50 +02:00
..
asm Optimized dataset read (#211) 2021-05-22 13:54:50 +02:00
blake2 Fix symbol collisions with blake2b (#145) 2019-10-30 20:09:27 +01:00
tests add --noBatch benchmark option 2020-07-04 14:57:56 +02:00
aes_hash.cpp Combined hash and fill AES loop (#166) 2019-12-01 16:58:38 +01:00
aes_hash.hpp Combined hash and fill AES loop (#166) 2019-12-01 16:58:38 +01:00
allocator.cpp Fix inconsistent class/struct usage 2019-11-19 23:17:55 +01:00
allocator.hpp Relicensed under the 3-clause BSD license 2019-05-18 14:21:47 +02:00
argon2.h use SSSE3 consistently as opposed to SSE3 2019-10-06 23:46:49 +02:00
argon2_avx2.c Optimized Argon2 (SSSE3/AVX2) 2019-10-06 18:07:23 +02:00
argon2_core.c Fix symbol collisions with blake2b (#145) 2019-10-30 20:09:27 +01:00
argon2_core.h Optimized Argon2 (SSSE3/AVX2) 2019-10-06 18:07:23 +02:00
argon2_ref.c Optimized Argon2 (SSSE3/AVX2) 2019-10-06 18:07:23 +02:00
argon2_ssse3.c use SSSE3 consistently as opposed to SSE3 2019-10-06 23:46:49 +02:00
assembly_generator_x86.cpp Refactoring (#95) 2019-07-03 18:13:20 +02:00
assembly_generator_x86.hpp Use 'dst' as the CBRANCH condition register 2019-05-21 08:37:36 +02:00
blake2_generator.cpp Refactoring (#95) 2019-07-03 18:13:20 +02:00
blake2_generator.hpp Refactoring (#95) 2019-07-03 18:13:20 +02:00
bytecode_machine.cpp Refactoring (#95) 2019-07-03 18:13:20 +02:00
bytecode_machine.hpp Regression tests (#73) 2019-06-22 15:54:43 +02:00
common.hpp JIT compiler for ARMv8 (#125) 2019-09-22 21:06:22 +02:00
configuration.h Increase the frequency of CBRANCH (#118) 2019-08-30 09:28:18 +02:00
cpu.cpp Apple silicon: force W^X, enable hardware AES 2020-11-29 20:39:53 +01:00
cpu.hpp Automatic detection of CPU capabilities 2019-10-08 23:09:35 +02:00
dataset.cpp Optimized Argon2 (SSSE3/AVX2) 2019-10-06 18:07:23 +02:00
dataset.hpp Fixes for cmake build with visual studio (#144) 2019-11-22 18:24:16 +01:00
instruction.cpp Code generator fixups 2019-06-23 23:10:29 +02:00
instruction.hpp Use strongly typed enums (#55) 2019-06-10 16:02:25 +02:00
instruction_weights.hpp Regression tests (#73) 2019-06-22 15:54:43 +02:00
instructions_portable.cpp fix test 92 not failing properly on GCC/amd64 2020-05-06 13:48:53 +02:00
intrin_portable.h fix test 92 not failing properly on GCC/amd64 2020-05-06 13:48:53 +02:00
jit_compiler.hpp Apple silicon: force W^X, enable hardware AES 2020-11-29 20:39:53 +01:00
jit_compiler_a64.cpp Fix illegal instruction crash on some ARM systems 2021-02-01 23:19:14 +01:00
jit_compiler_a64.hpp Fix inconsistent class/struct usage 2019-11-19 23:17:55 +01:00
jit_compiler_a64_static.hpp JIT compiler for ARMv8 (#125) 2019-09-22 21:06:22 +02:00
jit_compiler_a64_static.S Optimized dataset read (#211) 2021-05-22 13:54:50 +02:00
jit_compiler_fallback.hpp Fix inconsistent class/struct usage 2019-11-19 23:17:55 +01:00
jit_compiler_x86.cpp remove unnecessary first-load initialization code 2021-01-23 14:56:35 -08:00
jit_compiler_x86.hpp Fix inconsistent class/struct usage 2019-11-19 23:17:55 +01:00
jit_compiler_x86_static.asm remove unnecessary first-load initialization code 2021-01-23 14:56:35 -08:00
jit_compiler_x86_static.hpp remove unnecessary first-load initialization code 2021-01-23 14:56:35 -08:00
jit_compiler_x86_static.S remove unnecessary first-load initialization code 2021-01-23 14:56:35 -08:00
program.hpp Relicensed under the 3-clause BSD license 2019-05-18 14:21:47 +02:00
randomx.cpp Merge pull request #187 from tevador/pr-netbsd 2020-06-28 16:35:19 +02:00
randomx.h Preserve floating point state when calling randomx_calculate_hash 2020-05-06 12:42:30 +02:00
reciprocal.c Sanity checks (#88) 2019-06-29 18:53:49 +02:00
reciprocal.h Regression tests (#73) 2019-06-22 15:54:43 +02:00
soft_aes.cpp Relicensed under the 3-clause BSD license 2019-05-18 14:21:47 +02:00
soft_aes.h Relicensed under the 3-clause BSD license 2019-05-18 14:21:47 +02:00
superscalar.cpp Fix a possible out-of-bounds access in superscalar generator 2019-10-11 11:31:05 +02:00
superscalar.hpp Fix header dependency of superscalar_program.hpp 2019-06-24 13:58:41 +02:00
superscalar_program.hpp Sanity checks (#88) 2019-06-29 18:53:49 +02:00
virtual_machine.cpp Combined hash and fill AES loop (#166) 2019-12-01 16:58:38 +01:00
virtual_machine.hpp fix potential use-after-free when reallocating cache 2020-06-27 20:21:06 +02:00
virtual_memory.cpp Faster W^X policy for apple silicon macs 2021-05-20 20:35:18 +01:00
virtual_memory.hpp Optional W^X policy for JIT pages (#112) 2019-08-25 13:47:40 +02:00
vm_compiled.cpp JIT compiler for ARMv8 (#125) 2019-09-22 21:06:22 +02:00
vm_compiled.hpp Optional W^X policy for JIT pages (#112) 2019-08-25 13:47:40 +02:00
vm_compiled_light.cpp Optional W^X policy for JIT pages (#112) 2019-08-25 13:47:40 +02:00
vm_compiled_light.hpp Optional W^X policy for JIT pages (#112) 2019-08-25 13:47:40 +02:00
vm_interpreted.cpp Regression tests (#73) 2019-06-22 15:54:43 +02:00
vm_interpreted.hpp Regression tests (#73) 2019-06-22 15:54:43 +02:00
vm_interpreted_light.cpp Support Dataset size larger than 4 GiB 2019-05-29 17:27:49 +02:00
vm_interpreted_light.hpp Add Dataset prefetch in interpreted VM (#52) 2019-06-10 16:00:04 +02:00