This avoids error spews from easylogging++ when we try to log
something before easylogging is initialized, which can happen
when errors happen at command line parsing time
EVP_dss1() was deprecated and EVP_sha1() is the direct replacement.
Upstream libunbound already has this patch. Note that I haven't
added a test for HAVE_EVP_DSS1 since that was deprecated quite a
long time ago in OpenSSL, there's really no reason to support it.
Keep the immediate direct deps at the library that depends on them,
declare deps as PUBLIC so that targets that link against that library
get the library's deps as transitive deps.
Break dep cycle between blockchain_db <-> crytonote_core.
No code refactoring, just hide cycle from cmake so that
it doesn't complain (cycles are allowed only between
static libs, not shared libs).
This is in preparation for supproting BUILD_SHARED_LIBS cmake
built-in option for building internal libs as shared.
The split is to make this software more packageable. 'make install'
is used by the package building scripts, and should not be installing
vendored dependencies onto the system.
Also bumped DB VERSION to 1
Another significant speedup and space savings:
Get rid of global_output_indices, remove indirection from output to keys
This is the change warptangent described on irc but never got to finish.
This allows the OpenSSL function checks to compile in unbound's CMake
configuration.
Otherwise, the functions SHA256() and EVP_sha512() won't be called from
libunbound as possible algorithms.
They had not been compiling because static OpenSSL libraries were being
used, along with lack of -ldl. The static library preference is
unnecessary for the checks, so use default suffixes ordering for
CMAKE_FIND_LIBRARY_SUFFIXES when building unbound.
Related files:
configure_checks.cmake
external/unbound/validator/val_secalgo.c
secalgo_ds_digest(), setup_key_digest()
Using libevent seems to have high peaks of file descriptor use,
which can cause failure to create fds in other parts of bitmonerod.
The fallback implementation seems to run fine in a significantly
tighter file descriptor limit.
Bockchain:
1. Optim: Multi-thread long-hash computation when encountering groups of blocks.
2. Optim: Cache verified txs and return result from cache instead of re-checking whenever possible.
3. Optim: Preload output-keys when encoutering groups of blocks. Sort by amount and global-index before bulk querying database and multi-thread when possible.
4. Optim: Disable double spend check on block verification, double spend is already detected when trying to add blocks.
5. Optim: Multi-thread signature computation whenever possible.
6. Patch: Disable locking (recursive mutex) on called functions from check_tx_inputs which causes slowdowns (only seems to happen on ubuntu/VMs??? Reason: TBD)
7. Optim: Removed looped full-tx hash computation when retrieving transactions from pool (???).
8. Optim: Cache difficulty/timestamps (735 blocks) for next-difficulty calculations so that only 2 db reads per new block is needed when a new block arrives (instead of 1470 reads).
Berkeley-DB:
1. Fix: 32-bit data errors causing wrong output global indices and failure to send blocks to peers (etc).
2. Fix: Unable to pop blocks on reorganize due to transaction errors.
3. Patch: Large number of transaction aborts when running multi-threaded bulk queries.
4. Patch: Insufficient locks error when running full sync.
5. Patch: Incorrect db stats when returning from an immediate exit from "pop block" operation.
6. Optim: Add bulk queries to get output global indices.
7. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
8. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
9. Optim: Added thread-safe buffers used when multi-threading bulk queries.
10. Optim: Added support for nosync/write_nosync options for improved performance (*see --db-sync-mode option for details)
11. Mod: Added checkpoint thread and auto-remove-logs option.
12. *Now usable on 32-bit systems like RPI2.
LMDB:
1. Optim: Added custom comparison for 256-bit key tables (minor speed-up, TBD: get actual effect)
2. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
3. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
4. Optim: Added support for sync/writemap options for improved performance (*see --db-sync-mode option for details)
5. Mod: Auto resize to +1GB instead of multiplier x1.5
ETC:
1. Minor optimizations for slow-hash for ARM (RPI2). Incomplete.
2. Fix: 32-bit saturation bug when computing next difficulty on large blocks.
[PENDING ISSUES]
1. Berkely db has a very slow "pop-block" operation. This is very noticeable on the RPI2 as it sometimes takes > 10 MINUTES to pop a block during reorganization.
This does not happen very often however, most reorgs seem to take a few seconds but it possibly depends on the number of outputs present. TBD.
2. Berkeley db, possible bug "unable to allocate memory". TBD.
[NEW OPTIONS] (*Currently all enabled for testing purposes)
1. --fast-block-sync arg=[0:1] (default: 1)
a. 0 = Compute long hash per block (may take a while depending on CPU)
b. 1 = Skip long-hash and verify blocks based on embedded known good block hashes (faster, minimal CPU dependence)
2. --db-sync-mode arg=[[safe|fast|fastest]:[sync|async]:[nblocks_per_sync]] (default: fastest:async:1000)
a. safe = fdatasync/fsync (or equivalent) per stored block. Very slow, but safest option to protect against power-out/crash conditions.
b. fast/fastest = Enables asynchronous fdatasync/fsync (or equivalent). Useful for battery operated devices or STABLE systems with UPS and/or systems with battery backed write cache/solid state cache.
Fast - Write meta-data but defer data flush.
Fastest - Defer meta-data and data flush.
Sync - Flush data after nblocks_per_sync and wait.
Async - Flush data after nblocks_per_sync but do not wait for the operation to finish.
3. --prep-blocks-threads arg=[n] (default: 4 or system max threads, whichever is lower)
Max number of threads to use when computing long-hash in groups.
4. --show-time-stats arg=[0:1] (default: 1)
Show benchmark related time stats.
5. --db-auto-remove-logs arg=[0:1] (default: 1)
For berkeley-db only. Auto remove logs if enabled.
**Note: lmdb and berkeley-db have changes to the tables and are not compatible with official git head version.
At the moment, you need a full resync to use this optimized version.
[PERFORMANCE COMPARISON]
**Some figures are approximations only.
Using a baseline machine of an i7-2600K+SSD+(with full pow computation):
1. The optimized lmdb/blockhain core can process blocks up to 585K for ~1.25 hours + download time, so it usually takes 2.5 hours to sync the full chain.
2. The current head with memory can process blocks up to 585K for ~4.2 hours + download time, so it usually takes 5.5 hours to sync the full chain.
3. The current head with lmdb can process blocks up to 585K for ~32 hours + download time and usually takes 36 hours to sync the full chain.
Averate procesing times (with full pow computation):
lmdb-optimized:
1. tx_ave = 2.5 ms / tx
2. block_ave = 5.87 ms / block
memory-official-repo:
1. tx_ave = 8.85 ms / tx
2. block_ave = 19.68 ms / block
lmdb-official-repo (0f4a036437)
1. tx_ave = 47.8 ms / tx
2. block_ave = 64.2 ms / block
**Note: The following data denotes processing times only (does not include p2p download time)
lmdb-optimized processing times (with full pow computation):
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.25 hours processing time (--db-sync-mode=fastest:async:1000).
2. Laptop, Dual-core / 4-threads U4200 (3Mb) - 4.90 hours processing time (--db-sync-mode=fastest:async:1000).
3. Embedded, Quad-core / 4-threads Z3735F (2x1Mb) - 12.0 hours processing time (--db-sync-mode=fastest:async:1000).
lmdb-optimized processing times (with per-block-checkpoint)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 10 minutes processing time (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with full pow computation)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.8 hours processing time (--db-sync-mode=fastest:async:1000).
2. RPI2. Improved from estimated 3 months(???) into 2.5 days (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with per-block-checkpoint)
1. RPI2. 12-15 hours (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
Due to a bug in unbound, we were passing a string containing a null
character to ub_ctx_resolvconf and ub_ctx_hosts rather than a NULL
pointer. On *nix this wasn't causing headache, but on Windows this was
causing unbound to not correctly load DNS settings from the OS.
Note on the bug: in a Windows-specific code branch in the function
ub_ctx_hosts(), if the hosts file specified was a NULL pointer, a call
to getenv() was stored in a local char* and later freed. This is
incorrect, as we do not own that data, and caused the program to crash.
Everything except actually *using* BlockchainBDB is wired up, but the db
itself is not yet working. Some error about user mem not large enough.
I think I know what this error means, but I can't determine the cause.
Notes: BerkeleyDB does not allow 0-indexing in its recno type databases,
so block numbers *in the database* will be 1-indexed. Modifications
to indexing have been made as needed.
Forgot that CMake vars set to PARENT_SCOPE will still vanish if that
parent scope goes...out of scope. LMDB vars elevated one more scope to
compensate for moving db_drivers/ into external/
libglim is an Apache-licensed C++ wrapper for lmdb, and rather than
rolling our own it seems prudent to use it.
Note: lmdb is not included in it, and unless something happens as did
with libunbound, should be installed via each OS' package manager or
equivalent.
On Windows, getaddrinfo is part of the Windows API and as such is
__stdcall, not __cdecl, so check_function_exists fails because the
declaration doesn't match the mangling __stdcall has. Instead, use a
header to include the symbol as declared on the system and use
check_symbol_exists instead.
Tested-By: greatwolf on IRC