update_checkpoints() makes a few DNS requests and can take up to 20-30 seconds to complete (3-6 seconds on average). It is currently called from core::handle_incoming_block() which holds m_incoming_tx_lock, so it blocks all incoming transactions and blocks processing while update_checkpoints() is running. This PR moves it to until after a new block has been processed and relayed, to avoid full monerod locking.
A newly synced Alice sends a (typically quite small) list of
txids in the local tpxool to a random peer Bob, who then uses
the existing tx relay system to send Alice any tx in his txpool
which is not in the list Alice sent
If the peer (whether pruned or not itself) supports sending pruned blocks
to syncing nodes, the pruned version will be sent along with the hash
of the pruned data and the block weight. The original tx hashes can be
reconstructed from the pruned txes and theur prunable data hash. Those
hashes and the block weights are hashes and checked against the set of
precompiled hashes, ensuring the data we received is the original data.
It is currently not possible to use this system when not using the set
of precompiled hashes, since block weights can not otherwise be checked
for validity.
This is off by default for now, and is enabled by --sync-pruned-blocks
The blockchain prunes seven eighths of prunable tx data.
This saves about two thirds of the blockchain size, while
keeping the node useful as a sync source for an eighth
of the blockchain.
No other data is currently pruned.
There are three ways to prune a blockchain:
- run monerod with --prune-blockchain
- run "prune_blockchain" in the monerod console
- run the monero-blockchain-prune utility
The first two will prune in place. Due to how LMDB works, this
will not reduce the blockchain size on disk. Instead, it will
mark parts of the file as free, so that future data will use
that free space, causing the file to not grow until free space
grows scarce.
The third way will create a second database, a pruned copy of
the original one. Since this is a new file, this one will be
smaller than the original one.
Once the database is pruned, it will stay pruned as it syncs.
That is, there is no need to use --prune-blockchain again, etc.
To help protect one's privacy from traffic volume analysis
for people using Tor or I2P. This will really fly once we
relay txes on a timer rather than on demand, though.
Off by default for now since it's wasteful and doesn't bring
anything until I2P's in.
It was always returning true, and could not be foreseen to
usefully return errors in the future. This silences CID 162652
as well as saves some checking code in a few places.
Add get_fork_version and add_ideal_fork_version to core so
cryptonote_protocol does not have to need the Blockchain
class directly, as it's not in its dependencies, and add
those to the fake core classes in tests too.
A block queue is now placed between block download and
block processing. Blocks are now requested only from one
peer (unless starved).
Includes a new sync_info coommand.
- fix wrong block being used when a new block is received between
a node elaying a fluffy block and sending a new fluffy block
with txes a peer did not have
- misc a neverending ping pong requesting the same missing txids
when a new block is received in the meantime, causing the top
block to not be the one we need
- send the original fluffy block message block height when sending
a new fluffy block, not the current top height, which might
have been updated since
- avoid sending back the whole block blob when asking for txes,
send only the hash instead
- plus misc cleanup and additional debugging logs
The last relayed time of a transaction is maintained, and
transactions will be relayed again if they are still in the
pool after a certain amount of time, which increases with
the transaction's age. All such transactions are resent,
whether or not they originated on the local node.
Bockchain:
1. Optim: Multi-thread long-hash computation when encountering groups of blocks.
2. Optim: Cache verified txs and return result from cache instead of re-checking whenever possible.
3. Optim: Preload output-keys when encoutering groups of blocks. Sort by amount and global-index before bulk querying database and multi-thread when possible.
4. Optim: Disable double spend check on block verification, double spend is already detected when trying to add blocks.
5. Optim: Multi-thread signature computation whenever possible.
6. Patch: Disable locking (recursive mutex) on called functions from check_tx_inputs which causes slowdowns (only seems to happen on ubuntu/VMs??? Reason: TBD)
7. Optim: Removed looped full-tx hash computation when retrieving transactions from pool (???).
8. Optim: Cache difficulty/timestamps (735 blocks) for next-difficulty calculations so that only 2 db reads per new block is needed when a new block arrives (instead of 1470 reads).
Berkeley-DB:
1. Fix: 32-bit data errors causing wrong output global indices and failure to send blocks to peers (etc).
2. Fix: Unable to pop blocks on reorganize due to transaction errors.
3. Patch: Large number of transaction aborts when running multi-threaded bulk queries.
4. Patch: Insufficient locks error when running full sync.
5. Patch: Incorrect db stats when returning from an immediate exit from "pop block" operation.
6. Optim: Add bulk queries to get output global indices.
7. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
8. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
9. Optim: Added thread-safe buffers used when multi-threading bulk queries.
10. Optim: Added support for nosync/write_nosync options for improved performance (*see --db-sync-mode option for details)
11. Mod: Added checkpoint thread and auto-remove-logs option.
12. *Now usable on 32-bit systems like RPI2.
LMDB:
1. Optim: Added custom comparison for 256-bit key tables (minor speed-up, TBD: get actual effect)
2. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
3. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
4. Optim: Added support for sync/writemap options for improved performance (*see --db-sync-mode option for details)
5. Mod: Auto resize to +1GB instead of multiplier x1.5
ETC:
1. Minor optimizations for slow-hash for ARM (RPI2). Incomplete.
2. Fix: 32-bit saturation bug when computing next difficulty on large blocks.
[PENDING ISSUES]
1. Berkely db has a very slow "pop-block" operation. This is very noticeable on the RPI2 as it sometimes takes > 10 MINUTES to pop a block during reorganization.
This does not happen very often however, most reorgs seem to take a few seconds but it possibly depends on the number of outputs present. TBD.
2. Berkeley db, possible bug "unable to allocate memory". TBD.
[NEW OPTIONS] (*Currently all enabled for testing purposes)
1. --fast-block-sync arg=[0:1] (default: 1)
a. 0 = Compute long hash per block (may take a while depending on CPU)
b. 1 = Skip long-hash and verify blocks based on embedded known good block hashes (faster, minimal CPU dependence)
2. --db-sync-mode arg=[[safe|fast|fastest]:[sync|async]:[nblocks_per_sync]] (default: fastest:async:1000)
a. safe = fdatasync/fsync (or equivalent) per stored block. Very slow, but safest option to protect against power-out/crash conditions.
b. fast/fastest = Enables asynchronous fdatasync/fsync (or equivalent). Useful for battery operated devices or STABLE systems with UPS and/or systems with battery backed write cache/solid state cache.
Fast - Write meta-data but defer data flush.
Fastest - Defer meta-data and data flush.
Sync - Flush data after nblocks_per_sync and wait.
Async - Flush data after nblocks_per_sync but do not wait for the operation to finish.
3. --prep-blocks-threads arg=[n] (default: 4 or system max threads, whichever is lower)
Max number of threads to use when computing long-hash in groups.
4. --show-time-stats arg=[0:1] (default: 1)
Show benchmark related time stats.
5. --db-auto-remove-logs arg=[0:1] (default: 1)
For berkeley-db only. Auto remove logs if enabled.
**Note: lmdb and berkeley-db have changes to the tables and are not compatible with official git head version.
At the moment, you need a full resync to use this optimized version.
[PERFORMANCE COMPARISON]
**Some figures are approximations only.
Using a baseline machine of an i7-2600K+SSD+(with full pow computation):
1. The optimized lmdb/blockhain core can process blocks up to 585K for ~1.25 hours + download time, so it usually takes 2.5 hours to sync the full chain.
2. The current head with memory can process blocks up to 585K for ~4.2 hours + download time, so it usually takes 5.5 hours to sync the full chain.
3. The current head with lmdb can process blocks up to 585K for ~32 hours + download time and usually takes 36 hours to sync the full chain.
Averate procesing times (with full pow computation):
lmdb-optimized:
1. tx_ave = 2.5 ms / tx
2. block_ave = 5.87 ms / block
memory-official-repo:
1. tx_ave = 8.85 ms / tx
2. block_ave = 19.68 ms / block
lmdb-official-repo (0f4a036437)
1. tx_ave = 47.8 ms / tx
2. block_ave = 64.2 ms / block
**Note: The following data denotes processing times only (does not include p2p download time)
lmdb-optimized processing times (with full pow computation):
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.25 hours processing time (--db-sync-mode=fastest:async:1000).
2. Laptop, Dual-core / 4-threads U4200 (3Mb) - 4.90 hours processing time (--db-sync-mode=fastest:async:1000).
3. Embedded, Quad-core / 4-threads Z3735F (2x1Mb) - 12.0 hours processing time (--db-sync-mode=fastest:async:1000).
lmdb-optimized processing times (with per-block-checkpoint)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 10 minutes processing time (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with full pow computation)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.8 hours processing time (--db-sync-mode=fastest:async:1000).
2. RPI2. Improved from estimated 3 months(???) into 2.5 days (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with per-block-checkpoint)
1. RPI2. 12-15 hours (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).