Bitcoin Q&A: Initial Blockchain Download

Becker asks, "Downloading the blockchain…"
This is a beginner's question, very good. Thank you for asking beginner's
questions, that is really useful. There are a lot of people who need basic information,
and beginner's questions are perfect for this Q&A. Becker asks, "Why does it take so long to download
the blockchain? I do have a fast internet connection." "I could download 200 gigabytes in less than an hour." Becker is asking about what is called
the initial blockchain download or IBD, which is the first synchronisation of the Bitcoin node, or any kind of blockchain node, to its blockchain network.

The answer is: while the amount of data you need to
download in order to have the full Bitcoin blockchain… is about 200 gigabytes or so, you are not
simply downloading and storing it on disk. One of the Bitcoin node's fundamental functions is to
validate [transactions against] the rules of consensus. Your node does that, even if you're not
trying to do a full sync of the blockchain. Every node validates every rule. When you start from the genesis block and download [each subsequent block], you are building [up to]… the complete blockchain [of today],
and fully sync with the rest of the network. [With] every block, you download all of the transactions.
Then your node goes through it, validates everything. All of the signatures, spends,
amounts, coinbase reports, and fees. It recreates and reconstructs every soft fork
and upgrade, every change in the code. It replicates the entire history from
January 3rd 2009 [through to today]. It behaves like a node in 2009 for the
first period of downloading the blockchain; it counts the votes in a soft fork, changes in real-time,
then evaluates the next block based on those new rules.

It re-calculates the difficulty and [checks] if miners
are missing the target for blocks mined in 2010. It evaluates every rule as if it is that time,
downloading the blockchain for the first time. It simulates living in 2009, then 2010, etc.
Every bug, every fork, every change. That takes more than just bandwidth.
It also takes CPU and a big amount of disk indexing. In order to validate whether a transaction
hasn't been double-spent or improperly spent, it must keep in memory and index all the UTXOs,
to evaluate whether each amount was spendable. And transaction IDs. When your transaction refers
to a previous transaction, it must look it up by hash. It must reconstruct the Merkle roots of all blocks
and keep the hash from the previous block listed. That is a lot of database indexing.
That is what happens with your node. I would guess that your real problem here is not
bandwidth on the network, but to the hard drive.

Capacity, performance, as well as
available memory on the hard drive. A recommended minimum configuration
[for a node] involves four gigabytes of RAM. That is only if you have a relatively fast solid-state disk
(SSD), due to all the index, reading, and writing on disk. If you don't have a solid-state disk, you
need to do a lot more caching in RAM, to compensate for the performance
of an old mechanical hard drive. In that case, you might need 8 -16 gigabytes of RAM.
I would guess your bottleneck is disk I/O, perhaps CPU, although if you are running it on a 4-core modern
processor, that shouldn't be a problem.

If you are doing this on a Raspberry Pi with only two
gigabytes of RAM, then I can see what your problem is. That [will be] all of the bottlenecks within
the system, rather than your bandwidth..

As found on YouTube

You May Also Like