Bitcoin behind the curtains: 2 - Sending Transactions
What happens when a transaction is sent to the Bitcoin network.
In the last post, we’ve seen what transactions are made of. If you need a quick refresh, you can take a look at the summary. In this post, we’re gonna understand what needs to be done in order to confirm the transaction.
Assembling a transaction, gathering inputs, outputs, setting transaction fees, and signing the transaction is something that can be done offline. If you wanna go full hardcore mode, you can even assemble and sign your transaction with paper and pen.
But in order to transfer the custody of the money to a new owner, you need to do one extra step besides creating the transaction: broadcast it to the Bitcoin network. After you broadcast it, all you have to do is wait for the confirmation of the transaction.
The Bitcoin Network
The bitcoin networks differ from the most common computer networks. Practically all the websites we visit are based on the “client-server” model. In this model, the application architecture is divided in a way that segregates the consumption and supply of resources. That means that the participants of that network are not equally privileged. The server has more permissions and access than the clients.
In order to build permissionless, trustless, and uncensorable money for the internet, we need a different model for the application architecture. In this model, all participants must be equally privileged and equipotent on the network.
Luckily, there’s a network architecture designed just for that. It’s called a “peer-to-peer network” (p2p, for short). In p2p networks, the participants are all equally privileged and equipotent. A p2p network is egalitarian by design and with a couple more ingredients, that network architecture is used to run Bitcoin and enable it to be permissionless, trust-less, and uncensorable.
In order to participate in the Bitcoin network, all you have to do is download and run a software that can talk in the bitcoin language — aka protocol — on your computer. This software enables you to communicate with other participants and thus broadcast information, such as a transaction, for instance. We call such software a Bitcoin client.
So a participant of the network is a computer that runs a Bitcoin client and is connected to other computers that are also running bitcoin clients. We call a participant of the network a “node”.
Broadcasting a Transaction
Going back to our transaction timeline, let’s suppose that Alice is transferring the custody of some coins from her to Bob. Alice uses an app to assemble and sign the transaction, and when she clicks “send” the software will broadcast the transaction to all nodes connected to it.
When a node receives a broadcasted transaction, it will check the transaction against the rules of the protocol. If the transaction obeys the rules, the node will broadcast to all nodes it is connected to and so on. If the transaction doesn’t obey the rules of the protocol, it gets discarded by the node.
Soon enough, if the transaction follows the protocol rules, it will be broadcasted to the majority of the nodes of the network. This technique is called “flooding” and it’s a very common strategy to transmit data around in peer-to-peer networks.
Notice that Alice doesn’t need to be directly connected to Bob in order to make the payment. That’s because there are no funds being moved around, all that is being transferred is the custody of bitcoins. It’s kinda like announcing: “The new owner of these bitcoins must prove that he/she is the actual owner”. As I told you in the last post, that proof is a digital signature. So there’s no need to be directly connected to the new owner when making a transaction.
It’s also worth noticing that despite the majority of the network being aware of the transaction, it is not yet confirmed and thus cannot be spent by Bob. Unconfirmed transactions are stored in a node’s mempool. Mempool is a memory pool of unconfirmed transactions. Each node has its own mempool since not all nodes are aware of the same set of unconfirmed transactions at the same time.
In order to confirm the transaction, it must be added to a block of transactions and this block must be linked to the previous block of transactions. This is done via a special process called “mining”.
Mining Bitcoins
At a high level, mining is the process used in Bitcoin to append a “block” of new transactions in the ledger. Bitcoin mining is a process that requires a lot of energy, thus the node that successfully mines a block is rewarded with all the transaction fees contained in that block, plus a number of new coins.
In the whitepaper, Satoshi described mining as a process similar to gold mining, thus the name “mining”. Like gold mining, Bitcoin mining is a very expensive process that results in new monetary units in the network.
This expensive process is in reality solving an inequation like the one below.
value_computed_by_miner < target_set
You can read this as: “the value computed by the miner must be lower than the target value in order to have a valid solution for the mining problem”.
The value computed by the miner is the output of a hash function. A hash function is a mathematical function that receives an input and outputs a random number.
You can play around with a hash function here. Try entering similar inputs and see how the outputs compare. Notice that the output is a string of numbers and letters, this is just a number represented in hexadecimal representation. You can read more about hexadecimal here.
We’ll talk more about hash functions in the future, all that you need to know right now is that if you use very similar inputs, the outputs will be entirely different.
f("cat") = "77af778b51abd4a3c51c5ddd97204a9c3ae614ebccb75a606c3b6865aed6744e"
f("Cat") = "48735c4fae42d1501164976afec76730b9e5fe467f680bdd8daff4bb77674045"
This property of hash functions means that the mining process is actually a probability problem. Miners try different inputs until it finds a valid solution for the inequation. It makes no sense to compute the hash and check how close you were to solving the inequation and then making little tweaks on the input, expecting an output similar to the last one computed.
Since this is a probability problem, we can expect that after some amount of attempts, a valid input will be found. When this input is found, the miner gains the right of appending the new block of transactions to the ledger.
The hard-working miner will then broadcast the new block to the nodes connected to it. The nodes will verify if the input provided actually solves the inequation. If the solution is valid, they will broadcast the block forward, otherwise, the block gets discarded. Notice that this is very similar to the “transaction flooding” technique described earlier.
We call the solution to this inequation a “proof-of-work”. A cool insight about that process is its asymmetry between finding a valid solution and verifying its validity. The first is really hard and expensive, like trying to find a needle in a haystack, and the second one is as simple as checking if a number is lower than another. This asymmetry enables Bitcoin to operate without a trusted central authority because the nodes can easily verify the work made by the miner, instead of trusting that the proof-of-work is valid.
As more miners join the network, you could expect that a valid solution would be found quicker. Just like 10 persons flipping a coin has a greater chance of finding a “head” than one person flipping a coin. But Bitcoin actually checks the time interval between blocks produced by miners and adjusts the difficulty of the inequation accordingly, in order to maintain the average time interval between blocks constant.
The difficulty adjustment idea was one of the great breakthroughs of Satoshi. If that was not used in the mining process, new coins would be issued faster as more miners join the network and as hardware gets better. Having a constant average time interval between blocks is what enables bitcoin issuance to follow a predictable schedule.
In order to correctly set the difficulty of the mining problem, the software compares the time it took to mine the last 2016 blocks to the expected time. If it found all those blocks too quickly, the difficulty of the mining problem increases. Otherwise, if the blocks were found too slowly, the difficulty of the mining problem decreases.
Adjustments to the difficulty can be done by tweaking the target_set
part of the inequation. If the difficulty must increase, we lower the target_set
to a smaller number, therefore the probability of finding a valid solution decreases. If the difficulty must decrease, we increase the target_set
to a bigger number, therefore the probability of finding a valid solution increases.
To visualize this concept, I’ve built a minigame. Imagine you’re in a bar with friends. You’re shooting darts at a target, but you’re all completely drunk. So drunk that only luck counts when you’re trying to shoot the dart at the target, no skills involved, only pure luck. You can change the number of friends shooting darts simultaneously and you can also change the size of the target. This way, you can manually play around with difficulty adjustment. Notice what happens to the number of attempts needed to hit the target as you adjust the difficulty.
Confirmed Transaction
So Alice has created and signed the transaction and broadcast it to the Bitcoin network. After some time, a miner that assembled a block containing Alice’s transaction finds a valid proof-of-work and broadcasts the new block to the Bitcoin network.
Bob’s node will eventually receive the new block with Alice’s transaction that transferred some coins to Bob’s custody. Bob’s node checks the new block’s transactions and sees that there is a transaction associated with Bob’s digital identity. Now the node is aware that Bob’s balance has changed.
Now Bob can also transfer the new coins to someone else, just as Alice did, furthering the chain of custody related to the coins.
Summary
In order to support permissionless, trustless, and uncensorable money for the internet, Bitcoin uses a peer-to-peer network architecture, where each participant is equipotent on the network.
In order to participate in the Bitcoin network, all you have to do is download and run software that can talk in the bitcoin language — aka protocol — on your computer.
We call a computer that is running the Bitcoin software and is connected to other computers that are also running the Bitcoin software a node.
The Bitcoin network uses a technique called “flooding” to transmit data.
Nodes broadcast transactions to connected peers. Peers check the transaction against the rules of the protocol.
Transactions that follow the rules are broadcasted forward. Transactions that don’t follow the rules get discarded.
A bitcoin transaction doesn’t move coins around, it only adds an entry on the ledger that says that the bitcoins involved in the transaction now have a new owner.
A broadcasted transaction only gets confirmed when a miner appends a new block of transactions to the ledger containing the transaction broadcasted.
Adding blocks to the ledger is a very expensive activity, so the miner who did the work and found a valid proof-of-work gets paid with the transaction fees contained in the block plus some new-minted coins.
Finding valid proof of work is hard, but validating the proof provided is quite easy. This asymmetry enables nodes to verify easily the work of the miners.
In order to keep the issuance of new coins in a predictable schedule, the target can increase or decrease to react to miners joining and leaving the network and better hardware over time.