Reinventing Bitcoin

by Steven J. Owens (unless otherwise attributed)

This is my summary of Michael Nielsen's essay:

http://www.michaelnielsen.org/ddi/how-the-bitcoin-protocol-actually-works/

I wrote this summary to help myself understand Nielsen's essay. While reading his extremely interesting essay, I found myself writing out the sequence of steps as the workflow evolved. This summary has helped, but it's still harder than it needs to be. I think I really need to make some flow diagrams to illustrate it.

Nielse's essay only covers the basic concept of bitcoin, he notes at the end that it doesn't cover how the bitcoin network operates, how servers join and leave the network, or the bitcoin scripting language.

The conceit is that he's explaining bitcoin by reinventing bitcoin, starting with a super simple version, then with each new version adding new features. He calls his fictional bitcoin "infocoin."

Caveat, the whole point of writing this was that I'm not a bitcoin expert, so it's entirely possible I got some things wrong here.

Two other interesting links:

https://www.reddit.com/r/programming/comments/5s7h65/blockchain_for_dummies/https://anders.com/blockchain/

Basic Concepts

You should be able to read up on these in a great many places, so I'm not going to document them here:

For non-technical readers, the really short version is:

Public Key Cryptography

Traditional cryptography is based on using a bit of data to scramble some information, and then using the same bit of data to descramble it. The bit of data is called the secret or key. In modern cryptography it's always a number.

"Public key cryptography" is a shortening of the phrase "public/private key cryptography", because it's based on two numbers, the public key and the private key.

Public key cryptography uses a type of mathematical scrambling that, unlike the math you were taught in high school, cannot be easily run in reverse. If "X + 2 = Y" and you know that X is 3, then Y must equal 5. If you don't know X but know that the result, Y, is 5, then you can run it backward to figure out that X must equal 2.

You can't do this - at least not easily - with the mathematical scrambling used in public key cryptography. This allows you to publish one half of the pair. Then people can use the public key to encrypt a message to you, and only you have the other half of the pair - the private key - to decrypt it.

I'll skip over why, but I'll note that it relies heavily on prime numbers, which are mathematically hard to factor out in a multiplication/division sort of situation. This is why you hear talk about "key size" and "computers getting faster", and why you sometimes (rarely) hear about some new math - some much faster way to factor prime numbers - that threatens to abruptly undermine the effectiveness of widely-used encryption.

Digital Signatures

If you think about it, you can see this also gives you a way to prove that you are the person who controls the private key. Somebody encrypts a secret message with your public key and sends it to you. You decrypt it and tell them what was in the message, thus proving that you know the private key that matches that public key.

A more complicated version of this is what we call a digital signature. It's a way to take a message and generate a number. Then you publish the message and number. People can use the published public key to run a calculation that proves that number could only have been generated using a) the private key that matches the public key and b) an exact copy of that message.

This proves that a) the person who controlled the public key generated that signature and b) the message contents were not altered by somebody in between.

Hashing

Typically for digital signatures, messages are hashed, then the hash is digitally signed.

A hash is sort of like encryption; it's a way of running some data (numbers) through a mathematical formula and reducing it to a much shorter number, a sort of numeric "fingerprint". It's like taking a series of numbers and adding them up. Then somebody else hands you another series of numbers and if they all add up to the same amount, then you know that it's likely that both strings of numbers are the same. But of course simply adding isn't quite effective enough, it's easy to just add or change a few numbers to make sure they both add up to the same. So hashing uses more complicated math, which I won't go into here. A "cryptographically strong" hash uses math which makes it particularly hard to tweak the numbers.

So using a signed hash is:

a) faster to transmit, since the hash is smaller; hash algorithms always produce a number that is so many digits long, and usually that's a fairly small number of digits (32, 64, etc) compared to the size of the document that's fed into the hashing algorithm.

b) better interoperability, since any data (a letter, a digital photo, a spreadsheet or a database, etc) can be reduced to a hash,

c) easier, since without the hash to reduce the size, the message might need to be chunked up and each chunk signed, then you have to manage issues of completeness, sequence, etc.

Let's Start Reinventing

Okay, with those basic concepts out of the way, let's go back to Nielsen's very interesting article and start by looking at the simplest workflow:

A Signed Letter of Intent (an IOU)

Workflow:

  1. Sender composes a transaction message: "I, Alice, am giving Bob one infocoin"
  2. Sender digitally signs the message.
  3. Sender sends the signed message to recipient.
  4. Recipient sends acknowledgement of receipt.

Serial Numbers and a Bank


Problem: Same message can be resent 10 times, is this 10 coins?
Solution: Assign a unique serial number to each message.

Create a central server (a bank) to dole out unique serial numbers.

Note: Of course these aren't actually serial numbers, in the sense that they are numbers in a series: 1, 2, 3, 4, etc. Instead, they're unique identifier numbers that are generated cryptographically. The very name "serial number" is because serial numbers use the sequence (first, second, third, etc) to make sure each transaction gets a unique number to identify it. In the crypto world the numbers actually aren't sequential, because that would make them predictable, and that would mean bad guys could use that predictability to make fake messages. The programming and crypto worlds have different names for these numbers, "uuid" (unique user id), guid (generic unique id), "swiss number", etc. I'm just going to keep calling them serial numbers to make this easier to read, and trust you to remember that they're not sequential and predictable.

New workflow:

  1. Sender gets serial number from bank server.
  2. Sender composes transaction message (which includes the serial number).
  3. Sender digitally signs message.
  4. Sender sends the signed message to recipient.
  5. Recipient contacts the bank.
  6. Bank confirms that sender owns the serial number.
  7. Bank confirms that nobody else has claimed receipt of the serial number.
  8. Recipient sends acknowledgement of receipt.

Decentralizing The Bank

A centralized bank is a very common thing, but in the crypto world centralization raises all sorts of concerns. Therefore, we decentralize. Everyone using infocoin keeps a complete record of who owns which serial number.

We're not completely decentralizing - that would require every single user to keep a full copy of all user transactions. Instead, we set up a whole bunch of servers and they all agree to talk to each other and keep each other up to date.

I've noticed this seems to be a bit of a confusion point for a lot of people. A lot of people seem to think that bitcoin is anonymous - that the "crypto" part of cryptocurrency means that the transactions are kept private. It does not work like that. The decentralized design requires that everybody can see all of the transactions. Which means that the idea of bitcoins being untraceable is silly.

Of course, the connection between any given person and their bitcoin account is kept private - but that's relatively easy to crack with either a subpoena or surveillance.


Transaction: a signed IOU/serial number message.
The Blockchain: a "shared public ledger", showing all bitcoin transactions.

Note: as we'll see later, ultimately the ledger - the Blockchain - shows not just who currently owns which serial number, but ALL transactions, from which you reconstruct who currently owns what.

New workflow:

  1. Sender cryptographically generates their own transaction serial number.
  2. Sender composes transaction with serial number.
  3. Sender digitally signs transaction.
  4. Sender sends transaction to recipient.
  5. Recipient confirms transaction owner in their copy of the blockchain.
  6. Recipient broadcasts sender's transaction & recipient's receipt to entire network.
  7. Everyone else in network updates their copy of the blockchain.

Race Conditions


Problem: race condition; what if sender sends same transaction to two recipients simultaneously?

Both recipients check their blockchain copy, see the serial number is available, broadcast claim of receipt. Some people get one receipt first, other people get the other one first.


Solution: recipient broadcasts request for confirmation check to entire network before claiming receipt.

New workflow:

  1. Sender generates their own transaction serial number.
  2. Sender composes transaction with serial number.
  3. Sender digitally signs transaction.
  4. Sender sends transaction to recipient.
  5. Recipient confirms transaction in their copy of the blockchain.
  6. Recipient broadcasts sender's transaction to entire network.
  7. Other network members confirm transaction in their copy of blockchain.
  8. Other network members respond with confirms.
  9. Once enough network members confirm, recipient broadcasts claim of receipt.
  10. Everyone else in network updates their copy of the blockchain.

This solution has a variety of problems:

How many is "enough"?

How do we figure that out?

Proof of Work


Problem: Scammers can set up a zillion zombie network members to fake "enough" confirms, etc.
Solution: Make it prohibitively expensive to confirm.

1) Define a computationally expensive mathematical process for validating confirmation - that's the "proof of worth".

2) Reward network members for validating - and now we have bitcoin mining.

New workflow:

  1. Sender generates their own transaction serial number.
  2. Sender composes transaction with serial number.
  3. Sender digitally signs transaction.
  4. Sender broadcasts transaction to entire network.
  5. Other network members add transaction to their queue of pending transactions.
  6. Other network members race to provide proof of work first:
  7. Member confirms transaction in their copy of blockchain.
  8. Member calculates validation proof-of-work on transaction.
  9. Member broadcasts validation proof-of-work result.
  10. Other network members verify proof-of-work.
  11. Network members update their copy of blockchain with transaction & validation.
  12. Network members also update their copy of blockchain with reward for validator.

Bitcoin uses a proof-of-work based on the SHA-256 hash function.

Inputs to the hash function are a) the transaction and b) a "nonce".

Note: nonce is a crypto term for some value to be used a single time, it can be anything but in bitcoin it's just a number.

The test is to try running the hash function with different nonces, until you find a nonce that generates a hash value that meets an arbitrary mathematical requirement:

"The Bitcoin proof-of-work puzzle requires the hash of a block's header to be lower than or equal to a number known as The Target. This target is automatically adjusted to ensure that a Bitcoin block takes, on average, about ten minutes to validate."

The nonce can be a randomly generated number, but it looks like it's really just a sequential number - the "miners" keep adding 1 to the number and trying again, until they get the right result.

Mining

To incentivize people to do the validating, bitcoin rewards the first person to publish a given proof-of-work. This reward is where we get the "bitcoin mining" you keep hearing about in the news.

By doing the validation proof-of-work, the validator earns some bitcoins.

Note: It's not clear to me yet, but it looks like those reward bitcoins are just created out of thin air - when everybody records a proof-of-work, they also create an entry with a new bitcoin for the proof-of-work creator.

The reward was initially 50 bitcoins per proof, but for every 210,000 validated proof-of-works, the reward is halved. This should happen about once every 4 years. By 2140 it will drop below the smallest bitcoin unit (a 10^-8 bitcoin, aka a satoshi) and future rewards will use a transaction fee built into the message. I guess that means that the transaction sender has to pay for it.

Here Nielsen's essay gets a bit into the big picture and how this whole scheme limits the amount of damage scammers can do. A key point: your percentage chance of "winning" and finding a coin is about equal to what percentage of the computing power devoted to mining is under your control:

"So, for instance, if a miner controls one percent of the computing power being used to validate Bitcoin transactions, then they have roughly a one percent chance of winning the competition."
Problem: double-spending is not conclusively addressed by all this. Yet. But first...

Blocks (of queued transactions), Not Transactions

To complicate the explanation a bit further, the bitcoin validation proof-of-work doesn't take a single transaction as the input, but rather a "block", which is the entire queue of pending transactions, about ten minutes worth (remember, the bitcoin Target is automatically adjusted so a proof of work takes about ten minutes. So a block is all the transactions within that period).

Hence why it's called The Blockchain.

I'm guessing this is done to optimize the whole scheme, but it does make the language to explain it all a bit more awkward.

The queue-aka-block is one of the inputs to the hash function. Ideally all members of the network hear all transactions and are working on blocks containing all transactions since the last block.

If not, this creates a situation where different validators are starting with a block containing different transactions, so you'd get different proof-of-work blocks being added to the blockchain.

When a new transaction is published to the network, anybody who is currently working on a proof-of-work that had started with the old version of the queue discards that effort, and starts a new mining attempt with the current queue-aka-block.

For example, let's say there are 85 transactions.

The transactions 1-74 are already validated.

You start validating the block of transactions 75 through 85.

While you're working, 5 new transactions come in, but you ignore them, because you're not done with 75-85 yet.

But before you finish, somebody else publishes a proof of work for transactions 75 through 80.

You throw away your in-progress work and start working on a proof of work for the as-yet-unvalidated transactions, 81 through 90.

But what happens if you finish 75-85 and publishes the proof of work at the exact same time that somebody else publishes for 75-80?

This is where we get into forking, which is section after next.

Order of Transactions

New requirement: All blocks include an address - a cryptographic hash - of the last block validated in the blockchain.

This, by the way, is why they call it the block "chain".

Now remember: when a proof-of-work is published, it's broadcast to the entire bitcoin network. This includes the other validators/miners.

Their ongoing proof-of-work includes a hash address for the used-to-be-most-recent block, and now that is no longer the head block.

So they cancel and discard any ongoing work with that former head block, and start a new proof-of-work, with the a hash address for the new-most-recent block.

Nielsen doesn't say it explicitly, but I'm guessing that the miners also discard any transactions that were dealt with by the new-most-recent block.

New Workflow:

  1. Sender generates their own transaction serial number.
  2. Sender composes transaction with serial number.
  3. Sender digitally signs transaction.
  4. Sender broadcasts transaction to entire network.
  5. Other network members add transaction to their queue of pending transactions.
  6. Other network members race to provide proof of work first:
  7. Member A confirms transaction in their copy of blockchain.
  8. Member A calculates validation proof-of-work on transaction.
  9. Member B confirms transaction in their copy of blockchain.
  10. Member B calculates validation proof-of-work on transaction.
  11. Member A broadcasts validation proof-of-work result first.
  12. Other network members verify proof-of-work
  13. Network members update their copy of blockchain with transaction & validation.
  14. Network members also update their copy of blockchain with reward for validator.
  15. Member B confirms member A proof-of-work.
  16. Member B cancels on-going competing proof-of-work.
  17. Member B skips transactions validated by member A.
  18. Member B starts new proof-of-work with remaining transactions.

Forking


Problem: Sometimes two miners finish at the same time (or so nearly the same time as to cause problems).
Solution: At first, the blockchain will record both blocks as an either/or, a "fork". Then whichever side of the fork gets extended first, wins.

Let's call the two sides of the fork block A and block B. The miners will continue working on the next proof-of-work. Since each block has an address for the preceding block, the miner who did block A can only win if they publish a proof-of-work that builds on block A, and same for the miner who did block B.

Note, it doesn't need to be miner A that publishes the validation that builds on block A, they just need somebody to do a validation that builds on block A, before somebody does a validation that builds on block B.

Since the proof-of-work takes a random, but on average significant amount of time, it's highly unlikely that A+1 and B+1 will come in at the same time (or close enough to the same time). One will be finished and broadcast significantly earlier, at which point the losing miner will give up and start building on the winner.

At Least 6 Confirmations

In theory the race condition could happen several times in a row, but it's not likely to happen often. Regardless, sooner or later one will pull ahead and the fork is resolved.

In bitcoin, a transaction is not really confirmed until it's both confirmed and in the longest fork, and 5 more blocks have followed it ('has "6 more confirmations"').

New Workflow:

  1. Sender generates their own transaction serial number.
  2. Sender composes transaction with serial number.
  3. Sender digitally signs transaction.
  4. Sender broadcasts transaction to entire network.
  5. Other network members add transaction to their queue of pending transactions.
  6. Other network members race to provide proof of work first:
  7. Member A confirms transaction in their copy of blockchain.
  8. Member A calculates validation proof-of-work on transaction.
  9. Member B confirms transaction in their copy of blockchain.
  10. Member B calculates validation proof-of-work on transaction.
  11. Members A & B broadcasts validation proof-of-work result simultaneously.
  12. Other network members verify both proofs-of-work as a fork.
  13. New transactions are broadcast, confirmed, validated, validations broadcast.
  14. Member A's validation is extended first.
  15. Network members continue blockchain from member B's validation.
  16. Network members also update their copy of blockchain with reward for Member B.
  17. Member B confirms new transaction proof-of-work.
  18. Member B cancels on-going competing proof-of-work.
  19. Member B skips transactions validated by member A.
  20. Member B starts new proof-of-work with remaining transactions.

Double Spending

One obvious way to double-spend is to generate a block that contains both transactions:

One block: #1 Alice Pays Bob and #2 Alice Pays Charlie

...and win the mining competition for that block.

Remember, the odds of winning are about the same as what percentage of all bitcoin-mining resources is under the scammer's control.

Even if the scammer gets lucky, the double-spend in the block will be painfully obvious and be immediately rejected by everybody else in the blockchain.

Another way to double-spend is to broadcast two separate transactions that spend the same bitcoin.

Block A: Alice pays Bob

Block B: Alice pays Charlie

In essence Alice is forking herself, and the blockchain/forking mechanism will automatically catch and reject one of these forks and the transaction it includes.

Another way to double-spend is to use a fake third party:

Block A: #1 Alice pays Bob-who-is-Alice's-Sock-Puppet

Block B: #2 Alice pays Charlie

Alice waits for Charlie's transaction to confirm, which takes 6 more blocks.

Then Alice tries to fork the blockchain and catch up with the fork containing the Charlie transaction.

However, by this point Alice is way behind, and the rest of the bitcoin community is already hard at work on the Charlie side of the fork. Alice needs to control at least 50% of the bitcoin mining to even keep up.

If Alice gets super-lucky and manages, with say 1% of the network's bitcoin mining network, to find 6 extra blocks in a row before the rest of the network finds any blocks, she might get ahead and get control of the block chain. Nielsen calculates the odds of this are 1/(100^6) or 10^-12,

Problems for the Author

At the end Nielsen lists several things he doesn't understand, like why we need the elaborate bitcoin scheme instead of just using a two-phase commit.

Bitcoin

This part of Nielsen's essay fills in some additional details about how the real bitcoin works. This includes some screen shots of bitcoin apps and discusses some pragmatic details about how people put bitcoin into practice.

Your bitcoin "address" is a hash based on a public/private key, usually generated by your bitcoin "wallet" software.

Multbit (note that's not "multibit", it's "multbit")) is a popular bitcoin wallet app.

You send your bitcoin address (hash of your wallet-generated public key) to the payer.

The payer generates a transaction. The transaction looks like this:

{"hash":"7c4025...", "ver":1, "vin_sz":1, "vout_sz":1, "lock_time":0,"size":224, 
 "in":[  {"prev_out": {"hash":"2007ae...", "n":0}, "scriptSig":"304502... 042b2d..."}  ],
 "out":[  {"value":"0.31900000", "scriptPubKey":"OP_DUP OP_HASH160 a7db6f OP_EQUALVERIFY OP_CHECKSIG"}  ]
}

Nielsen decomposes this line by line, but basically each transaction has:

{"hash":"7c4025...", "ver":1, "vin_sz":1, "vout_sz":1, a hash that identifies this transaction. (The data here also has the version number, input count and output count.)
"lock_time":0, lock_time is how long before transaction is finalized, usually 0).
"size":224, size of the transaction message bytes
"in":[ { One or more inputs.
"prev_out": {"hash":"2007ae...", "n":0}, prev_out is a pointer to an output from previous transaction, identified by hash and index into the outputs. In this example the transaction identified by hash "20007ae..." and the "n":0 means the first (index 0) output from that transaction.
"scriptSig":"304502... 042b2d..." scriptSig is a signature hash and the public key that generated it (space separated). This is a hash of most of the contents of the transaction message.
} ], Input close
"out":[ { One or more outputs.
"value":"0.31900000", value: the number of bitcoins being spent (but again see "making change" below).
"scriptPubKey":"OP_DUP OP_HASH160 a7db6f OP_EQUALVERIFY OP_CHECKSIG" scriptPubKey: a bit of bitcoin scripting language, in this example it identifies who's being paid (by hash) and some other stuff.
} ] output close
} transaction close

A couple caveats:

First, the scriptSig doesn't cover the entire transaction. Some bits are left out (not the sender, receiver, amount). This is a topic of discussion in the bitcoin community.

Second, the transaction spends all of the bitcoins in the referenced prev_out. The sum of all of the outputs has to equal the sum of all of the inputs. If there's any leftover, it gets paid to whichever miner validates the block that contains the transaction.

There's no facility for spending a chunk of a coin, there has to be a "here's your change" output built into the transaction to cover that. He gets into that next.

An interesting observation: pretty much all bitcoin transactions point back to a previous bitcoin transaction. Ultimately you trace back to either a Genesis Block transaction (the original 50 bitcoins distributed when bitcoin was set up) or a coinbase transaction (coins paid to bitcoin miners).

You can have multiple inputs and multiple outputs, but again any leftover value from the inputs is given to the miner who validates it.

You can also have multiple outputs however, so you can add an extra output to spend the remainder back to a bitcoin address you own.

Transactions with Multiple Inputs and Outputs

Nielsen dissects a longer, more complex example. I'm not going to reproduce it here, my main point in writing this was to make the step-by-step lists of the protocol as it evolves.

Conclusion

Nielsen talks about the confusion over anonymity, and about the parts of bitcoin that he didn't cover in his essay.