[CS198.2x Week 6] Blockchain Technology Summary

In our second course CS198.2x we decided to
take a huge step back to look at some of the big problems in the blockchain
space. Having studied Bitcoin
and cryptocurrencies in CS198.1x, the previous course, now we take a step back
and look at the fundamentals issues. And in the first lecture we talked about
achieving trust without trust and more particularly about distributed systems
and consensus. At its core, consensus is a social issue. How do we reach a majority
opinion? Or how do we achieve a general agreement? Consensus is trivial if I'm
the only one agreeing with myself, but the moment when I try to come to
consensus with some other peers then it gets a bit more tricky. The formal study
of distributed consensus started really with the study of digital avionics when
they decided to put computers in very adversarial environments such as on an
airplane. And the formal study of distributed systems
really started with the famous author Leslie Lamport. And today consensus can be found in jet
planes, spacecraft, distributed lock servers in a cloud compute environment,
and also of course the topic of this course blockchains. At their core
distributed systems can be seen as a collection of nodes and messages.

And we
can look at them on a graph where nodes are the actual nodes, the computers — the
processes themselves — and edges are the messages that have been passed. Some of
the properties of distributed systems are that there are concurrent components,
there is no global clock, and most importantly, there's the potential
failure of individual components. So the question is how to achieve a reliable
system from unreliable parts. We looked at some of the categories of properties
including safety and liveness properties of distributed systems, and also defined
correctness in terms of validity agreement and termination. We then
looked at one of the more famous theorems in distributed systems. And that
is the CAP theorem where you can only have two of the three: consistency,
availability, and partition tolerance. We then ran through a quick demo proof
of how we can only have two of the three and also observed that the CAP theorem
isn't really a black or white trade-off, but actually on a spectrum and
especially in the case of blockchains, the trade-off is generally more between
C and P, and A and P.

We then looked at one of the more famous papers
in distributed systems, which is the Byzantine
generals problem in which we have general surrounding a city separated
geographically and they have to agree on whether to attack or retreat. They all have to agree on doing the same
thing and they can only communicate with each other through messengers — and the
messengers can get lost and also there may or may not be a traitor among them. And it turns out to reach agreement there
is no solution for greater than one thirds traitors. The Byzantine generals problem teaches us
to classify faults in two ways being fail stop faults
in which nodes just simply crash and also Byzantine faults nodes that can lie
to each other or send corrupted messages. And the parallels between the Byzantine generals
problem and that of public blockchains are pretty apparent. Then we looked at some of the more
traditional consensus algorithms where nodes explicitly vote. One of the
earliest was Paxos, which told the story of the Paxon parliament, in which
proposers, acceptors, and learners all tried to agree on which decrees to pass.

And the parallels between the Paxon Parliament
and distributed systems are quite similar. And there's a pretty clear correspondence
between the database systems that we know and the Paxon
Parliament. Paxos is used in
practice and many of these companies listed on screen such as Google,
WANdisco, IBM, … for distributed lock servers or of the like. And other distributed
consensus algorithms of the same class include Raft and PBFT. We then looked at
some of the newer classes of consensus mechanisms, in which nodes consume
resources. And this is the class of consensus mechanism
that proof-of-work falls into, and also proof-of-stake, proof-of-activity,
proof-of-burn, proof-of-space, and proof-of-elapsed-time. We then looked into more detail into
proof of stake where instead of miners like in proof-of-work, we have validators. Then looked at some flavors of proof of steak
algorithms including chain-based proof-of-stake and Byzantine fault tolerant
proof-of-stake, and also some implementations. We then looked at federated consensus in which
we can form quorum slices. And the demo here is that if I'm trying to
agree with my friends on where to get lunch, I don't have to trust
anyone other than my immediate close friends, and if anyone else tries to
join, then we can use the intersection of our friend groups to make sure that we
all agree on where to get lunch.

So
following the lecture on consensus mechanisms we took a deep dive into
proof-of-stake and also the cryptoeconomics that secures it. Breaking down cryptoeconomics there's the
cryptography side which aims to secure the past and also the economics side which
aims to secure the future. In terms
of cryptography, we have cryptographic hash functions which give us awesome
properties in terms of the immutability of the blockchain, and also give us
digital signatures. And for economics we have primitives such
as tokens and privileges. We also looked at the carrot and stick model
as well as the concept of security margins, uncoordinated choice models,
and normal form games.

Then
looking back at proof-of-stake we realized that there's problems such as
nothing at stake, which can be solved using slashing. There's also the
long-range attack and stake grinding. In the next lecture we looked at enterprise
blockchains. In the past there was a rise in interest in
private blockchains, or permissioned ledgers. These blockchains are not open and not trustless
and lack the incentive structures like in the previous
lecture where we talked about cryptoeconomics. In analyzing enterprise blockchains it was
useful to step back and classify blockchains in general.

Specifically, blockchains are databases
at their core, and databases we can see them as central
databases and also distributed databases. A subclass of distributed databases are
distributed ledgers, and distributed ledgers which specifically use the
structure of blocks chained together are called blockchains. We then classified
blockchains in terms of their access control. Namely we have public blockchains like Bitcoin
and Ethereum and permissioned ledgers such as consortium
blockchains and private blockchains. We saw that there's various tech companies
that are involved in the blockchain space and also some varying perspectives
on the whole blockchain movement. We clarified some misconceptions and also
some things to look for when evaluating enterprise blockchain platforms. We then
introduced a very fundamental observation that we carry over in the
next three sections, and that is the scalability trilemma,
in which you can't really have all three of security, decentralization, and
scalability: there are some implicit trade-offs that have to happen. For enterprise blockchain platforms we looked
at Ethereum, Hyperledger, Corda, Chain, Ripple, Rootstock, Quorum, and Cosmos
and Tendermint. We then did a quick
survey of the use cases and industries in which blockchains are currently being
deployed: including auto and mobility, big banks and financial institutions, travel
and tourism, digital identity, healthcare, insurance, the Internet of Things, supply
chains, real estate, and foreign aid.

And then after all those we looked at
some enterprise blockchain use case generalizations in which we saw that in
many use cases blockchains are viable but not really necessary. We also looked
at the term "efficiency" when applied to blockchain and how it may be misleading. In general blockchains are good at solving
coordination failures, horizontal integration, and of course pure decentralization. And it turns out there
are advantages of centralized solutions. So the big question is:
Why is using a blockchain better than using a centralized database? And that's
something for you to decide when you're evaluating your own use cases. We then
did a quick survey of the ICO scene, looked at some of the drawbacks, some
statistics, AML, KYC, and also some perspectives on enterprise blockchains. And these spanned states and global as well,
and these perspectives are both good and bad. Having seen all these use cases for blockchain
we then looked at the next lecture which is called Cryptocurrencies
for the Masses: Scaling Blockchain. And this lecture looked at the fundamental
problem of scaling blockchains. Namely at the current problems that we're
facing today as well as dreams of how we can scale these blockchains
in order to power the use cases that we looked at in the last lecture.

Fundamentally, scaling, we're trying to
increase the TPS so we can either increase the volume of transactions or
decrease block time. And we also looked at the size of the blockchain. Blockchains may be scalable and that they're
super fast and can support lots of TPS but when it comes down to it, storing the
blockchain is also a very big problem as well. We then looked back at the scalability trilemma,
having covered the decentralized portion on the bottom left in
the previous lecture on enterprise blockchains and how there's varying categories
of blockchains based on their access controls, now we look at the
bottom right section of the triangle which is on scalability. In terms of scaling techniques there's the
usual vertical, horizontal, and diagonal scaling
and also blockchain specifically, we have layer 1 and layer 2 scaling and that refers
to whetherr scaling solutions belong on the blockchain or off the blockchain,
off chain. A naive solution
might be to increase the size of blocks, thereby increasing the throughput.

The
problem here is that because the time to broadcast blocks stays fixed, then we
have some nasty side effects which include the blockchain increasing in
size too fast and also naturally occurring forks. A way we could possibly
fix this is perhaps using GHOST. We then
look to increase in the block size. Some pros and cons of course, and also of
decreasing transaction size, in which some implementations might be Segregated
Witness like in Bitcoin, where we move the signatures from within transactions
out to a separate structure called a segregated witness and of course some
pros and cons. We then looked at xk-SNARKs and how we could
use recursive SNARKs also for scalability. All these previous scalability techniques
were all on chain scalability techniques, or layer
1 solutions, and the observation now is that we've run out of variables to play with.

We looked at the size of blocks,
size of transactions, and also block creation rate, but that's pretty much all
we can do on chain. So we need to look at something else. So let's just not use the blockchain. We looked back at the fundamental problem
of Bitcoin transactions and how long their delays are. The idea of payment
channels is to bypass the blockchain and instead create a private payment channel
where Alice and Bob can pay each other off chain. The way this would be
implemented would be Alice and Bob would open a private balance sheet on the
blockchain — so that's one transaction — conduct as many transactions as they
want off chain, and then settle the final balance back onto the blockchain. And
then there's the idea of the Lightning Network, creating a network of payment
channels.

In terms of scalability, Lightning network
can help us achieve up to tens of thousands of transactions per second,
but there are centralization concerns in terms of the Lightning network. Sharding is a horizontal scaling
solution that is traditionally used in databases, and we can now use them on
blockchains. And the idea behind sharding blockchains is
that now we have multiple parallel blockchains. Of course with all this complexity there
are concerns in terms of the varying node categories and also single shard
takeovers and cross-shard communications. There's also the idea of side chains in
which blockchains plug into each other like hub-and-spoke.

There's a particular side chain scalability
solution called Plasma and Blockchain at Berkeley's FourthState Labs
are implementing Plasma on Cosmos and Cosmos allows us to scale diagonally looking
at both the speed that Tendermint allows us as well as the interoperability
of blockchains in the Cosmos network. In summary we reconsidered the role of consensus
mechanisms in scalability and also categorized all of our scaling solutions
that we looked at. Having looked at the problem of scalability,
another big problem in blockchain is that of privacy. Firstly we resolved the question of
whether anonymity is really only good for cryptocurrencies
in terms of buying drugs. The answer is no. We pose some hypothetical situations
in which anonymity is necessary for a properly functioning currency, as well as
some security concerns. We looked at the fundamental architecture
of public blockchains in which everyone stores everything and saw
that blockchains are mostly pseudonymous.

All of our transactions are stored on the
blockchain for everyone to see and there are methods of linking, or
deanonymization, in which people can link your pseudonym to your actual real world
identity. And even if a technology might be anonymous
or pseudonymous, there are some human factors that we have to consider. We noticed a slight paradox
when trying to achieve anonymity and security on the blockchain,
and saw that it all ties back to the scalability trilemma in which we're
sacrificing some decentralization or some scalability for the purpose of
security. We then looked at the process of linking where
we can use a transaction graph, some clustering, as well
as some heuristics including merging of transaction outputs and also the
existence of change addresses in order to identify organizations or individuals.

We then looked at some more
advanced techniques such as taint analysis. We saw that one way to achieve
anonymity was mixing. We made some formal
definitions of anonymity sets and the goals of mixing, as well as some
desirable properties such as being trustless and plausibly deniable. We
sought to increase the size of our anonymity sets to make it harder to link
our pseudonyms to our real world identities. Some types of mixers include
centralized mixers, altcoin exchange mixing, decentralized mixing protocols,
and privacy focus altcoins. And we looked at all of those. We also
looked at decentralized mixing protocols including CoinJoin, CoinShuffle,
JoinMarket, and CoinParty.

We then looked at some privacy focused altcoins
including Dash with its masternode network, Monero which
uses CryptoNote technology for ring signatures to have untraceability and
also unlinkability through one-time addresses. We also took a look at Zcash, which uses zk-SNARKs
as well as a zero knowledge security layer. There are also some advanced anonymity solutions,
one of them being MimbleWimble, which aim to simplify
the Bitcoin protocol to make it much more secure. And in simplifying Bitcoin we see once again
we're making a compromise in terms of the scalability trilemma. And now we've made it all the
way to the end this is the last lecture in the blockchain fundamentals program
in which we talk about a blockchain powered future.

As found on YouTube

You May Also Like