Friday, July 19, 2024

Analysis of Safety vs Resilience - PoS protocols have higher safety than PoW protocols, with the tradeoff having lower resilience

TL;DR:

Resilience is a big tradeoff of having higher security.

In the past, Bitcoin (in 2010 and 2013) and PoW Ethereum (in 2016 and 2016) had both been successfully 51% attacked twice each in order to fix catastrophic bugs and issues. It would be extremely difficult if not impossible to accomplish this in reasonable time under PoS Ethereum and most other decentralized PoS blockchains today.

Most PoS blockchains are much more secure than PoW blockchains, but they usually require a chain split or bailout to undo a catastrophe. There are ways to compensate for lack of resilience through offchain governance.

  • Security is the ability to protect against malicious attackers
  • Resilience is the ability to restore the chain after an attack or catastrophic bug

Successful PoW attacks have been common in the wild, but successful PoS attacks are virtually non-existent.

Summary

In general, PoS consensus is much safer than PoW consensus, but PoW is much more resilient during disaster recovery because it's easier for honest miners to re-attack PoW blockchains to revert mistakes. It's built into the PoW protocol.


Security

There are only 2 main categories of exploitable consensus-level blockchain attacks: reorganizations (which include forks and double-spends) and censorship.

  • Liveness threshold: the percent of malicious actors above which censorship can occur
  • Safety threshold: the percent of malicious actors above which reorgs can occur

If the Safety threshold is N%, then the Liveness threshold is (1-N)%. For PoW, these are both 50%. For traditional BFT, safety is 67%, and liveness is 33%. For PoS, safety is at least 67%. The stronger a network is against safety attacks, the weaker it is against liveness attacks. But there are other bigger factors that can increase security overall, like increasing centralization.

Nearly all crypto networks are alike in that they do not allow for bad transactions with invalid signatures. This is true for all consensus protocols (PoW, PoS, PoA, etc). Even if the network is reorged, 51%-attacked, 33%/67% attacked, or censored, an attacker still can't add invalid transactions. The bad transaction/block would be ignored and skipped by the rest of the network because no honest node (e.g. validator, node, wallet, CEX, RPC, etc.) would ever accept those transactions.

Historically

  • There have been numerous (30+) successful malicious 51% consensus attacks on various PoW blockchains
  • There have been no reported successful PoS consensus attacks

(Please correct me if you know of a PoS one)

Proof of Work (PoW) Blockchains

PoW's heaviest weight and longest chain protocols are fundamentally vulnerable to 51% attacks by design. The security budget of PoW miners is usually orders of magnitude lower than its native token's market cap, so it doesn't cost anywhere near as much to attack a network as the amount of damage done. Also, miners can often jump from chain to chain as long as their hashing protocol is similar. Many successful 51% attacks occurred when large mining operations switched from a larger chain to a smaller one in a form of bullying to disrupt the smaller chain.

There are ways to reduce the effectiveness of block-withholding attacks, which by far the most common type of 51% attack. One method is to use finality checkpoints for which blocks past a certain time in history are considered final. But this method uses arbitrary factors and only prevents long-range attacks, not short-to-mid range attacks. In fact, it makes short-range attacks much more dangerous and reduces resilience. If an attacker pulled off a successful short-range attack, it would be impossible to revert the chain after the finality checkpoint. Thus checkpoints do not meaningfully increase security under PoW other than for preventing long-range attacks.

The reason PoW has high resilience to attacks is because the method to revert a chain is fundamentally built into PoW. All you have to do is beat the attacker at producing the longest or heaviest chain. Thus PoW blockchains are less secure, but they can undo the changes easier. However, most PoW blockchains that get successfully attacked often lose their reputation even after the chain is restored.

Proof of Stake (PoS) Blockchains

There are numerous types of PoS networks, and many of them work very differently for security. Some can be taken over and reorged at 67% of stake. Others like Avalanche's Snowman and Algorand require higher percentages above 80-90% and are extremely hard to attack. PoS has one weak point: It has a lower liveness threshold. If an attacker can reorg a network at 67%, it can censor it at 33%. When censored, depending on the network, it will either stop adding or stop finalizing blocks. For example, Ethereum still produces blocks but stops finalizing blocks when attackers obtain 33% of the stake and begins an inactivity leak after 4 epochs without finality.

PoS attacks are very difficult because the amount staked is often orders of magnitude more expensive to obtain than it is to acquire the amount of miners in a mining network. And even if 51% of the staking amount were obtained, it's very unlikely for a PoS attacker to attack itself. The only realistic vectors of attack for PoS networks are to exploit staking pools and client bugs.

PoW vs PoS

Bitcoin was reorged in once in 2010 and once in 2013. Ethereum was reorged twice in 2016. Unlike the malicious attacks, which are common throughout PoW blockchains, these 4 times were to fix bugs.

Under PoW, it was really easy to gather the top miners (fewer than 5) and convince them to attack and reorg the network. It only took hours to fix the chain, not days or weeks.

This short turnaround time would be virtually impossible under a decentralized PoS blockchain. Most PoS blockchains have deterministic finality after a fixed (sometimes arbitrary) number of seconds or blocks. By protocol, they cannot reorg past finality, so the community basically would have to collectively agree to split the chain, or bail out the network.

Slashing on Ethereum

If the current version of PoS Ethereum were to hit a bug today and erroneously finalize a block past an epoch, it would be catastrophic. There would be no way to revert that block without completely splitting the chain, or slashing the majority of PoS stakeholders. Those validators would lose everything.

This is mostly an Ethereum issue because Ethereum is one of the few blockchains with strict slashing rules. In order to revert the chain after finality, the majority of validators would be slashed. In order to split the chain, all validator and node developer clients would need to release an update, and the whole community and all centralized exchanges would need to agree to support the new chain. Instead of only taking a few hours to revert the chain like under PoW, it would likely take weeks. Ethereum has at least 10 different client developer teams, each making their own clients. Ethereum updates often take quarters and require testing through multiple testnets.

Given that Ethereum has 10 different clients and multiple testnets, it's extremely unlikely that the majority of clients would commit the same error on mainnet. But it isn't impossible, and it only takes one mistake to result in a mass slashing event. Ethereum has lost finality before due to a bug in May 2023, and there have been catastrophic bugs that were fortunately discovered on testnets. I wouldn't expect it to happen on mainnet within a decade, but the chances of such a catastrophic bug happening in a human lifetime has a decent chance.

One easy way to fix this vulnerability is to reduce the slashing penalty.

Other PoS blockchains

Other PoS blockchains without slashing have it easier because they aren't pressured to revert minor mistakes in a short amount of time. Reorging would be embarrassing, but it would be easier for the community to take their time to recover through a hard fork update when there is no pressure of slashing. Nevertheless, reverting past finality is not easy because the community would still have to get 51% of stakers and nearly all node client developers (validators, wallets, nodes, RPCs, CEXs) to agree, develop clients, and then apply update those clients.

Centralization

There are some exceptions where PoS blockchains are also resilient.

If you recall from the blockchain trilemma, increasing centralization allows for scalability and security to increase. Blockchains like Solana and BSC can be halted and restored to a previous checkpoint. Thus they are resilient to reorgs and bugs because they are centralized in this aspect.

Most PoA blockchains are also similar in that they can freeze and revert, giving them high security and resilience with the tradeoff of having low decentralization.


No comments:

Post a Comment