The Practical Guide to Ethereum Rollups Series – III

Ethereum Rollups: Security Problems and Potential Fixes

This last installment of our Practical Guide to Ethereum Rollups Series looked at the various types of rollups, factors that affect how the industry classifies them, and the overall rollup design. This episode dives deeper into our exploratory journey to identify Ethereum rollups’ challenges and potential solutions. 

These problems are either technological or centralization-related. They may affect a system either independently or collaboratively, blowing up what is otherwise a highly secure blockchain network. The aim is to help map out potential measures that can help eliminate existing problems.

For context, we are grouping the risk areas to magnify the problems that should be solved and how proposed solutions help eradicate the existing and perceived problems. The risk areas are:

  • Engineering: Refers to the risks that affect the technical implementation data that may restrict any rollup from becoming a useful solution. Solutions that address this risk area scrutinize the correctness and technical implementation of the protocol, proof systems, client software, and smart contracts. Essentially, they ensure that bugs do not compromise the rollup.
  • Upgradeability: Such risks target all parts of the rollup, but more specifically, the smart contracts, influencing how an infrastructure may behave.
  • System Resilience: This category covers the risk factors that distort the resilience systems of a rollup that affect its ability to withstand or recover from unexpected faults. Such faults may result from censorship, intentional disruption (Byzantine behavior), or distributed system bugs.

Having highlighted the various risk areas that rollups are vulnerable to, let’s move on to potential solutions.

Engineering

As highlighted earlier, engineering risk areas and their potential solutions cover protocol specification and the rollup’s application layer via smart contracts. Below is a deeper look at the various factors that comprise the above risk factor.

  • Protocol Specification: A formal document describing the mathematical workings of a rollup. Protocol specification risks can be covered by formally verifying that specs details match the specified functional details.
  • Client: Client-related risks affect the implementation of the node client software. Such risks may compromise the operation of the rollup. These risks are mitigated by ensuring the client implementation translates the protocol specs into software.
  • Prover System: These risks target a rollup’s validity enforcement system or prover system. They may affect the rollup’s functionality by interrupting the trust-minimized bridging between it and an underlying L1. The risks can be averted by ensuring that the prover system is implemented correctly to ensure the functionality of the underlying L1 or any monolithic L1 the rollup chooses to have a trust-minimized bridging.
  • Smart Contracts: Contract risks may affect trust-minimized bridging or the relay of important data from the L1 to the roll-up, delaying the roll-up blockchain’s syncing process. Adequate auditing and implementing recovery measures for unexpected conditions mitigate these risks. 

Defense in Depth (DiD)

DiD secures the system through a multi-layered security approach. The strategies that employ this risk mitigation are Multi-Provers and Multi-Verifiers, and both use DiD to improve the trust assumptions of the validity enforcement system. DiD strategies aim to identify and stop functionality that relies on the rollup prover system.

  • Multi-Provers: This approach employs two or more distinct proof systems to assert an equivalent rollup output. For instance, if a rollup’s proof system generates inconsistent results, this is an alert that one of the proof systems has suffered a bug attack. This knowledge of a bug could prompt the rollup to halt functionality and schedule an upgrade to rectify the anomaly.
  • Multi-Verifiers: Verifiers refer to smart contracts facilitating a validating bridge to perform trust-minimized verification. Similar to the Multi-Provers example, two smart contracts, say, in Solidity and Vyper, can be employed to implement a validating bridge. The primary smart contract implementing the bridge can double as an additional checker that asserts equivalence. If such a contract detects inequality, it can initiate a recovery path similar to Multi-Provers.

Upgradeability of Ethereum Rollups

The risks around upgradability refer to the vulnerabilities that target smart contracts, either bridge on L1 and L2 or verifier on L2, that influence the operational continuity of a rollup system. Therefore, upgradeability mitigations aim to minimize the coercion associated with updating contracts that facilitate functionality.

The demands for this approach are simple: A rollup wishing to initiate an upgrade should provide a guarantee period for users to choose to leave or stay. Such a guarantee should ensure:

  • It eliminates security updates: Upgrades require a considerable time to implement and should allow users to process rollup exit requests.
  • Censorship resistance: Users who choose to exit should not be penalized or be censored against leaving.

System Resilience 

Resilience refers to a computer system’s ability to autonomously deploy strategies that elevate and prepare it to adapt to unexpected challenges. In the case of rollups, we refer to:

  • Operational Resilience
  • Adversarial Resilience

It is important to note that while the two types of system resilience mentioned above seem related, they are not mutually exclusive.

Operational Resilience

Operational Resilience depends almost wholly on system liveness. Therefore, a rollup must autonomously detect and resolve liveness issues without trusted intervention. Liveness issues emerge in:

  • Sequencing: A rollup transaction execution may fail if a sequencer and a backup plan are unavailable.
  • Proposing: Refers to publishing rollup state roots to the L1. A liveness failure may result if a rollup system allows only a privileged actor to undertake the process and such actors go offline.  

Systems can ensure operational resilience by providing users with an emergency mechanism that bypasses the sequencer and deploys self-sequencing. 

Adversarial Resilience

Adversarial Resilience refers to the processes that ensure the system has a defense against malicious attacks. Such attacks may target:

  • Rollup Functionality: Malicious attacks that attempt to force acceptance of invalid state changes.
  • Transaction Censorship: Malicious actor’s ability to prevent users’ transactions from settling.

Conclusion to the Ethereum Rollups Series

This series of articles aimed to provide a comprehensive understanding of Ethereum rollups. It covers key aspects like the technical foundations of rollup systems, their classification and alignment with Ethereum’s scaling goals. The article also looks at security risks, potential solutions, and a structured assessment framework.

As the rollup landscape expands, users and developers must effectively navigate and evaluate different solutions. By establishing these foundational insights, this article seeks to contribute to the growth and maturity of Ethereum’s rollup ecosystem. It helps stakeholders make informed decisions about their time, resources, and associated risks.

For more information about rollups and an in-depth look at their attempts to scale Ethereum, refer to the original version of this article that was first published here.