Draft: Mina Data Availability Layer

Hi everyone,

I’m Chiro founder and CTO of Orochi Network, I’m a grantee of zkIgnite Cohort #2. My zkDatabase project is focus on solving data availability and data correctness. I start this topic to discuss an improvement in off-chain storage solution at protocol level. This proposal is being drafted so feel free to discuss and contribute your opinion.

Abstract

This proposal establishes a new ability to store data on an off-chain layer; this data CAN NOT be accessible on zkApp but its commitment does by which a larger amount of data can be served and data layer can acting as source of trust for zkApp’s UI.

Motivation

Implement Data Availability Layer for Mina Protocol by which all ZK Application, L2 solutions can access the off-chain data securely. Improving data availability means people can develop featureful applications.

Prevent fragmentation of data and overhead in building temporary and short-term solutions.

Objectives

  • Build a consistent solution for all ZK Applications
  • Provide data commitment that compatible with Kimchi Proof System and o1js
  • Allowed zkApps and L2s to rent the data storage with MINA token (the blobs should be freed/disposed after the rental token is run out)
  • Implement data sharding to reduce the average cost per byte
  • Free developers from doing implementation for short-term solutions

Specification

Parameters

Data type

Data structure

Data validation

Commitment scheme

Network design

API

Full-node Integration

Proof-system Integration

Security Consideration

Trade-off

1 Like

Could some one please move it back to MIP?, it isn’t a zkApp since there are some modifies in protocol and consensus level to make it live. @moderators

1 Like

Hey @chiro-hiro, I think this is an interesting idea! May I suggest the next steps would be to host a community call on the topic and set up a working group to focus on detailing this proposal out. Some things that working group could do would be to clearly define the desired developer experience and what specific use-cases this benefits and unlocks, and more!

Happy to be a part of that and help out :slight_smile:

2 Likes

@chiro-hiro zkDatabase does not provide any data availability guarantees. In fact, if used for data availability, the database provider can completely hide block data, in which case it is inherently worse performance than archival nodes.

I do believe that it is a great data storage solution, especially since we can have an off chain tamper proof db, but there’s an large difference between data storage and data availability.

Zeko’s litepaper had an overview of the DA options that they investigated a few months ago, see the Section 5. Now that Celestia and Eigenlayer have launched, and the cryptography and o1js have advanced considerably, I’m hoping we could get a renewed look of the issue from experts.

I had a twitter post here where Teddy and maht0rz had chipped in the discussion.

2 Likes

Let’s start by outlining what a DA-layer must do and what scenarios it must satisfy. From my understanding a data availability layer’s core feature is that the application state-tree is always available; a security feature which gives users confidence in the scenario where any service were ever to stop functioning, they can always prove custody of their funds.

1 Like

I completly agree with @teddyjfpender, we have to decide on what we actually want.
So there is this industry-wide term “data availability” floating around, but most people have different understandings on what that actually means. How I think most prominent projects define it is as something like “guaranteed data observability”. That means that if the DA-layer publishes a block, firstly every full node can check if all data was submitted (which is kinda trivial in our context), but additionally, every light client has the ability to download only the block header and the subset of the data it wants plus some additional stuff and verify that all data committed to in the block head is actually available. That additional stuff enables the light client to trustlessly verify that all the data that should be in that block is actually there, and nothing was changed / omitted by the producer. This is mostly done via sampling over some erasure-coded extension of the data. Remember, the light client shouldn’t have to download all the data in that block to ensure it’s availablility.
So what is a light client in this context? Most would think of ordinary users that want to participate in the network somehow, but in our case, light clients actually are the systems that submit data to the DA layer. For example, if a rollup wants to settle on an L1, it has to prove data availability. He does that by executing the verification steps of a light client and attaching a proof of that to the settlement. Basically, we want some sort of DA-proof to come along with L1 settlement. That convinces the L1 that the data corresponding to the settled computation is actually available in some external system.

This leads us to the second thing we might want: data storage and retrievability. Storage basically makes some assurance that for a certain time period, all data that was available at some point, will also be stored. This makes sure that some data will be stored somewhere if you weren’t online in the time that it was available without relying on archive nodes.
Retrievability again is a different thing, and pretty difficult to create guarantees around without economical or social assumptions. It says that anyone has to be able to retrieve the data that was submitted at some point in the past.
It seems that the industry settled on data availability providing enough guarantees for the time being and for retrievability and storage, we can safely rely on archive nodes and such. Altough I might add, that data storage alone doesn’t help much without retrievability. And since retrievability hasn’t been solved on a technological level, storage doesn’t really add any benefits.

3 Likes

I think there are three discussions going on in parallel here:

  1. How can we get a data storage protocol that is Mina-aligned (that’s what the first post in this thread is about)
  2. How can we integrate existing DA layers with Mina (Celestia, Avail, EigenDA). Notably, I believe this requires writing a groth16 verifier in Mina
  3. How can we have a Mina-aligned DA layer instead of using one of these other DA layers (Cardano is going through the same discussion at the moment and they also use Ouroboros so anybody interested in this may want to check out their new paper on this topic https://twitter.com/rom1_pellerin/status/1719318640498241980 )
2 Likes

Thanks to @teddyjfpender @rpanic @SebastienGllmt for your replies. As there is a long thread of discussions on twitter, mainly from @rpanic, I’ll try to categorise them here using the sub-discussions outlined by @SebastienGllmt. The initial post of this thread here was about data storage, not DA, so not continuing below). Also note that @rpanic 's comments on twitter may have not covered How, but more on why and discussions on the pros and cons of the approaches.

Integrate existing DA layers with Mina (Celestia, Avail, EigenDA). Notably, I believe this requires writing a groth16 verifier in Mina.

@rpanic: “tbh, bridging attestations over from celestia through some quorum of validators is a horrible idea. It removes all the properties of why we built DA in the first place resulting in really bad guarantees. But that is something that I find concerning with current DA archs anyways.”

Is there a way to use existing products to achieve so eg Celestia (prob no?), Avail, Eigenlayer?

“Afaik, no. They semm to all rely on some sort of state-root attestation on a L1 contract, that is done by some quorum of validators (how that works in detail I don’t know). What we can instead do on mina is prove the consensus of the DA layer itself (like the Mina L1).
This enables us to remove that trust assumption and at the same time reduce settlement cost.
EigenDA does it best, because they provide signatures of all the attesting validators so that is pretty strong. Still not as strong as proving consensus itself though.
So they problem with integrating existing solutions is: 1. Prove the state inclusion inside kimchi (most of them are KZG-based => difficult) 2. Have that attested stateroot coming from the DA-solution on the Mina L1 (either they collaborate on that or we bridge it from ETH).”

A Mina-aligned DA layer instead of using one of these other DA layers (Cardano is going through the same discussion at the moment and they also use Ouroboros so anybody interested in this may want to check out their new paper on this topic https://twitter.com/rom1_pellerin/status/1719318640498241980 )

“The DA I specced out and envision allows appchains to verify availability by simply merging in another proof, settlement cost stays constant. This also means that devs mostly don’t have to change any of their existing appchain-design patterns.”

@teddyjfpender: " What@rpanic46 is saying about the transaction cost must not go lost here. Running an app-chain, bespoke computation layer, L2, etc. need not pay more than anyone else to settle a transaction on the L1 ledger. Transactions need not be replayed, only verified, no gas or wasted computation."

@rpanic: "And we don’t want to add extra cost for proving DA as well. If we follow the traditional architectures that don’t utilize zk, that would be the case, so we can improve that by building a zk-native DA layer. ",

what’s the difficulty and limitation of settling on Mina L1 then, and how would you do it?

@rpanic: “The difficulty is not in the settlement itself, but in what you want to settle. Every Mina smartcontract is it’s own mini-rollups in the end. Problem with DA is that we have quite low limits on events and actions. But this is by design and DA should always be seperated imo.
But, the external DA layer has to have good guarantees, which DACs don’t have. Therefore we need a strong, well-designed, L1-aligned DA layer that integrates seemlessly in the current DX”

@teddyjfpender: “Couldn’t agree more with @rpanic46 he speaks eminent sense. What I would like to know more about, and study, is how a DA-layer can be optimized to specific applications & use-cases; gaming and DeFi might be quite different but perhaps there is a root of common requirements.”

2 Likes

Hi @teddyjfpender, I think it’s a good idea to have a community call for detailing this proposal. Each part of the proposal need different expertise to build up no doubt that we need more experts to join this discussion.

1 Like

I’d be happy to host that!

1 Like