Draft: Mina Data Availability Layer

chiro-hiro · October 13, 2023, 3:48pm

Hi everyone,

I’m Chiro founder and CTO of Orochi Network, I’m a grantee of zkIgnite Cohort #2. My zkDatabase project is focus on solving data availability and data correctness. I start this topic to discuss an improvement in off-chain storage solution at protocol level. This proposal is being drafted so feel free to discuss and contribute your opinion.

Abstract

This proposal establishes a new ability to store data on an off-chain layer; this data CAN NOT be accessible on zkApp but its commitment does by which a larger amount of data can be served and data layer can acting as source of trust for zkApp’s UI.

Motivation

Implement Data Availability Layer for Mina Protocol by which all ZK Application, L2 solutions can access the off-chain data securely. Improving data availability means people can develop featureful applications.

Prevent fragmentation of data and overhead in building temporary and short-term solutions.

Objectives

Build a consistent solution for all ZK Applications
Provide data commitment that compatible with Kimchi Proof System and o1js
Allowed zkApps and L2s to rent the data storage with MINA token (the blobs should be freed/disposed after the rental token is run out)
Implement data sharding to reduce the average cost per byte
Free developers from doing implementation for short-term solutions

Specification

Parameters

Data type

Data structure

Data validation

Commitment scheme

Network design

API

Full-node Integration

Proof-system Integration

Security Consideration

Trade-off

chiro-hiro · October 15, 2023, 4:55am

Could some one please move it back to MIP?, it isn’t a zkApp since there are some modifies in protocol and consensus level to make it live. @moderators

teddyjfpender · October 26, 2023, 2:22pm

Hey @chiro-hiro, I think this is an interesting idea! May I suggest the next steps would be to host a community call on the topic and set up a working group to focus on detailing this proposal out. Some things that working group could do would be to clearly define the desired developer experience and what specific use-cases this benefits and unlocks, and more!

Happy to be a part of that and help out

rahul.aeriuslabs · November 22, 2023, 2:03pm

@chiro-hiro zkDatabase does not provide any data availability guarantees. In fact, if used for data availability, the database provider can completely hide block data, in which case it is inherently worse than archival nodes.

I do believe that it is a great data storage solution, especially since we can have an off chain tamper proof db, but there’s an large difference between data storage and data availability.

lamps · November 23, 2023, 10:47am

Zeko’s litepaper had an overview of the DA options that they investigated a few months ago, see the Section 5. Now that Celestia and Eigenlayer have launched, and the cryptography and o1js have advanced considerably, I’m hoping we could get a renewed look of the issue from experts.

I had a twitter post here where Teddy and maht0rz had chipped in the discussion.

teddyjfpender · November 23, 2023, 10:53am

Let’s start by outlining what a DA-layer must do and what scenarios it must satisfy. From my understanding a data availability layer’s core feature is that the application state-tree is always available; a security feature which gives users confidence in the scenario where any service were ever to stop functioning, they can always prove custody of their funds.

rpanic · November 23, 2023, 11:29am

I completly agree with @teddyjfpender, we have to decide on what we actually want.
So there is this industry-wide term “data availability” floating around, but most people have different understandings on what that actually means. How I think most prominent projects define it is as something like “guaranteed data observability”. That means that if the DA-layer publishes a block, firstly every full node can check if all data was submitted (which is kinda trivial in our context), but additionally, every light client has the ability to download only the block header and the subset of the data it wants plus some additional stuff and verify that all data committed to in the block head is actually available. That additional stuff enables the light client to trustlessly verify that all the data that should be in that block is actually there, and nothing was changed / omitted by the producer. This is mostly done via sampling over some erasure-coded extension of the data. Remember, the light client shouldn’t have to download all the data in that block to ensure it’s availablility.
So what is a light client in this context? Most would think of ordinary users that want to participate in the network somehow, but in our case, light clients actually are the systems that submit data to the DA layer. For example, if a rollup wants to settle on an L1, it has to prove data availability. He does that by executing the verification steps of a light client and attaching a proof of that to the settlement. Basically, we want some sort of DA-proof to come along with L1 settlement. That convinces the L1 that the data corresponding to the settled computation is actually available in some external system.

This leads us to the second thing we might want: data storage and retrievability. Storage basically makes some assurance that for a certain time period, all data that was available at some point, will also be stored. This makes sure that some data will be stored somewhere if you weren’t online in the time that it was available without relying on archive nodes.
Retrievability again is a different thing, and pretty difficult to create guarantees around without economical or social assumptions. It says that anyone has to be able to retrieve the data that was submitted at some point in the past.
It seems that the industry settled on data availability providing enough guarantees for the time being and for retrievability and storage, we can safely rely on archive nodes and such. Altough I might add, that data storage alone doesn’t help much without retrievability. And since retrievability hasn’t been solved on a technological level, storage doesn’t really add any benefits.

SebastienGllmt · November 23, 2023, 5:39pm

I think there are three discussions going on in parallel here:

How can we get a data storage protocol that is Mina-aligned (that’s what the first post in this thread is about)
How can we integrate existing DA layers with Mina (Celestia, Avail, EigenDA). Notably, I believe this requires writing a groth16 verifier in Mina
How can we have a Mina-aligned DA layer instead of using one of these other DA layers (Cardano is going through the same discussion at the moment and they also use Ouroboros so anybody interested in this may want to check out their new paper on this topic https://twitter.com/rom1_pellerin/status/1719318640498241980 )

lamps · November 24, 2023, 3:22pm

Thanks to @teddyjfpender @rpanic @SebastienGllmt for your replies. As there is a long thread of discussions on twitter, mainly from @rpanic, I’ll try to categorise them here using the sub-discussions outlined by @SebastienGllmt. The initial post of this thread here was about data storage, not DA, so not continuing below). Also note that @rpanic 's comments on twitter may have not covered How, but more on why and discussions on the pros and cons of the approaches.

Integrate existing DA layers with Mina (Celestia, Avail, EigenDA). Notably, I believe this requires writing a groth16 verifier in Mina.

@rpanic: “tbh, bridging attestations over from celestia through some quorum of validators is a horrible idea. It removes all the properties of why we built DA in the first place resulting in really bad guarantees. But that is something that I find concerning with current DA archs anyways.”

Is there a way to use existing products to achieve so eg Celestia (prob no?), Avail, Eigenlayer?

“Afaik, no. They semm to all rely on some sort of state-root attestation on a L1 contract, that is done by some quorum of validators (how that works in detail I don’t know). What we can instead do on mina is prove the consensus of the DA layer itself (like the Mina L1).
This enables us to remove that trust assumption and at the same time reduce settlement cost.
EigenDA does it best, because they provide signatures of all the attesting validators so that is pretty strong. Still not as strong as proving consensus itself though.
So they problem with integrating existing solutions is: 1. Prove the state inclusion inside kimchi (most of them are KZG-based => difficult) 2. Have that attested stateroot coming from the DA-solution on the Mina L1 (either they collaborate on that or we bridge it from ETH).”

A Mina-aligned DA layer instead of using one of these other DA layers (Cardano is going through the same discussion at the moment and they also use Ouroboros so anybody interested in this may want to check out their new paper on this topic https://twitter.com/rom1_pellerin/status/1719318640498241980 )

“The DA I specced out and envision allows appchains to verify availability by simply merging in another proof, settlement cost stays constant. This also means that devs mostly don’t have to change any of their existing appchain-design patterns.”

@teddyjfpender: " What@rpanic46 is saying about the transaction cost must not go lost here. Running an app-chain, bespoke computation layer, L2, etc. need not pay more than anyone else to settle a transaction on the L1 ledger. Transactions need not be replayed, only verified, no gas or wasted computation."

@rpanic: "And we don’t want to add extra cost for proving DA as well. If we follow the traditional architectures that don’t utilize zk, that would be the case, so we can improve that by building a zk-native DA layer. ",

what’s the difficulty and limitation of settling on Mina L1 then, and how would you do it?

@rpanic: “The difficulty is not in the settlement itself, but in what you want to settle. Every Mina smartcontract is it’s own mini-rollups in the end. Problem with DA is that we have quite low limits on events and actions. But this is by design and DA should always be seperated imo.
But, the external DA layer has to have good guarantees, which DACs don’t have. Therefore we need a strong, well-designed, L1-aligned DA layer that integrates seemlessly in the current DX”

@teddyjfpender: “Couldn’t agree more with @rpanic46 he speaks eminent sense. What I would like to know more about, and study, is how a DA-layer can be optimized to specific applications & use-cases; gaming and DeFi might be quite different but perhaps there is a root of common requirements.”

chiro-hiro · November 25, 2023, 3:59am

Hi @teddyjfpender, I think it’s a good idea to have a community call for detailing this proposal. Each part of the proposal need different expertise to build up no doubt that we need more experts to join this discussion.

teddyjfpender · November 30, 2023, 2:54pm

I’d be happy to host that!

teddyjfpender · December 8, 2023, 2:11pm

I really like what @SebastienGllmt mentioned, thanks for bringing that clarity! There are three discussions going on in parallel, I’d like to separate these topics out to prevent commingling topics.

How can we get a data storage protocol that is Mina-aligned (that’s what the first post in this thread is about)

The flexibility to build one’s own data storage solution on Mina is quite nice, particularly if one does not want to be constrained solutions that are perhaps otherwise unoptimised for their application; in a stoic-ish way developing zkApps on Mina does require one to think thrice and act once. However, this is not appealing to every Web3 developer when Ethereum, Solana, Algorand et al. each give developers the luxury of easily storing rather large amounts of data on-chain; its a friendlier on-boarding experience.

What are people’s thoughts on a data storage protocol that is Mina-aligned? How would you desire to interact with it as a developer?

How can we integrate existing DA layers with Mina (Celestia, Avail, EigenDA). Notably, I believe this requires writing a groth16 verifier in Mina

For integrating existing DA layers with Mina, one thing I am not totally sure of is the design pattern/architecture of those zkApps. If people can share their thoughts on what an end-to-end flow would look like (sequence diagrams are welcome!) for integrating existing DA layers with Mina? That would be great to discuss.

Also there is some fantastic work going on in the Navigator’s program on writing a groth16 verifier in o1js: GitHub - onurinanc/o1js-groth16!

How can we have a Mina-aligned DA layer instead of using one of these other DA layers

Same question different scenario as above, if people can share their thoughts on what an end-to-end flow would look like (sequence diagrams are welcome!) for building a Mina-aligned DA layer? That would be great to discuss.

Mithril 2.0 is interesting & naturally I think Mina is well suited for this set of features given its zk-native nature. To extend the question above, what are people’s thoughts on a public-private DA layer; the ability to prove properties about a specific zkApp’s state without revealing specific data…?

rahul.aeriuslabs · December 12, 2023, 10:41am

We previously made a proposal under zkIgnite for an app-level DA integration that is opt in and non invasive to the chain itself.

We already made a few strides in this direction, but since the proposal didn’t get funder, we weren’t able to continue on it. I’m also attaching an image that outlines the same

The actual architecture has changed a little bit since this diagram was created, I’ll create an updated flow

rahul.aeriuslabs · December 12, 2023, 10:46am

We’re also creating an integration with Avail and are in talks with the Avail team, since proving inclusion in Avail is arguably more efficient when doing so in a Kimchi proof

lamps · January 2, 2024, 11:06am

what about eigen DA? They are primarily based on KZG commitments and are launching soon: Intro to EigenDA: Hyperscale Data Availability for Rollups

onurinanc · February 15, 2024, 7:09pm

Here is my solution to the Data Availability Layer. Succinct Labs implemented VectorX as a light Client for Avail’s consensus. The VectorX light client tracks both the state of Avail’s Grandpa consensus and Vector as the data commitments.

See here: vectorx/contracts/src/VectorX.sol at 303605dce8af295f59a8f40b3e25cbac87a0cb14 · succinctlabs/vectorx · GitHub.

BlobstreamX is also a light Client for Celestia’s consensus layer which tracks the data commitments.

See here: blobstreamx/contracts/src/BlobstreamX.sol at 46f9a92e5ffe19dbdcf58a2a1dbcafd8b972f108 · succinctlabs/blobstreamx · GitHub.

By implementing these with o1js, we can track the data commitments. Together with this, a PLONK verifier will be needed.

For example, you can see the latest VectorX proofs from the right: Succinct Platform.

Here is the example proof: Succinct Platform.

You can see an example PLONK verifier here: https://platform-artifacts.f29dee52805d2df0d34ac5b3f297e6bd.r2.cloudflarestorage.com/main/releases/633bae47-2178-4d24-9fae-c40f9b93ac2b/FunctionVerifier.sol?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=9808523fba11f9f61dc8e51a858d537d%2F20240215%2Fauto%2Fs3%2Faws4_request&X-Amz-Date=20240215T190308Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&x-id=GetObject&X-Amz-Signature=35de9e401f7db9c7001849c0afa249f7d69942c1c612a463045747f511b7c687.

Note that: SuccinctX uses a Plonky2 which is (PLONK + FRI + Goldilocks Field) → wrapping this into (PLONK + FRI + BN254) → GNARK (PLONK + BN254). So, implementing a plonk verifier would solve the problem.

Here is my related repository for the BN254 part: GitHub - onurinanc/o1js-groth16

Topic		Replies	Views
zkDatabase as a service Mina Navigator Proposals	14	520	November 23, 2024
[Product Research Sketch] Mina For Anonymous DAO Voting zkApps	22	3066	July 5, 2022
Illuminate - January Ecosystem Update Community	3	540	February 24, 2025
DRM Mina - Unity games Mina Navigator Proposals	15	552	December 2, 2024
Community zkApps Ideas (formerly snapps) zkApps	51	3063	April 4, 2022