zkDatabase as a service

This topic is to discuss the proposal submitted by @chiro-hiro & @magestrio
Please see below for the details of the proposal and discussion.

10th July, 2024
Current status: Under Consideration.
Opened for community discussion on : 10th July.

1 Like

Title

zkDatabase as a service (zkDatabase aaS)

Project Background

zkDatabase as a Service represents a significant leap forward in ensuring data integrity and ease of use, addressing the critical challenges of slow client-side proving and data retrieval in distributed storage. It provides a platform that offloads complex tasks to the server, ensuring data privacy and verifiability.

This enhances the user experience and accelerates the development of zkApps, paving the way for broader adoption and setting new standards in the industry. zkDatabase empowers developers with efficient, secure, and verifiable data handling, propelling the pace of development and reshaping the future of decentralized applications.

With a focus on standardization and user-friendliness, zkDatabase simplifies data management through a graphical user interface (GUI)

Proposal Overview

Orochi Network is developing a off-chain storage solution for all zkApps on Mina Protocol, zkDatabase as a Service (zkDatabase aaS). This project combine the advantage of ZKP and centralized noSQL database. We provided serverside proving that allowed simplify tasks of zkApps/zkPrograms meanwhile the verification is possible with ZKP and MongoDB Atlas Clusters for data availability.

Problems

Problem: Difficulty in tracking, managing and debugging offchain storage data

  • Solution: Offchain Storage Visualization: Track your data and manage accessibility and proof status through an intuitive browser UI. Debugging: Visual representation can help identify and resolve bugs more effectively.

Problem: Lack of a standardized approach for off-chain storage

  • Solution: New Off-Chain Storage Standard: Implemented in a NoSQL style to enhance flexibility and performance.

Problem: Slow proof generation process and inefficient zkApp development due to browser limitations.

  • Solution: Delegated Proof Generation: Enhance proof generation efficiency by server-side proving, starting with 20 parallel provers and scaling up to 128 parallel workers in production. Efficient zkApp Development: Streamline zkApp development by offloading proof generation to our service, eliminating browser inefficiencies.

Problem: Complex and inefficient interaction with stored data

  • Solution: User-Friendly SDK: Interact with your stored data effortlessly using a robust and intuitive SDK.

Problem: Lack of easily creation of Oracle

  • Solution: Oracle Functionality: Easily act as a zkOracle by requesting data and verifying proofs. Anyone can create their own database that functions as a zkOracle service.

Problem: Lack of control over data access and difficulty in sharing off-chain storage across zkApps

  • Solution: Fine-Grained Access Control: Gain precise control over who can access your data and effortlessly share off-chain storage across zkApps.

Impact

zkDatabase offers numerous use cases and applications, encouraging people to use the Mina blockchain to access the features we provide. This means that developers will not only adopt zkDatabase because they are already developing on Mina, but they will also choose Mina specifically for the advanced features that zkDatabase offers.

A simple and intuitive SDK that allows users to maintain their own coding style, rather than forcing them to adhere to a specific zkdb style, will undoubtedly win the hearts of developers.

With our management tool, users will be able to generate their first zk-proof without writing a single line of code. However, they will still need to sign a transaction to fully immerse themselves in the world of Zero Knowledge World on Mina.

Almost all zkApps require off-chain storage, and we provide a solution that can generate proofs for their data. Take, for example, the SocialCap app. It is an excellent demonstration of zkDB usage. Users can create a document containing all necessary information, and a Merkle proof and zk-proof will be generated for this document. They can then verify this proof on-chain, ensuring the integrity and authenticity of the stored data.

Furthermore, users can authorize access to the zkDatabase UI, where they can see their document and verify its proof. Permissions can be set so that only the document owner and specific groups can view the document. Alternatively, the document can be shared and utilized by other projects.

Audience

Developers, businesses, and individuals interested in blockchain technology, zkApps, and Web3 solutions.

Architecture & Design

Detailed Design/Architecture

Overview

The zkDBaaS architecture consists of the following key components:

  • zkApp: Supports all zkApps on the Mina Protocol built on o1js.
  • zkDatabase Smart Contract: ZK circuit to perform data proving based on o1js.
  • zkDatabase Client/SDK: An NPM package that provides ODM and API to interact with the zkDatabase instance.
  • zkDatabase as a Service:
    • Data commitment on the Mina blockchain.
    • Proof accumulation and verification.
    • Data storage and retrieval using MongoDB Atlas Clusters.
    • Digital signature authorization.
    • Permission management.
  • MongoDB Replicate Set: The underlying database used for storing data and committed metadata, providing high availability and scalability.

Component Interactions

Data Updates

  1. Initiation:
    • The zkApp initiates a data update request through the zkDatabase Client.
  2. Forwarding:
    • The zkDatabase Client forwards the request to zkDatabase as a Service.
  3. Processing by zkDatabase as a Service:
    • Verifies authorization using digital signature authentication.
    • Persists data to MongoDB.
    • Generates a zero-knowledge proof of the updates.
    • Commits the Merkle Tree root to the Mina blockchain through the zkDatabase Smart Contract.

Data Queries

  1. Initiation:
    • The zkApp initiates a data query request through the zkDatabase Client/SDK.
  2. Forwarding:
    • The zkDatabase Client/SDK forwards the request to zkDatabase as a Service.
  3. Processing by zkDatabase as a Service:
    • Retrieves data from MongoDB.
    • Generates a zero-knowledge proof of the query result.
    • Returns the proof to the zkDatabase Client/SDK.
  4. Verification:
    • The zkDatabase Client/SDK verifies the proof and returns the verified data to the zkApp.

This architecture ensures secure, efficient, and scalable interactions between the various components, leveraging zero-knowledge proofs and blockchain technology to maintain data integrity and privacy.

Vision

Phase 2: zkDatabase as a Service (On-going)

  • Objective: Phase 2 aims to lay a strong groundwork for the zkDatabase project by introducing it to the public, testing its core technologies, and fostering community engagement. This phase marks the shift from theoretical development to practical application and user involvement, preparing for real-world implementation and feedback.
  • Key Activities:
    • Merkel Tree Feature: Diving into the functionalities of Merkle Tree for maintaining the accuracy and integrity of data, ensuring their tamper-proof nature within the blockchain network.
    • Proof Accumulation: Delving into the accumulation process, this section explains how the Mina Blockchain efficiently processes numerous transactions simultaneously, ensuring quick and secure transaction verification.
    • Client Feature: The library used by end-user (zkDatabase SDK) and zkDatabase GUI Management tool.
    • Digital Signature Authorization: This module specifies the permission and the role of each user, we consider supporting Decentralized Identifier (DID) and underlying Multi-Party Computation (MPC) keying to support permission grant and revoke.
    • Permission: Permissions within the database system are designed to offer a comprehensive control mechanism over data access and manipulation.
    • Lookup prover: A link between the Merkle tree and B-tree validates data lookup, preventing malicious actions.
    • On-Chain Commitment: After generating a proof, we prove the transformation to the new root on-chain via zkApp.
    • MongoDB Adaptor: This serves as our backend storage, where all data is persisted.
    • Monetization: Monetization functionalities within the database system are tailored to facilitate revenue generation and cost management.
    • zkDatabase Management Tool GUI: Our graphical user interface simplifies the data management process by offering a visual representation of the data, allowing for more efficient manipulation, analysis, and collaboration while ensuring accuracy.
    • zkDatabase Smart Contract: This built-in smart contract stores the root of accumulated proofs, ensuring the integrity and immutability of stored data. Users can interact with it via its public key in zkApp or via zkDatabase Client/SDK.
    • zkDatabase Client/SDK: This is a API that allows you to retrieve off-chain data and manage all requests to the serverless backend.

Phase 3: Full Deployment and Scaling

  • Objective: Attain complete operational readiness for zkDatabase, guaranteeing scalability and accessibility to all users.
  • Expected key activities:
    • Officially introduce zkDatabase to the public, making the platform accessible to all.
    • Expand infrastructure to accommodate increasing user base and transaction volumes.
    • Persist in optimization endeavors to uphold elevated standards of performance and security.

Phase 4: Ecosystem Development

  • Objective: Stimulate expansion and creativity within the zkDatabase ecosystem, nurturing the advancement of decentralized applications and services.
  • Expected key activities:
    • Encourage the creation of zkApps on the zkDatabase platform by providing developer assistance programs and incentives.
    • Broaden the ecosystem through strategic alliances and cooperative ventures.
    • Perpetually innovate by integrating novel technologies and functionalities to ensure the platform remains at the forefront of blockchain advancement.

Phase 5: ZK Modular

  • Objective: Implement a full decentralized zkDatabase that can be integrate with any blockchain and proof-system.
  • Expected key activities:
    • ZK Modular Data Availability Layer.
    • Rewrite in Rust.
    • Build up distributed system for data storage.
    • Implement our aBFT consensus.
    • Support Verkle Tree, KZG commitment scheme.
    • Support Kimchi proof-system natively.
    • Support Halo2 proof-system natively.
    • Support Nova Variants proof-system natively.
    • Integrate with hardware accelerator.
    • Chains integration.

Project Progress Report

Budget & milestones

This section should detail the deliverables at the end of the project, mid-point milestones, timeline and the budget requested. It should explain how the budget will be spent.

  • Deliverables:
    • Feature
      • On-chain commitment. zkDatabase Smart Contract for confirming operation to the chain.
      • Expose more endpoints for fine-grained control over zkDatabase system. Link to endpoints
      • Improve Proof-Service to Enhance Scalability and Caching
      • API module
    • Improvements
      • Improve Client SDK. The current implementation is not suitable. Improvements needed include making the SDK more intuitive and user-friendly, and integrating o1js to interact with the API.We aim to develop an ORM. To create an effective SDK, we will adopt an empirical approach by developing several conceptual implementations to explore different use cases. By iterating on these concepts, we can identify the most intuitive and user-friendly design for the SDK. This method will help us refine the SDK to ensure it meets the needs of developers and is easy to use.
    • CI/CD
      • Automize publish and deploying
    • UI/UX
      • Improve design for UI to reflect new version zkdb
      • Design Monetization UI
      • Develop Monetization UI and Explorer
    • Infrastructure
      • Run them on Cloud
  • Mid-Point milestones:
    • Expose more endpoints for fine-grained control over zkDatabase system
    • On-chain commitment
    • Design for Explorer and Monetization
    • Improve Client SDK
  • Project Timeline : 2M
  • Budget Requested : 30000 Mina
  • Budget Breakdown:
    • Infrastructure (6500 mina to run them 2months+)
      • k8s with autoscaling for both severless and proof service on general purpose and cpu-optimized droplets. We planning to generate up to 20 proof simultaneously
      • DigitalOcean Spaces to store circuit cache or NFS
      • load balancers
      • mongodb serverless subsciption
    • CI/CD (1.5k)
      • Automize publish and deploying
    • UI/UX (7k)
      • Design Monetization UI and improvement for Exploler (2k)
      • Develop UI (5k)
    • Architecture (2k)
    • Feature (13k):
      • Support and Develop On-Chain Commitment(4k Mina)
      • Expose more endpoints for fine-grained control over zkDatabase system (1.5k Mina)
      • Scaling Proof-Service (2k Mina)
      • Optimize Caching for Proof-Serivce to share circuit cache across all nodes (2k)
      • Enhance the Client SDK for simplicity and user-friendliness and add new feature. (1.5k Mina)
      • API module. It will be utilized by the Client and Explorer for API calls and other management tasks.(2k Mina)
  • Wallet Address: B62qjnxsQE3nYnLj5GYPqyucKcquFcWabY6fdTycLkNp6j5TpnzhGHV

Team Info

  • Proposer Github: magestrio · GitHub

  • Proposer Experience: I hold a Bachelor’s degree in Computer Science and have 5 years of experience in software development. I spent 2 years as a mobile engineer, collaborating with major tech companies like Blackberry. Following that, I worked as a freelancer for 2 years. For over a year, I have been focused on learning zero-knowledge proofs (ZKP) and cryptography. I have participated in Cohort 0 and Cohort 1 and served as a Mina Navigator throughout Season 1. I am proficient in Java, C++, Kotlin, TypeScript, JavaScript, Python, and Rust.

Project: zkDatabase

GitHub: magestrio · GitHub

Team Member:

Chiro Hiro (Architector)

I’m Chiro CTO and founder of Orochi Network an R&D Company that applied cryptography and ZKP to solve the problem of Web3. 15 years of experience in software development and cybersecurity, 8 years in cryptography and distributed systems. Co-author of Orand paper, author of EIP-6366, EIP-6617, libecvrf, zkMemory and zkDatabase. Contributor of halo2, Kimchi and Nova.

Risks & Mitigations

  • What risks or dependencies do you foresee with building this project ?
  • What are your migtigations if any?

Although we have reservations about the security and stability of zkDatabase as a service, we still intend to expand it.

1 Like

great proposal, glad to see you guys continuously working on zkDb!

few questions from me:

  • Where would the proposed access control permissions for off-chain storage be stored?

  • How would you ensure proof generation is fair and valid if personal data (private input) has to be send over to the server?

  • What is ‘Proof-service’?

  • Proof accumulation is mentioned as key activity to be explained but is not actually explained anywhere, could you shine some light into how would this work/look?

Big part of your budged is allocated to infrastructure for proof generation, would be great to learn more about the design of such a system and how would it differ from another community project zkCloudWorker ?

  1. We store permission off-chain with on-chain commitment in the future.
  2. Public data is prioritized, aligning with the spirit of Web3, followed by private data.
  3. The Proof Service is responsible for generating and accumulating proofs.
  4. Proof Accumulation is actually a recursive proof. All stored documents will be proved recursively

zkCloudWorker vs. zkDatabase

  • zkCloudWorker: Focuses on cloud-based development tools.
  • zkDatabase: Concentrates on:
    • Data Availability
    • ZK-data-rollups
    • Provable Data
    • Verifiable Data Pipelines
    • Server-Side Proving

After reading this proposal a few questions arise for me:

Where does data availability come into play there? The database being replicated and having high availability isn’t the same as DA in the blockchain sense. The way you architected the system ensure data correctness by proving their insertion/updated to a on-chain merkle root, but in no way does it guarantee that the data is available to any outside observer. So since the data effectively never has to leave the centralized service, censorship to both incoming data and data retrieval is very much in play. That makes it basically unusable for permissionless zkapps or rollups.
Please correct me if I missed any component that makes sure of that.

Why do you need to prove the inclusion in the MongoDB BTree? For correctness, wouldn’t it be enough to retrieve the data from MongoDB and then prove it’s inclusion in the poseidon merkle tree? The reason I’m asking is that probably the internal mongodb hashing function is rather snark unfriendly and will therefore be inefficient.

How do enable zkapps to prove that their data has been included in a certain on-chain commitment that was posted by zkDatabase?

2 Likes

Very excited about the proposal simply because there is not enough of developer tooling to build on Mina with off-chain stored data.

Personally, I have built with vercel redis database and the free tier ended very quickly for me. The bill would definitely skyrocket once the app would go to mainnet.

The key problem for me was that I had to query all the keys of the tree to get the key witness and subsequently modify a contract state.

From my perspective, on-chain storage could be used here to improve interactions with off-chain storage.

1 Like