Learn

Zero-knowledge Machine Learning on the Mina Protocol

zkML turns decentralized AI into a practical reality, verifying computations and keeping input data & models private. Learn more about Mina’s upcoming ZKML library.

Introduction

As AI continues to evolve, offering powerful tools for data-driven decision-making, it faces a challenge when integrated with blockchain’s decentralized nature. AI thrives on large datasets and centralization for efficiency, while blockchain prioritizes decentralization, transparency, and security—often at the expense of privacy and scalability. These contrasting principles have made combining the two difficult, but new breakthroughs are emerging that could redefine how AI operates in decentralized environments. 

Problems with AI and Blockchain

Historically, three main blockers have hindered the use of AI in a decentralized environment:

 

  1. Redundant computation across all nodes for consensus.
  2. Keeping input data (models) private.
  3. Expensive gas fee for implementing machine learning models on-chain.

 

If an AI model is run on a device alongside a blockchain node, the only way to verify correct computation is to rerun the same process on all nodes. This approach doesn’t scale well, as it requires significant computational resources and compromises privacy by requiring the inputs and model to be shared across the entire network.

 

Enter ZKML, a machine learning (ML) system based on zero-knowledge (ZK), a cryptographic method that allows one party to prove to another that they know a piece of information or have performed a computation correctly without revealing the information itself or any details about the computation.

 

By leveraging ZK technology, you can turn decentralized AI into a practical reality, enhancing both performance and privacy. This approach allows models to run in a ZK execution environment—such as a ZKVM or a ZK circuit—where computations can be verified more efficiently by all nodes. At the same time, it enables users to utilize data that typically falls under strict privacy regulations, like GDPR, to generate proofs that can be shared with third parties without violating any regulations.

 

A diagram showing how private data inputs are entered into a ZKML system so that only the proofs are verified.
How ZK technology can be leveraged in blockchain, reducing the need for computations by only verifying proofs.

One of the most popular zkML examples today is Rocky bot, a trading bot that is trained on historical WEth/USDC trading pair data. The bot runs off-chain, and for each decision it makes, it also generates a ZK proof of the computation process.

 

Using ZK, the trading bot Rocky can log proofs of its training data, algorithm, and model weights to an L1 blockchain allowing all users to validate that the bot is indeed operating correctly.

A diagram showing how the Rocky bot, an AI trading bot, is trained on data and then has a proof of this data shared with other nodes. If the resulting proof differs from node to node, then it attempts to reach consensus. If the proof is the same, it will post an on-chain transaction.
How the Rocky bot leverages ZKML technology.

 

Previously, building such a bot was complex; every user needed copies of the training data, input data, and the appropriate hardware just to verify its functionality. Now, with ZKPs, this process is streamlined—minimal data needs to be shared for consensus, and only the proof requires verification, eliminating the need for redundant computations.

 

This model also paves the way for decentralized identity systems, AI-driven artists, and machine learning models capable of generating games, stories, and more.

A very simplified diagram showing how an AI such as the Rocky bot can use training data to generate proofs which is then verified by other nodes in the network.

 

 

ZKML - protecting sensitive data while proving ML accuracy

With ZKML, you can prove that you’ve run private data through a specific public model without requiring the validator (the other party to whom you are proving this) to rerun the entire computation to confirm your claim. This is a very useful tool since ML computations are complex and resource-intensive to verify when executed in a trustless environment.

 

ZKML is especially useful when the input data consists of sensitive information like credit data, iris scans, or fingerprints and the model is public. You can configure ZKML circuits to ensure the input data remains private. 

A diagram showing how a Zero-knowledge proof circuit can make input data private for a neural network.

When the ML model itself is a critical asset for a company, the model’s weights can be kept private. For instance, in the case of models submitted to Kaggle competitions, participants may not want to disclose their model weights. In this scenario, models can be deployed privately while still remaining verifiable and users interacting with the private model can be assured that they will receive the correct computation.

A diagram showing how a Zero-knowledge proof circuit can make the model weight private for a neural network.

The importance of ZKML is clear—it addresses core issues in today’s machine learning landscape, such as verifying interactions with large language models (LLMs), ensuring they perform as expected, and complying with regulatory standards. Since most ML systems rely on third-party servers, ZKML’s ability to enable trust without revealing sensitive data is critical for future AI deployment in decentralized environments.

Mina and ZKML

Mina is a programmable ZK-based blockchain that provides a seamless experience for developers. Through tools like o1js and Protokit, which utilize a TypeScript framework, the complexities of programming ZK applications (zkApps) have been effectively abstracted, making development more accessible.

 

Now, Mina aims to expand into the field of ZKML, enabling developers to build ZKML applications that can run across multiple platforms. Mina supports the use of recursive ZK proofs–a technique that allows you to create ZK proofs for multiple ZK proofs, enabling the verification of complex computations in a highly efficient way. 

Diagram showing a very simplified overview of how zero-knowledge proofs can be recursively used to generate a single proof for any amount of data.
A simplified overview of how proof recursion works.

Above is a simplified diagram showing how SNARK proofs can be recursively used. After generating proofs for data, you can then generate proofs for these proofs until eventually you reach the point of having one recursive proof that can represent any amount of data.

 

Essentially, recursion means that a proof can be generated not only for a single computation but for a series of computations, all while keeping the details private. This makes it possible to verify that each step of a multi-stage process was done correctly without having to repeat or reveal the details of each step. 

 

Mina has native recursion implemented with Pickles, which allows you to split the ZKML proof into many smaller proofs per layer instead of generating a single large proof. This is not only good for performance, but also for privacy, as each layer does not need to know about the other, allowing Mina developers to build more complex ZKML models with enhanced privacy features.

 

 

An example would be composable AI workflows where multiple smaller models are chained together, each performing a specific task. Imagine you want to use one model to read data from a PDF, another LLM to analyze the data and another to take some action. These models could all be chained together into a single proof proving that the pipeline was indeed executed correctly.

Diagram showing a very simplified overview of how Mina can use proof recursion to generate a single proof from any number of Mina ZKML small models.

 

ZKML Use Cases Enabled by Mina Recursion

Mina’s native recursive proof support allows developers to take on many interesting use cases. 

 

Composable AI workflows

 

An example would be composable AI workflows, where multiple smaller models are chained together, each performing a specific task. Imagine you want to use one model to read data from a PDF, another LLM to analyze the data and another to take some action. These models could all be chained together into a single proof proving that the pipeline was indeed executed correctly.

To put this in real-world terms, consider a lawyer who needs to review numerous contracts stored as PDFs, extract relevant clauses, and then analyze them for compliance with certain legal standards.

 

  • The workflow could start with an Optical Character Recognition (OCR) model that reads and extracts text from the PDF contracts.
  • Next, an LLM could analyze the extracted text to identify key clauses, such as termination terms or payment schedules, and flag any non-compliant terms.
  • Finally, an action model could be employed to automatically generate a report or recommend specific actions, such as suggesting contract revisions or alerting the legal team.

 

Recursion-enhanced privacy

 

Recursive proofs can also enhance privacy so developers can compute different layers across various machines without exposing sensitive data. This ensures that each layer of an ML model operates securely, privately, and can be verified—maintaining the integrity of the entire model. This is particularly useful for complex ML models like convolutional neural networks (CNNs), which are designed for analyzing visual data (such as images) by automatically detecting patterns like edges, textures, and objects through layers of filters and transformations. Recursive proofs ensure security without compromising the model’s internal privacy or exposing intermediate outputs.

 

Applying this to the real world, hospitals often use CNNs to analyze medical images, such as X-rays or MRIs, to detect diseases like cancer. Since medical data is highly sensitive, privacy-preserving techniques are crucial when using these models across different institutions or cloud services.

 

In this case, recursive proofs can ensure that each layer of the CNN processes the image correctly without exposing the sensitive medical data to external parties. For example:

 

  • Layer 1 of the CNN might detect features like edges or shapes from the MRI.
  • Layer 2 could identify more complex patterns associated with abnormalities or tumors.
  • Layer 3 might further refine the diagnosis or likelihood of disease.

 

How ZKML will be built for Mina

Now, Mina’s capabilities are being extended into the field of ML. This opens up new possibilities for developers, allowing them to build ZK applications that leverage predictive analytics for financial forecasting or utilize image recognition to enhance user security—all while preserving the privacy guarantees of zero-knowledge proofs.

 

A ZKML library for Mina is being developed that leverages ONNX, a standardized language for representing ML models. This library will enable developers to convert ONNX representations of ML models into equivalent ZK circuits, streamlining the integration of machine learning into the Mina ecosystem.

What is ONNX ?

 

Think of ONNX as a programming language that can represent an ML model in a standardized way. There are many ML frameworks; PyTorch, Keras and TensorFlow. ONNX aims to provide a common language that any machine learning framework can use to describe its models.

 

A simplified illustration of how the ONNX programming language can be used to represent an ML model in a standardized way.

Under the hood, ONNX represents machine learning models as a computational graph, where the nodes correspond to operations (OP codes) like Add, Mul, Relu, etc. Each node in the graph represents a specific operation, while the edges represent the flow of data (tensors) between operations.

A small example of how ONNX represents machine learning models as a computational graph, where the nodes correspond to operations.
Machine learning models are represented in ONNX as computational graphs.

 

Understanding the ONNX to Circuit compiler 

 

As discussed above, ONNX represents machine learning models as a computational graph.

 

def onnx_linear_regressor(X):

    "ONNX code for a linear regression"

    return onnx.Add(onnx.MatMul(X, coefficients), bias)

 

The ZKML library needs to convert the ONNX program above into a ZK circuit, which requires implementing the onnx.Add and onnx.MatMul methods as ZK computations to generate a proof.

 

ONNX includes over 100 OP codes, which would be a substantial workload to implement. Fortunately, Tract offers a manageable subset of these OP codes, reducing the implementation requirement to just 20—significantly simplifying the process.

A diagram showing how a ONNX file representing a Kimchi circuit can be generated from an input through a ZKML compiler.

 

Using the Mina ZKML library

A diagram showing a proposed way in which Zero-knowledge machine learning can be used to generate game content for multiple players based on in-game events.
How ZKML can be used to generate game content for multiple players based on in-game events.

Developers using the ZKML library will be able to deploy their models within their zkApps. For example, consider a game built around the concept of “the model is the game,” where an autonomous AI operates within a zkApp. Users can interact with this AI to generate game lore while verifying that the AI game master is acting correctly and generating actions in accordance with the established rules.

 

Another example would be to use an AI model as a physics engine or to generate random data for a game based on in-game events.

 

With the Mina ZKML library, developers can access a range of out-of-the-box benefits, simplifying the process of building performant ML-driven zkApps.

  • ONNX Parsing: Load trained ONNX models and convert them for use in zkML applications.
  • Kimchi Proof System: Use Mina’s Kimchi proof system as the backend to generate zk-SNARKs.
  • Recursive Proofs: Support recursive proofs to allow for optimized ZKML workflows
  • Performance Optimization: Contribute to the backend engine, the Kimchi proof system, by introducing lookup tables and improving basic operations to enhance the overall performance of the zkML library.

 

Want to participate in the development? All work is taking place on Github and anyone is welcome to join in on the conversion on Discord.

Authored in collaboration with Immanuel Segol.

References

 

 

About Mina Protocol

Mina is the world’s lightest blockchain, powered by participants. Rather than apply brute computing force, Mina uses advanced cryptography and recursive zk-SNARKs to design an entire blockchain that is about 22kb, the size of a couple of tweets. It is the first layer-1 to enable efficient implementation and easy programmability of zero knowledge smart contracts (zkApps). With its unique privacy features and ability to connect to any website, Mina is building a private gateway between the real world and crypto—and the secure, democratic future we all deserve.

More from our Blog

SEE ALL POSTS
Announcement / 2024-10-25 / Mina Ecosystem Contributors
Mina Ecosystem Gears Up for Beta Testing Phase
The Mina Protocol blockchain ecosystem is buzzing with anticipation as several groundbreaking apps and infrastructure projects prepare to launch testing phases this fall. This coordinated beta testing campaign marks a significant milestone in Mina’s journey, showcasing the growing momentum and maturity of the ecosystem. These innovative infrastructure tools are set to unblock critical barriers for […]
Read more
Announcement / 2024-10-16 / Andrew Ferrone
Mina Foundation Product Priorities Q4 2024
Read more
Announcement / 2024-10-14 / Mina Foundation
SmartOSC to Onboard 1,000 Developers to the Mina Ecosystem and Accelerate Adoption of ZK Tech
Read more
Learn / 2024-10-09 / Mina Foundation
Mina Whiteboard Session TL;DR
Read more

About the Tech

AboutTechCta

Mina uses advanced cryptography and recursive zk-SNARKs to deliver true decentralization at scale.

Get Started

GetStartedCta

Getting started with ZK on Mina is simple.