MNIST Classification with Orion

Orion is a dedicated Cairo-based library designed specifically to build machine learning models for ValidityML. Its purpose is to facilitate verifiable inference. For better performance we will operate with an 8-bit quantized model. In this tutorial, you will be guided on how to train your model using Quantized Aware Training using MNIST dataset, how to convert your pre-trained model to Cairo 1, and how to perform inference with Orion.
Here is the content of the tutorial:
What is MNIST dataset?
The MNIST dataset is an extensive collection of handwritten digits, very popular in the field of image processing. Often, it's used as a reference point for machine learning algorithms. This dataset conveniently comes already partitioned into training and testing sets, a feature we'll delve into later in this tutorial.
The MNIST database comprises a collection of 70,000 images of handwritten digits, ranging from 0 to 9. Each image measures 28 x 28 pixels.

Train the model with Quantization-Aware Training
We will be using Tensorflow to train a neural network to recognize MNIST's handwritten digits in this tutorial. TensorFlow is a very popular framework for deep learning.
Dataset Preparation
In a notebook, import the required libraries and load the dataset.
We have a total of 70,000 grayscale images, each with a dimension of 28 x 28 pixels. 60,000 images are for training and the remaining 10,000 are for testing.
We now need to pre-process our data. For the purposes of this tutorial and performance, we'll resize the images to 14 x 14 pixels. You'll see later that the neural network's input layer supports a 1D tensor. We, therefore, need to flatten and normalize our data.
Model Definition and Training
We will design a straightforward feedforward neural network. Here's the model architecture we'll use:\

This model is composed of an input layer with a shape of 14*14, followed by two dense layers, each containing 10 neurons. The first dense layer uses a ReLU activation function, while the second employs a softmax activation function. Let's implement this architecture in the code.
Now let's train this model on our training data.
At this point, we have trained a regular model.
Making the model Quantization Aware
The aim of this tutorial is to guide you through the process of performing verifiable inference with the Orion library. As stated before, Orion exclusively performs inference on 8-bit quantized models. Typically, quantization is executed via two distinct methods: Quantization Aware Training (QAT) or Post-Training Quantization (PTQ), which occurs after the training phase. In this tutorial we will use QAT method.
Concretely QAT is a method where the quantization error is emulated during the training phase itself. In this process, the weights and activations of the model are quantized, and this information is used during both the forward and backward passes of training. This allows the model to learn and adapt to the quantization error. It ensures that once the model is fully quantized post-training, it has already accounted for the effects of quantization, resulting in improved accuracy.
We will use TensorFlow Model Optimization Toolkit to finetune the pre-trained model for QAT.
We have now created a new model, q_aware_model, which is a quantization-aware version of our original model. Now we can train this model exactly like our original model.
Converting to TFLite Format
Now, we will convert our model to TFLite format, which is a format optimized for on-device machine learning.
Testing the Quantized Model
Now that we have trained a quantization-aware model and converted it to the TFLite format, we can perform inference using the TensorFlow Lite interpreter to test it.
We first load the TFLite model and allocate the required tensors. The Interpreter class provides methods for loading a model and running inferences.
Next, we get the details of the input and output tensors. Each tensor in a TensorFlow Lite model has a name, index, shape, data type, and quantization parameters. These can be accessed via the input_details and output_details methods.
Before performing the inference, we need to normalize the input to match the data type of our model's input tensor, which in our case is int8. Then, we use the set_tensor method to provide the input data to the model. We perform the inference using the invoke method.
Now, we are going to run the inference for the entire test set.
We normalize the entire test set and initialize an array to store the predictions.
We then iterate over the test set, making predictions. For each image, we flatten the image, normalize it, and then expand its dimensions to match the shape of our model's input tensor.
Finally, we use a function to plot the test images along with their predicted labels. This will give us a visual representation of how well our model is performing.

We have successfully trained a quantization-aware model, converted it to the TFLite format, and performed inference using the TensorFlow Lite interpreter.
Now let's convert the pre-trained model to Cairo, in order to perform verifiable inference with Orion library.
Convert your model to Cairo
In this section, you will generate Cairo files for each bias and weight of the model.
Create a new Scarb project
Scarb is a Cairo package manager. We will use Scarb to run inference with Orion. You can find all information about Scarb and Cairo installation here.
Let's create a new Scarb project. In your terminal run:
Replace the content in Scarb.toml file with the following code:
Finally, place the notebook and q_aware_model.tflite file in the mnist_nn directory. We are now ready to generate Cairo files from the pre-trained model.
Generate Cairo files
In a new notebook's cell load TFLite and allocate tensors.
Then, create an object with an input from the dataset, and all weights and biases.
Now let's generate Cairo files for each tensor in the object.
Your Cairo files are generated in src/generated directory.
In src/lib.cairo replace the content with the following code:
We have just created a file called lib.cairo, which contains a module declaration referencing another module named generated.
Let's analyze the generated files
Here is a file we generated: fc1_bias.cairo
fc1_bias is a i32 Tensor two concepts that deserve a closer look.
Signed Integer in Orion
In Cairo, there are no built-in signed integers. However, in the field of machine learning, they are very useful. So Orion introduced a full implementation of Signed Integer. It is represented by a struct containing both the magnitude and its sign as a boolean.
The magnitude represents the absolute value of the number, and the sign indicates whether the number is positive or negative.
Tensor in Orion
The second concept Orion introduced is the Tensor. We've used it extensively in previous sections, the tensor is a central object in machine learning. It is represented in Orion as a struct containing the tensor's shape, a flattened array of its data, and extra parameters. The generic Tensor is defined as follows:
You should now be able to understand the content in generated files.
Perform Inference with Orion
We have now reached the last part of our tutorial, performing ML inference in Cairo 1.0.
How to Build a Neural Network with Orion
In this subsection, we will reproduce the same model architecture defined earlier in the training phase with Tensorflow, but with Orion, as the aim is to perform the inference in Cairo.
In src folder, create a nn.cairo file and reference the module in lib.cairo as follow:
Now, let's build the layers of our neural network in nn.cairo. As a reminder, this was the architecture of the model we defined earlier:
Input -> FC1 (activation = 'relu') -> FC2 (activation= 'softmax') -> Output
Dense Layer 1
In nn.cairo let's create a function fc1 that takes three parameters:
i: Tensor<i32>- A tensor ofi32values representing the input data.w: Tensor<i32>- A tensor ofi32values representing the weights of the first layer.b: Tensor<i32>- A tensor ofi32values representing the biases of the first layer.
It should return a Tensor<i32>.
To build the first layer, we need a Linear function and a ReLU from NNTrait.
Dense Layer 2
In a similar way, we can build the second layer fc2, which contains a Linear function and a Softmax from NNTrait. We could convert the tensor to fixed point in order to perform softmax, but for this simple tutorial it's not necessary.
We are now ready to perform inference!
Make Prediction
In src folder, create a test.cairo file and reference the module in lib.cairo as follow:
In your test file, create a function mnist_nn_test.
Now let's import and set the input data and the parameters generated previously.
Then import and set the neural network we built just above.
Finally, let's make a prediction. The input data represents the digit 7. So the index 7 should have the highest probability.
Test your model by running scarb test.
Bravo π You can be proud of yourself! You just built your first Neural Network in Cairo 1.0 with Orion.
Orion leverages Cairo to guarantee the reliability of inference, providing developers with a user-friendly framework to build complex and verifiable machine learning models. We invite the community to join us in shaping a future where trustworthy AI becomes a reliable resource for all.
Last updated