.. _c-array-output-mnist-generation-tutorial: .. role:: raw-html(raw) :format: html 01 – Array Output: Building a Simple Autoencoder for MNIST Digit Generation =========================================================================== In this tutorial, we will explore the capabilities of `EIR` for array output tasks, specifically focusing on MNIST digit generation using a simple autoencoder. Arrays can represent various types of data, including images, time series, and more. This technique allows us to generate new, meaningful arrays based on patterns learned from the training data. .. note:: This tutorial assumes you are familiar with the basics of `EIR`, and have gone through previous tutorials. Not required, but recommended. A - Data -------- Here, we will be using the well known MNIST dataset. The dataset here consists of preprocessed NumPy arrays containing the MNIST handwritten digit images. To download the data, `use this link. `__ After downloading the data, the folder structure should look like this: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/commands/tutorial_folder.txt :language: console B - Training A Simple Autoencoder --------------------------------- Training an autoencoder for MNIST digit generation with `EIR` involves the familiar configuration files and follows a process similar to supervised learning. We'll discuss the key configurations and visualize the training process, including the training curve and generated images at different iterations. The global config provides standard parameters for training: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/globals.yaml :language: yaml :caption: globals.yaml .. note:: One new thing you might notice here is the ``latent_sampling`` configuration in the global configuration, which let's you extract and visualize the latent space of chosen layers during training (computed on the validation set). The input configuration specifies the structure of the MNIST array input: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/input_mnist_array.yaml :language: yaml :caption: input_mnist_array.yaml The output configuration defines the structure and settings for the generated images: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/output.yaml :language: yaml :caption: output.yaml With the configurations in place, we can run the following command to start the training process: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/commands/ARRAY_GENERATION_MNIST_1.txt :language: console I got the following results: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/0_autoencoder/training_curve_LOSS_1.png :width: 100% :align: center Since we had that latent space sampling configuration in the global config, the latents are saved and a couple of visualizations are generated, here is one with the t-SNE visualization of the latents at iteration 9000: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/0_autoencoder/latents_visualization.png :width: 100% :align: center Here we have colored the latent space by the digit label, and we can see which labels are close to each other in the latent space. For example, it seems that 4, 7 and 9 are close to each other. Now, when we are generating arrays, ``EIR`` will save some of the generated arrays (as well as the corresponding inputs) during training under the ``results/samples/`` folders (the sampling is configurable by the sampling configuration in the output config). We can load these numpy arrays and visualize them. Here is a comparison of generated images at iteration 500: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/0_autoencoder/comparison_iteration_500.png :width: 50% :align: center And at iteration 9000, we can observe the improvements in generation: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/0_autoencoder/comparison_iteration_9000.png :width: 50% :align: center C - Augmenting Our Autoencoder With More Data --------------------------------------------- In this section, we will explore how to augment our MNIST digit-generating autoencoder with additional data. Specifically, we will add the MNIST labels to the autoencoder, which will allow us to conditionally generate images of specific digits. The global config remains the same as in the previous section: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/globals.yaml :language: yaml :caption: globals.yaml The input configuration now includes additional files to represent the augmented data: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/input_mnist_array_with_label.yaml :language: yaml :caption: input_mnist_array_with_label.yaml .. note:: Here we see another new option, ``modality_dropout_rate``, this will randomly drop out modalities during training, which can be useful for training models that can handle missing modalities. .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/input_mnist_label.yaml :language: yaml :caption: input_mnist_label.yaml The output configuration has also been modified to accommodate the augmented data: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/output_with_label.yaml :language: yaml :caption: output_with_label.yaml .. note:: Notice here we are using some manual inputs in the sampling configuration, which will allow us to generate images of specific digits. We can run the following command to start training the augmented autoencoder: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/commands/ARRAY_GENERATION_MNIST_2.txt :language: console I got the following results: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/1_autoencoder_augmented/training_curve_LOSS_1.png :width: 100% :align: center Here is a visualization of the latent space: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/1_autoencoder_augmented/latents_visualization.png :width: 100% :align: center Here is a comparison of generated images at iteration 500 and 9000: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/1_autoencoder_augmented/comparison_iteration_500.png :width: 50% :align: center .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/1_autoencoder_augmented/comparison_iteration_9000.png :width: 50% :align: center Now, since we added those manual inputs earlier, they are also saved in the ``sample`` folders (under ``manual``), and we can visualize them: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/figures/1_autoencoder_augmented/combined_plot.png :width: 75% :align: center So indeed we can see, in the absence of the actual image to encode, the model uses the class label to generate the respective digit. While not immediately obvious, the generated images of the same class are not completely identical (although they are extremely similar), due to some stochasticity injected into the model. D - Serving ----------- In this final section, we demonstrate serving our trained model for MNIST array generation as a web service and interacting with it using HTTP requests. Starting the Web Service """"""""""""""""""""""""" To serve the model, use the following command: .. code-block:: shell eirserve --model-path [MODEL_PATH] Replace `[MODEL_PATH]` with the actual path to your trained model. This command initiates a web service that listens for incoming requests. Here is an example of the command: .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/commands/ARRAY_GENERATION_DEPLOY.txt :language: console Sending Requests """""""""""""""" With the server running, we can now send requests with MNIST data arrays. The data arrays are encoded in base64 before sending. Here's an example Python function demonstrating this process: .. code-block:: python import requests import numpy as np import base64 def encode_array_to_base64(file_path: str) -> str: array_np = np.load(file_path) array_bytes = array_np.tobytes() return base64.b64encode(array_bytes).decode('utf-8') def send_request(url: str, payload: dict): response = requests.post(url, json=payload) return response.json() payload = { "mnist": encode_array_to_base64("path/to/mnist_array.npy") } response = send_request('http://localhost:8000/predict', payload) print(response) Retrieving Array Information """"""""""""""""""""""""""""" You can get information about the array type and shape by sending a GET request to the `/info` endpoint: .. code-block:: bash curl -X 'GET' \\ 'http://localhost:8000/info' \\ -H 'accept: application/json' This request will return details about the expected array input and output formats, such as type, shape, and data type. Decoding and Processing the Response """""""""""""""""""""""""""""""""""" After receiving a response, you can decode the base64 encoded array, reshape it, and cast it to the appropriate dtype using the information obtained from the ``/info`` endpoint: .. code-block:: python def decode_array_from_base64(encoded_array: str, shape: tuple): array_bytes = base64.b64decode(encoded_array) return np.frombuffer(array_bytes, dtype=np.float32).reshape(shape) array_np = decode_array_from_base64( response['mnist_output'], shape=(28, 28) ) .. important:: While the original output arrays can be of any dtype, and that information is provided in the ``/info`` endpoint, the response output arrays are always of dtype ``float32``, which is the output dtype of the model itself. The model output is then un-normalized using the training set statistics (assuming normalization was used during training). For example, since these are images originally in ``uint8`` format, we can process the response arrays as follows: .. code-block:: python from PIL import Image array_np = (array_np - array_np.min()) / (array_np.max() - array_np.min()) array_np = (array_np * 255).astype(np.uint8) image = Image.fromarray(array_np) image.show() Analyzing Responses """"""""""""""""""" After sending requests to the served model, the responses can be analyzed. .. literalinclude:: ../tutorial_files/d_array_output/01_array_mnist_generation/serve_results/predictions.json :language: json :caption: predictions.json For example, using the approach described above, we can visualize the generated images from the responses: .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/serve_results/mnist_output_0.png :width: 33% :align: center .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/serve_results/mnist_output_1.png :width: 33% :align: center .. image:: ../tutorial_files/d_array_output/01_array_mnist_generation/serve_results/mnist_output_2.png :width: 33% :align: center If you made it this far, thank you for reading! I hope this tutorial was interesting and useful to you!