02 - Sequence to Sequence: Spanish to English Translation

In this tutorial, we will use EIR for sequence-to-sequence tasks. Sequence to Sequence (seq-to-seq) models are a type of models that transform an input sequence into an output sequence, a task relevant for numerous applications like machine translation, summarization, and more.

For this tutorial, our task will be translating Spanish sentences into English, using a dataset from Tatoeba.

A - Data

You can download the data for this tutorial here.

After downloading the data, the folder structure should look like this (we will look at the configs in a bit):

eir_tutorials/c_sequence_output/02_sequence_to_sequence
├── conf
│   ├── fusion.yaml
│   ├── globals.yaml
│   ├── input_spanish.yaml
│   └── output.yaml
└── data
    └── eng-spanish
        ├── english.csv
        └── spanish.csv

B - Training

Training follows a similar approach as we saw in the previous tutorial, 01 – Sequence Generation: Generating Movie Reviews.

First, we will train on only the English data, without any Spanish data to establish a baseline.

For reference, here are the configurations:

globals.yaml

output_folder: eir_tutorials/tutorial_runs/c_sequence_output/02_seq_to_seq
valid_size: 500
n_saved_models: 1
checkpoint_interval: 500
sample_interval: 500
memory_dataset: true
n_epochs: 10
batch_size: 256
lr: 0.0005
optimizer: "adabelief"
device: "mps"

fusion.yaml

model_type: "pass-through"

output.yaml

output_info:
  output_source: eir_tutorials/c_sequence_output/02_sequence_to_sequence/data/eng-spanish/english.csv
  output_name: english
  output_type: sequence

output_type_info:
  max_length: 32
  split_on: " "
  sampling_strategy_if_longer: "uniform"
  min_freq: 10

model_config:
  embedding_dim: 128
  model_init_config:
    num_layers: 6

sampling_config:
  generated_sequence_length: 64
  n_eval_inputs: 10

With these configurations, we can train with the following command:

eirtrain \
--global_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/globals.yaml \
--fusion_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/fusion.yaml \
--output_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/output.yaml \
--globals.output_folder=eir_tutorials/tutorial_runs/c_sequence_output/02_seq_to_seq_eng_only

When running the command above, I got the following training curve:

../../_images/training_curve_LOSS_transformer_1_only_english.png

Here are a couple of example of the generated sentences using only English data:

Generated English caption using only English data 1

Tom

Generated English caption using only English data 2

I don't have time to do this.

While the captions above are make some sense, a more interesting task is actually using the Spanish data as input, and generate the respective English translation. For this, we will include an input configuration for the Spanish data:

input_spanish.yaml

input_info:
  input_source: eir_tutorials/c_sequence_output/02_sequence_to_sequence/data/eng-spanish/spanish.csv
  input_name: spanish
  input_type: sequence

input_type_info:
  max_length: 32
  split_on: " "
  sampling_strategy_if_longer: "uniform"
  min_freq: 10

model_config:
  embedding_dim: 128
  model_init_config:
    num_layers: 6

To train, we will use the following command:

eirtrain \
--global_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/globals.yaml \
--input_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/input_spanish.yaml \
--fusion_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/fusion.yaml \
--output_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/output.yaml

When running the command above, I got the following training curve:

../../_images/training_curve_LOSS_transformer_1_spanish_to_english.png

We can see that the training curve is better than when we only used English data, indicating that the model can utilize the Spanish data to generate the English sentences.

Now, we can look at some of the generated sentences:

	Spanish	English Translation
0	Tom se escapó y tomó unas con los muchachos.	Tom got caught and drank some with the other guys.
1	¿Por qué Tomás sigue en Boston?	Why is Tom still in Boston?
2	Ella se fue a México sola.	She went to Mexico at the left.
3	fue la madre de la	The was his mother's
4	Todo será como antes.	Everything will be used as possible.
5	Me gustaría ver la de la	I'd like to see the from the
6	No tenía dónde	I didn't have any where I were.
7	Piensa en ello.	in it.
8	de nuevo para mí.	Forget about to me.
9	Si no te gusta, no tienes por qué	If you don't like him, don't need to

While these are not perfect translations, they are maybe not too bad considering a simple model trained for around an hour on a laptop.

C - Serving

In this final section, we demonstrate serving our trained model for sequence-to-sequence translation as a web service and interacting with it using HTTP requests.

Starting the Web Service

To serve the model, use the following command:

eirserve --model-path [MODEL_PATH]

Replace [MODEL_PATH] with the actual path to your trained model. This command initiates a web service that listens for incoming requests.

Here is an example of the command:

eirserve \
--model-path eir_tutorials/tutorial_runs/c_sequence_output/02_seq_to_seq/saved_models/02_seq_to_seq_model_5000_perf-average=-0.2346.pt

Sending Requests

With the server running, we can now send requests for translating text from Spanish to English.

Here’s an example Python function demonstrating this process:

import requests

def send_request(url: str, payload: dict):
    response = requests.post(url, json=payload)
    return response.json()

example_requests = [
    {"english": "", "spanish": "Tengo mucho hambre"},
    {"english": "", "spanish": "¿Por qué Tomás sigue en Boston?"},
]

for payload in example_requests:
    response = send_request('http://localhost:8000/predict', payload)
    print(f"Spanish: {payload['spanish']}")
    print(f"Translated to English: {response['english']}\n")

Additionally, you can send requests using bash:

curl -X 'POST' \\
  'http://localhost:8000/predict' \\
  -H 'accept: application/json' \\
  -H 'Content-Type: application/json' \\
  -d '{
      "english": "", "spanish": "Tengo mucho hambre"
  }'

Analyzing Responses

After sending requests to the served model, the responses can be analyzed. These responses provide insights into the model’s ability to translate from Spanish to English.

predictions.json

[
    {
        "request": {
            "english": "",
            "spanish": "Tengo mucho hambre"
        },
        "response": {
            "result": {
                "english": "I'm very hungry and"
            }
        }
    },
    {
        "request": {
            "english": "",
            "spanish": "¿Por qué Tomás sigue en Boston?"
        },
        "response": {
            "result": {
                "english": "Why is Tom still in Boston?"
            }
        }
    },
    {
        "request": {
            "english": "Why",
            "spanish": "¿Por qué Tomás sigue en Boston?"
        },
        "response": {
            "result": {
                "english": "Why is Tom still Boston?"
            }
        }
    },
    {
        "request": {
            "english": "",
            "spanish": "Un gato muy alto"
        },
        "response": {
            "result": {
                "english": "A cat was very high."
            }
        }
    }
]

Thanks for reading!