02 - Sequence to Sequence: Spanish to English Translation

In this tutorial, we will use EIR for sequence-to-sequence tasks. Sequence to Sequence (seq-to-seq) models are a type of models that transform an input sequence into an output sequence, a task relevant for numerous applications like machine translation, summarization, and more.

For this tutorial, our task will be translating Spanish sentences into English, using a dataset from Tatoeba.

A - Data

You can download the data for this tutorial here.

After downloading the data, the folder structure should look like this (we will look at the configs in a bit):

eir_tutorials/c_sequence_output/02_sequence_to_sequence
├── conf
│   ├── fusion.yaml
│   ├── globals.yaml
│   ├── input_spanish.yaml
│   └── output.yaml
└── data
    └── eng-spanish
        ├── english.csv
        └── spanish.csv

B - Training

Training follows a similar approach as we saw in the previous tutorial, 01 – Sequence Generation: Generating Movie Reviews.

First, we will train on only the English data, without any Spanish data to establish a baseline.

For reference, here are the configurations:

globals.yaml
output_folder: eir_tutorials/tutorial_runs/c_sequence_output/02_seq_to_seq
valid_size: 500
n_saved_models: 1
checkpoint_interval: 500
sample_interval: 500
memory_dataset: true
n_epochs: 10
batch_size: 256
lr: 0.0005
optimizer: "adabelief"
device: "mps"
fusion.yaml
model_type: "pass-through"
output.yaml
output_info:
  output_source: eir_tutorials/c_sequence_output/02_sequence_to_sequence/data/eng-spanish/english.csv
  output_name: english
  output_type: sequence

output_type_info:
  max_length: 32
  split_on: " "
  sampling_strategy_if_longer: "uniform"
  min_freq: 10

model_config:
  embedding_dim: 128
  model_init_config:
    num_layers: 6

sampling_config:
  generated_sequence_length: 64
  n_eval_inputs: 10

With these configurations, we can train with the following command:

eirtrain \
--global_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/globals.yaml \
--fusion_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/fusion.yaml \
--output_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/output.yaml \
--globals.output_folder=eir_tutorials/tutorial_runs/c_sequence_output/02_seq_to_seq_eng_only

When running the command above, I got the following training curve:

../../_images/training_curve_LOSS_transformer_1_only_english.png

Here are a couple of example of the generated sentences using only English data:

Generated English caption using only English data 1
Tom 
Generated English caption using only English data 2
I don't have time to do this.

While the captions above are make some sense, a more interesting task is actually using the Spanish data as input, and generate the respective English translation. For this, we will include an input configuration for the Spanish data:

input_spanish.yaml
input_info:
  input_source: eir_tutorials/c_sequence_output/02_sequence_to_sequence/data/eng-spanish/spanish.csv
  input_name: spanish
  input_type: sequence

input_type_info:
  max_length: 32
  split_on: " "
  sampling_strategy_if_longer: "uniform"
  min_freq: 10

model_config:
  embedding_dim: 128
  model_init_config:
    num_layers: 6

To train, we will use the following command:

eirtrain \
--global_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/globals.yaml \
--input_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/input_spanish.yaml \
--fusion_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/fusion.yaml \
--output_configs eir_tutorials/c_sequence_output/02_sequence_to_sequence/conf/output.yaml

When running the command above, I got the following training curve:

../../_images/training_curve_LOSS_transformer_1_spanish_to_english.png

We can see that the training curve is better than when we only used English data, indicating that the model can utilize the Spanish data to generate the English sentences.

Now, we can look at some of the generated sentences:

Spanish English Translation
0 Tom se escapó y tomó unas con los muchachos. Tom got caught and drank some with the other guys.
1 ¿Por qué Tomás sigue en Boston? Why is Tom still in Boston?
2 Ella se fue a México sola. She went to Mexico at the left.
3 fue la madre de la The was his mother's
4 Todo será como antes. Everything will be used as possible.
5 Me gustaría ver la de la I'd like to see the from the
6 No tenía dónde I didn't have any where I were.
7 Piensa en ello. in it.
8 de nuevo para mí. Forget about to me.
9 Si no te gusta, no tienes por qué If you don't like him, don't need to

While these are not perfect translations, they are maybe not too bad considering a simple model trained for around an hour on a laptop.

C - Serving

In this final section, we demonstrate serving our trained model for sequence-to-sequence translation as a web service and interacting with it using HTTP requests.

Starting the Web Service

To serve the model, use the following command:

eirserve --model-path [MODEL_PATH]

Replace [MODEL_PATH] with the actual path to your trained model. This command initiates a web service that listens for incoming requests.

Here is an example of the command:

eirserve \
--model-path eir_tutorials/tutorial_runs/c_sequence_output/02_seq_to_seq/saved_models/02_seq_to_seq_model_5000_perf-average=-0.2346.pt

Sending Requests

With the server running, we can now send requests for translating text from Spanish to English.

Here’s an example Python function demonstrating this process:

import requests

def send_request(url: str, payload: dict):
    response = requests.post(url, json=payload)
    return response.json()

example_requests = [
    {"english": "", "spanish": "Tengo mucho hambre"},
    {"english": "", "spanish": "¿Por qué Tomás sigue en Boston?"},
]

for payload in example_requests:
    response = send_request('http://localhost:8000/predict', payload)
    print(f"Spanish: {payload['spanish']}")
    print(f"Translated to English: {response['english']}\n")

Additionally, you can send requests using bash:

curl -X 'POST' \\
  'http://localhost:8000/predict' \\
  -H 'accept: application/json' \\
  -H 'Content-Type: application/json' \\
  -d '{
      "english": "", "spanish": "Tengo mucho hambre"
  }'

Analyzing Responses

After sending requests to the served model, the responses can be analyzed. These responses provide insights into the model’s ability to translate from Spanish to English.

predictions.json
[
    {
        "request": {
            "english": "",
            "spanish": "Tengo mucho hambre"
        },
        "response": {
            "result": {
                "english": "I'm very hungry and"
            }
        }
    },
    {
        "request": {
            "english": "",
            "spanish": "¿Por qué Tomás sigue en Boston?"
        },
        "response": {
            "result": {
                "english": "Why is Tom still in Boston?"
            }
        }
    },
    {
        "request": {
            "english": "Why",
            "spanish": "¿Por qué Tomás sigue en Boston?"
        },
        "response": {
            "result": {
                "english": "Why is Tom still Boston?"
            }
        }
    },
    {
        "request": {
            "english": "",
            "spanish": "Un gato muy alto"
        },
        "response": {
            "result": {
                "english": "A cat was very high."
            }
        }
    }
]

Thanks for reading!