Sentence-translation
README.md

Obejective

We have to make a model which translate Italian to English from scratch.

Basic Info

  1. Download the Italian to English translation dataset from here

  2. Encoder and Decoder architecture with attention.

Encoder - with 1 layer LSTM Decoder - with 1 layer LSTM Attention

  1. In Global attention, we have 3 types of scoring functions. As a part of this assignment, we had created 3 models for each scoring function. In model 1 you need to implemnt "dot" score function In model 2 you need to implemnt "general" score function In model 3 you need to implemnt "concat" score function

  2. Using attention weights, we have plot the attention plots.

  3. BLEU score as metric to evaluate the model and SparseCategoricalCrossentropy as a loss.

  4. There is detaile observation under each plot.

How to train your model?

  1. There are many ways to train your model. Say you are translating Hindi to English

  2. Encoder input should be — Hindi sentence

  3. Decoder input should be — what is your name?

  4. Decoder output should be — what is your name?

  5. model.fit([encoder_input,decoder_input],decoder_output)

What is teacher forcing?

If you are having Decoder input and output as same, Say I want to predict Hindi to English. English sentence is — — Hi How are you So at the first-time step, you will pass and you expect your model to predict Hi, not . If you want the model to predict the same input as output then why do you need such a complex network?. So your decoder output will be a one-time step ahead of decoder input.

Observation:

p1