## Obejective
We have to make a model which translate Italian to English from scratch.
## Basic Info
1. Download the Italian to English translation dataset from here
2. Encoder and Decoder architecture with attention.
Encoder - with 1 layer LSTM
Decoder - with 1 layer LSTM
Attention
3. In Global attention, we have 3 types of scoring functions.
As a part of this assignment, we had created 3 models for each scoring function.
In model 1 you need to implemnt "dot" score function
In model 2 you need to implemnt "general" score function
In model 3 you need to implemnt "concat" score function
4. Using attention weights, we have plot the attention plots.
5. BLEU score as metric to evaluate the model and SparseCategoricalCrossentropy as a loss.
6. There is detaile observation under each plot.
## How to train your model?
0. There are many ways to train your model. Say you are translating Hindi to English
1. Encoder input should be — Hindi sentence
2. Decoder input should be — what is your name?
3. Decoder output should be — what is your name?
4. model.fit([encoder_input,decoder_input],decoder_output)
## What is teacher forcing?
If you are having Decoder input and output as same, Say I want to predict Hindi to English.
English sentence is — — Hi How are you
So at the first-time step, you will pass and you expect your model to predict Hi, not .
If you want the model to predict the same input as output then why do you need such a complex network?. So your decoder output will be a one-time step ahead of decoder input.
## Observation:
