## Obejective We have to make a model which translate Italian to English from scratch. ## Basic Info 1. Download the Italian to English translation dataset from here 2. Encoder and Decoder architecture with attention. Encoder - with 1 layer LSTM Decoder - with 1 layer LSTM Attention 3. In Global attention, we have 3 types of scoring functions. As a part of this assignment, we had created 3 models for each scoring function. In model 1 you need to implemnt "dot" score function In model 2 you need to implemnt "general" score function In model 3 you need to implemnt "concat" score function 4. Using attention weights, we have plot the attention plots. 5. BLEU score as metric to evaluate the model and SparseCategoricalCrossentropy as a loss. 6. There is detaile observation under each plot. ## How to train your model? 0. There are many ways to train your model. Say you are translating Hindi to English 1. Encoder input should be — Hindi sentence 2. Decoder input should be — what is your name? 3. Decoder output should be — what is your name? 4. model.fit([encoder_input,decoder_input],decoder_output) ## What is teacher forcing? If you are having Decoder input and output as same, Say I want to predict Hindi to English. English sentence is — — Hi How are you So at the first-time step, you will pass and you expect your model to predict Hi, not . If you want the model to predict the same input as output then why do you need such a complex network?. So your decoder output will be a one-time step ahead of decoder input. ## Observation: ![p1](https://user-images.githubusercontent.com/39815040/100618452-e141aa80-3341-11eb-82ff-160ed8bfe5c3.png)