TODO: The project can be split into 3 parts: 1. encoder - embedding layer; Glove, etc. - FOR BSC: need to binarize the encodings into {-1, 1}. See equations (2.5-2.8) in text. Graidents can be passed through channel or leave them unchanged. - FOR AWGN: leave the output as is. - encodings will need to be variable length, see page 34 2. channel - AWGN (additive white gaussian noise); just add Gaussian noise with 0 mean and ∑ variance. - BSC (binary symmetric channel); use element-wise vector product between noise and codeword. The noise vector will have (modified) Bernoulli density at each index. So, for each index i, we have P(x=-1) = π and P(x = 1) = 1-π. - BEC (binary erasure channel); use a dropout layer, recieved codeword will have values {-1, 0, 1} where 0 is the erasure. 3. decoder - I would suggest using an auto-regressive output - output message doesn't need to be the same size as the input.