How to correctly use mask_zero=True for Keras Embedding with pre-trained weights?
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 5 Looping
--
Chapters
00:00 Question
02:30 Accepted answer (Score 5)
03:19 Thank you
--
Full question
https://stackoverflow.com/questions/5138...
Accepted answer links:
[implementation of keras' embedding]: https://github.com/keras-team/keras/blob...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #tensorflow #keras #wordembedding
#avk47
    --
Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 5 Looping
--
Chapters
00:00 Question
02:30 Accepted answer (Score 5)
03:19 Thank you
--
Full question
https://stackoverflow.com/questions/5138...
Accepted answer links:
[implementation of keras' embedding]: https://github.com/keras-team/keras/blob...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #tensorflow #keras #wordembedding
#avk47
ACCEPTED ANSWER
Score 6
You're second approach is correct. You will want to construct your embedding layer in the following way
embedding = Embedding(
   output_dim=embedding_size,
   input_dim=vocabulary_size + 1,
   input_length=input_length,
   mask_zero=True,
   weights=[np.vstack((np.zeros((1, embedding_size)),
                       embedding_matrix))],
   name='embedding'
)(input_layer)
where embedding_matrix is the second matrix you provided.
You can see this by looking at the implementation of keras' embedding layer. Notably, how mask_zero is only used to literally mask the inputs
def compute_mask(self, inputs, mask=None):
    if not self.mask_zero:
        return None
    output_mask = K.not_equal(inputs, 0)
    return output_mask
thus the entire kernel is still multiplied by the input, meaning all indexes are shifted up by one.