How to correctly use mask_zero=True for Keras Embedding with pre-trained weights?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Orient Looping

--

Chapters
00:00 How To Correctly Use Mask_zero=True For Keras Embedding With Pre-Trained Weights?
02:10 Accepted Answer Score 6
02:50 Thank you

--

Full question
https://stackoverflow.com/questions/5138...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #tensorflow #keras #wordembedding

#avk47

ACCEPTED ANSWER

Score 6

You're second approach is correct. You will want to construct your embedding layer in the following way

embedding = Embedding(
   output_dim=embedding_size,
   input_dim=vocabulary_size + 1,
   input_length=input_length,
   mask_zero=True,
   weights=[np.vstack((np.zeros((1, embedding_size)),
                       embedding_matrix))],
   name='embedding'
)(input_layer)

where embedding_matrix is the second matrix you provided.

You can see this by looking at the implementation of keras' embedding layer. Notably, how mask_zero is only used to literally mask the inputs

def compute_mask(self, inputs, mask=None):
    if not self.mask_zero:
        return None
    output_mask = K.not_equal(inputs, 0)
    return output_mask

thus the entire kernel is still multiplied by the input, meaning all indexes are shifted up by one.