Keras embedding layer masking. Why does input_dim need to be |vocabulary| + 2?

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Track title: CC I Beethoven Sonata No 31 in A Flat M

--

Chapters
00:00 Keras Embedding Layer Masking. Why Does Input_dim Need To Be |Vocabulary| + 2?
00:36 Accepted Answer Score 15
01:27 Answer 2 Score 0
01:42 Thank you

--

Full question
https://stackoverflow.com/questions/4322...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #nlp #deeplearning #keras #keraslayer

#avk47

ACCEPTED ANSWER

Score 15

I believe the docs are a bit misleading there. In the normal case you are mapping your n input data indices [0, 1, 2, ..., n-1] to vectors, so your input_dim should be as many elements as you have

input_dim = len(vocabulary_indices)

An equivalent (but slightly confusing) way to say this, and the way the docs do, is to say

1 + maximum integer index occurring in the input data.

input_dim = max(vocabulary_indices) + 1

If you enable masking, value 0 is treated differently, so you increment your n indices by one: [0, 1, 2, ..., n-1, n], thus you need

input_dim = len(vocabulary_indices) + 1

or alternatively

input_dim = max(vocabulary_indices) + 2

The docs become especially confusing here as they say

(input_dim should equal |vocabulary| + 2)

where I would interpret |x| as the cardinality of a set (equivalent to len(x)), but the authors seem to mean

2 + maximum integer index occurring in the input data.

ANSWER 2

Score 0

Because the input_dim already is +1 of the vocabulary, so you just add another +1 for the 0 and get the +2.

input_dim: int > 0. Size of the vocabulary, ie. 1 + maximum integer index occurring in the input data.