np.random.rand() or random.random()
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Techno Bleepage Open
--
Chapters
00:00 Np.Random.Rand() Or Random.Random()
00:40 Accepted Answer Score 13
01:16 Thank you
--
Full question
https://stackoverflow.com/questions/6908...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #random #randomseed
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Techno Bleepage Open
--
Chapters
00:00 Np.Random.Rand() Or Random.Random()
00:40 Accepted Answer Score 13
01:16 Thank you
--
Full question
https://stackoverflow.com/questions/6908...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #random #randomseed
#avk47
ACCEPTED ANSWER
Score 13
np.random.rand(len(df)) returns an array of uniform random numbers between 0 and 1, np.random.rand(len(df)) < 0.8 will transform it into an array of booleans based on the condition.
As there is a 80% chance to be below 0.8, there is 80% of True values.
A more explicit approach would be to use numpy.random.choice:
np.random.choice([True, False], p=[0.8, 0.2], size=len(df))
An even better approach, if your goal is to subset a dataframe, would be to use:
df.sample(frac=0.8)
how to split a dataframe 0.8/0.2:
df1 = df.sample(frac=0.8)
df2 = df.drop(df1.index)