The Python Oracle

Why were pandas merges in python faster than data.table merges in R in 2012?

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game Looping

--

Chapters
00:00 Question
00:55 Accepted answer (Score 128)
05:36 Answer 2 (Score 191)
06:59 Answer 3 (Score 39)
08:18 Answer 4 (Score 13)
11:29 Thank you

--

Full question
https://stackoverflow.com/questions/8991...

Question links:
[pandas]: http://pandas.sourceforge.net/
[this benchmark]: http://wesmckinney.com/blog/some-pandas-.../
[data.table]: http://cran.r-project.org/web/packages/d...
[R code]: https://github.com/wesm/pandas/blob/mast...
[Python code]: https://github.com/wesm/pandas/blob/mast...

Accepted answer links:
[NEWS]: https://r-forge.r-project.org/scm/viewvc...

Answer 2 links:
[a fast hash table implementation - klib]: https://github.com/attractivechaos/klib
[Cython]: http://cython.org/
[A look inside pandas design and development]: http://wesmckinney.com/blog/nycpython-11.../

Answer 3 links:
https://github.com/Rdatatable/data.table...
https://github.com/szilard/benchm-databa...
[image]: https://i.stack.imgur.com/dAnZO.png

Answer 4 links:
[db-benchmark]: https://h2oai.github.io/db-benchmark/

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #r #join #datatable #pandas

#avk47