The Python Oracle

Why is it string.join(list) instead of list.join(string)?

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Light Drops

--

Chapters
00:00 Why Is It String.Join(List) Instead Of List.Join(String)?
00:20 Accepted Answer Score 1452
00:49 Answer 2 Score 82
01:31 Answer 3 Score 26
01:57 Answer 4 Score 461
04:10 Thank you

--

Full question
https://stackoverflow.com/questions/4938...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #string #list

#avk47



ACCEPTED ANSWER

Score 1452


It's because any iterable can be joined (e.g, list, tuple, dict, set), but its contents and the "joiner" must be strings.

For example:

'_'.join(['welcome', 'to', 'stack', 'overflow'])
'_'.join(('welcome', 'to', 'stack', 'overflow'))
'welcome_to_stack_overflow'

Using something other than strings will raise the following error:

TypeError: sequence item 0: expected str instance, int found




ANSWER 2

Score 461


This was discussed in the String methods... finally thread in the Python-Dev achive, and was accepted by Guido. This thread began in Jun 1999, and str.join was included in Python 1.6 which was released in Sep 2000 (and supported Unicode). Python 2.0 (supported str methods including join) was released in Oct 2000.

  • There were four options proposed in this thread:
    • separator.join(items)
    • items.join(separator)
    • items.reduce(separator)
    • join as a built-in function
  • Guido wanted to support not only lists and tuples, but all sequences/iterables.
  • items.reduce(separator) is difficult for newcomers.
  • items.join(separator) introduces unexpected dependency from sequences to str/unicode.
  • join() as a free-standing built-in function would support only specific data types. So using a built-in namespace is not good. If join() were to support many data types, creating an optimized implementation would be difficult: if implemented using the __add__ method then it would be O(n²).
  • The separator string (separator) should not be omitted. Explicit is better than implicit.

Here are some additional thoughts (my own, and my friend's):

  • Unicode support was coming, but it was not final. At that time UTF-8 was the most likely about to replace UCS-2/-4. To calculate total buffer length for UTF-8 strings, the method needs to know the character encoding.
  • At that time, Python had already decided on a common sequence interface rule where a user could create a sequence-like (iterable) class. But Python didn't support extending built-in types until 2.2. At that time it was difficult to provide basic iterable class (which is mentioned in another comment).

Guido's decision is recorded in a historical mail, deciding on separator.join(items):

Funny, but it does seem right! Barry, go for it...
--Guido van Rossum




ANSWER 3

Score 82


I agree that it's counterintuitive at first, but there's a good reason. Join can't be a method of a list because:

  • it must work for different iterables too (tuples, generators, etc.)
  • it must have different behavior between different types of strings.

There are actually two join methods (Python 3.0):

>>> b"".join
<built-in method join of bytes object at 0x00A46800>
>>> "".join
<built-in method join of str object at 0x00A28D40>

If join was a method of a list, then it would have to inspect its arguments to decide which one of them to call. And you can't join byte and str together, so the way they have it now makes sense.




ANSWER 4

Score 26


Think of it as the natural orthogonal operation to split.

I understand why it is applicable to anything iterable and so can't easily be implemented just on list.

For readability, I'd like to see it in the language but I don't think that is actually feasible - if iterability were an interface then it could be added to the interface but it is just a convention and so there's no central way to add it to the set of things which are iterable.