What is the most efficient string concatenation method in Python?
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Over Ancient Waters Looping
--
Chapters
00:00 What Is The Most Efficient String Concatenation Method In Python?
00:40 Answer 1 Score 156
01:28 Accepted Answer Score 128
02:45 Answer 3 Score 71
02:58 Answer 4 Score 42
03:38 Answer 5 Score 29
05:51 Thank you
--
Full question
https://stackoverflow.com/questions/1316...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #string
#avk47
ANSWER 1
Score 155
You may be interested in this: An optimization anecdote by Guido.  Although it is worth remembering also that this is an old article and it predates the existence of things like ''.join (although I guess string.joinfields is more-or-less the same)
On the strength of that, the array module may be fastest if you can shoehorn your problem into it.  But ''.join is probably fast enough and has the benefit of being idiomatic and thus easier for other Python programmers to understand.
Finally, the golden rule of optimization: don't optimize unless you know you need to, and measure rather than guessing.
You can measure different methods using the timeit module. That can tell you which is fastest, instead of random strangers on the Internet making guesses.
ANSWER 2
Score 71
''.join(sequence_of_strings) is what usually works best – simplest and fastest.
ANSWER 3
Score 42
It depends on what you're doing.
After Python 2.5, string concatenation with the + operator is pretty fast. If you're just concatenating a couple of values, using the + operator works best:
>>> x = timeit.Timer(stmt="'a' + 'b'")
>>> x.timeit()
0.039999961853027344
>>> x = timeit.Timer(stmt="''.join(['a', 'b'])")
>>> x.timeit()
0.76200008392333984
However, if you're putting together a string in a loop, you're better off using the list joining method:
>>> join_stmt = """
... joined_str = ''
... for i in xrange(100000):
...   joined_str += str(i)
... """
>>> x = timeit.Timer(join_stmt)
>>> x.timeit(100)
13.278000116348267
>>> list_stmt = """
... str_list = []
... for i in xrange(100000):
...   str_list.append(str(i))
... ''.join(str_list)
... """
>>> x = timeit.Timer(list_stmt)
>>> x.timeit(100)
12.401000022888184
...but notice that you have to be putting together a relatively high number of strings before the difference becomes noticeable.
ANSWER 4
Score 29
As per John Fouhy's answer, don't optimize unless you have to, but if you're here and asking this question, it may be precisely because you have to.
In my case, I needed to assemble some URLs from string variables... fast. I noticed no one (so far) seems to be considering the string format method, so I thought I'd try that and, mostly for mild interest, I thought I'd toss the string interpolation operator in there for good measure.
To be honest, I didn't think either of these would stack up to a direct '+' operation or a ''.join(). But guess what? On my Python 2.7.5 system, the string interpolation operator rules them all and string.format() is the worst performer:
# concatenate_test.py
from __future__ import print_function
import timeit
domain = 'some_really_long_example.com'
lang = 'en'
path = 'some/really/long/path/'
iterations = 1000000
def meth_plus():
    '''Using + operator'''
    return 'http://' + domain + '/' + lang + '/' + path
def meth_join():
    '''Using ''.join()'''
    return ''.join(['http://', domain, '/', lang, '/', path])
def meth_form():
    '''Using string.format'''
    return 'http://{0}/{1}/{2}'.format(domain, lang, path)
def meth_intp():
    '''Using string interpolation'''
    return 'http://%s/%s/%s' % (domain, lang, path)
plus = timeit.Timer(stmt="meth_plus()", setup="from __main__ import meth_plus")
join = timeit.Timer(stmt="meth_join()", setup="from __main__ import meth_join")
form = timeit.Timer(stmt="meth_form()", setup="from __main__ import meth_form")
intp = timeit.Timer(stmt="meth_intp()", setup="from __main__ import meth_intp")
plus.val = plus.timeit(iterations)
join.val = join.timeit(iterations)
form.val = form.timeit(iterations)
intp.val = intp.timeit(iterations)
min_val = min([plus.val, join.val, form.val, intp.val])
print('plus %0.12f (%0.2f%% as fast)' % (plus.val, (100 * min_val / plus.val), ))
print('join %0.12f (%0.2f%% as fast)' % (join.val, (100 * min_val / join.val), ))
print('form %0.12f (%0.2f%% as fast)' % (form.val, (100 * min_val / form.val), ))
print('intp %0.12f (%0.2f%% as fast)' % (intp.val, (100 * min_val / intp.val), ))
The results:
# Python 2.7 concatenate_test.py
plus 0.360787868500 (90.81% as fast)
join 0.452811956406 (72.36% as fast)
form 0.502608060837 (65.19% as fast)
intp 0.327636957169 (100.00% as fast)
If I use a shorter domain and shorter path, interpolation still wins out. The difference is more pronounced, though, with longer strings.
Now that I had a nice test script, I also tested under Python 2.6, 3.3 and 3.4, here's the results. In Python 2.6, the plus operator is the fastest! On Python 3, join wins out. Note: these tests are very repeatable on my system. So, 'plus' is always faster on 2.6, 'intp' is always faster on 2.7 and 'join' is always faster on Python 3.x.
# Python 2.6 concatenate_test.py
plus 0.338213920593 (100.00% as fast)
join 0.427221059799 (79.17% as fast)
form 0.515371084213 (65.63% as fast)
intp 0.378169059753 (89.43% as fast)
# Python 3.3 concatenate_test.py
plus 0.409130576998 (89.20% as fast)
join 0.364938726001 (100.00% as fast)
form 0.621366866995 (58.73% as fast)
intp 0.419064424001 (87.08% as fast)
# Python 3.4 concatenate_test.py
plus 0.481188605998 (85.14% as fast)
join 0.409673971997 (100.00% as fast)
form 0.652010936996 (62.83% as fast)
intp 0.460400978001 (88.98% as fast)
# Python 3.5 concatenate_test.py
plus 0.417167026084 (93.47% as fast)
join 0.389929617057 (100.00% as fast)
form 0.595661019906 (65.46% as fast)
intp 0.404455224983 (96.41% as fast)
Lesson learned:
- Sometimes, my assumptions are dead wrong.
 - Test against the system environment. You'll be running in production.
 - String interpolation isn't dead yet!
 
tl;dr:
- If you using Python 2.6, use the '+' operator.
 - if you're using Python 2.7, use the '%' operator.
 - if you're using Python 3.x, use ''.join().