The Python Oracle

Python: Finding differences between elements of a list

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game Looping

--

Chapters
00:00 Python: Finding Differences Between Elements Of A List
00:27 Accepted Answer Score 204
00:35 Answer 2 Score 151
00:49 Answer 3 Score 43
00:59 Answer 4 Score 17
02:33 Thank you

--

Full question
https://stackoverflow.com/questions/2400...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #list

#avk47



ACCEPTED ANSWER

Score 204


>>> t
[1, 3, 6]
>>> [j-i for i, j in zip(t[:-1], t[1:])]  # or use itertools.izip in py2k
[2, 3]



ANSWER 2

Score 151


The other answers are correct but if you're doing numerical work, you might want to consider numpy. Using numpy, the answer is:

v = numpy.diff(t)



ANSWER 3

Score 43


If you don't want to use numpy nor zip, you can use the following solution:

>>> t = [1, 3, 6]
>>> v = [t[i+1]-t[i] for i in range(len(t)-1)]
>>> v
[2, 3]



ANSWER 4

Score 17


You can use itertools.tee and zip to efficiently build the result:

from itertools import tee
# python2 only:
#from itertools import izip as zip

def differences(seq):
    iterable, copied = tee(seq)
    next(copied)
    for x, y in zip(iterable, copied):
        yield y - x

Or using itertools.islice instead:

from itertools import islice

def differences(seq):
    nexts = islice(seq, 1, None)
    for x, y in zip(seq, nexts):
        yield y - x

You can also avoid using the itertools module:

def differences(seq):
    iterable = iter(seq)
    prev = next(iterable)
    for element in iterable:
        yield element - prev
        prev = element

All these solution work in constant space if you don't need to store all the results and support infinite iterables.


Here are some micro-benchmarks of the solutions:

In [12]: L = range(10**6)

In [13]: from collections import deque
In [15]: %timeit deque(differences_tee(L), maxlen=0)
10 loops, best of 3: 122 ms per loop

In [16]: %timeit deque(differences_islice(L), maxlen=0)
10 loops, best of 3: 127 ms per loop

In [17]: %timeit deque(differences_no_it(L), maxlen=0)
10 loops, best of 3: 89.9 ms per loop

And the other proposed solutions:

In [18]: %timeit [x[1] - x[0] for x in zip(L[1:], L)]
10 loops, best of 3: 163 ms per loop

In [19]: %timeit [L[i+1]-L[i] for i in range(len(L)-1)]
1 loops, best of 3: 395 ms per loop

In [20]: import numpy as np

In [21]: %timeit np.diff(L)
1 loops, best of 3: 479 ms per loop

In [35]: %%timeit
    ...: res = []
    ...: for i in range(len(L) - 1):
    ...:     res.append(L[i+1] - L[i])
    ...: 
1 loops, best of 3: 234 ms per loop

Note that:

  • zip(L[1:], L) is equivalent to zip(L[1:], L[:-1]) since zip already terminates on the shortest input, however it avoids a whole copy of L.
  • Accessing the single elements by index is very slow because every index access is a method call in python
  • numpy.diff is slow because it has to first convert the list to a ndarray. Obviously if you start with an ndarray it will be much faster:

    In [22]: arr = np.array(L)
    
    In [23]: %timeit np.diff(arr)
    100 loops, best of 3: 3.02 ms per loop