Peak-finding algorithm for Python/SciPy
Peak-finding algorithm for Python/SciPy
--
Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn
--
Track title: CC H Dvoks String Quartet No 12 Ame
--
Chapters
00:00 Question
02:08 Accepted answer (Score 148)
04:11 Answer 2 (Score 49)
05:14 Answer 3 (Score 21)
05:38 Answer 4 (Score 20)
06:25 Thank you
--
Full question
https://stackoverflow.com/questions/1713...
Question links:
[image]: https://i.stack.imgur.com/eJvFQ.jpg
[quadratic interpolation]: https://ccrma.stanford.edu/~jos/sasp/Qua...
[translated a MATLAB script]: http://gist.github.com/250860
[created a better version]: https://gist.github.com/1178136
Accepted answer links:
[scipy.signal.find_peaks]: https://docs.scipy.org/doc/scipy/referen...
[(topographic) prominence]: https://en.wikipedia.org/wiki/Topographi...
[image]: https://i.stack.imgur.com/c2xE7.png
[image]: https://i.stack.imgur.com/zFU2l.png
Answer 2 links:
[this]: http://www.ncbi.nlm.nih.gov/pmc/articles.../
Answer 3 links:
[http://docs.scipy.org/doc/scipy/referenc...]: http://docs.scipy.org/doc/scipy/referenc...
Answer 4 links:
[https://github.com/MonsieurV/py-findpeak...]: https://github.com/MonsieurV/py-findpeak...
[detect_peaks function]: http://nbviewer.ipython.org/github/demot...
[image]: https://i.stack.imgur.com/flY1u.png
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #scipy #fft #houghtransform
ACCEPTED ANSWER
Score 183
The function scipy.signal.find_peaks, as its name suggests, is useful for this. But it's important to understand well its parameters width, threshold, distance and above all prominence to get a good peak extraction.
According to my tests and the documentation, the concept of prominence is "the useful concept" to keep the good peaks, and discard the noisy peaks.
What is (topographic) prominence? It is "the minimum height necessary to descend to get from the summit to any higher terrain", as it can be seen here:
The idea is:
The higher the prominence, the more "important" the peak is.
Test:
I used a (noisy) frequency-varying sinusoid on purpose because it shows many difficulties. We can see that the width parameter is not very useful here because if you set a minimum width too high, then it won't be able to track very close peaks in the high frequency part. If you set width too low, you would have many unwanted peaks in the left part of the signal. Same problem with distance. threshold only compares with the direct neighbours, which is not useful here. prominence is the one that gives the best solution. Note that you can combine many of these parameters!
Code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import find_peaks
x = np.sin(2*np.pi*(2**np.linspace(2,10,1000))*np.arange(1000)/48000) + np.random.normal(0, 1, 1000) * 0.15
peaks, _ = find_peaks(x, distance=20)
peaks2, _ = find_peaks(x, prominence=1) # BEST!
peaks3, _ = find_peaks(x, width=20)
peaks4, _ = find_peaks(x, threshold=0.4) # Required vertical distance to its direct neighbouring samples, pretty useless
plt.subplot(2, 2, 1)
plt.plot(peaks, x[peaks], "xr"); plt.plot(x); plt.legend(['distance'])
plt.subplot(2, 2, 2)
plt.plot(peaks2, x[peaks2], "ob"); plt.plot(x); plt.legend(['prominence'])
plt.subplot(2, 2, 3)
plt.plot(peaks3, x[peaks3], "vg"); plt.plot(x); plt.legend(['width'])
plt.subplot(2, 2, 4)
plt.plot(peaks4, x[peaks4], "xk"); plt.plot(x); plt.legend(['threshold'])
plt.show()
ANSWER 2
Score 53
I'm looking at a similar problem, and I've found some of the best references come from chemistry (from peaks finding in mass-spec data). For a good thorough review of peaking finding algorithms read this. This is one of the best clearest reviews of peak finding techniques that I've run across. (Wavelets are the best for finding peaks of this sort in noisy data.).
It looks like your peaks are clearly defined and aren't hidden in the noise. That being the case I'd recommend using smooth savtizky-golay derivatives to find the peaks (If you just differentiate the data above you'll have a mess of false positives.). This is a very effective technique and is pretty easy to implemented (you do need a matrix class w/ basic operations). If you simply find the zero crossing of the first S-G derivative I think you'll be happy.
ANSWER 3
Score 23
There is a function in scipy named scipy.signal.find_peaks_cwt which sounds like is suitable for your needs, however I don't have experience with it so I cannot recommend..
http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks_cwt.html
ANSWER 4
Score 22
For those not sure about which peak-finding algorithms to use in Python, here a rapid overview of the alternatives: https://github.com/MonsieurV/py-findpeaks
Wanting myself an equivalent to the MatLab findpeaks function, I've found that the detect_peaks function from Marcos Duarte is a good catch.
Pretty easy to use:
import numpy as np
from vector import vector, plot_peaks
from libs import detect_peaks
print('Detect peaks with minimum height and distance filters.')
indexes = detect_peaks.detect_peaks(vector, mph=7, mpd=2)
print('Peaks are: %s' % (indexes))
Which will give you:


