Jump to content

Savitzky–Golay filter

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Volker Siegel (talk | contribs) at 18:38, 8 February 2014 (→‎Convolution and correlation: fixing syntax error in formula causing "Failed to parse(unknown error)" message). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Animation showing smoothing being applied, passing through the data from left to right. The red line represent the local polynomial being used to fit a sub-set of the data. The smoothed values are shown as circles.

A Savitzky–Golay filter is a digital filter that can be applied to a set of digital data points for the purpose of smoothing the data, that is, to increase the signal-to-noise ratio without greatly distorting the signal. This is achieved, in a process known as convolution, by fitting successive sub-sets of adjacent data points with a low-degree polynomial by the method of linear least squares . When the data points are equally spaced an analytical solution to the least-squares equations can be found, in the form of a single set of "convolution coefficients" that can be applied to all data sub-sets, to give estimates of the smoothed signal, (or derivatives of the smoothed signal) at the central point of each sub-set. The method, based on established mathematical procedures,[1][2] was popularized by Abraham Savitzky and Marcel J. E. Golay who published tables of convolution coefficients for various polynomials and sub-set sizes in 1964.[3][4] Some errors in the tables have been corrected.[5] The method has been extended for the treatment of 2- and 3-dimensional data.

Savitzky and Golay's paper is one of the most widely cited papers in the journal Analytical Chemistry[6] and is classed by that journal as one of its "10 seminal papers" saying "it can be argued that the dawn of the computer-controlled analytical instrument can be traced to this article".[7]

Applications

The data consists of a set of n {xj, yj} points (j = 1, ..., n), where x is an independent variable and yj is an observed value. They are treated with a set of m convolution coefficients, Ci according to the expression

It is easy to apply this formula in a spreadsheet. Selected convolution coefficients are shown in the tables, below. For example, for smoothing by a 5-point quadratic polynomial, m = 5, i = −2, −1, 0, 1, 2 and the jth smoothed data point, Yj, is given by

,

where, C−2 = −3/35, C−1 = 12 / 35, etc. There are numerous applications of smoothing, which is performed primarily to make the data appear to be less noisy than it really is. The following are applications of numerical differentiation of data.[8]

(1) Synthetic Lorentzian + noise (blue) and 1st. derivative (green)
(2) Titration curve (blue) for malonic acid and 2nd. derivative (green). The part in the light blue box is magnified 10 times
(3) Lorentzian on exponential baseline (blue) and 2nd. derivative (green)
(4) Sum of two Lorentzians (blue) and 2nd. derivative (green)
(5) 4th. derivative of the sum of two Lorentzians
  1. Location of maxima and minima in experimental data curves. This was the application that first motivated Savitzky.[4] The first derivative of a function is zero at a maximum or minimum. The diagram shows data points belonging to a synthetic Lorentzian curve, with added noise (blue diamonds). Data are plotted on a scale of half width, relative to the peak maximum at zero. The smoothed curve (red line) and 1st. derivative (green) were calculated with 7-point cubic Savitzky–Golay filters. Linear interpolation of the first derivative values at positions either side of the zero-crossing gives the position of the peak maximum. 3rd. derivatives can also be used for this purpose.
  2. Location of an end-point in a titration curve. An end-point is an inflection point where the second derivative of the function is zero.[9] The titration curve for malonic acid illustrates the power of the method. The first end-point at 4 ml is barely visible, but the second derivative allows its value to be easily determined by linear interpolation to find the zero crossing.
  3. Baseline flattening. In analytical chemistry it is sometimes necessary to measure the height of an absorption band against a curved baseline.[10] Because the curvature of the baseline is much less than the curvature of the absorption band, the second derivative effectively flattens the baseline. Three measures of the derivative height, which is proportional to the absorption band height, are the "peak-to-valley" distances h1 and h2, and the height from baseline, h3.[11]
  4. Resolution enhancement in spectroscopy. Bands in the second derivative of a spectroscopic curve are narrower than the bands in the spectrum: they have reduced half-width. This allows partially overlapping bands to be "resolved" into separate (negative) peaks.[12] The diagram illustrates how this may be used also for chemical analysis, using measurement of "peak-to-valley" distances. In this case the valleys are a property of the 2nd. derivative of a Lorentzian. (x-axis position is relative to the position of the peak maximum on a scale of half width at half height).
  5. Resolution enhancement with 4th. derivative (positive peaks). The minima are a property of the 4th derivative of a Lorentzian.

Derivation of convolution coefficients

When the data points are equally spaced an analytical solution to the least-squares equations can be found.[2] This solution forms the basis of the convolution method of numerical smoothing and differentiation. Suppose that the data consists of a set of n {xj, yj} points (j = 1, ..., n), where x is an independent variable and yj is a datum value. A polynomial will be fitted by linear least squares to a set of m (an odd number) adjacent data points, each separated by an interval h. Firstly, a change of variable is made

where is the value of the central point. z takes the values (e.g. m = 5 → z = −2, −1, 0, 1, 2).[note 1] The polynomial, of degree k is defined as

[note 2]

The coefficients a0, a1 etc. are obtained by solving the normal equations

where the ith row of the Jacobian matrix, , has values 1, zi, zi2, .. .

For example, for a cubic polynomial fitted to 5 points, z= −2, −1, 0, 1, 2 the normal equations are solved as follows.

Now, the normal equations can be factored into two separate sets of equations, by rearranging rows and columns, with

The normal equations become

and

Multiplying out and removing common factors,

The coefficients of y in these expressions are known as convolution coefficients. They are elements of the matrix

In general,

Tables of convolution coefficients, calculated in the same way for m up to 25, were published for the Savitzky–Golay filter in 1964,[3][5] The value of the central point, z = 0, is obtained from a single set of coefficients, a0 for smoothing, a1 for 1st. derivative etc. The numerical derivatives are obtained by differentiating Y. This means that the derivatives are calculated for the smoothed data curve. For a cubic polynomial

In general, polynomials of degree (0 and 1),[note 3] (2 and 3), (4 and 5) etc. give the same coefficients for smoothing and even derivatives. Polynomials of degree (1 and 2), (3 and 4) etc. give the same coefficients for odd derivatives.

Note. It is implicit in the above treatment that the data points are given equal weight. Technically, the objective function

being minimized has unit weights, wi=1. When weights are not all the same the normal equations become

,

These equations do not have an analytical solution. To use non-unit weights, the coefficients of the fitting polynomial must be calculated individually for each data sub-set, using local regression.

Algebraic expressions

It is not necessary always to use the Savitzky–Golay tables. The summations in the matrix JTJ can be evaluated in closed form,

so that algebraic formulae can be derived for the convolution coefficients.[13][note 4] For a cubic polynomial the expressions are (with (1 − m)/2 ≤ i ≤ (m − 1)/2)

Smoothing:
1st derivative:
2nd. derivative:
3rd. derivative:

These functions are suitable for use with any curve that has an inflection point. Higher derivatives can be obtained. For example, a fourth derivative can be obtained by performing two passes of a second derivative function.[14]

Use of orthogonal polynomials

An alternative to fitting m data points by a simple polynomial in the subsidiary variable, z, is to use orthogonal polynomials.

where P0 .. Pk is a set of mutually orthogonal polynomials of degree 0 .. k. Full details on how to obtain expressions for the orthogonal polynomials and the relationship between the coefficients b and a are given by Guest.[2] Expressions for the convolution coefficients are easily obtained because the normal equations matrix, JTJ, is a diagonal matrix as the product of any two orthogonal polynomials is zero by virtue of their mutual orthogonality. Therefore, each non-zero element of its inverse is simply the reciprocal the corresponding element in the normal equation matrix. The calculation is further simplified by using recursion to build orthogonal Gram polynomials. The whole calculation can be coded in a few lines of PASCAL, a computer language well-adapted for calculations involving recursion.[15]

Treatment of first and last points

Savitzky–Golay filters are most commonly used to obtain the smoothed or derivative value at the central point, z = 0, using a single set of convolution coefficients. (m − 1)/2 points at the start and end of the series cannot be calculated using this process. Various strategies can be employed to avoid this inconvenience.

  • The data could be artificially extended by adding, in reverse order, copies of the first (m − 1)/2 points at the beginning and copies of the last (m − 1)/2 points at the end. For instance, with m = 5, two points are added at the start and end of the data y1, ..., yn.
y3,y2,y1, ... ,yn, yn−1, yn−2.
  • Looking again at the fitting polynomial, it is obvious that data can be calculated for all values of z by using all sets of convolution coefficients for a single polynomial, a0 .. ak.
For a cubic polynomial
  • Convolution coefficients for the missing first and last points can also be easily obtained.[15] This is also equivalent to fitting the first (m+1)/2 points with the same polynomial, and similarly for the last points.

Two-dimensional convolution coefficients

Two-dimensional smoothing and differentiation can also be applied to tables of data values, such as intensity values in a photographic image which is composed of a rectangular grid of pixels.[16] [17] The trick is to transform part of the table into a row by a simple ordering of the indices of the pixels. Whereas the one-dimensional filter coefficients are found by fitting a polynomial in the subsidiary variable, z to a set of m data points. the two-dimensional coefficient are found by fitting a polynomial in subsidiary variables v and w to a set of m × m data points. The following example, for a bicubic polynomial and m = 5, illustrates the process, which parallels the process for the one dimensional case, above.[18]

The square of 25 data values, d1 − '"d25

v −2 −1 0 1 2
w
−2 d1 d2 d3 d4 d5
−1 d6 d7 d8 d9 d10
0 d11 d12 d13 d14 d15
1 d16 d17 d18 d19 d20
2 d21 d22 d23 d24 d25

becomes a vector when the rows are placed one after another.

d = (d1 ... d25)T

The Jacobian has 10 columns, one for each of the parameters a00 − a03 and 25 rows, one for each pair of v and w values. Each row has the form

The convolution coefficients are calculated as

The first row of C contains 25 convolution coefficients which can be multiplied with the 25 data values to provide a smoothed value for the central data point (13) of the 25.

A Matlab[19] routine for computing the coefficients is available. 3-dimensional filters can be obtained with a similar procedure.[16]

Some properties of convolution

  1. The sum of convolution coefficients for smoothing is equal to one. The sum of coefficients for odd derivatives is zero.[20]
  2. The sum of squared convolution coefficients for smoothing is equal to the value of the central coefficient.[21]
  3. Smoothing of a function leaves the area under the function unchanged.[20]
  4. Convolution of a symmetric function with even-derivative coefficients conserves the centre of symmetry.[20]
  5. Properties of derivative filters.[22]

Signal distortion and noise reduction

It is inevitable that the signal will be distorted in the convolution process. From property 3 above, when data which has a peak is smoothed the peak height will be reduced and the half-width will be increased. Both the extent of the distortion and S/N (signal-to-noise ratio) improvement:

  • decrease as the degree of the polynomial increases
  • increase as the width, m of the convolution function increases

For example, If the noise in all data points is uncorrelated and has a constant standard deviation, σ, the standard deviation on the noise will be decreased by convolution with an m-point smoothing function to[21]

polynomial degree 0 or 1: (moving average)
polynomial degree 2 or 3: .

Thus, with a 9-point linear function (moving average) two thirds of the noise is removed and with a 9-point quadratic/cubic smoothing function only about half the noise is removed. To remove two thirds of the noise with a quadratic/cubic smoothing function 21 points will be needed.

Although the moving average (polynomial order 0 or 1) gives the best noise reduction it is unsuitable for smoothing data which has curvature over m points. A quadratic filter is unsuitable for getting a derivative of a data curve with an inflection point because a quadratic polynomial does not have one. The optimal choice of polynomial order and number of convolution coefficients will be a compromise between noise reduction and distortion.[23]

Multipass filters

One way to mitigate distortion and improve noise removal is to use a filter of smaller width and perform more than one convolution with it. For two passes of the same filter this is equivalent to one pass of a filter obtained by convolution of the original filter with itself.[24] The disadvantage of this process is that the equivalent filter width for n passes of an m-point function is n(m − 1) + 1 so multipassing is subject to greater end-effects. Nevertheless, multipassing has been used to great advantage. For instance, some 40–80 passes on data with a signal-to-noise ratio of only 5 gave useful results.[25] The noise reduction formulae given above do not apply because correlation between calculated data points increases with each pass.

Frequency characteristics of convolution filters

Fourier transform of the 9-point quadratic/cubic smoothing function

Convolution maps to multiplication in the Fourier co-domain. The discrete Fourier transform of a convolution filter is a real-valued function which can be represented as

θ runs from 0 to 180 degrees, after which the function merely repeats itself. The plot for a 9-point quadratic/cubic smoothing function is typical. At very low angle, the plot is almost flat, meaning that low-frequency components of the data will be virtually unchanged by the smoothing operation. As the angle increases the value decreases so that higher frequency components are more and more attenuated. This shows that the convolution filter can be described as a low-pass filter: the noise that is removed is primarily high-frequency noise and low-frequency noise passes through the filter.[26] Some high-frequency noise components are attenuated more than others, as shown by undulations in the Fourier transform at large angles. This can give rise to small oscillations in the smoothed data.[27]

Convolution and correlation

Convolution affects the correlation between errors in the data. The effect of convolution can be expressed as a linear transformation.

By the law of error propagation, the variance-covariance matrix of the data, A will be transformed into B according to

To see how this applies in practice, consider the effect of a 5-point quadratic smoothing function for the first five calculated points, Y3 − Y7.

If one assumes that the data points have equal variance and that there is no correlation between them, A will be an identity matrix multiplied by a constant, σ2, the variance at each point. In that case, . The correlation coefficients, , between calculated points i and j will be obtained by vector multiplication of rows i and j of C.

In this example the correlation coefficient between adjacent points is 0.27. Thus, the calculated values are correlated even when the observed values are not correlated. The same pattern applies to the rest of the calculated points. The correlation extends over m − 1 calculated points at a time.[28]

See also

Notes

  1. ^ With even values of m, z will run from 1 − m to m − 1 in steps of 2
  2. ^ The simple moving average is a special case with k = 0, Y = a0. In this case all convolution coefficients are equal to 1/m.
  3. ^ Smoothing using the moving average is equivalent, with equally spaced points, to local fitting with a (sloping) straight line
  4. ^ The expressions given here are different from those of Madden, which are given in terms of the variable m' = (m − 1)/2.

References

  1. ^ Whittaker, E.T (1924). The Calculus Of Observations. Blackie And Son Limited. pp. 291–296. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help). "Graduation Formulae obtained by fitting a Polynomial."
  2. ^ a b c Guest, P.G. (1961). "Chapter 7, Estimation of Polynomial Coefficients". Numerical Methods of Curve Fitting. Cambridge University Press.
  3. ^ a b Savitzky, A.; Golay, M.J.E. (1964). "Smoothing and Differentiation of Data by Simplified Least Squares Procedures". Analytical Chemistry. 36 (8): 1627–1639. doi:10.1021/ac60214a047.
  4. ^ a b Savitzky, Abraham (1989). "A Historic Collaboration". Analytical Chemistry. 61 (15): 921A–923A. doi:10.1021/ac00190a744.
  5. ^ a b Steinier, Jean; Termonia, Yves; Deltour, Jules (1972). "Smoothing and differentiation of data by simplified least square procedure". Analytical Chemistry. 44 (11): 1906–1909. doi:10.1021/ac60319a045.
  6. ^ Larive, Cynthia K.; Sweedler, Jonathan V. (2013). "Celebrating the 75th Anniversary of the ACS Division of Analytical Chemistry: A Special Collection of the Most Highly Cited Analytical Chemistry Papers Published between 1938 and 2012". Analytical Chemistry. 85 (0): 4201–4202. doi:10.1021/ac401048d.
  7. ^ Riordon, James; Zubritsky, Elizabeth; Newman, Alan (2000). "Top 10 Articles". Analytical Chemistry. 72 (9): 24 A–329 A. doi:10.1021/ac002801q.
  8. ^ Talsky, Gerhard. Derivative Spectrophotometry. Wiley. ISBN 3527282947.
  9. ^ Abbaspour, Abdolkarim; Khajehzadeha, Abdolreza (2012). "End point detection of precipitation titration by scanometry method without using indicator". Anal. Methods. 4: 923–932. doi:10.1039/C2AY05492B.
  10. ^ Li, N; Li, XY; Zou, XZ; Lin, LR; Li, YQ (2011). "A novel baseline-correction method for standard addition based derivative spectra and its application to quantitative analysis of benzo(a)pyrene in vegetable oil samples". Analyst. 136 (13): 2802–2810. doi:10.1039/c0an00751j.
  11. ^ Dixit, L. (1985). "Quantitative Analysis by Derivative Electronic Spectroscopy". Applied Spectroscopy Reviews. 21 (4): 311–418. doi:10.1080/05704928508060434. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  12. ^ Giese, Arthur T.; French, C. Stacey (1955). "The Analysis of Overlapping Spectral Absorption Bands by Derivative Spectrophotometry". Appl. Spectrosc. 9 (2): 78–96.
  13. ^ Madden, Hannibal H. (1978). "Comments on the Savitzky–Golay convolution method for least-squares-fit smoothing and differentiation of digital data" (PDF). Anal. Chem. 50 (9): 1383–1386. doi:10.1021/ac50031a048.
  14. ^ Gans, pp. 153–157, "Repeated smoothing and differentiation"
  15. ^ a b A., Gorry (1990). "General least-squares smoothing and differentiation by the convolution (Savitzky–Golay) method" (PDF). Analytical Chemistry. 62 (6): 570–573. doi:10.1021/ac00205a007.
  16. ^ a b Thornley, David J. "Anisotropic Multidimensional Savitzky Golay kernels for Smoothing, Differentiation and Reconstruction" (PDF). Imperial College Department of Computing, Technical report 2066/8.
  17. ^ Ratzlaff, Kenneth L.; Johnson, Jean T. (1989). "Computation of two-dimensional polynomial least-squares convolution smoothing integers". Anal. Chem. 61 (11): 1303–1305.
  18. ^ Krumm, John. "Savitzky–Golay filters for 2D Images". Microsoft Research, Redmond.
  19. ^ Krumm, John. "Compute Savitzky−Golay coefficients". Microsoft Research, Redmond.
  20. ^ a b c Gans, Appendix 7
  21. ^ a b Ziegler, Horst (1981). "Properties of Digital Smoothing Polynomial (DISPO) Filters". Applied Spectroscopy. 35 (1): 88–92.
  22. ^ Luo, Jianwen (2005). "Properties of Savitzky–Golay digital differentiators" (PDF). Digital Signal Processing. 15: 122–136. doi:10.1016/j.dsp.2004.09.008. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  23. ^ Gans, Peter; Gill, J. Bernard (1983). "Examination of the Convolution Method for Numerical Smoothing and Differentiation of Spectroscopic Data in Theory and in Practice". Applied Spectroscopy. 37 (6): 515–520.
  24. ^ Gans, p. 153.
  25. ^ Procter, Peter M.A.; Sherwood (1980). "Smoothing of digital x-ray photoelectron spectra by an extended sliding least-squares approach". Anal. Chem. 52 (14): 2315–2321. doi:10.1021/ac50064a018.
  26. ^ Gans, p. 207
  27. ^ Bromba, Manfred U.A (1981). "Application hints for Savitzky–Golay digital smoothing filters". Anal. Chem. 53 (11): 1583–1586. doi:10.1021/ac00234a011. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  28. ^ Gans, p.157

Appendix

Tables of selected convolution coefficients

Consider a set of data points (xj, yj)1 ≤ jn. The Savitzky–Golay tables refer to the case that the step xj − xj − 1 is constant, h. Examples of the use of the so-called convolution coefficients, with a cubic polynomial and a window size, m, of 5 points are as follows.

Smoothing:  ;
1st derivative:  ;
2nd. derivative: .

Selected values of the convolution coefficients for polynomials of degree 1,2,3, 4 and 5 are given in the following tables.

Coefficients for smoothing
Degree 2/3 (quadratic/cubic) 4/5 (quartic/quintic)
Window size 5 7 9 7 9
−4 −21 15
−3 −2 14 5 −55
−2 −3 3 39 −30 30
−1 12 6 54 75 135
0 17 7 59 131 179
1 12 6 54 75 135
2 −3 3 39 −30 30
3 −2 14 5 −55
4 −21 15
Normalisation 35 21 231 231 429
Coefficients for 1st derivative
Degree 1/2 (linear/quadratic) 3/4 (cubic/quartic)
Window size 5 7 9 5 7 9
−4 −4 86
−3 −3 −3 22 −142
−2 −2 −2 −2 1 −67 −193
−1 −1 −1 −1 −8 −58 −126
0 0 0 0 0 0 0
1 1 1 1 8 58 126
2 2 2 2 −1 67 193
3 3 3 −22 142
4 4 −86
Normalisation 10 28 60 12 252 1,188
Coefficients for 2nd derivative
Degree 2/3 (quadratic/cubic) 4/5 (quartic/quintic)
Window size 5 7 9 5 7 9
−4 28 −4,158
−3 5 7 −117 12,243
−2 2 0 −8 −3 603 4,983
−1 −1 −3 −17 48 −171 −6,963
0 −2 −4 −20 −90 −630 −12,210
1 −1 −3 −17 48 −171 −6,963
2 2 0 −8 −3 603 4,983
3 5 7 −117 12,243
4 28 −4,158
Normalisation 7 42 462 36 99 4719
Coefficients for 3rd derivative
Degree 3/4 (cubic/quartic) 5/5 (quintic/sextic)
Window size 5 7 9 7 9
−4 −14 100
−3 −1 7 1 −457
−2 −1 1 13 −8 256
−1 2 1 9 13 459
0 0 0 0 0 0
1 −2 −1 −9 −13 −459
2 1 −1 −13 8 −256
3 1 −7 −1 457
4 14 −100
Normalisation 2 6 198 8 1144
Coefficients for 4th derivative
Degree 4/5 (quartic/quintic)
Window size 7 9
−4 14
−3 3 −21
−2 −7 −11
−1 1 9
0 6 18
1 1 9
2 -7 −11
3 3 −21
4 14
Normalisation 11 143