Total variation denoising: Difference between revisions

Content deleted Content added

Inline

Latest revision as of 17:06, 17 May 2024

In signal processing, particularly image processing, total variation denoising, also known as total variation regularization or total variation filtering, is a noise removal process (filter). It is based on the principle that signals with excessive and possibly spurious detail have high total variation, that is, the integral of the image gradient magnitude is high. According to this principle, reducing the total variation of the signal—subject to it being a close match to the original signal—removes unwanted detail whilst preserving important details such as edges. The concept was pioneered by L. I. Rudin, S. Osher, and E. Fatemi in 1992 and so is today known as the ROF model.^[1]

This noise removal technique has advantages over simple techniques such as linear smoothing or median filtering which reduce noise but at the same time smooth away edges to a greater or lesser degree. By contrast, total variation denoising is a remarkably effective edge-preserving filter, i.e., simultaneously preserving edges whilst smoothing away noise in flat regions, even at low signal-to-noise ratios.^[2]

1D signal series

For a digital signal $x_{n}$ , we can, for example, define the total variation as

V(x)=\sum _{n}|x_{n+1}-x_{n}|.

Given an input signal $x_{n}$ , the goal of total variation denoising is to find an approximation, call it $y_{n}$ , that has smaller total variation than $x_{n}$ but is "close" to $x_{n}$ . One measure of closeness is the sum of square errors:

\operatorname {E} (x,y)={\frac {1}{n}}\sum _{n}(x_{n}-y_{n})^{2}.

So the total-variation denoising problem amounts to minimizing the following discrete functional over the signal $y_{n}$ :

\operatorname {E} (x,y)+\lambda V(y).

By differentiating this functional with respect to $y_{n}$ , we can derive a corresponding Euler–Lagrange equation, that can be numerically integrated with the original signal $x_{n}$ as initial condition. This was the original approach.^[1] Alternatively, since this is a convex functional, techniques from convex optimization can be used to minimize it and find the solution $y_{n}$ .^[3]

Regularization properties

The regularization parameter $\lambda$ plays a critical role in the denoising process. When $\lambda =0$ , there is no smoothing and the result is the same as minimizing the sum of squares. As $\lambda \to \infty$ , however, the total variation term plays an increasingly strong role, which forces the result to have smaller total variation, at the expense of being less like the input (noisy) signal. Thus, the choice of regularization parameter is critical to achieving just the right amount of noise removal.

2D signal images

We now consider 2D signals y, such as images. The total-variation norm proposed by the 1992 article is

V(y)=\sum _{i,j}{\sqrt {|y_{i+1,j}-y_{i,j}|^{2}+|y_{i,j+1}-y_{i,j}|^{2}}}

and is isotropic and not differentiable. A variation that is sometimes used, since it may sometimes be easier to minimize, is an anisotropic version

V_{\operatorname {aniso} }(y)=\sum _{i,j}{\sqrt {|y_{i+1,j}-y_{i,j}|^{2}}}+{\sqrt {|y_{i,j+1}-y_{i,j}|^{2}}}=\sum _{i,j}|y_{i+1,j}-y_{i,j}|+|y_{i,j+1}-y_{i,j}|.

The standard total-variation denoising problem is still of the form

\min _{y}[\operatorname {E} (x,y)+\lambda V(y)],

where E is the 2D L₂ norm. In contrast to the 1D case, solving this denoising is non-trivial. A recent algorithm that solves this is known as the primal dual method.^[4]

Due in part to much research in compressed sensing in the mid-2000s, there are many algorithms, such as the split-Bregman method, that solve variants of this problem.

Rudin–Osher–Fatemi PDE

Suppose that we are given a noisy image $f$ and wish to compute a denoised image $u$ over a 2D space. ROF showed that the minimization problem we are looking to solve is:^[5]

\min _{u\in \operatorname {BV} (\Omega )}\;\|u\|_{\operatorname {TV} (\Omega )}+{\lambda  \over 2}\int _{\Omega }(f-u)^{2}\,dx

where ${\textstyle \operatorname {BV} (\Omega )}$ is the set of functions with bounded variation over the domain $\Omega$ , ${\textstyle \operatorname {TV} (\Omega )}$ is the total variation over the domain, and ${\textstyle \lambda }$ is a penalty term. When ${\textstyle u}$ is smooth, the total variation is equivalent to the integral of the gradient magnitude:

\|u\|_{\operatorname {TV} (\Omega )}=\int _{\Omega }\|\nabla u\|\,dx

where ${\textstyle \|\cdot \|}$ is the Euclidean norm. Then the objective function of the minimization problem becomes: $\min _{u\in \operatorname {BV} (\Omega )}\;\int _{\Omega }\left[\|\nabla u\|+{\lambda \over 2}(f-u)^{2}\right]\,dx$ From this functional, the Euler-Lagrange equation for minimization – assuming no time-dependence – gives us the nonlinear elliptic partial differential equation:

${\begin{cases}\nabla \cdot \left({\nabla u \over {\|\nabla u\|}}\right)+\lambda (f-u)=0,\quad &u\in \Omega \\{\partial u \over {\partial n}}=0,\quad &u\in \partial \Omega \end{cases}}$

For some numerical algorithms, it is preferable to instead solve the time-dependent version of the ROF equation: ${\partial u \over {\partial t}}=\nabla \cdot \left({\nabla u \over {\|\nabla u\|}}\right)+\lambda (f-u)$

Applications

The Rudin–Osher–Fatemi model was a pivotal component in producing the first image of a black hole.^[6]

References

^ ^a ^b ^c Rudin, L. I.; Osher, S.; Fatemi, E. (1992). "Nonlinear total variation based noise removal algorithms". Physica D. 60 (1–4): 259–268. Bibcode:1992PhyD...60..259R. CiteSeerX 10.1.1.117.1675. doi:10.1016/0167-2789(92)90242-f.
^ Strong, D.; Chan, T. (2003). "Edge-preserving and scale-dependent properties of total variation regularization". Inverse Problems. 19 (6): S165–S187. Bibcode:2003InvPr..19S.165S. doi:10.1088/0266-5611/19/6/059. S2CID 250761777.
^ ^a ^b Little, M. A.; Jones, Nick S. (2010). "Sparse Bayesian Step-Filtering for High-Throughput Analysis of Molecular Machine Dynamics" (PDF). ICASSP 2010 Proceedings. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
^ Chambolle, A. (2004). "An algorithm for total variation minimization and applications". Journal of Mathematical Imaging and Vision. 20: 89–97. CiteSeerX 10.1.1.160.5226. doi:10.1023/B:JMIV.0000011325.36760.1e. S2CID 207622122.
^ Getreuer, Pascal (2012). "Rudin–Osher–Fatemi Total Variation Denoising using Split Bregman" (PDF).
^ "Rudin–Osher–Fatemi Model Captures Infinity and Beyond". IPAM. 2019-04-15. Retrieved 2019-08-04.

External links

[rof92-1] Rudin, L. I.; Osher, S.; Fatemi, E. (1992). "Nonlinear total variation based noise removal algorithms". Physica D. 60 (1–4): 259–268. Bibcode:1992PhyD...60..259R. CiteSeerX 10.1.1.117.1675. doi:10.1016/0167-2789(92)90242-f.

[strong-2] Strong, D.; Chan, T. (2003). "Edge-preserving and scale-dependent properties of total variation regularization". Inverse Problems. 19 (6): S165–S187. Bibcode:2003InvPr..19S.165S. doi:10.1088/0266-5611/19/6/059. S2CID 250761777.

[little10-3] Little, M. A.; Jones, Nick S. (2010). "Sparse Bayesian Step-Filtering for High-Throughput Analysis of Molecular Machine Dynamics" (PDF). ICASSP 2010 Proceedings. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[chambolle04-4] Chambolle, A. (2004). "An algorithm for total variation minimization and applications". Journal of Mathematical Imaging and Vision. 20: 89–97. CiteSeerX 10.1.1.160.5226. doi:10.1023/B:JMIV.0000011325.36760.1e. S2CID 207622122.

[5] Getreuer, Pascal (2012). "Rudin–Osher–Fatemi Total Variation Denoising using Split Bregman" (PDF).

[6] "Rudin–Osher–Fatemi Model Captures Infinity and Beyond". IPAM. 2019-04-15. Retrieved 2019-08-04.

[1]

[2]

[3]

[4]

[5]

[6]

@@ Line 1: / Line 1: @@
+{{Short description|Noise removal process during image processing}}
-[[Image:ROF Denoising Example.png|thumb|right|400px|Example of application of the Rudin et al.<ref name="rof92"/> total variation denoising technique to an image corrupted by Gaussian noise. This example created using demo_tv.m by Guy Gilboa, see external links.]]
+[[Image:ROF Denoising Example.png|thumb|right|400px|Example of application of the Rudin et al.<ref name="rof92"/> total variation denoising technique to an image corrupted by [[Gaussian noise]]. This example created using demo_tv.m by Guy Gilboa, see external links.]]
-In signal processing, '''total variation denoising''', also known as '''total variation regularization''', is a process, most often used in digital [[image processing]], that has applications in noise removal. It is based on the principle that signals with excessive and possibly spurious detail have high [[total variation]], that is, the integral of the absolute [[gradient]] of the signal is high. According to this principle, reducing the total variation of the signal subject to it being a close match to the original signal, removes unwanted detail whilst preserving important details such as edges. The concept was pioneered by Rudin, Osher, and Fatemi in 1992 and so is today known as the ''ROF model''.<ref name="rof92">{{cite journal | first1 = L. I. | last1 = Rudin | first2 = S. | last2 = Osher | first3 = E. | last3 = Fatemi | title = Nonlinear total variation based noise removal algorithms | journal = Physica D | volume = 60 | pages = 259–268 | year = 1992 | id =  | doi=10.1016/0167-2789(92)90242-f}}</ref>
+In [[signal processing]], particularly [[image processing]], '''total variation denoising''', also known as '''total variation regularization''' or '''total variation filtering''', is a noise removal process ([[Filter (signal processing)|filter]]). It is based on the principle that signals with excessive and possibly spurious detail have high ''[[total variation]]'', that is, the integral of the [[image gradient]] [[vector magnitude|magnitude]] is high. According to this principle, reducing the total variation of the signal—subject to it being a close match to the original signal—removes unwanted detail whilst preserving important details such as [[image edge|edges]]. The concept was pioneered by L. I. Rudin, [[Stanley Osher|S. Osher]], and E. Fatemi in 1992 and so is today known as the ''ROF model''.<ref name="rof92">{{cite journal|last1=Rudin|first1=L. I.|last2=Osher|first2=S.|last3=Fatemi|first3=E.|year=1992|title=Nonlinear total variation based noise removal algorithms|journal=Physica D|volume=60|issue=1–4|pages=259–268|doi=10.1016/0167-2789(92)90242-f|bibcode=1992PhyD...60..259R|citeseerx=10.1.1.117.1675}}</ref>
-This noise removal technique has advantages over simple techniques such as [[Gaussian blur|linear smoothing]] or [[median filter]]ing which reduce noise but at the same time smooth away edges to a greater or lesser degree. By contrast, total variation denoising is remarkably effective at simultaneously preserving edges whilst smoothing away noise in flat regions, even at low signal-to-noise ratios.<ref name="strong">{{cite journal | first1 = D. | last1 = Strong | first2 = T. | last2 = Chan | title = Edge-preserving and scale-dependent properties of total variation regularization | journal = Inverse Problems | volume = 19 | pages = S165–S187 | year = 2003 | id =  | doi=10.1088/0266-5611/19/6/059}}</ref>
+This noise removal technique has advantages over simple techniques such as [[Gaussian blur|linear smoothing]] or [[median filter]]ing which reduce noise but at the same time smooth away edges to a greater or lesser degree. By contrast, total variation denoising is a remarkably effective [[edge-preserving filtering|edge-preserving filter]], i.e., simultaneously preserving edges whilst smoothing away noise in flat regions, even at low signal-to-noise ratios.<ref name="strong">{{cite journal | first1 = D. | last1 = Strong | first2 = T. | last2 = Chan | title = Edge-preserving and scale-dependent properties of total variation regularization | journal = Inverse Problems | volume = 19 | issue = 6 | pages = S165–S187 | year = 2003 | doi=10.1088/0266-5611/19/6/059| bibcode = 2003InvPr..19S.165S | s2cid = 250761777 }}</ref>
 == 1D signal series ==
-[[Image:TVD 1D Example.png|thumb|right|300px|Application of 1D total variation denoising to a signal obtained from a single-molecule experiment.<ref name="little10"/> Gray is the original signal, black is the denoised signal.]]
+[[Image:TVD 1D Example.png|thumb|right|300px|Application of 1D total-variation denoising to a signal obtained from a single-molecule experiment.<ref name="little10"/> Gray is the original signal, black is the denoised signal.]]
-For a [[Digital signal (signal processing)|digital signal]] <math>y_n</math>, we can, for example, define the total variation as:
+For a [[Digital signal (signal processing)|digital signal]] <math>x_n</math>, we can, for example, define the total variation as
-:<math>V(y) = \sum\limits_n\left|y_{n+1}-y_n \right|</math>
+: <math>V(x) = \sum_n |x_{n+1} - x_n|.</math>
 Given an input signal <math>x_n</math>, the goal of total variation denoising is to find an approximation, call it <math>y_n</math>, that has smaller total variation than <math>x_n</math> but is "close" to <math>x_n</math>. One measure of closeness is the sum of square errors:
-:<math>E(x,y) = \frac{1}{2}\sum\limits_n\left(x_n - y_n\right)^2</math>
+: <math> \operatorname E(x, y) = \frac{1}{n} \sum_n (x_n - y_n)^2.</math>
-So the total variation denoising problem amounts to minimizing the following discrete functional over the signal <math>y_n</math>:
+So the total-variation denoising problem amounts to minimizing the following discrete functional over the signal <math>y_n</math>:
-:<math>E(x,y) + \lambda V(y)</math>
+: <math> \operatorname E(x,y) + \lambda V(y).</math>
-By differentiating this functional with respect to <math>y_n</math>, we can derive a corresponding [[Euler–Lagrange equation]], that can be numerically integrated with the original signal <math>x_n</math> as initial condition. This was the original approach.<ref name="rof92"/> Alternatively, since this is a [[convex function]]al, techniques from [[convex optimization]] can be used to minimize it and find the solution <math>y_n</math>.<ref name="little10">{{cite conference | first1 = M. A. | last1 = Little | first2 = Nick S. | last2 = Jones | year = 2010 | url = http://www.maxlittle.net/publications/little_icassp2010.pdf | title = Sparse Bayesian Step-Filtering for High-Throughput Analysis of Molecular Machine Dynamics | conference = 2010 IEEE International Conference on Acoustics, Speech and Signal Processing | booktitle = ICASSP 2010 Proceedings }}</ref>
+By differentiating this functional with respect to <math>y_n</math>, we can derive a corresponding [[Euler–Lagrange equation]], that can be numerically integrated with the original signal <math>x_n</math> as initial condition. This was the original approach.<ref name="rof92"/> Alternatively, since this is a [[convex function]]al, techniques from [[convex optimization]] can be used to minimize it and find the solution <math>y_n</math>.<ref name="little10">{{cite conference | first1 = M. A. | last1 = Little | first2 = Nick S. | last2 = Jones | year = 2010 | url = http://www.maxlittle.net/publications/little_icassp2010.pdf | title = Sparse Bayesian Step-Filtering for High-Throughput Analysis of Molecular Machine Dynamics | conference = 2010 IEEE International Conference on Acoustics, Speech and Signal Processing | book-title = ICASSP 2010 Proceedings }}</ref>
 == Regularization properties ==
@@ Line 25: / Line 27: @@
 == 2D signal images ==
 We now consider 2D signals ''y'', such as images.
-The total variation norm proposed by the 1992 paper is
+The total-variation norm proposed by the 1992 article is
-: <math>V(y) = \sum_{i,j} \sqrt{ |y_{i+1,j} - y_{i,j}|^2 + |y_{i,j+1} - y_{i,j}|^2 }</math>
+: <math>V(y) = \sum_{i,j} \sqrt{|y_{i+1,j} - y_{i,j}|^2 + |y_{i,j+1} - y_{i,j}|^2}</math>
-and is [[isotropic]] and not [[Differentiable function|differentiable]].  A variation that is sometimes used, since it may sometimes be easier
+and is [[isotropic]] and not [[Differentiable function|differentiable]].  A variation that is sometimes used, since it may sometimes be easier to minimize, is an anisotropic version
+: <math>V_\operatorname{aniso}(y) = \sum_{i,j} \sqrt{|y_{i+1,j} - y_{i,j}|^2} + \sqrt{|y_{i,j+1} - y_{i,j}|^2} = \sum_{i,j} |y_{i+1,j} - y_{i,j}| + |y_{i,j+1} - y_{i,j}|.</math>
-to minimize, is an anisotropic version
-: <math>V_\text{aniso}(y) = \sum_{i,j} \sqrt{ |y_{i+1,j} - y_{i,j}|^2} + \sqrt{|y_{i,j+1} - y_{i,j}|^2 } = \sum_{i,j} |y_{i+1,j} - y_{i,j}| + |y_{i,j+1} - y_{i,j}|. </math>
-The standard total variation denoising problem is still of the form
+The standard total-variation denoising problem is still of the form
-: <math> \min_y \; E(x,y) + \lambda V(y) </math>
+: <math> \min_y [ \operatorname E(x, y) + \lambda V(y)],</math>
-where ''E'' is the 2D [[L2 norm#Euclidean norm|L2 norm]].  In contrast to the 1D case, solving this denoising is non-trivial.  A recent algorithm that solves this is known as the [[primal dual method]].<ref name="chambolle04">{{cite journal | first = A. | last = Chambolle | year = 2004 | citeseerx = 10.1.1.160.5226 | title = An algorithm for total variation minimization and applications | journal = Journal of Mathematical Imaging and Vision | volume = 20 | pages = 89–97 | doi=10.1023/B:JMIV.0000011325.36760.1e}}</ref>
+where ''E'' is the 2D [[L2 norm#Euclidean norm|''L''<sub>2</sub> norm]].  In contrast to the 1D case, solving this denoising is non-trivial.  A recent algorithm that solves this is known as the [[primal dual method]].<ref name="chambolle04">{{cite journal | first = A. | last = Chambolle | year = 2004 | citeseerx = 10.1.1.160.5226 | title = An algorithm for total variation minimization and applications | journal = Journal of Mathematical Imaging and Vision | volume = 20 | pages = 89–97 | doi=10.1023/B:JMIV.0000011325.36760.1e| s2cid = 207622122 }}</ref>
 Due in part to much research in [[compressed sensing]] in the mid-2000s, there are many algorithms, such as the split-[[Bregman method]], that solve variants of this problem.
+== Rudin–Osher–Fatemi PDE ==
+Suppose that we are given a noisy image <math>f</math> and wish to compute a denoised image <math>u</math> over a 2D space. ROF showed that the minimization problem we are looking to solve is:<ref>{{Cite web|url=https://www.ipol.im/pub/art/2012/g-tvd/article_lr.pdf|title=Rudin–Osher–Fatemi Total Variation Denoising using Split Bregman|last=Getreuer|first=Pascal|date=2012}}</ref>
+: <math> \min_{u\in\operatorname{BV}(\Omega)} \; \|u\|_{\operatorname{TV}(\Omega)} + {\lambda \over 2} \int_\Omega(f-u)^2 \, dx</math>
+where <math display="inline">\operatorname{BV}(\Omega)</math> is the set of functions with [[bounded variation]] over the domain <math>\Omega</math>, <math display="inline">\operatorname{TV}(\Omega)</math> is the total variation over the domain, and <math display="inline">\lambda</math> is a penalty term. When <math display="inline">u</math> is smooth, the total variation is equivalent to the integral of the gradient magnitude:
+: <math>\|u\|_{\operatorname{TV}(\Omega)} = \int_\Omega\|\nabla u\| \, dx</math>
+where <math display="inline">\|\cdot\|</math> is the [[Euclidean norm]]. Then the objective function of the minimization problem becomes:<math display="block">\min_{u\in\operatorname{BV}(\Omega)} \; \int_\Omega\left[\|\nabla u\| + {\lambda \over 2}(f-u)^2 \right] \, dx</math>From this functional, the Euler-Lagrange equation for minimization – assuming no time-dependence – gives us the nonlinear [[Elliptic partial differential equation|elliptic]] [[partial differential equation]]:
+{{Equation box 1|cellpadding|border|indent=:|equation=<math> \begin{cases}
+\nabla\cdot\left({\nabla u\over{\|\nabla u\|}} \right ) + \lambda(f-u) = 0, \quad &u\in\Omega \\
+{\partial u\over{\partial n}} = 0, \quad &u\in\partial\Omega
+\end{cases} </math>|border colour=#0073CF|background colour=#F5FFFA}}For some numerical algorithms, it is preferable to instead solve the time-dependent version of the ROF equation:<math display="block">{\partial u\over{\partial t}} = \nabla\cdot\left({\nabla u\over{\|\nabla u\|}} \right ) + \lambda(f-u)</math>
+== Applications ==
+The Rudin–Osher–Fatemi model was a pivotal component in producing the [[Messier 87#Supermassive black hole M87*|first image of a black hole]].<ref>{{Cite web|url=https://www.ipam.ucla.edu/news/rudin-osher-fatemi-model-captures-infinity-and-beyond/|title=Rudin–Osher–Fatemi Model Captures Infinity and Beyond|date=2019-04-15|website=IPAM|language=en|access-date=2019-08-04}}</ref>
 == See also ==
-* [[Total variation]]
 * [[Anisotropic diffusion]]
-* [[Signal processing]]
+* [[Bounded variation]]
+* [[Basis pursuit denoising]]
+* [[Chambolle-Pock algorithm]]
 * [[Digital image processing]]
+* [[Lasso (statistics)]]
 * [[Noise reduction]]
 * [[Non-local means]]
+* [[Signal processing]]
+* [[Total variation]]
-==External links==
-*[http://www.maxlittle.net/software/ TVDIP: Full-featured Matlab 1D total variation denoising implementation.]
-*[http://znah.net/rof-and-tv-l1-denoising-with-primal-dual-algorithm.html ROF and TV-L1 denoising with Primal-Dual algorithm] by A. Mordvintsev.
-*[https://www.mathworks.com/matlabcentral/fileexchange/57604-tv-l1-image-denoising-algorithm TV-L1 image denoising algorithm in Matlab]
 == References ==
 {{Reflist}}
+== External links ==
+*[http://www.maxlittle.net/software/ TVDIP: Full-featured Matlab 1D total variation denoising implementation.]
+*[ftp://ftp.math.ucla.edu/pub/camreport/cam08-34.pdf Efficient Primal-Dual Total Variation]
+*[https://www.mathworks.com/matlabcentral/fileexchange/57604-tv-l1-image-denoising-algorithm TV-L1 image denoising algorithm in Matlab]
 {{Noise|state=uncollapsed}}
 [[Category:Nonlinear filters]]
 [[Category:Signal processing]]
 [[Category:Image processing]]
+[[Category:Partial differential equations]]