MASTER’S THESIS Compressed sensing – with applications to medical imaging HENG YANG Department of Mathematical sciences Division of Mathematics CHALMERS UNIVERSITY OF TECHNOLOGY UNIVERSITY OF GOTHENBURG Gothenburg, Sweden 2011 Thesis for the Degree of Master of Science Compressed sensing – with applications to medical imaging Heng Yang Department of Mathematical Sciences Chalmers University of Technology and Gothenburg University SE-412 96 Göteborg, Sweden Göteborg, June 2011 Mathematics Sciences Göteborg 2011 Abstract Compressed sensing is a new approach for acquiring signals. It captures and represents signals and images at a rate significantly below Nyquist rate. In certain areas like magnetic resonance imaging (MRI), it is urgent to reduce the time of the patients’ exposure in the electromagnetic radiation. Compressed sensing breaks the canonical rules and effectively reduces the sampling rate without losing the essential information, so it has a wide application in medical imaging. In this project, three different recovery strategies - Orthogonal matching pursuit (OMP), Compressive Sampling Matching Pursuit (CoSaMP) and Model-based recovery will be explored to investigate the performance of algorithms on different MRI images. Peak Signal-to- Noise Ratio is used to measure the quality of reconstruction. Key words: Compressed sensing, Magnetic resonance imaging, Recovery strategy, OMP, CoSaMP, Model-based, PSNR Acknowledgments First I would like to give my sincerest gratitude to my supervisor Prof. Morten Nielsen in Aalborg University. Thank you for leading me to the world of compressed sensing and always be patient to answer my questions. His well organized working process and scientific attitude will be a great treasure for my whole life. In addition, I want to thank my examiner Mohammad Asadzadeh. Thank you for being the bridge between Chalmers and Aalborg University. Without your constant support, I wouldn’t finish the thesis in time. Special thanks for my family and the people who love me and support me always. 1 Contents Notation ......................................................................................................................... 2 1 Introduction ................................................................................................................. 3 1.1 Background ......................................................................................................................... 3 1.1.1 The drawback of Nyquist sampling .................................................................................. 3 1.1.2 The sparsity of signals ...................................................................................................... 3 1.1.3 The inefficiencies of conventional data transform ........................................................... 4 1.1.4 The introduction of compressed sensing ......................................................................... 4 1.2 Purpose ............................................................................................................................... 5 1.3 Outline ................................................................................................................................. 5 2 How to do reconstruction ............................................................................................ 6 2.1 Restricted isometry property (RIP) ...................................................................................... 6 2.2 The idea of Reconstruction ................................................................................................. 6 3 The reconstruction algorithms .................................................................................... 8 3.1 OMP algorithm .................................................................................................................... 8 3.1.1 Description ....................................................................................................................... 8 3.1.2 Experiments ..................................................................................................................... 9 3.1.3 Conclusion ...................................................................................................................... 12 3.2 CoSaMP algorithm ............................................................................................................. 12 3.2.1 Description ..................................................................................................................... 12 3.2.2 Experiments ................................................................................................................... 13 3.2.3 Conclusion ...................................................................................................................... 16 3.3 Model-based compressive sensing ................................................................................... 16 3.3.1 Description ..................................................................................................................... 16 3.3.2 Experiments ................................................................................................................... 22 3.3.3 Conclusion ...................................................................................................................... 27 4 Experiments on MRI Images .................................................................................... 28 5 Conclusion ................................................................................................................ 36 Reference ..................................................................................................................... 37 2 Notation Real numbers Real valued, finite length signal with length N P-norm Quasi-norm Cardinality of Measurement matrix with size Conjugate transpose of matrix Inner product Pseudoinverse of full-rank matrix . Index set supp(x):= 3 1 Introduction 1.1 Background 1.1.1 The drawback of Nyquist sampling According to the Shannon/Nyquist sampling theorem, in order to reconstruct a bandlimited signal perfectly the sampling rate should be at least two times that of the signal bandwidth [3]. To be more specific, let represent a continuous-time signal and be the continuous Fourier transform of the signal , we have: The signal is said to be bandlimited if there is a B, such that . Figure 1.1 shows an example of a bandlimited signal . The quantity 2B is called the Nyquist rate. The sufficient condition for signal to be perfectly reconstructed from an infinite sequence of samples is the sample rate fs should be larger than 2B. If fs is less than 2B, aliasing will be introduced after reconstruction. Figure 1.1: An example the Fourier transform of a bandlimited signal [3] While in reality, this sampling rate is still so high that too many samples should be achieved. Especially in the medical imaging modality, we need to reduce the time of the patients’ exposure in the electromagnetic radiation. So it is desirable to take as few samples as possible without losing essential information. It is interesting to notice that most signals in reality are sparse. When they are represented in some domain (such as the wavelet domain), they contain many coefficients close to or equal to zero. Compressed sensing acquires and reconstructs a signal applying the prior knowledge that it is sparse. It can capture and represents compressible signals at a rate significantly lower than Nyquist rate. 1.1.2 The sparsity of signals Using mathematics to illustrate, we have a discrete-time signal in that can be represented in terms of an orthonormal basis of vectors as follows: http://en.wikipedia.org/wiki/Nyquist_rate 4    N i iis 1 x (1) where is the coefficient sequence of . For simplification, we can write (1) in matrix form as (where is the column vector and is the matrix with as columns). Signal has a K-sparse expansion if only K of the entries in are non-zero and are zero. Real signals are often compressible which means the sequence of coefficients decays quickly. It means the large fraction of small coefficients can be thrown away without much perceptual loss. 1.1.3 The inefficiencies of conventional data transform In traditional data acquisition, the first step is to acquire the full N-sample signal ; then compute the coefficients via and only keep the K largest while discarding the others. The values and locations of the K largest should be encoded. This traditional signal acquisition processing divides the sampling and compression into two separate processes which samples a lot of unnecessary information. This inefficiency is more obvious when the number of samples N is large compared to K. Compressed sensing is a method to skip the sampling step by directly acquiring the compressed signal representation to overcome these inefficiencies. 1.1.4 The introduction of compressed sensing In order to measure all the N coefficients of , we consider the column inner products between and collection of vectors : (2) where is an matrix. is called an measurement matrix with as rows. is fixed and does not depend on the signal , so this process is non-adaptive. This is a great point since if we get a robust result from a measurement matrix , we can apply this measurement matrix to any kinds of signals without worrying about the stability. Figure 1.2 illustrates the process of compressed sensing. Figure 1.2: Compressed sensing measurement process [4] The main task of this thesis is to investigate the algorithms about reconstructing the K- sparse signal by the given measurement vector , with and . 5 1.2 Purpose The purpose of this project is to explore the compressed sensing strategy to reconstruct images stably and efficiently by using as few measurements as possible. Three reconstruction methods Orthogonal Matching Pursuit (OMP), Compressive Sampling Matching Pursuit (CoSaMP) and Model-based algorithms will be illustrated and analyzed. 1.3 Outline Chapter 1 gives a brief background of the thesis. Chapter 2 shows the idea of reconstruction and depicts the theory behind it. Chapter 3 emphasizes on the reconstruction algorithms of OMP, CoSaMP and Model-based, some experiments will be carried out as well. In Chapter 4, some real Magnetic Resonance Imaging (MRI) images will be tested for further investigation by using different recovery strategies. Finally, a conclusion will be made in Chapter 5. 6 2 How to do reconstruction 2.1 Restricted isometry property (RIP) The main task of encoding is to transform the K-sparse signal to the measurement by using a proper measurement matrix . The sampling matrix must map two different signals to two different sets of measurements, so all of the column submatrices (containing at most K columns) of should be well-conditioned. Candѐs and Tao proposed a condition for the sampling matrix . For all K-sparse vector , an matrix has the K-restricted isometry property if (3) When is less than 1, the inequalities (3) imply the all of the submatrices of with K columns are well-conditioned and close to an isometry. If , the sampling matrix has a large probability to reconstruct the (K/2)-sparse signal stably. This condition is called Restricted Isometry Property (shorted for RIP). The connection between RIP and Compressed Sensing is if is sufficiently less than 1, all pairwise distances between K-Sparse signals must be well preserved in the measurement space which implies that holds for all K-sparse vectors and . Because and are two different vectors and are always larger than zero, will never equal to zero. It can be said that the sampling matrix should map two different K-sparse signals to different samples. So as to invert the sampling process stably and get a K-sparse signal , we need to get a small restricted isometry constant . However, it is computational difficult to check whether a matrix satisfies the inequality (3). Fortunately, many types of random matrices have a good restricted isometry behavior, and they satisfy the restricted isometry condition with high probability. One of the quintessential examples is Gaussian measurement matrix that the entries of are independent and identically distributed (i.i.d.) random variables from a Gaussian probability density function. An i.i.d. Gaussian matrix has restricted isometry behavior with high probability if where is a constant [1] [2]. This also means K-spare or compressible signals with length N can be recovered with M random Gaussian measurements. In my project, I just pick a random matrix without checking its restricted isometry property. The random sampling matrix is regarded to have a good RIP behavior if the recovery signal is stable and approximately accurate. 2.2 The idea of Reconstruction In order to achieve an optimal recovery algorithm, there are several requirements that should be satisfied. The requirements are illustrated as below: (1) Stability. The algorithm should be stable. That means when the signals or the measurements are perturbed slightly by noise, recovery should still be approximately accurate. (2) Fast. The algorithm should be fast if we want to apply it into practice. (3) Uniform guarantees. When acquiring linear measurements by using a specific method, these 7 linear measurements can apply to all sparse signals. (4) Efficiency. The algorithm should require as few measurements as possible. Now we want to reconstruct a K-sparse signal by the measurement vector . Since the measurement matrix and , the system (2) is underdetermined. That means we have more unknowns than the equations. Theoretically, there are infinitely many that satisfy Eq. (2). However, in our case the additional assumption is that is K-sparse, and then there is often a unique that will suffice to recover . The best solution will be the sparsest vector that means it has the most zero coefficients. Consider the -norm that counts the number of non-zeros entries, the reconstruction problem turns to be: 0~ ~minarg xx x  , subject to xy ~ (4) Unfortunately, the -minimization problem is NP-hard [25][26]. It is computationally intractable to solve Eq.(4) for any matrix and vector . There are two families can be alternatively used to solve Eq.(4). One is the basic pursuit that is a convex relaxation leading to -norm minimization [18], the other is greedy pursuit [9] such as Orthogonal Matching Pursuit (OMP) [15], Stagewise Orthogonal Matching Pursuit (StOMP) [19], and Regularized Orthogonal Matching Pursuit (ROMP) [22][23]. -minimization approach As we discussed in section 2.1, in most cases if the RIP holds, the -norm can exactly recover K-sparse signals and do a proper job to approximate the compressible signals with high probability using only i.i.d. Gaussian measurements. Then the Eq.(4) will change to be: 1~ ~minarg xx x  , subject to xy ~ (5) Eq. (5) is equivalent to the linear programming   N j jv 2 1 min , subject to (6) where is a positive real number of size 2N. The signal is obtained from the solution of (6) via . So the -norm minimization can be solved by linear programming problem. Interior-point methods [12], projected gradient methods [13] and iterative thresholding [14] can be used to solve the Eq. (6). The -minimization approach can provide stability and uniform guarantees. But it doesn’t have linear bound on the runtime, it is not optimally fast. The basic pursuit is well developed and I am not going to talk too much about it. Greedy pursuit is the target I will focus on. Greedy pursuit Greedy pursuit is another approach to reconstruct the signal. It is an iterative signal recovery algorithm to calculate the support of the signal and it makes the locally optimal choice at each time to build up an approximation and repeats until the criterion is fulfilled. When we get the support of the signal, the signal can be reconstructed by , where is the measurement matrix with entries indexed by and is the pseudoinverse of . The pseudoinverse of a full-rank matrix is defined by the formula . Greedy pursuit is extremely fast while it is not optimally stable and doesn’t have uniform guarantees. 8 3 The reconstruction algorithms 3.1 OMP algorithm 3.1.1 Description Orthogonal matching pursuit is a so called greedy algorithm for signal recovery. It was proposed by Mallat and Zhang [20] and analyzed by Gilbert and Tropp [9]. Suppose is a K-sparse signal in , and let be a measurement matrix with columns . Then the signal can be represented by an M-dimensional measurement vector . Since has only K non-zero components, can be regarded as a linear combination of K columns from . So when we do the signal recovery, the most challenging part is to identify the location of the largest ideal signal . It is important to determine which columns of participate in the measurement vector . OMP applies the greedy algorithm to pick the columns of by finding the largest correlation between and the residual of . At each iteration, one coordinate for the support of the signal is calculated. Hopefully after K iterations, the entire support of the signal will be identified. Tropp and Gilbert gave a proof about the weak uniform guarantees about the OMP [15]. They showed that OMP can correctly reconstruct the K-sparse signal from its measurements with probability exceeding if is an Gaussian measurement matrix with . Here is a fixed constant between 0 and 0.36 and is a constant. However, this guarantee is only for a fixed signal not for all of signals. OMP may fail for some sparse signals. It is also unknown whether OMP works for compressible signals rather than sparse signals or succeeds when samples contain noise. The OMP algorithm has 4 major steps during each iteration: (1) Find the index by choosing the largest correlation between and the residual of . (2) Unite the newly chosen index with the index set , and with the matrix . Here, is an empty set. (3) Form the signal estimate by using the least squares method which is to find the projection of onto the range of . It is easy to recognize that the residual is always orthogonal to . Figure 3.1: least square method 9 (4) Calculate the newly residual of and then return to (1) if . Since the residual is always orthogonal to , in the first step we will never pick one column twice. Once the support of signal is found, the approximation of signal can be found by . Algorithm 1: OMP Input: Measurement matrix , measurement , sparsity level K of the ideal signal Output: index set , measurement estimate , residual , , {Initialize} while do 1. 2. {identify index of largest correlation} 3. {Augment the index set} 4. {Augment the matrix} 5. {Signal estimate by least squares} 6. {Update the new residual} end while 3.1.2 Experiments Figure 3.2 illustrates the results of a simulation study on the impact of the number of measurements M on the performance of OMP recovery. A piecewise polynomial signal of length N=1024 is generated by random. The signal in a piecewise polynomial wavelet basis is K-spare and K is equal to 42. Figure 3.2(a) to (d) show the performances of OMP recovery from M=2K, M=3K, M=4K and M=5K noise free measurements respectively. The figure shows that as the M growing, the recovery results are more and more accurate. When M=5K, the recovery and original signal almost overlap with each other. (a) (b) 10 (c) (d) Figure 3.2: Example performance of OMP signal recovery. The red dotted line is the orignal signal, the green solid line is the signal recovery. (a) Signal recovery from M=2K samples. (b) Signal recovery from M=3K samples. (c) Signal recovery from M=4K samples. (d) Signal recovery from M=5K samples. Then let us study the behavior of OMP algorithm when the orignal signal is perturbed by noise. Here, the Peak Signal-to-Noise Ratio (shorted for PSNR) is used to measure the quality of reconstrucion. PSNR is a term of ratio between the maximum possible power of a signal and the power of corrupting noise. The PSNR is defined as: )(log10 2 10 MSE MAX PSNR I Here, is a signal and is the maximum component of signal. is the mean squared error and defined as:       1 0 1 0 2)],(),([ 1 m i n j jiKjiI mn MSE and is an approximation of signal. Typical values for the PSNR are between 30 dB and 50 dB and acceptable values are considered to be about 20 dB to 25 dB. The higher PSNR is, the better recovery performance is. [11] Now, let us find out the stability of OMP by adding noise in the system. Noise will be added to the original signal and the measurements respectively. In the first case, original signal is perturbed by noise and the noise is also perturbed the measurements since the measurements are generated though the multiplication of signal and measurement matrix . Figrue 3.3 gives the PSNR of OMP under perturbed signal. The noise power level is increased from 0 to 0.5 step by step. The red pentagram line shows the performance of the algorithm without noise. The results are acceptable when M is larger than 4K. From Figure 3.3 we can see when M=4K, the PSNR is around 22 dB. But when M grows to 5K, the PSNR suddenly increases to approximately 81 dB. So for the noise free case, when M is bigger than 5K, the recovery performance is really ideal. But as noise gradually adding to the signal, even we have a large size of M, the reconstruction effect is still bad. The PSNR for with noise power level that is bigger than 0.2 is less than 20 dB. It is similar when we add noise directly to the measurements. Figure 3.4 shows the PSNR result with noise measurements and noise power level is increased from 0 to 0.5 with increasing step 11 0.1. The quality of reconstruction is very poor if measurements are interfered by noise. Figure 3.3: The stability tests of the OMP algorithm when random noise is added to the original signal. The noise is a random generated vector with the same size as the signal. The noise power level is increased from 0 to 0.5 step by step. Figure 3.4: The stability tests of the OMP algorithm when we add noise to measurements. The noise is a random generated vector with the same size as the signal. The noise power level is increased from 0 to 0.5 step by step. 12 3.1.3 Conclusion From the results Figure 3.3 and 3.4, it is easy to draw a conclusion that OMP can achieve a really ideal result without interuption of noise when M is big. But this algorithm is not stable. When the signals or the measurements are perturbed slightly by noise, recovery is not accurate any more. 3.2 CoSaMP algorithm 3.2.1 Description CoSaMP, short for the Compressive Sampling Matching Pursuit, is a new reconstruction algorithm based on OMP (Orthogonal Matching Pursuit). CoSaMP was first put forth by Needell and Tropp in 2009 [24]. As in the case of OMP, identifying the K largest components in signal is the most important target. According to the restricted isometry property, by giving a sampling matrix with the restricted isometry constant , the -norm of the largest K entries of vector is close to the -norm of the largest K entries of the K-sparse signal , therefore, can be called the proxy of signal . Here, is the conjugate transpose of matrix . The proxy can be obtained by applying the matrix to the measurement . In order to identify the location of the largest K components of , it is enough to find out the location of the largest K components of the proxy . At each iteration, the algorithm first selects the largest 2K components of the signal proxy and adds the index of these components to the support set. Next using the least squares, we can get a signal estimation . The sparse signal can be obtained by keeping only the largest K components of the estimation to make it sparse. This is called pruning. Needell and Tropp established the following result that for an arbitrary signal with noise samples , CoSaMP produces a 2K-sparse signal approximation that satisfies where is a best K-sparse approximation of , is an sampling matrix with restricted isometry constant . Further, refer to positive constants. This result illustrates without the interrupted of noise, CoSaMP can recover an arbitrary signal with high precision. What is more, the performance of recovery reduces gracefully if we add noise in the samples. The -norm of 2K-sparse signal approximation that is produced by this algorithm is comparable with the -norm of the best K-sparse approximation . The CoSaMP has 5 major steps during each iteration: (1) Find the proxy of the current samples’ residual. (2) Locate the largest 2K components of the proxy and unite with the index of the current signal approximation. (3) Estimate the signal on the merged set of components by using the least squares. (4) Choose the K largest components as the signal approximation to prune the signal estimation (5) Update the samples’ residual. 13 Algorithm 2: CoSaMP Input: Measurement matrix , measurement y, sparsity level K of the ideal signal x Output: K-sparse estimate , index set , signal estimation , residual , , , {Initialize} While (halting criterion false) do 1. 2. {Find the proxy} 3. {Identify index of largest 2K components} 4. {Augment the index set} 5. ; {Signal estimate by least squares} 6. {Prune the signal estimation} 7. {Update the new residual} end while return 3.2.2 Experiments Same setup as for the OMP algorithm, we generate a signal of length N=1024 by using Matlab to implement. The recovery results with different size of M are displayed in the Figure 3.5. As compared with the OMP algorithm, the performance of CoSaMP is not as accurate as OMP algorithm. When M is equal to 2K or 3K, the recovery results oscillate up and down. What’s more, it is easy to see that, when M=5K the reconstruction by OMP is almost exact while the result of CoSaMP has a little deviations by comparison to the original signal. (a) (b) 14 (c) (d) Figure 3.5: Example performance of CoSaMP signal recovery. The red dotted line is the orignal signal, the green solid line is the signal recovery. (a) Signal recovery from M=2K samples. (b) Signal recovery from M=3K samples. (c) Signal recovery from M=4K samples. (d) Signal recovery from M=5K samples. Figure 3.6:The stability tests of the CoSaMP algorithm when random noise is added to the original signal. The noise is a random generated vector with the same size as the signal. The noise power level is increased from 0 to 0.5 step by step. Figure 3.6 and 3.7 show the performances of recovery of CoSaMP algorithm when we add random noise in signal and measurements respectively. If the original signal is perturbed by the noise and the noise power level is less than 0.1, the PSNR is more than 20 dB when M is bigger than 5K. But if the noise power level is more than 0.1, the result is not good no matter how big M is. The performance of CoSaMP is better if we add noise to the measurements. The recovery results are acceptable if the noise power level is not more than 0.2 for M is bigger than 4K. CoSaMP is more stable than OMP algorithm. If the signal or the measurements is interrupted by little noise, we can still get an approximately accurate result with a large M. 15 Figure 3.7: The stability tests of the CoSaMP algorithm when random noise is added to the measurements. The noise is a random generated vector with the same size as the signal. The noise power level is increased from 0 to 0.5 step by step. Now let us compare the PSNR of OMP and CoSaMP and the results are given in Figure 3.8. Without the noise, the recovery results by OMP are much more accurate than the results by CoSaMP. For example, the PSNR of OMP is around 78 dB when M=5K while the PSNR of CoSaMP is about 32 dB. Figure 3.9 shows the comparison about runtimes between these two algorithms. CoSaMP recovery is extremely fast. When M=5K, CoSaMP needs less than 0.2 seconds to do the reconstruction but OMP requires more than 1.5 seconds. So if we only need an acceptable result and require more about the cost of calculation, CoSaMP algorithm is a better choice. Figure 3.8: The comparison about performance of two algorithms 16 Figure 3.9: The comparison about runtime of two algorithms 3.2.3 Conclusion Compared with OMP, CoSaMP algorithm can’t achieve such a high PSNR as OMP in the noise free case. But CoSaMP has its own advantages. It is more stable when noise interrupts the signal. The recovery result is acceptable if the noise is not too high and measurements are large. The calculation is extremely fast. 3.3 Model-based compressive sensing 3.3.1 Description As we discussed before, in order to get the robust reconstruction, we need measurements. As a matter of a fact, modern wavelet image coders not only focus on the fact that most of wavelets coefficients are small, but also exploit the fact that the locations of the small part of large coefficients have a particular structure. One of the structured sparsity models that have the large wavelet coefficients on a rooted, connected tree structure[17], the other ones that account for the large coefficients are clustered together[7,8]. Based on the structural dependencies between the locations and values of signals, Baraniuk and Duarte [21] proposed a new signal acquisition method called model-based recovery. If the inter-dependency structure among the coefficients is considered in the process of reconstruction, fewer measurements will be required to offer the robust recovery. This is the foundation of model-based compressive sensing. Structured sparse signals A structured sparsity model [5] is defined as , s.t. (7) where denotes the entries of equaling to the set of indices 17 and contains all the allowed support with . represents the complement of the set . is the union all the K-dimensional subspaces and the signals from are called K-structured sparse. Model-based Restricted Isometry Property For all , an matrix has the -restricted isometry property if (8) Blumensath and Davies [10] quantified the number of measurement M required for reconstruction to have -restricted isometry property and pointed out that for any and positive constant , a random i.i.d. sub-Gaussian matrix has the -restricted isometry property with probability at least , if (9) By substituting , inequality (9) can be used for the bound of standard RIP. As we know, is much smaller than since arises from the structure imposed. So the number of measurements needed that satisfied the -RIP is much fewer than that of the standard RIP. Structured compressible signals Richard [5] defined the algorithm to obtain the best K-term structured sparse approximation of by (10) If the error about the best K-term structured sparse approximation and signal is , then So the set of structured compressible signal is define as is called an -structured compressible signal under the structured sparsity model . The value of is selected by minimizing the distance between and . Enlarged union of subspaces The enlarged union of subspaces is defined as It is easy to show that and . is the algorithm to obtain the best approximation of in the enlarged union of subspaces and defined as Rooted tree structure In a rooted tree, there is a unique path linking any two nodes. For a node , all nodes that lie from to the root are called ancestors of ; all nodes that lie from away from the root are 18 named descendants of . The parent of is the ancestor that links directly to and the child of is the node that has as its parent. A node can have several children but only one parent. Nodes without children are called the leaves. The parent of is denoted as . Wavelet coefficient tree The subspace of functions in of the form is denoted , where is the scaling functions at scale . The dilated, translated and normalized scaling functions are . So every function can be written as The general multiresolution analysis gives that a function is said to be a wavelet if the detail space spanned by the functions complements in [6]. Wavelets should be a Riesz basis for . So any can be decomposed as , where is an approximation of and contains the lost details. The detail spaces are defined as the set of functions of the form is the dilated and translated wavelets. According to the above, any function can be decomposed as , where and . The decomposition is not unique. There are many ways to choose the wavelet in the corresponding detail spaces . Here, a special orthogonal wavelet basis- Daubechies wavelet is used in the simulation. We say that has p vanishing moments if for we have This means is orthogonal to any polynomial of degree p-1. Daubechies wavelets (dbN) are defined to have the maximum number of vanishing moments. Daubechies orthogonal wavelets db1­db10 are widely used in practice. For instance, db1 refers to the Haar wavelet and it has one vanishing moment. The index number N refers to the number of vanishing moments. The number of coefficients is twice of the number of coefficients. So db1 has 2 coefficients. Figure 3.10 shows the scaling functions and wavelet functions of db1 to db6. The scaling functions and wavelet functions become smoother and smoother as the increasing of N. db1 is suitable for encoding polynomials of one coefficient so it can be used to encode constant signal components. db2 can encode constant and linear signal components and db3 is good at encoding constant, linear and quadratic signal components. In my thesis, db2 is used in the simulation. 19 Figure 3.10: Scaling functions and wavelet functions of db1 to db6 Now consider a signal at a finest scale I, we can write as by repeating the decomposition until a certain level . Since as and as , the wavelet representation of is given by [6] Using a matrix to illustrate, where is a matrix containing the wavelet functions as the columns and is a vector with the wavelet coefficients as the components. The vector can form a rooted binary wavelet tree with as the root of the tree. Figure 3.11 shows the wavelet coefficients tree. If a coefficient and its parent as well, the coefficients that satisfy this property will form a connected subtree. Then the definition of the structured sparsity model is Here is the index set. In order to recover the tree based signal, we need to solve the optimal problem as follows: (11) 20 Figure 3.11: Binary wavelet tree The structured sparsity model and optimal problem (11) are similar to the structured sparsity model and algorithm . Condensing Sort and Select is an Algorithm (CSSA) that can solve the optimal problem (10) and (11). Because is K-sparse, we need to find K largest absolute value of wavelet coefficients to get the optimal value of signal . The CSSA algorithm If we refer to as the value of node and as the kernel of linear program, the Condensing Sort and Select Algorithm (shorted for CSSA) [16] is a greedy algorithm to find the maximum value of . The kernel should satisfy the nonincreasing constraint that is . Before given the algorithm of CSSA, the notations are given in table 1: Table 1: notations The value of node . The kernel value. The volume of kernel. The volume counter. Root of the tree The algorithm of CSSA is as follows: (1) Use to find the node that has the largest data value of all of the nodes with . (2) If , is set to be 1. While if , we merge with to form a larger supernode by and is the supernode value (SNV) of the supernode . Supernode can also contain several nodes. Let denote the number of internal nodes the supernode has and denote the kernel value, its SNV is (3) If is still equal to zero, we continue to merge nodes into supernodes until we get with , then the kernel of can be set to 1. Update the volume of counter to 21 . Return to (1) if . Algorithm 3: CSSA Input: value of nodes , tree, Output: {Initialize} while do 1. {identify the largest supernode} 2. If ; {update the kernel value} 3. else Merge and into a single supernode {condense} end if end while Model-based signal reconstruction algorithm As we have already known, the model-based signal recovery algorithm is based on the CoSaMP algorithm, so we can get the model-based algorithm by replacing the best K-term sparse approximation step in CoSaMP with a best K-term structured sparse approximation. In this way, at each time we merely search subspaces of that are significantly fewer than subspaces of in the conventional recovery algorithms. Fewer measurements are needed to keep the robust of algorithm. Baraniuk and Duarte gave the proof of performance of structured sparse signal recovery. By given a noise measurement , the signal estimation of structured sparse signal obtained from iteration satisfies As the growing of the number of iteration, the -error is becoming smaller and smaller. In the absence of noise, the model-based algorithm can recover a structured sparse signal with high accuracy. The Model-based algorithm has 5 major steps during each iteration: (1) Find the proxy of the current samples’ residual. (2) Obtain the best approximation of the proxy in the enlarge union of and unite the index of the best approximation with the index of the current signal approximation. (3) Estimate the signal on the merged set of components by using the least squares. (4) Choose the best K-term structured sparse approximation of the estimation as the signal approximation to prune the signal estimation (5) Update the samples’ residual. 22 Algorithm 4: Model-based Input: Measurement matrix , measurement y, sparsity level K of the ideal signal x, structured sparse approximation algorithm Output: K-sparse estimate , index set , signal estimation , residual , , , {Initialize} While (halting criterion false) do 1. 2. {Find the proxy} 3. {Prune residual estimate based on structure} 4. {Augment the index set} 5. ; {Signal estimate by least squares} 6. {Prune the signal estimation} 7. {Update the new residual} end while return 3.3.2 Experiments By using the same method as OMP to generate a test signal with length N=1024, Figure 3.12 and 3.13 give the stability tests of Model-based algorithm with different power levels of noise. Compared with OMP and CoSaMP algorithms, Model-based recovery is much more robust. Even with noise, the results of recovery are satisfactory, especially when the noise interrupt the measurements. That means it is possible to get an approximately accurate reconstructed signal with an interference sample. Figure 3.12:The stability tests of the Model-based algorithm when we add random noise to the original signal. The noise is a random generated vector with the same size as the signal. The noise power level is increased from 0 to 0.5 step by step. 23 Figure 3.13:The stability tests of the Model-based algorithm when random noise is added to the measurements. The noise is a random generated vector with the same size as the signal. The noise power level is increased from 0 to 0.5 step by step. Figure 3.14 depicts the experiment about the performance of CoSaMP and Model-based algorithms without noise. It is easy to see that M=2K is far fewer than the requirement of CoSaMP to recover the signal accurately. On the contrary, model-based recovery using M=2K can get a pretty good result. Figure 3.15 illustrates how the number of measurements M affects the PSNR of two algorithms. Model-based recovery achieves good recovery at M=3K while CoSaMP gets this performance at M=5K. It is not difficult to tell that comparing with CoSaMP, Model-based recovery uses significantly fewer measurements to offer the same stability. By using the same class of signals, the recovery time of Model-based and CoSaMP algorithms are compared empirically and the results are illustrated in Figure 3.16. In general, Model-based recovery is slower than CoSaMP. The best benefits of model-based recovery are obtained at M=4K. Around this area, Model-based algorithm yields much higher PSNR than CoSaMP and the computational times of two methods are comparable. (a) The original signal of length N=1024 24 (b) Signal recovery from M=2K=126 samples (c) Signal recovery from M=4K=252 samples (d) Signal recovery from M=6K=378 samples Figure 3.14:The comparison about CoSaMP and Model-based algorithms. (a) is the original signal with length N=1024. The signal is K-sparse in wavelet basis and K=63. (b), (c) and (d) are the comparisons of reconstruction with different size of M. Left column is the result of Model-based algorithm and right column is the result of CoSaMP algorithm. 25 Figure 3.15: The comparison about performance of Model-based and CoSaMP algorithms Figure 3.16: The comparison about runtime of Model-based and CoSaMP algorithms Now, Figure 3.17 and Figure 3.18 give the comparisons of stability and recovery time of OMP, CoSaMP and Model-based algorithms. With the noise free measurements, OMP can achieve a really high PSNR when M is larger than 5K. But as we have discussed in section 3.1.2, OMP recovery is very unstable. If the signal or measurements are disturbed by noise, the recovery is not accurate any more even with a large M. When M is less than 4K, Model-based algorithm can get the highest PSNR comparing with the other algorithms. From the Figure 3.17, the running time of Model-based algorithm will decrease as the increasing of the size of M. This is because for small M, the restricted degrees of freedom make it harder for the algorithm to reach the stopping criteria, so it needs more iteration and thus a longer running time. In general, CoSaMP is the fastest algorithm. This is due to the fact that CoSaMP uses the simple K-term approximation. 26 This algorithm picks the largest K components each time instead one component in OMP algorithm and the CSSA step in Model-based recovery is more computational. Comparing Figure 3.17 and 3.18, the near perfect point Model-based algorithm obtains is around M=4K. The recovery error of Model-based approach at this point is smaller than the other algorithms and the running time is acceptable. Figure 3.17: The comparison about performance of three algorithms Figure 3.18: The comparison about runtime of three algorithms 27 3.3.3 Conclusion From the above discussions, Model-based algorithm can recover a signal accurately with fewer measurements. This approach is more stable. When the samples are interfered by noise, it is possible to reconstruct the signal properly. The recovery time is much faster than OMP but a little slower than CoSaMP. Overall, Model-based algorithm is a costless, effective and efficient one. 28 4 Experiments on MRI Images In this chapter, the algorithms mentioned above will be tested by series of real 2D medical images. These images are provided by MRI scanner. MRI stands for Magnetic Resonance Imaging and is also known as spin imaging. MRI makes use of the property of Nuclear Magnetic Resonance (NMR) to image nuclei of atoms inside the body and this technique is widely used in medical imaging to visualize detailed internal structures. The main problem is that MRI scanner requires a huge number of measurements to reconstruct an image, so it will take a long time of scanning. Comparing to X-ray and CT (Computed Tomography ), MRI does not expose the patient to the hazards of ionizing radiation. But the long time scanning is still harmful and will bring a lot of discomfort to the patient. Compressed sensing can significantly reduce the number of samples, so it is broadly used in medical image processing. Since OMP is very unstable, it is not accurate if the signal is perturbed slightly by noise. This algorithm is also slow so it is quite expensive compared with the other algorithms, therefore it is not practical in reality. In this part, we only test CoSaMP and Model-based algorithms and compare their performances for different types of images. All of the test images in this thesis are of size . But due to the limited data processing capability of my computer, I resize the image to be one quarter of the original one and the new images are of size . Because the size of images is small, neither the resolution of test images nor the quality of reconstructed results is good. But it is not difficult to tell the difference when we change the sample size M and compare the performance by using different methods. Head First, an example of head image recovery is given in Figure 4.1. This type of image is not smooth and there are a lot of details in the image. In this figure, both PSNR are not higher than 20 dB until . Even , both PSNR of this image are not higher than 23 dB, so the recovery qualities for both algorithms are very poor. More samples are needed to reconstruct this type of image. M=3K, PSNR=17.31dB M=3K, PSNR=13.84dB http://en.wikipedia.org/wiki/Nuclear_magnetic_resonance http://en.wikipedia.org/wiki/Nuclear_magnetic_resonance http://en.wikipedia.org/wiki/Computed_tomography 29 M=5K, PSNR=21.15dB M=5K, PSNR=20.17dB M=7K, PSNR=22.8dB M=7K, PSNR=22.88dB Figure 4.1: Example of Model-based and CoSaMP recovery on images. The left column is Model-based recovery and the right column is CoSaMP recovery. Figure 4.2: Performance of Model-based and CoSaMP recovery on a 2D head image. Brain I The following is the test results of brain (horizontal view of a head which is cut into two pieces). This type of image is quite smooth in the centre, but it has a bright circle which is quite different 30 from the other parts. Model-based recovery can achieve a very good PSNR when while the PSNR of CoSaMP recovery is not acceptable for . But the PSNR of CoSaMP is increasing very fast as grows. From the curve in Figure 4.4, the PSNR of CoSaMP is higher than Model-based recovery when exceeds . M=3K, PSNR=17.02dB M=3K, PSNR=12.55dB M=5K, PSNR=24.07dB M=5K, PSNR=23.11dB M=7K, PSNR=25.01dB M=7K, PSNR=26.14dB Figure 4.3: Example of Model-based and CoSaMP recovery on images. The left column is Model-based recovery and the right column is CoSaMP recovery. 31 Figure 4.4: Performance of Model-based and CoSaMP recovery on a 2D brain image. Knee On the whole, the knee image is quite smooth and doesn’t have a lot of curves. The main trend of PSNR is similar as the first type of brain. Both PSNR are growing fast as M increases and the PSNR of CoSaMP is bigger than Model-based at the end when is larger than . The qualities of recovery are good when . M=3K, PSNR=19.33dB M=3K, PSNR=15.33dB M=5K, PSNR=23.96dB M=5K, PSNR=22.97dB 32 M=7K, PSNR=25.08dB M=7K, PSNR=25.82dB Figure 4.5: Example of Model-based and CoSaMP recovery on images. The left column is Model-based recovery and the right column is CoSaMP recovery. Figure 4.6: Performance of Model-based and CoSaMP recovery on a 2D knee image. Lumbar There are some clear objects in the lumbar images and these objects are very smooth. According to the Figure 4.7 and 4.8, Model-based recovery achieves quite good results and the PSNR of this algorithm is more than 20dB when . The PSNR of CoSaMP recovery is very low at , and it can get a comparable PSNR when . Finally, the quality of CoSaMP algorithm exceeds Model-based at . Model-based algorithm does a quite good job on this type of image. 33 M=3K, PSNR=20.39dB M=3K, PSNR=16.70dB M=5K, PSNR=24.56dB M=5K, PSNR=23.93dB M=7K, PSNR=26.33dB M=7K, PSNR=26.63dB Figure 4.7: Example of Model-based and CoSaMP recovery on images. The left column is Model-based recovery and the right column is CoSaMP recovery. 34 Figure 4.8: Performance of Model-based and CoSaMP recovery on a 2D lumbar image. Brain II This is the second type of brain image and it is the vertical view of a head which is cut into two pieces. The top of this image is quite smooth but the bottom part is rough. The results of the experiments are illustrated in Figure 4.9 and 4.10. The PSNR of both algorithms are not more than 20dB until . We need a large size of samples to recovery this type of images. M=3K, PSNR=17.08dB M=3K, PSNR=12.87dB M=5K, PSNR=21.61dB M=5K, PSNR=21.09dB 35 M=7K, PSNR=23.18dB M=7K, PSNR=23.72dB Figure 4.9: Example of Model-based and CoSaMP recovery on images. The left column is Model-based recovery and the right column is CoSaMP recovery. Figure 4.10: Performance of Model-based and CoSaMP recovery on a 2D brain image. 36 5 Conclusion The traditional process of image compression is quite costly. It acquires the entire signal at beginning, then does the compression and throws most of the information away at the end. The new idea of image compression combines signal acquisition and compression as one step which improves the overall cost significantly. The process of new image compression is to find a measurement matrix and multiply it with the signal that we want to compress in order to get linear measurements . is the compressed sample that we wish to get. Since , the system is underdetermined if we want to reverse the process and reconstruct the signal . In theory, there are infinitely many that satisfy this system, so it seems impossible to reconstruct the signal. Fortunately, most of signals in reality are spare or spare under some basis (e.g. wavelets). If we can find the location of those non-zero entries, we can reconstruct the signal uniquely. In this thesis, we discussed three reconstruction algorithms and compared the advantages and disadvantages of them. Peak Signal-to-Noise Ratio (PSNR) is used to measure the quality of recovery. PSNR is a ratio between the maximum possible power of a signal and the power of corrupting noise. The higher PSNR value is, the better recovery performance is. Orthogonal matching pursuit (OMP) is lacked in stability guarantees. Even it can achieve very high PSNR when the size of measurements is large in the noise free case, it is not accurate any more if the signal or measurements are perturbed by noise. Since the algorithm picks the optimal entries one by one, it is very slow. So OMP is not an ideal algorithm in reality. Compressive Sampling Matching Pursuit (CoSaMP) is fastest among these three algorithms. The property of stability of CoSaMP is better than OMP. When noise is added to the signal, the PSNR is acceptable if we have a large size of measurements . Model-based algorithm is the most stable algorithm. Though the experiments in 1D and 2D, it is not difficult to tell model-based algorithm can offer a robust recovery by using fewer measurements comparing to CoSaMP. By testing the MRI images, both algorithms can provide satisfactory results when the images are smooth. But when the images are rough or have a lot of details, the recovery results are not good. This kind of images needs more measurements to reconstruct the images. Table 2: The comparison of three algorithms OMP Not optimally fast. Lacked in uniform guarantees. Not stable. PSNR is very high without noise. CoSaMP Need a lot of measurements. Fastest. Uniform guarantees. Possible to be stable. Model-based Fast. Uniform guarantees. Large possibility to be stable. Need fewer measurements. 37 Reference [1] E. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inform. Theory, vol. 52, no. 2, pp. 489–509, Feb. 2006. [2] D. Donoho, “Compressed sensing,” IEEE Trans. Inform. Theory, vol. 52, no. 4, pp. 1289–1306, Apr. 2006. [3] “Nyquist-shannon sampling theorem.” Wikipedia, 2011. http://en.wikipedia.org/wiki/Nyquist-Shannon_sampling_theorem. [4] Justin Romberg, Michael Wakin, “Compressed Sensing: A Tutorial.” IEEE Statistical Signal Processing Workshop, Aug. 26, 2007. [5] Richard G. Baraniuk, Marco F. Duarte, “Model-Based Compressive Sensing.” arXiv: 0808.3572v5 [cs. IT], Dec. 9, 2009. [6] Joran Bergh, Fredrik Ekstedt, Martin Lindberg, ”Wavelets.” January 1999. [7] M. Stojnic, F. Parvaresh, and B. Hassibi, “On the reconstruction of block-sparse signals with an optimal number of measurements,” IEEE Trans. Signal Processing, vol. 57, no. 8, pp. 3075–3085, Aug. 2009. [8] Y. Eldar and M. Mishali, “Robust recovery of signals from a structured union of subspaces,” IEEE Trans. Info. Theory, vol. 55, no. 11, pp. 5302–5316, Nov. 2009. [9] J. Tropp and A.C. Gilbert, “Signal recovery from partial information via orthogonal matching pursuit,” IEEE Trans. Inform. Theory, vol. 53, no. 12, pp. 4655-4666, 2007. [10] T. Blumensath and M. E. Davies, “Sampling theorems for signals from the union of finite-dimensional linear subspaces,” IEEE Trans. Info. Theory, vol. 55, no. 4, pp. 1872–1882, Apr. 2009. [11] “Peak signal-to noise ratio.” Wikipedia, 2011. http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio [12] S.-J. Kim, K. Koh, M. Lustig, S. Boyd, and D. Gorinevsky. A method for large-scale -regularized least-squares problems with applications in signal processing and statistics. Submitted for publication, 2007. [13] M. A. T. Figueiredo, R. D. Nowak, and S. J. Wright. Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE J. Selected Topics in Signal Processing: Special Issue on Convex Optimization Methods for Signal Processing, 1(4): 586–598, 2007. [14] I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math., 57:1413-1457, 2004. [15] J. A. Tropp and A. C. Gilbert. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Info. Theory, 53(12):4655–4666, 2007. [16] R. G. Baraniuk and D. L. Jones, “A signal-dependent time-frequency representation: Fast algorithm for optimal kernel design,” IEEE Trans. Signal Processing, vol. 42, no. 1, pp. 134–146, Jan. 1994. [17] M. S. Crouse, R. D. Nowak, and R. G. Baraniuk, “Wavelet-based statistical signal processing using hidden Markov models,” IEEE Trans. Signal Processing, vol. 46, no. 4, pp. 886–902, Apr. 1998. [18] S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by Basis Pursuit. SIAM J. http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio 38 Sci. Comput., 20(1):33–61, 1999. [19] D. L. Donoho, Y. Tsaig, I. Drori, and J.-L. Starck. Sparse solution of underdetermined linear equations by stagewise Orthogonal Matching Pursuit (StOMP). Submitted for publication, 2007. [20] S. Mallat and Z. Zhang. “Matching Pursuits with time-frequency dictionaries”. IEEE Trans. Signal Process., 41(12):3397–3415, 1993. [21] Richard G. Baraniuk and Marco F. Duarte. “Model-based Compressive Sensing”. Information theory, IEEE Transactions, April 2010. [22] D. Needell and R. Vershynin. “Signal recovery from incomplete and inaccurate measurements via regularized orthogonal matching pursuit”. Submitted for publication, October 2007. [23] D. Needell and R. Vershynin. “Uniform uncertainty principle and signal recovery via regularized orthogonal matching pursuit”. Submitted for publication, July 2007. [24] D. Needell and J.A. Tropp. "CoSaMP: Iterative signal recovery from incomplete and inaccurate samples". Applied and Computational Harmonic Analysis, 2009. [25] S. G. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process., 41(12):3397–3415, 1993. [26] B. K. Natarajan. Sparse approximate solutions to linear systems. SIAM J. Comput., 24:227–234, 1995.