Image Restoration Based on 3-D Autoregressive Model
via Low-Rank Minimization

Published on DCC, April 2015.

Introduction

With the rapid development of multimedia technology, massive digital image and video resources are being generated every day. Due to the complicated transmitting environment and all kinds of need of customers, numerous practical applications emerge, e.g. image inpainting, interpolation, super-resolution and the removal of salt and pepper noise. One thing these cases all have in common is that there are plenty of missing pixels randomly distributed in an image.

In this paper, based on the 3-D autoregressive model, a novel image restoration method has been proposed. For the first time, the 3-D autoregressive model is put forward on a single image to simultaneously measure correlations within and between image patches. Specifically, similar image patches are stacked into a data cube and the 3-D autoregressive model is applied to the data cube to obtain a local-consistent patch set. Moreover, to fully utilize the preserved perceptual statistics in lower resolution versions of an image, a multiscale structure for image reconstruction utilizing the 2-D autoregressive model is proposed. With the multiscale image reconstruction method and the 3-D autoregressive model, local-consistent low-rank data matrices consist of similar image patches are formed. An iterative singular value thresholding method is used to solve the low-rank minimization problem.

3-D Autoregressive Model Based Image Restoration

The Formulation of 3-D Autoregressive Model

To restore an image from random samples, the basic way is to utilize the intrinsic nonlocal similarity existed in natural images. Considering a reference patch $p_i$ of size $\sqrt{n}\times\sqrt{n}$ centered at location $i$, its nonlocal similar patches in the entire image are collected. Each similar patch can be represented as an $n\times 1$ vector by concatenating all of its columns. Suppose the top $m$ similar patches of $p_i$ are selected, they can form an $n\times m$ data matrix. Since all the entries are similar to each other, the data matrix should be low-rank. Besides, the noises or the missing pixels are irregular. Thus, all kinds of low-rank minimization methods can be applied to the data matrix to wipe out the noises and reserve the intrinsic structures. The preceding method has been utilized by many works and obtained prominent results.

However, such method only considers the correlation between patches, neglecting the correlation inside a single patch. 2-D autoregressive (2D-AR) model has been proposed to model the stable local statistics within an image patch. The low-rank data matrix constructed by similar patches also presents stable local statistics in each row (between patches). Nevertheless, considering the formation of the data matrix, the stable local statistics do not exist in each column. The reason is that the spatial relationships between adjacent pixels are different. Most of the adjacent pixels are also adjacent in the original image patch, while some of them are not (e.g. the last pixel in the first column and the first pixel in the second column of the same image patch are adjacent in the data matrix, but they are not adjacent in the image patch). Apparently, such situation is caused by the information loss from 2-D (a patch) to 1-D (a vector). No matter how a patch is unfolded to a vector, it is still impossible to satisfy the local stationary assumption.

In order to simultaneously model the stable local statistics within and between image patches, the 3-D AR model, together with the 2-D AR model based multiscale image reconstruction and low-rank minimization using iterative SVT, forms the method proposed in this paper. Figure 1 illustrates the whole process of our work.

Figure 1 The flowchart of the proposed image restoration based on the 3-D autoregressive model via low-rank minimization.

3-D AR Modeling

The AR model has been widely used in signal processing field. In image processing, an AR model is constructed by a pixel and its neighbors. Based on the assumption of the local stationary of natural images, all pixels in a local area can be estimated by their neighbors with the same weights, i.e. AR parameters. The general AR model is defined as
\begin{equation}
X_{ij}=\sum_{k\in\mathcal{N}}X_{(i+k)j} \cdot \varphi_k+\sigma,
\end{equation}
where $X_{ij}$ represents the $j^{th}$ pixel value of the patch located at $i$ of the input image $X$. $\mathcal{N}$ is the offsets of the neighbors in the AR model and $\varphi_k$ is the $k^{th}$ AR parameter. $\sigma$ is the estimation error.

Since the collected patches are similar to each other, the aforementioned assumption of natural images can also be extended to another dimension, allowing the AR model to be applied between similar patches. Thus, in our work, The neighbor offsets consist not only the pixels in the same patch, but also the pixels in other similar patches (as illustrated in Figure 1). The proposed 3-D AR model is defined as
\begin{equation}
\label{3DAR}
X_{ij}=\sum_{k\in\mathcal{N}_1}\sum_{l\in\mathcal{N}_2}X_{(i+k)(j+l)} \cdot \varphi_{kl}+\sigma,
\end{equation}
where $\mathcal{N}_1$ and $\mathcal{N}_2$ represents the spatial offsets and the patch offsets, respectively. In our experiments, a $3\times 3\times 3$ AR model (that is, $1$ center pixel and $26$ neighbor pixels) is formed.

If we stack all the similar patches in a data cube $C$, by modeling all the pixels (except for those on the boundaries) in $C$, (\ref{3DAR}) can be transformed into a matrix form and the 3-D AR parameters can be computed by the following linear least squares problem:
\begin{equation}
\min_{\varphi}\|\textbf{X}-\textbf{X}_N \cdot \varphi\|_2^2,
\end{equation}
where $\textbf{X}$ consists of all the modeled pixels, each row of $\textbf{X}_N$ consists of the corresponding pixel's AR neighbors. Such least squares problem has a closed-form solution, that is
\begin{equation}
\label{AR parameters}
\varphi=(\textbf{X}_N^T\textbf{X}_N)^{-1}\textbf{X}_N^T\textbf{X}.
\end{equation}
Hence, the 3-D AR parameters of $C$ are obtained and it can be used to reconstruct a more local-consistent $\hat{C}$ by
\begin{equation}
\label{3Dcompute}
\hat{\textbf{X}}=\textbf{X}_N\cdot \varphi.
\end{equation}
Pixels in the original data cube $C$ are replaced by those in $\hat{\textbf{X}}$ to form $\hat{C}$. After that, $\hat{C}$ is reformed into a data matrix $\textbf{M}$ by representing all patches as a vector in order to perform the low-rank minimization method.

2-D Autoregressive Model Based Multiscale Image Reconstruction

In the previous section, a 3-D AR model is applied to a similar patch set to simultaneously measure the spatial and inter correlations of patches. However, due to the lack of prior information, the 3-D AR model can not contribute to the reconstruction of clear edges. Since natural images have the property of scale invariant, which also be known as the geometry duality, important second-order statistics (e.g. edges) can be well preserved in the low-resolution version of the images. Moreover, although the original image has been damaged, its low-resolution version can be well reconstructed utilizing the given information of the input image. Through the above analysis, we propose a 2-D AR model based multiscale image reconstruction to preserve perceptual structures of the damaged image.

For a given random sampled image $I$, several low-resolution versions are generated by averaging the known pixels in 8-neighborhood of the corresponding position on higher resolutions. Apparently, such measurement may cause some blur structures. Thus, the higher resolution is also downsampled and the known pixels in the low-resolution is preserved. After that, a simple bilinear interpolation is utilized to generate a complete image in the lowest resolution. Then, a 2-D AR model based interpolation is applied to generate a higher resolution image. Combining with the previous downsampled version, a better higher resolution image is generated and it is used as the lower resolution image of the next resolution, until the original resolution is reached (the rectangle in blue dashed line in Figure 1).

Different with the measurement of the 3-D AR model, there exist unknown pixels, so that we cannot model all the pixels in the target HR image. Similar to NEDI [1], unknown HR pixels are interpolated in two steps. The first step interpolates HR pixels centered at four LR pixels and the second step interpolates the remaining pixels. In our experiment, a 4-order AR model is utilized in the 2-D AR model based reconstruction. AR parameters in a local window of the input image can be computed through (\ref{AR parameters}). Due to the geometry duality between the LR and HR image, HR pixels can be obtained by convolving AR parameters with corresponding LR neighbors.

Low-Rank Minimization

After obtaining the preliminary reconstruction of the input image, similar patches are collected and the 3-D AR model is applied to form a more local-consistent patch set. Even so, the patch set still contains perceptual burrs. In this section, we introduce an iterative SVT to restore a clean image.

The low-rank minimization has been studied extensively in recent years. In this paper, the singular value thresholding (SVT) [2] is applied for its simplicity and efficiency. We here give a brief review of SVT. For a given matrix $\textbf{N}_0$, e.g. the data matrix $\textbf{M}$ obtained in Section II.B, SVT gives an iterative solution as follows,
\begin{equation}
\label{iteration}
\textbf{N}_k=\textbf{N}_{k-1}+\delta(\textbf{N}_0-S_\tau(\textbf{N}_{k-1})),
\end{equation}
where $S_\tau(\cdot)$ represents the soft shrinkage process and $\delta$ is an iterative regularization factor. The soft shrinkage process $S_\tau(\cdot)$ first applies SVD on the target matrix, then uses the threshold $\tau$ to shrink the singular values $\sigma(\cdot)$, as follows,
\begin{equation}
S_\tau(\textbf{N}_{k-1})=U\Sigma_\tau V^T, \Sigma_\tau=diag(\mathrm{max}(\sigma(\textbf{N}_{k-1})-\tau,0)).
\end{equation}

In (\ref{iteration}), the iterative regularization is performed on each data matrix $\textbf{M}$ filtered by 3-D AR model to restore the clean data matrix $\textbf{N}$. The whole image can be restored by averaging all the overlapped patches after each patch collection is processed. However, in order to utilize the newly updated data for other patch collections and accelerate the convergence of the above iterative procedure, an alternative scheme is adopted. First, we perform SVT for each data matrix $\textbf{M}$ and aggregate all the overlapped patches into a whole image. Then we carry out the iterative regularization on the whole image to produce the newly output. The iterative procedure is performed until it reaches the stopping criteria. Finally, the image restoration based on the 3-D AR model via low-rank minimization is summarized in Algorithm 1 in details.

Experimental Results

Our experiments are implemented on MATLAB platform. The state-of-the-art methods KR [3], STDC [4] and IRJSM [5] are used as comparisons. Four standard test images include: House ($256\times 256$), Lena ($512\times 512$), Cameraman ($256\times 256$) and Pepper ($512\times 512$). Three kinds of pixel missing rates 60%, 70% and 85% are tested in the experiments. The experimental results of KR, STDC and IRJSM are all generated by the original authors' codes. Peak Signal to Noise Ratio (PSNR) is selected as the objective quality criterion. Subjective comparisons are also shown for readers to give a perceptual evaluation of our work.

Table I presents the PSNR results of different methods on test images under different missing rates. The second column represents the pixel missing rate, i.e. the percentage of the missing pixels. As can be observed, the proposed method appreciably outperforms other state-of-the-art methods. The gain of PSNR is about 0.4dB on average over the second best method.

Subjective comparisons for 60%, 70%, 85% missing rates are shown in Figure 2, Figure 3 and Figure 4, respectively. The specific PSNR and SSIM values for these portion images are also given in the parentheses. KR captures edge structures but produces blurred artifacts at the same time. Since STDC regards the whole image as a low-rank matrix, it generates visible artifacts around edges. And its restoration under high missing rate (e.g. 85%) is not that satisfactory. Although the PSNR results of IRJSM are quite close to the proposed method, the subjective quality of IRJSM is not as good as ours, especially the detail on image textures. The proposed method presents the best visual quality, especially in edge structures and texture regions.

Table I PSNR (dB) results from different methods under different missing rates. The best result in each case is highlighted in bold.

Images Missing Rates KR STDC IRJSM Proposed
House 60% 37.13 36.95 41.35 41.90
70% 36.86 36.05 39.70 40.22
85% 35.90 33.00 37.43 38.13
Lena 60% 37.02 36.67 39.84 40.52
70% 36.79 35.96 38.50 39.15
85% 36.00 33.91 36.28 36.80
Cameraman 60% 34.17 34.22 37.35 37.47
70% 34.01 33.42 36.15 36.27
85% 33.57 31.09 34.45 34.37
Pepper 60% 36.77 36.60 39.43 39.73
70% 36.59 35.75 38.32 38.61
85% 35.74 33.24 36.60 36.84
Average 35.88 34.74 37.95 38.34

 

Figure 2 The restoration comparison of House image under 60% missing rate.

 

Figure 3 The restoration comparison of Lena image under 70% missing rate.

 

Figure 4 The restoration comparison of Pepper image under 85% missing rate.

References

[1] X. Li and M. Orchard, “New edge-directed interpolation,” IEEE Transactions on Image Processing, vol. 10, no. 10, pp. 1521-1527, October 2001.

[2] J. Cai, E. Cand`es, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956-1982, March 2010.

[3] H. Takeda, S. Farsiu, and P. Milanfar, “Robust kernel regression for restoration and reconstruction of images from sparse noisy data,” in Proceedings of IEEE International Conference on Image Processing, October 2006, pp. 1257-1260.

[4] Y. Chen, C. Hsu, and H. Liao, “Simultaneous tensor decomposition and completion using factor priors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 3, pp. 577-591, March 2014.

[5] J. Zhang, D. Zhao, R. Xiong, S. Ma, and W. Gao, “Image restoration using joint statistical modeling in a space-transform domain,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 6, pp. 915-928, June 2014.

Back to Projects Page