Joint-Feature Guided Depth Map Super-Resolution With Face Priors


Teaser

Figure 1. Super resolution of a low resolution facial depth map. (a) Input raw degraded depth map and the corresponding high resolution color image. (b) The upsampled result using the proposed method. (c) Textured depth surface using raw degraded depth map. (d) Textured depth surface using recovered depth map. Depth maps are color coded for better visualization.

Abstract

In this paper, we present a novel method to super-resolve and recover the facial depth map nicely. The key idea is to exploit the exemplar-based method to obtain the reliable face priors from high-quality facial depth map to improve the depth image. Specifically, a new neighbor embedding (NE) framework is designed for face prior learning and depth map reconstruction. First, face components are decomposed to form specialized dictionaries and then reconstructed, respectively. Joint features, i.e., low-level depth, intensity cues and high-level position cues, are put forward for robust patch similarity measurement. The NE results are used to obtain the face priors of facial structures and smooth maps, which are then combined in an uniform optimization framework to recover high-quality facial depth maps. Finally, an edge enhancement process is implemented to estimate the final high resolution depth map. Experimental results demonstrate the superiority of our method compared to state-of-the-art depth map super-resolution techniques on both synthetic data and real-world data from Kinect.

Teaser

Figure 2. Framework of the proposed method. We first decompose a whole face into facial components based on the HR color image and reconstruct them, respectively. Joint features, including low-level depth, intensity cues, and high-level position cues are extracted to represent each patch for robust nearest neighbor searching. The face priors of facial structures and smooth map estimated by these nearest neighbors are used to recover the facial depth map. In addition, our method further enhances depth boundaries and makes them clean and sharp using learned edge maps.

Demo

Resources

  • Paper: pdf
  • Data: zip (24.4M, containing inputs and results in our paper)
  • BU-3DFE Dateset: Please request data from BU-3DFE website
  • Citation

    @article{Yang2018Joint, title={Joint-Feature Guided Depth Map Super-Resolution With Face Priors}, author={Yang, Shuai and Liu, Jiaying and Fang, Yuming and Guo, Zongming}, journal={IEEE Transactions on Cybernetics}, volume={48}, number={1}, pages={399-411}, month={January}, year={2018}, }

    Selected Results

    Figure 3. Visual comparison with state-of-the-art methods for 8× upsampling on the degraded BU-3DFE dataset. Our method eliminates the noise while effectively restoring facial structures. And the error maps demonstrate that our reconstructed depth map is highly consistent with the ground truth. (a) Ground truth. (b) Diebel and Thrun [1]. (c) Yang et al. [2]. (d) He et al. [3]. (e) Kiechle et al. [4]. (f) Ma et al. [5]. (g) Ferstl et al. [6]. (h) Proposed method. For visual inspection, regions highlighted by blue rectangles are enlarged, and the error maps between the recovered depth map and ground truth are shown below the results.

    Reference

    [1] J. Diebel and S. Thrun, An application of Markov random fields to range sensing, NIPS 2005.
    [2] Q. Yang, R. Yang, J. Davis, and D. Nister, Spatial-depth super resolution for range images, ICCV 2007,
    [3] K. He, J. Sun, and X. Tang, Guided image filtering, TPAMI 2013.
    [4] M. Kiechle, S. Hawe, and M. Kleinsteuber, A joint intensity and depth co-sparse analysis model for depth map super-resolution, ICCV 2013.
    [5] Z. Ma, K. He, Y. Wei, J. Sun, and E. Wu, Constant time weighted median filtering for stereo matching and beyond, ICCV 2013.
    [6] D. Ferstl, C. Reinbacher, R. Ranftl, M. Ruether, and H. Bischof, Image guided depth upsampling using anisotropic total generalized variation,
    [6] ICCV 2013.