Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Spatiotemporal Statistics for Video Quality Assessment

Published: 01 July 2016 Publication History

Abstract

It is an important task to design models for universal no-reference video quality assessment (NR-VQA) in multiple video processing and computer vision applications. However, most existing NR-VQA metrics are designed for specific distortion types, which are not often aware in practical applications. A further deficiency is that the spatial and temporal information of videos is hardly considered simultaneously. In this paper, we propose a new NR-VQA metric based on the spatiotemporal natural video statistics in 3D discrete cosine transform (3D-DCT) domain. In the proposed method, a set of features are first extracted based on the statistical analysis of 3D-DCT coefficients to characterize the spatiotemporal statistics of videos in different views. These features are used to predict the perceived video quality via the efficient linear support vector regression model afterward. The contributions of this paper are: 1) we explore the spatiotemporal statistics of videos in the 3D-DCT domain that has the inherent spatiotemporal encoding advantage over other widely used 2D transformations; 2) we extract a small set of simple but effective statistical features for video visual quality prediction; and 3) the proposed method is universal for multiple types of distortions and robust to different databases. The proposed method is tested on four widely used video databases. Extensive experimental results demonstrate that the proposed method is competitive with the state-of-art NR-VQA metrics and the top-performing full-reference VQA and reduced-reference VQA metrics.

References

[1]
A. C. Bovik, “Automatic prediction of perceptual image and video quality,” Proc. IEEE, vol. 101, no. 9, pp. 2008–2024, Sep. 2013.
[2]
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.
[3]
M. H. Pinson and S. Wolf, “A new standardized method for objectively measuring video quality,” IEEE Trans. Broadcast., vol. 50, no. 3, pp. 312–322, Sep. 2004.
[4]
P. V. Vu and D. M. Chandler, “ViS3: An algorithm for video quality assessment via analysis of spatial and spatiotemporal slices,” J. Electron. Imag., vol. 23, no. 1, p. 013016, 2014.
[5]
K. Seshadrinathan and A. C. Bovik, “Motion tuned spatio-temporal quality assessment of natural videos,” IEEE Trans. Image Process., vol. 19, no. 2, pp. 335–350, Feb. 2010.
[6]
R. Soundararajan and A. C. Bovik, “Video quality assessment by reduced reference spatio-temporal entropic differencing,” IEEE Trans. Circuits Syst. Video Technol., vol. 23, no. 4, pp. 684–694, Apr. 2012.
[7]
J. You, A. Perkis, and T. Ebrahimi, “Attention driven foveated video quality assessment,” IEEE Trans. Image Process., vol. 23, no. 1, pp. 200–213, Jan. 2014.
[8]
Q. Guo, X. Lu, and Y. Yuan, “Video quality assessment via supervised topic model,” in Proc. IEEE China Summit Int. Conf. Signal Inf. Process. (ChinaSIP), Jul. 2014, pp. 636–640.
[9]
W. Lu, X. Li, X. Gao, W. Tang, J. Li, and D. Tao, “A video quality assessment metric based on human visual system,” Cognit. Comput., vol. 2, no. 2, pp. 120–131, Jun. 2010.
[10]
X. Li, D. Tao, X. Gao, and W. Lu, “A natural image quality evaluation metric,” Signal Process., vol. 89, no. 4, pp. 548–555, Apr. 2009.
[11]
Video Quality Experts Group (VQEG), “Validation of reducedreference and no-reference objective models for standard definition television, phase I,” Tech. Rep., 2009. [Online]. Available: http://www.its.bldrdoc.gov/vqeg/projects/rrnr-tv/rrnr-tv.aspx
[12]
G. Valenzise, S. Magni, M. Tagliasacchi, and S. Tubaro, “No-reference pixel video quality monitoring of channel-induced distortion,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 4, pp. 605–618, Apr. 2012.
[13]
F. Zhang, W. Lin, Z. Chen, and K. N. Ngan, “Additive log-logistic model for networked video quality assessment,” IEEE Trans. Image Process., vol. 22, no. 4, pp. 1536–1547, Apr. 2013.
[14]
S. Kanumuri, P. C. Cosman, A. R. Reibman, and V. A. Vaishampayan, “Modeling packet-loss visibility in MPEG-2 video,” IEEE Trans. Multimedia, vol. 8, no. 2, pp. 341–355, Apr. 2006.
[15]
K. Zhu, C. Li, V. Asari, and D. Saupe, “No-reference video quality assessment based on artifact measurement and statistical analysis,” IEEE Trans. Circuits Syst. Video Technol., vol. 25, no. 4, pp. 533–546, Apr. 2014.
[16]
T. Brandao and M. P. Queluz, “No-reference quality assessment of H.264/AVC encoded video,” IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 11, pp. 1437–1447, Nov. 2010.
[17]
X. Lin, H. Ma, L. Luo, and Y. Chen, “No-reference video quality assessment in the compressed domain,” IEEE Trans. Consum. Electron., vol. 58, no. 2, pp. 505–512, May 2012.
[18]
J. Xu, P. Ye, Y. Liu, and D. Doermann, “No-reference video quality assessment via feature learning,” in Proc. IEEE Int. Conf. Image Process., Oct. 2014, pp. 491–495.
[19]
A. Mittal, M. A. Saad, and A. C. Bovik, “Zero shot prediction of video quality using intrinsic video statistics,” Proc. SPIE, vol. 9014, p. 90140R, Feb. 2014.
[20]
M. A. Saad, A. C. Bovik, and C. Charrier, “Blind prediction of natural video quality,” IEEE Trans. Image Process., vol. 23, no. 3, pp. 1352–1365, Mar. 2014.
[21]
R. Muijs and I. Kirenko, “A no-reference blocking artifact measure for adaptive video processing,” in Proc. Eur. Signal Process. Conf., Sep. 2005, pp. 1–4.
[22]
P. Marziliano, F. Dufaux, S. Winkler, and T. Ebrahimi, “A no-reference perceptual blur metric,” in Proc. IEEE Int. Conf. Image Process., vol. 3. Sep. 2002, pp. III-57–III-60.
[23]
T. Yamada, Y. Miyamoto, and M. Serizawa, “No-reference video quality estimation based on error-concealment effectiveness,” in Proc. Packet Video, Nov. 2007, pp. 288–293.
[24]
K.-C. Yang, C. C. Guest, K. El-Maleh, and P. K. Das, “Perceptual temporal quality metric for compressed video,” IEEE Trans. Multimedia, vol. 9, no. 7, pp. 1528–1535, Nov. 2007.
[25]
X. Gao, F. Gao, D. Tao, and X. Li, “Universal blind image quality assessment metrics via natural scene statistics and multiple kernel learning,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 12, pp. 2013–2026, Dec. 2013.
[26]
L. He, D. Tao, X. Li, and X. Gao, “Sparse representation for blind image quality assessment,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit., Jun. 2012, pp. 1146–1153.
[27]
A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference image quality assessment in the spatial domain,” IEEE Trans. Image Process., vol. 21, no. 12, pp. 4695–4708, Dec. 2012.
[28]
T. Brandão and M. P. Queluz, “No-reference image quality assessment based on DCT domain statistics,” Signal Process., vol. 88, no. 4, pp. 822–833, Apr. 2008.
[29]
W. S. Geisler, “Visual perception and the statistical properties of natural scenes,” Annu. Rev. Psychol., vol. 59, pp. 167–192, Jan. 2008.
[30]
S. Sawant and D. A. Adjeroh, “Balanced multiple description coding for 3D DCT video,” IEEE Trans. Broadcast., vol. 57, no. 4, pp. 765–776, Dec. 2011.
[31]
J. Roese, W. Pratt, and G. Robinson, “Interframe cosine transform image coding,” IEEE Trans. Commun., vol. 25, no. 11, pp. 1329–1339, Nov. 1977.
[32]
M. Servais and G. de Jager, “Video compression using the three dimensional discrete cosine transform (3D-DCT),” in Proc. South African Symp. Commun. Signal Process., Sep. 1997, pp. 27–32.
[33]
N. Božinović and J. Konrad, “Motion analysis in 3D DCT domain and its application to video coding,” Signal Process., Image Commun., vol. 20, no. 6, pp. 510–528, Jul. 2005.
[34]
E. A. DeYoe and D. C. Van Essen, “Concurrent processing streams in monkey visual cortex,” Trends Neurosci., vol. 11, no. 5, pp. 219–226, 1988.
[35]
G. E. Legge, “Sustained and transient mechanisms in human vision: Temporal and spatial properties,” Vis. Res., vol. 18, no. 1, pp. 69–81, 1978.
[36]
C. A. Burbeck and D. H. Kelly, “Spatiotemporal characteristics of visual mechanisms: Excitatory-inhibitory model,” J. Opt. Soc. Amer., vol. 70, no. 9, pp. 1121–1126, 1980.
[37]
J. H. Van Hateren, “Spatiotemporal contrast sensitivity of early vision,” Vis. Res., vol. 33, no. 2, pp. 257–267, Jan. 1993.
[38]
M. A. Saad, A. C. Bovik, and C. Charrier, “Blind image quality assessment: A natural scene statistics approach in the DCT domain,” IEEE Trans. Image Process., vol. 21, no. 8, pp. 3339–3352, Aug. 2012.
[39]
B.-L. Yeo and B. Liu, “Volume rendering of DCT-based compressed 3D scalar data,” IEEE Trans. Vis. Comput. Graphics, vol. 1, no. 1, pp. 29–43, Mar. 1995.
[40]
K. Seshadrinathan, R. Soundararajan, A. C. Bovik, and L. K. Cormack, “Study of subjective and objective quality assessment of video,” IEEE Trans. Image Process., vol. 19, no. 6, pp. 1427–1441, Jun. 2010.
[41]
F. Muller, “Distribution shape of two-dimensional DCT coefficients of natural images,” Electron. Lett., vol. 29, no. 22, pp. 1935–1936, Oct. 1993.
[42]
R. L. Joshi and T. R. Fischer, “Comparison of generalized Gaussian and Laplacian modeling in DCT image coding,” IEEE Signal Process. Lett., vol. 2, no. 5, pp. 81–82, May 1995.
[43]
K. Sharifi and A. Leon-Garcia, “Estimation of shape parameter for generalized Gaussian distributions in subband decompositions of video,” IEEE Trans. Circuits Syst. Video Technol., vol. 5, no. 1, pp. 52–56, Feb. 1995.
[44]
D. J. Field, “Relations between the statistics of natural images and the response properties of cortical cells,” J. Opt. Soc. Amer. A, vol. 4, no. 12, pp. 2379–2394, 1987.
[45]
D. W. Dong and J. J. Atick, “Statistics of natural time-varying images,” Netw., Comput. Neural Syst., vol. 6, no. 3, pp. 345–358, 1995.
[46]
L. Ma, S. Li, F. Zhang, and K. N. Ngan, “Reduced-reference image quality assessment using reorganized DCT-based image representation,” IEEE Trans. Multimedia, vol. 13, no. 4, pp. 824–829, Aug. 2011.
[47]
C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, pp. 27:1–27:27, 2011.
[48]
F. De Simone, M. Naccari, M. Tagliasacchi, F. Dufaux, S. Tubaro, and T. Ebrahimi, “Subjective assessment of H.264/AVC video sequences transmitted over a noisy channel,” in Proc. Int. Workshop Quality Multimedia Exper., Jul. 2009, pp. 204–209.
[49]
Video Quality Experts Group (VQEG), “Report on the validation of video quality models for high definition video content,” Tech. Rep., 2010. [Online]. Available: http://www.its.bldrdoc.gov/vqeg/projects/hdtv/hdtv.aspx
[50]
S. Boussakta and H. O. Alshibami, “Fast algorithm for the 3-D DCT-II,” IEEE Trans. Signal Process., vol. 52, no. 4, pp. 992–1001, Apr. 2004.
[51]
L. Ma, K. N. Ngan, F. Zhang, and S. Li, “Adaptive block-size transform based just-noticeable difference model for images/videos,” Signal Process., Image Commun., vol. 26, no. 3, pp. 162–174, Mar. 2011.

Cited By

View all
  • (2025)A no-reference video quality assessment method with bidirectional hierarchical semantic representationSignal Processing10.1016/j.sigpro.2024.109819230:COnline publication date: 1-May-2025
  • (2025)Luminance decomposition and reconstruction for high dynamic range Video Quality AssessmentPattern Recognition10.1016/j.patcog.2024.111011158:COnline publication date: 1-Feb-2025
  • (2024)Semantic-Aware and Quality-Aware Interaction Network for Blind Video Quality AssessmentProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680598(9970-9979)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. Spatiotemporal Statistics for Video Quality Assessment
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Please enable JavaScript to view thecomments powered by Disqus.

            Information & Contributors

            Information

            Published In

            cover image IEEE Transactions on Image Processing
            IEEE Transactions on Image Processing  Volume 25, Issue 7
            July 2016
            495 pages

            Publisher

            IEEE Press

            Publication History

            Published: 01 July 2016

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 07 Mar 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2025)A no-reference video quality assessment method with bidirectional hierarchical semantic representationSignal Processing10.1016/j.sigpro.2024.109819230:COnline publication date: 1-May-2025
            • (2025)Luminance decomposition and reconstruction for high dynamic range Video Quality AssessmentPattern Recognition10.1016/j.patcog.2024.111011158:COnline publication date: 1-Feb-2025
            • (2024)Semantic-Aware and Quality-Aware Interaction Network for Blind Video Quality AssessmentProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680598(9970-9979)Online publication date: 28-Oct-2024
            • (2024)Cooperative Bargaining Game Based Adaptive Video Multicast Over Mobile Edge NetworksIEEE Transactions on Multimedia10.1109/TMM.2023.329556926(2380-2394)Online publication date: 1-Jan-2024
            • (2024)Pseudo Light Field Image and 4D Wavelet-Transform-Based Reduced-Reference Light Field Image Quality AssessmentIEEE Transactions on Multimedia10.1109/TMM.2023.327385526(929-943)Online publication date: 1-Jan-2024
            • (2024)Automatic Evaluation of Instructional Videos Based on Video Features and Student Watching ExperienceIEEE Transactions on Learning Technologies10.1109/TLT.2023.329935917(54-62)Online publication date: 1-Jan-2024
            • (2024)Blind Video Quality Prediction by Uncovering Human Video Perceptual RepresentationIEEE Transactions on Image Processing10.1109/TIP.2024.344573833(4998-5013)Online publication date: 1-Jan-2024
            • (2024)Blind video quality assessment based on Spatio-Temporal Feature ResolverNeurocomputing10.1016/j.neucom.2024.127249574:COnline publication date: 14-Mar-2024
            • (2024)FAVERImage Communication10.1016/j.image.2024.117101122:COnline publication date: 1-Mar-2024
            • (2023)2BiVQA: Double Bi-LSTM-based Video Quality Assessment of UGC VideosACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363217820:4(1-22)Online publication date: 8-Nov-2023
            • Show More Cited By

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media