CN108377387A - Virtual reality method for evaluating video quality based on 3D convolutional neural networks - Google Patents
Virtual reality method for evaluating video quality based on 3D convolutional neural networks Download PDFInfo
- Publication number
- CN108377387A CN108377387A CN201810240647.XA CN201810240647A CN108377387A CN 108377387 A CN108377387 A CN 108377387A CN 201810240647 A CN201810240647 A CN 201810240647A CN 108377387 A CN108377387 A CN 108377387A
- Authority
- CN
- China
- Prior art keywords
- video
- cnn
- videos
- frame
- virtual reality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N17/00—Diagnosis, testing or measuring for television systems or their details
- H04N17/004—Diagnosis, testing or measuring for television systems or their details for digital television systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/154—Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
The present invention relates to a kind of stereo image quality evaluation methods based on 3D CNN, include the following steps:Video pre-filtering:VR differential videos are obtained using the left view video and right view video of VR videos, frame is uniformly taken out from differential video, give each frame nonoverlapping stripping and slicing, the video block of each frame same position constitutes a VR video patch, to generate training of enough data for 3D CNN;Establish 3D CNN models;Training 3D CNN models:Utilize stochastic gradient descent method, it is input with VR video patches, each patch mixes original video mass fraction as label, it is inputted network in batches, each layer weight of network is fully optimized after successive ignition, finally obtains the convolutional neural networks model that can be used for evaluating virtual reality video quality;Obtain final result.The present invention improves method for objectively evaluating accuracy rate.
Description
Technical field
The invention belongs to field of video processing, are related to virtual reality method for evaluating video quality.
Background technology
The emulation and interaction technique new as one --- virtual reality (VR) technology many fields as building, game with
It is used in military affairs, it can create a virtual environment consistent with the rule of real world, or establish one and completely disengage
The simulated environment of reality, this can bring the more true hearing experience of people and when participating in the cintest experience [1].As the important of virtual reality
Carrier, it is panoramic stereoscopic video currently to be defined closest to VR videos, plays huge effect.However, VR videos are being adopted
During collection, storage and transmission due to equipment and processing means etc., some distortions, Jin Erying are inevitably introduced
Ring the quality of VR videos.Therefore, it is most important to study a kind of evaluation method of energy effective evaluation virtual reality video quality.But
It is that subjective evaluation method is easily interfered by many factors, and time-consuming and laborious, evaluation result is also not sufficiently stable.Opposite subjective assessment,
The quality of objective evaluation evaluation image in the form of software, while it being not required to participant and a large amount of subjective test, it is easy to operate, and
It is highly relevant with subjective assessment, increasingly paid close attention to by correlative study person.
Since virtual reality technology has just emerged in recent years, there is presently no commented with objective for VR video specifications standard
Valence system [2].VR videos are realistic, feeling of immersion, the characteristics such as three-dimensional sense [3], in conventional Multi Media type neutrality volumetric video
The characteristics of with VR videos, is closest, therefore, the think of that evaluation needs to refer to current stereoscopic video quality evaluation is carried out to VR videos
Think.It is the evaluation side based on human visual system (HVS) that the method for objectively evaluating of current three-dimensional video-frequency, which mainly has three classes, the first kind,
Method.Second class is the evaluation method based on characteristics of image and combination machine learning.Third class is the evaluation side using deep learning
Method.The above method all has good reference to the evaluation of VR video objectives.
[1]Minderer M,Harvey C D,Donato F,et al.Neuroscience:Virtual reality
explored.[J]. Nature,2016,533(7603):324.
[2]X.Ge,L.Pan,Q.Li.Multi-Path Cooperative Communications Networks
forAugmented and Virtual Reality Transmission.IEEE Transactions on
Multimedia,vol.19,no.10,pp.2345-2358, 2017.
[3]Hosseini M,Swaminathan V.Adaptive 360VR Video Streaming:Divide and
Conquer[C]//IEEE International Symposium on Multimedia.IEEE,2017:107-110.
Invention content
It is an object of the invention to establish a VR method for evaluating video quality for fully considering virtual reality characteristic.This hair
The VR video objective quality evaluation methods of bright proposition allow machine to carry using deep learning model 3D convolutional neural networks (3D CNN)
Take VR video features, rather than traditional manual extraction feature, and newest deep learning model 3D CNN can be examined fully
Consider the temporal motion information of video.At the same time the present invention devises fitting VR video productions and merges plan with the score of the feature of broadcasting
Slightly, to make accurately and objective appraisal.Technical solution is as follows:
A kind of stereo image quality evaluation method based on 3D CNN, evaluation method include the following steps:
1) video pre-filtering:VR differential videos are obtained using the left view video and right view video of VR videos, from difference
Frame is uniformly taken out in video, and the nonoverlapping stripping and slicing of each frame, the video block of each frame same position is given to constitute a VR video patch,
To generate training of enough data for 3D CNN;
2) 3D CNN models are established:Including two convolutional layers, two pond layers and two full articulamentums, activation primitive uses
Rectification linear unit (ReLU) prevents over-fitting using Dropout strategies;Then structure and training parameter in the layer of adjustment network
To reach better classifying quality;
3) training 3D CNN models:It is input with VR video patches using stochastic gradient descent method, each patch mixes original
Video quality score is inputted network in batches as label, and each layer weight of network obtains fully excellent after successive ignition
Change, finally obtains the convolutional neural networks model that can be used for evaluating virtual reality video quality;
4) final result is obtained:Used 3D CNN obtain the score of VR video patches, recycle score convergence strategy difference
Different weights is assigned to the VR video patches of different location, is weighted to obtain final virtual reality video quality objective and comment
Valence score.
VR video objective quality evaluation methods proposed by the invention utilize newest deep learning model, can extract VR
The more high-dimensional feature of video is not only not necessarily to the feature of manual extraction video, learns the feature that extraction needs using machine itself,
The movable information of video time domain is fully taken into account simultaneously.In addition to this present invention combines making and the feature of broadcasting of VR videos, right
Different video patch scores gives different weights and is weighted, and then integrates statement VR videos using score convergence strategy
Objective quality.The video pre-filtering method that the present invention takes is simple, has stronger practicability, the test model consumption proposed
When it is small, it is easily operated.The VR video quality objective assessments result that this method obtains and subjective evaluation result have very high consistent
Property, it can accurately reflect the quality of VR videos.
Description of the drawings
Fig. 1 VR video pre-filtering flow charts.
Fig. 2 3D CNN network frame figures.
Fig. 3 3D trellis diagrams.
The subjective and objective fractional relationship scatter plots of Fig. 4:(a) symmetrical distortion, (b) H.264 asymmetric distortion, (c) be distorted, (d)
JPEG2000 is distorted.
Specific implementation mode
Stereo image quality evaluation method provided by the invention based on 3D CNN, each VR videos that are distorted are to by left video
VlWith right video VrComposition, evaluation method include the following steps:
The first step:Difference video V is built according to three-dimensional perception principled.It is first that original VR videos and distortion VR videos is every
Then one frame gray processing utilizes left video VlWith right video VrThe difference video needed.It calculates at video location (x, y, z)
On and value video VdValue such as formula (1) shown in:
Vd(x, y, z)=| Vl(x,y,z)-Vr(x,y,z)| (1)
Second step:By VR difference video strippings and slicings to constitute video patch, to the capacity of EDS extended data set.Specifically,
1 frame is extracted every 8 frames from all VR differences videos, extracts N frames altogether.It is cut into 32 × 32 pictures in the same position of extraction frame
Then the image block of same video, same position is constituted a VR video patch by the square image blocks of plain size.In order to fill
Divide extraction sdi video information, each frame uniformly nonoverlapping cutting image block, each frame video should be cut into M image block.
The video patch that M size is 32 × 32 × N can be extracted altogether according to each VR videos of the difference of resolution sizes.
Third walks:Structure and training 3D CNN deep learning models.Model of the present invention is by two 3D convolutional layers, two 3D
Pond layer and two full articulamentums form.On the basis of 2D CNN, 3D CNN consider the information between multiple input, can have
The movable information of effect extraction video time domain, it is therefore necessary to illustrate the convolution and pond process of 3D CNN.The formula of 3D convolution is:
Wherein k indicates that the index of the Feature Mapping in (l-1) layer is connected to current convolution kernel,It indicates in (l-1) layer
K-th of 3D characteristic pattern,It is i-th of 3D convolution kernel in l layers, convolution existsOn.One additional bias item and one
Nonlinear terms activation function is performed to obtain final characteristic pattern.It is additional bias item, f () is nonlinear activation function,
Such as sigmoid function, hyperbolic tangent function and integer linear function.
The formula in the ponds 3D is:
Wherein m, n, j represent the size in characteristic pattern selected point region.
3D CNN structures are using stochastic gradient descent and ReLU as activation primitive in the present invention, in order to prevent over-fitting,
This invention takes dropout strategies, i.e., the dropout strategies for the use of parameter being 0.5 after each pond layer, complete at first
After articulamentum, we are tactful using the dropout that parameter is 0.25.Minibatch sizes are 128 in network, model training
Habit rate is set as 0.001.In addition, using batch normalization to accelerate network training between the subsequent activation of each convolution sum.
The object function that this model is taken is as follows:
Wherein λ represents regularization parameter, yiIndicate real estate mass fraction, f (xi) indicate prediction score.Model construction
After the completion using 80% data as training, 20% data are as test.
4th step:It obtains obtaining VR videos by score convergence strategy after video patch score by depth model finally dividing
Number.Equirectangular projection pattern of the score convergence strategy that the present invention uses according to VR videos, to the VR of different location
Video patch assigns different weights, to obtain final objective evaluation mass fraction.Equirectangular projection patterns
Video the two poles of the earth part in projection process can be made significantly to be stretched, influence the spatial distribution of VR videos under areal model.Due to
Method for evaluating objective quality be using planar video as input, and subjective assessment score be then with spherical video perception experience for according to
According to, therefore the present invention designs shown in score convergence strategy such as formula (4):
Wherein SfIndicate final score, SxyIndicate that the prediction score in the video patch of video frame position (x, y), x represent
Width position, y represent height and position, WxyIndicate that the weight of corresponding position, h indicate that the vertical height of VR videos, h' indicate video
The vertical range of patch center position VR video hubs.
5th step:Choose database.For the image prediction objective quality scores for proving the method for the present invention acquisition and subjective matter
Amount score has very high consistency, and prediction objective quality scores can accurately reflect the quality of image, by the method for the present invention in VRQ-
It is tested on TJU databases.The database includes 13 original VR videos and 364 distortion VR videos, and type of distortion includes
H.264 and include symmetrical distortion and asymmetric distortion, wherein symmetrical distortion video 104, asymmetric mistake simultaneously with JPEG2000
True 260.
Take 4 in the world commonly weigh Objective image quality evaluation algorithms index evaluation the method for the present invention performance, 4
A index be respectively Pearson's linearly dependent coefficient (Pearson linear correlation coefficient, PLCC),
Spearman sequence related coefficient (Spearman rank-order correlation coefficient, SRCC), Ken Deer
Rank related coefficient (Kendallrank-order correlation coefficient, KROCC) and root-mean-square error
(Root Mean SquaredError, RMSE).For the value of three above related coefficient closer to 1, RMSE value is smaller, explanation
Algorithm is more accurate.
6th step:Analysis and comparison algorithm performance.The verification present invention for VR video quality evaluations specific aim and have
Effect property, the present invention are directed to image quality evaluation IQA, and stereo image quality evaluates SIQA, video quality evaluation VQA, three-dimensional video-frequency
Quality evaluation SVQA respectively refers to a kind of method contrast verification in the database, corresponds to [1] respectively successively, [2], [3] and
[4]。
1 integrated data index of table
The different type of distortion indexs of 2 present invention of table
[1]A.Liu,W.Lin,and M Narwaria.Image quality assessment basedon
gradient similarity. IEEE Transactions on Image Processing APublication of
the IEEE Signal Processing Society, 21(4):1500,2012.
[2]Alexandre Benoit,Patrick Le Callet,Patrizio Campisi,and
RomainCousseau.Using disparity for quality assessment of stereoscopic
images.In IEEE International Conference on Image Processing,pages 389–392,
2008.
[3]Kalpana Seshadrinathan,Rajiv Soundararajan,Alan Conrad Bovik,
andLawrence K Cormack.Study of subjective and objective quality assessment of
video.IEEE Transactions on Image Processing,19(6):1427–1441,2010.
[4]Nukhet Ozbek and A.Murat Tekalp.Unequal inter-view rate
allocationusing scalable stereo video coding and an objective stereo video
qualitymeasure.In IEEE Intern。
Claims (1)
1. a kind of stereo image quality evaluation method based on 3D CNN, evaluation method include the following steps:
1) video pre-filtering:VR differential videos are obtained using the left view video and right view video of VR videos, from differential video
In uniformly take out frame, give the nonoverlapping stripping and slicing of each frame, the video block of each frame same position to constitute a VR video patch, with production
Raw enough data are used for the training of 3D CNN;
2) 3D CNN models are established:Including two convolutional layers, two pond layers and two full articulamentums, activation primitive uses rectification
Linear unit (ReLU) prevents over-fitting using Dropout strategies;Structure and training parameter are to reach in the layer of subsequent adjustment network
To better classifying quality;
3) training 3D CNN models:It is input with VR video patches using stochastic gradient descent method, each patch mixes original video
Mass fraction is inputted network in batches as label, and each layer weight of network is fully optimized after successive ignition, most
The convolutional neural networks model that can be used for evaluating virtual reality video quality is obtained eventually;
4) final result is obtained:Used 3D CNN obtain the score of VR video patches, recycle score convergence strategy respectively to not
VR video patches with position assign different weights, are weighted to obtain final virtual reality video quality objective assessment point
Number.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810240647.XA CN108377387A (en) | 2018-03-22 | 2018-03-22 | Virtual reality method for evaluating video quality based on 3D convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810240647.XA CN108377387A (en) | 2018-03-22 | 2018-03-22 | Virtual reality method for evaluating video quality based on 3D convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108377387A true CN108377387A (en) | 2018-08-07 |
Family
ID=63019046
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810240647.XA Pending CN108377387A (en) | 2018-03-22 | 2018-03-22 | Virtual reality method for evaluating video quality based on 3D convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108377387A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109615627A (en) * | 2018-12-14 | 2019-04-12 | 国网山东省电力公司信息通信公司 | A kind of power transmission and transformation inspection image quality evaluating method and system |
CN109871124A (en) * | 2019-01-25 | 2019-06-11 | 华南理工大学 | Emotion virtual reality scenario appraisal procedure based on deep learning |
CN113853796A (en) * | 2019-05-22 | 2021-12-28 | 诺基亚技术有限公司 | Methods, apparatuses and computer program products for volumetric video encoding and decoding |
US11315354B2 (en) | 2018-12-24 | 2022-04-26 | Samsung Electronics Co., Ltd. | Method and apparatus that controls augmented reality (AR) apparatus based on action prediction |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015017796A2 (en) * | 2013-08-02 | 2015-02-05 | Digimarc Corporation | Learning systems and methods |
CN105898279A (en) * | 2016-06-01 | 2016-08-24 | 宁波大学 | Stereoscopic image quality objective evaluation method |
US20170270653A1 (en) * | 2016-03-15 | 2017-09-21 | International Business Machines Corporation | Retinal image quality assessment, error identification and automatic quality correction |
CN107633513A (en) * | 2017-09-18 | 2018-01-26 | 天津大学 | The measure of 3D rendering quality based on deep learning |
CN107766249A (en) * | 2017-10-27 | 2018-03-06 | 广东电网有限责任公司信息中心 | A kind of software quality comprehensive estimation method of Kernel-based methods monitoring |
-
2018
- 2018-03-22 CN CN201810240647.XA patent/CN108377387A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015017796A2 (en) * | 2013-08-02 | 2015-02-05 | Digimarc Corporation | Learning systems and methods |
US20170270653A1 (en) * | 2016-03-15 | 2017-09-21 | International Business Machines Corporation | Retinal image quality assessment, error identification and automatic quality correction |
CN105898279A (en) * | 2016-06-01 | 2016-08-24 | 宁波大学 | Stereoscopic image quality objective evaluation method |
CN107633513A (en) * | 2017-09-18 | 2018-01-26 | 天津大学 | The measure of 3D rendering quality based on deep learning |
CN107766249A (en) * | 2017-10-27 | 2018-03-06 | 广东电网有限责任公司信息中心 | A kind of software quality comprehensive estimation method of Kernel-based methods monitoring |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109615627A (en) * | 2018-12-14 | 2019-04-12 | 国网山东省电力公司信息通信公司 | A kind of power transmission and transformation inspection image quality evaluating method and system |
US11315354B2 (en) | 2018-12-24 | 2022-04-26 | Samsung Electronics Co., Ltd. | Method and apparatus that controls augmented reality (AR) apparatus based on action prediction |
CN109871124A (en) * | 2019-01-25 | 2019-06-11 | 华南理工大学 | Emotion virtual reality scenario appraisal procedure based on deep learning |
CN109871124B (en) * | 2019-01-25 | 2020-10-27 | 华南理工大学 | Emotion virtual reality scene evaluation method based on deep learning |
CN113853796A (en) * | 2019-05-22 | 2021-12-28 | 诺基亚技术有限公司 | Methods, apparatuses and computer program products for volumetric video encoding and decoding |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107437092B (en) | The classification method of retina OCT image based on Three dimensional convolution neural network | |
CN110555434B (en) | Method for detecting visual saliency of three-dimensional image through local contrast and global guidance | |
Zhou et al. | Binocular responses for no-reference 3D image quality assessment | |
CN108377387A (en) | Virtual reality method for evaluating video quality based on 3D convolutional neural networks | |
CN109360178B (en) | Fusion image-based non-reference stereo image quality evaluation method | |
CN108389192A (en) | Stereo-picture Comfort Evaluation method based on convolutional neural networks | |
Fang et al. | Stereoscopic image quality assessment by deep convolutional neural network | |
CN109166144A (en) | A kind of image depth estimation method based on generation confrontation network | |
CN108449595A (en) | Virtual reality method for evaluating video quality is referred to entirely based on convolutional neural networks | |
CN106462771A (en) | 3D image significance detection method | |
CN109831664B (en) | Rapid compressed stereo video quality evaluation method based on deep learning | |
Yang et al. | Predicting stereoscopic image quality via stacked auto-encoders based on stereopsis formation | |
CN108259893B (en) | Virtual reality video quality evaluation method based on double-current convolutional neural network | |
Yue et al. | Blind stereoscopic 3D image quality assessment via analysis of naturalness, structure, and binocular asymmetry | |
CN108235003B (en) | Three-dimensional video quality evaluation method based on 3D convolutional neural network | |
CN109389591A (en) | Color image quality evaluation method based on colored description | |
CN110516716A (en) | Non-reference picture quality appraisement method based on multiple-limb similarity network | |
CN109167996A (en) | It is a kind of based on convolutional neural networks without reference stereo image quality evaluation method | |
CN109523513A (en) | Based on the sparse stereo image quality evaluation method for rebuilding color fusion image | |
CN103780895B (en) | A kind of three-dimensional video quality evaluation method | |
CN108520510B (en) | No-reference stereo image quality evaluation method based on overall and local analysis | |
CN110490252A (en) | A kind of occupancy detection method and system based on deep learning | |
Kim et al. | Binocular fusion net: deep learning visual comfort assessment for stereoscopic 3D | |
CN107371016A (en) | Based on asymmetric distortion without with reference to 3D stereo image quality evaluation methods | |
CN109788275A (en) | Naturality, structure and binocular asymmetry are without reference stereo image quality evaluation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180807 |
|
RJ01 | Rejection of invention patent application after publication |