CN116600107B - HEVC-SCC quick coding method and device based on IPMS-CNN and spatial neighboring CU coding modes - Google Patents
HEVC-SCC quick coding method and device based on IPMS-CNN and spatial neighboring CU coding modes Download PDFInfo
- Publication number
- CN116600107B CN116600107B CN202310893891.7A CN202310893891A CN116600107B CN 116600107 B CN116600107 B CN 116600107B CN 202310893891 A CN202310893891 A CN 202310893891A CN 116600107 B CN116600107 B CN 116600107B
- Authority
- CN
- China
- Prior art keywords
- mode
- modes
- cnn
- network model
- ipms
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000000605 extraction Methods 0.000 claims description 17
- 238000013139 quantization Methods 0.000 claims description 17
- 238000010276 construction Methods 0.000 claims description 5
- 238000004519 manufacturing process Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 4
- 238000010200 validation analysis Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 8
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The application discloses an HEVC-SCC quick coding method and device based on IPMS-CNN and space domain adjacent CU coding modes, which combines a method for predicting a large-size CU mode by a convolutional neural network with a method for predicting a small-size CU mode based on the number of modes adopted by the space adjacent CU, aims at reducing coding time while maintaining coding quality and reducing calculation complexity, and comprises the steps of firstly constructing a database and training a convolutional neural network model selected by an IBC/PLT mode; secondly, the input CTU passes through a mode selection network, and a mode prediction label of the CTU is output; finally, the mode selected by the current CU is predicted by counting the number of modes used by the neighboring 3 CUs. The application can save the encoding time and reduce the calculation complexity of the screen content video.
Description
Technical Field
The application relates to the field of video coding, in particular to an HEVC-SCC (high efficiency video coding-fast coding) method and device based on IPMS-CNN and spatial adjacent CU coding modes.
Background
In recent years, with the rapid development of fields such as computer vision, multimedia technology, man-machine interaction, etc., application programs of screen content videos (Screen Content Video, SCV) such as screen sharing, wireless display, remote education, etc. are continuously emerging, and a great challenge is presented to a video encoding method for processing natural videos in the prior art. Standards for traditional processing of natural video, such as high efficiency video coding (High Effcient Video Coding, HEVC), are specifically formulated for compressing natural video content captured by a camera. The screen content video is mainly generated by a computer, and generally has the characteristics of large-area uniform plane, repeated patterns and characters, limited color types, high saturation, high image contrast, sharp edges and the like. Compression tends to be poor if screen content video is still processed with conventional video coding standards. To take advantage of these special features of screen content video, the screen content coding (Screen Content Coding, SCC) standard was therefore developed on an HEVC basis in conjunction with the video coding group: HEVC-SCC. The standard adds four new modes: intra Block Copy (IBC), palette Mode (PLT), adaptive color transform (Adaptive Color Transform, ACT), adaptive motion vector resolution (Adaptive Motion Vector Resolution, AMVR).
Of these four modes, IBC and PLT are two main modes of improving compression performance. IBC mode helps to encode repeated modes within the same frame, while PLT mode aims at encoding with some limited dominant colors. Although the addition of these two tools can significantly improve the coding performance of SCC, its coding complexity also increases significantly.
Disclosure of Invention
The application aims to overcome the defects in the prior art, and provides an HEVC-SCC rapid coding method and device based on IPMS-CNN and space domain adjacent CU coding modes, which can save coding time, reduce calculation complexity of screen content video and accelerate the HEVC-SCC coding process while ensuring subjective quality.
The application adopts the following technical scheme:
in one aspect, an HEVC-SCC fast encoding method based on IPMS-CNN and spatial neighboring CU encoding modes includes:
a data set manufacturing step, namely, creating video sequence data sets with different resolutions, and encoding to obtain whether each CU of HEVC-SCC uses a real tag of an IBC/PLT mode under different quantization parameters;
a network model construction step of constructing a network model IPMS-CNN comprising an input layer, a feature extraction layer and an output expression layer; three convolution layers are built in the feature extraction layer to extract three feature images, and the feature images obtained after downsampling are added to extract four feature images with different sizes;
training a network model based on the manufactured data set to acquire a trained IBC/PLT mode selection convolutional neural network IPMS-CNN model;
a network model prediction step of inputting LCU into the trained IPMS-CNN to obtain a mode prediction label so as to predict the mode selection of CTU;
a current CU mode prediction step of calculating the number of IBC/PLT modes used by 3 adjacent CUsAnd the number of Intra modes used by the neighboring 3 CUsJointly predicting a current 8×8CU mode according to two quantitative relationships;
and an encoding step, wherein the encoder calls a prediction label based on the network model prediction step, and predicts the CU dividing result together with the current CU mode prediction step.
Preferably, the data set making step specifically includes:
making three video sequence data sets with different resolutions, wherein the data sets comprise a picture data set and a video data set, and the three video sequence data sets comprise three types of TGM/M, A, CC video sequences;
and then encoding by a standard encoding software platform, and setting mode labels of IBC/PLT modes of the CUs under different quantization parameters QP under the full-frame configuration.
Preferably, the data set includes: training set, validation set and test set; each of the training set, the validation set, and the test set comprises three subsets; the first subset has a resolution of 1024×576, the second subset has a resolution of 1792×1024, and the third subset has a resolution of 2304×1280.
Preferably, the quantization parameter comprises four quantization levels, 22, 27, 32 and 37 respectively.
Preferably, in the step of constructing the network model, three convolution layers are built in the feature extraction layer, three feature graphs are extracted, and simultaneously the feature graphs after downsampling are directly sent to a connection layer of the network.
Preferably, the output expression layer comprises a full connection layer; the feature vector of the full connection layer is added with a quantization parameter QP.
Preferably, the loss function of the network model is as follows:
wherein,the cross entropy representing the true and predicted values,、、the real mode labels respectively representing the first stage 64 x 64, the second stage 32 x 32 and the third stage 16 x 16CU,a real mode tag representing 64 x 64CTU,,a real mode tag representing 4 32 x 32 CTUs,a real mode tag representing 4×4 16×16 CTUs; in the same way, the processing method comprises the steps of,、、representing the prediction tags of the first stage 64 x 64, the second stage 32 x 32, and the third stage 16 x 16,prediction mode labels representing 64 x 64 CTUs,,a prediction mode flag indicating 4 32 x 32 CTUs,prediction mode labels representing 4×4 16×16 CTUs; the predicted label and the real label of the network are binarized and the range is [0,1 ]]Between them.
Preferably, in the network model prediction step, the network model outputs 21 binary labels indicating whether the CTUs of 64×64, 32×32, 16×16 are to be divided and whether IBC/PLT mode is to be selected on the basis of the division.
Preferably, the step of predicting the current CU mode specifically includes:
when the CU size is 8×8, calculating IBC, the number of PLT modes, and the number of Intra modes used by the neighboring 3 CUs;
in particular, whenWhen the candidate mode is only an Intra mode; when (when)And is also provided withWhen the candidate modes are IBC and PLT modes; when (when)And is also provided withWhen the candidate modes are Intra, IBC and PLT modes.
On the other hand, the HEVC-SCC fast coding device based on the IPMS-CNN and the spatial neighboring CU coding modes comprises:
the data set making module is used for establishing video sequence data sets with different resolutions and encoding the video sequence data sets to obtain whether each CU of HEVC-SCC under different quantization parameters uses a real tag of an IBC/PLT mode or not;
the network model building module builds a network model IPMS-CNN comprising an input layer, a feature extraction layer and an output expression layer; three convolution layers are built in the feature extraction layer, and three feature graphs are extracted;
the network model training module is used for training the constructed network model based on the manufactured data set to obtain a trained network model IPMS-CNN;
the network model prediction module inputs the CTU to the trained IPMS-CNN to obtain a mode prediction label so as to predict the mode selection of the CTU;
the current CU mode prediction module calculates the number of IBC/PLT modes used by 3 adjacent CUsAnd the number of Intra modes used by the neighboring 3 CUsJointly predicting a current 8×8CU mode according to two quantitative relationships;
and the coding module is used for calling a prediction label based on the network model prediction step and predicting the CU dividing result together with the current CU mode prediction step by the coder.
Compared with the prior art, the application has the following beneficial effects:
(1) Firstly, constructing a database, and training a convolutional neural network model (IPMS-CNN) of IBC/PLT mode selection; secondly, the input CTU passes through a mode selection network, and a mode prediction label of the CTU is output; finally, predicting the mode selected by the current CU by counting the number of modes used by the adjacent 3 CUs, and reducing the coding time and the calculation complexity of the screen content video while maintaining the coding quality;
(2) The application adopts a network structure of four scale feature fusion, wherein the feature map after downsampling is sent to a connecting layer together with the feature map obtained by a subsequent convolution layer, the feature map of the convolution layer provides a part of deep features, the feature map of the convolution layer provides some shallow features, and the combination of the shallow features and the deep features not only can increase the quantity of training data, but also can provide more feature information for the full connecting layer, thereby improving the feature expression capability of the model and the accuracy of prediction mode selection;
(3) The application adds the QP external vector in the full connection layer of the output expression layer, so that the model can learn better how to select the best coding mode under different QPs, and can adapt to various QP values better, thereby generating a reconstructed video with higher quality;
(4) The method for predicting the large-size CU mode by the convolutional neural network and the method for predicting the small-size CU mode based on the mode number adopted by the spatially adjacent CUs are combined, so that the modes of the CUs can be predicted more accurately, and the complexity of mode selection is reduced.
Drawings
FIG. 1 is a flow chart of an HEVC-SCC fast encoding method based on IPMS-CNN and spatial neighboring CU encoding modes;
FIG. 2 is a schematic diagram of an IPMS-CNN convolutional neural network according to the present application;
FIG. 3 is a schematic diagram of a current 8X 8CU and an adjacent CU of the present application;
FIG. 4 is a detailed flow chart of a method of the present application for connecting a convolutional neural network to predict large-size CU modes and to predict small-size CU modes based on the number of modes employed by spatially neighboring CUs;
FIG. 5 is a schematic diagram of an MFF-CNN network architecture according to the present application;
fig. 6 is a block diagram of an HEVC-SCC fast encoding device based on IPMS-CNN and spatial neighboring CU coding modes according to the present application.
Detailed Description
The application will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present application and are not intended to limit the scope of the present application. Furthermore, it should be understood that various changes and modifications can be made by one skilled in the art after reading the teachings of the present application, and such equivalents are intended to fall within the scope of the application as defined in the appended claims.
In order to solve the problem of high CU partition complexity in HEVC-SCC, the embodiment provides a multi-scale feature fusion (MFF-CNN) based quick partition coding method for CU in HEVC-SCC frames, which is used for accelerating coding time and reducing coding complexity while not affecting subjective quality.
Specifically, referring to fig. 1, an HEVC-SCC fast encoding method based on IPMS-CNN and spatial neighboring CU encoding modes includes:
s101, a data set manufacturing step, namely, establishing video sequence data sets with different resolutions, encoding the video sequence data sets, and acquiring whether each CU of HEVC-SCC under different quantization parameters uses a real label of an IBC/PLT mode or not;
s102, a network model construction step, namely constructing a network model IPMS-CNN comprising an input layer, a feature extraction layer and an output expression layer; three convolution layers are built in the feature extraction layer, three feature graphs are extracted by the three convolution layers built in the feature extraction layer, and four feature graphs with different sizes are extracted in total by adding the feature graphs obtained after downsampling;
s103, training a network model, namely training the constructed network model based on the manufactured data set to obtain a trained IBC/PLT mode selection convolutional neural network (IBC/PLT Mode Selection Convolution Neural Network, IPMS-CNN) model;
s104, inputting the LCU into the trained IPMS-CNN to obtain mode prediction labels of 64×64, 32×32 and 16×16CU so as to predict the mode selection of the CTU;
s105,8×8CU mode prediction step, 8×8CU mode prediction is performed by counting spatial phaseThe number of modes employed by the neighboring CU predicts the modes that the current CU may choose. Calculating the number of IBC/PLT modes used by 3 neighboring CUsAnd the number of Intra modes used by the neighboring 3 CUsJointly predicting a current 8×8CU mode according to two quantitative relationships;
s106, an encoding step, wherein the encoder calls a prediction label based on the network model prediction step, and predicts the CU dividing result together with the current CU mode prediction step.
In this embodiment, the data set manufacturing step specifically includes:
three different resolution video sequence datasets were produced: 1024×576, 1792×102, 2304×1280, the data set covering a picture data set and a video data set, and each resolution including three types of video sequences TGM/M, A, CC;
all dataset sequences were encoded by hm16.12+scm8.3, with quantization parameter QP set in All Intra (AI) configuration: 22. 27, 32, 37, obtain mode labels of IBC/PLT modes of respective CUs at four QP for the next network model training.
Referring to FIG. 2, the IPMS-CNN network of the present embodiment has three components: an input layer, a feature extraction layer, and an output expression layer. Three convolution layers are built in a characteristic extraction layer of the network, three characteristic diagrams are extracted, and the characteristic diagrams after downsampling are directly sent into a connection layer of the network. Thus, the network can obtain 12 feature maps of different scales. The network adopts a cross entropy loss function as an objective function, and the formula is as follows:
wherein,representing true values and predictionsThe cross-entropy of the values,、、real mode labels respectively representing the first stage (64×64), the second stage (32×32) and the third stage (16×16) CUs,a real mode tag representing 64 x 64CTU,,the real mode tags representing 4 32 x 32 CTUs,a real mode tag representing 4×4 16×16 CTUs; in the same way, the processing method comprises the steps of,、、the prediction labels of the first stage (64×64), the second stage (32×32) and the third stage (16×16) are respectively shown,prediction mode labels representing 64 x 64 CTUs,,a prediction mode flag indicating 4 32 x 32 CTUs,a prediction mode flag representing 4×4 16×16 CTUs; the predicted label and the real label of the network are binarized and the range is [0,1 ]]Between them.
Since the quantization parameter QP is a parameter that can control the video compression quality, the larger the QP, the higher the compression ratio thereof, but the lower the image quality after compression. Whereas in HEVC-SCC, QP is also very important for the selection of the SCC mode, as it can control the size of quantization error, affecting the final image quality. When the same video frame is encoded, the larger the QP is, the coarser the encoded CU mode is selected, and the Intra mode is often selected; while the smaller the QP, the more likely the encoded QP will find the appropriate reference to select IBC or PLT modes. Therefore, by adding the external vector of QP in the full connection layer of the output expression layer, the model can learn better how to select the optimal coding mode under different QPs, and can adapt to various QP values better, so as to generate the reconstructed video with higher quality.
In this embodiment, the network model ultimately outputs 21 binary labels indicating whether 64×64, 32×32, 16×16 CTUs are to be partitioned and whether IBC/PLT mode is to be selected based thereon.
The current CU mode predicting step specifically includes:
when (when)When the candidate mode is only an Intra mode; when (when)And is also provided withWhen the candidate modes are IBC and PLT modes; when (when)And is also provided withWhen the candidate modes are Intra, IBC and PLT modes.
Referring to fig. 3, a schematic diagram of the current 8×8CU and the neighboring CU according to the present embodiment is shown.
Referring to fig. 4, a detailed flowchart of a method for predicting a large-size CU mode by using a convolutional neural network and a method for predicting a small-size CU mode based on the number of modes used by spatially adjacent CUs according to this embodiment is shown, which is specifically as follows:
(a) The network inputs LCU, invokes Multi-scale feature fusion convolution neural network model MFF-CNN (Multi-scale Feature Fusion Convolution Neural Network) (see figure 5), IPMS-CNN mode selection model, respectively outputs 21 binary label to represent whether 64×64, 32×32, 16×16CTU will divide and whether IBC/PLT mode will be selected on the basis;
(b) When the CU size is 8×8, IBC, the number of PLT modes, and the number of Intra modes used by the neighboring 3 CUs are calculated. When (when)When the candidate mode is only an Intra mode; when (when)And is also provided withWhen the candidate mode is an IBC mode and a PLT mode; when (when)And is also provided withWhen the candidate mode is Intra, IBC, PLT mode;
(c) The encoder calls the network tag according to the step (a), predicts the CU division result together with the step (b), thereby skipping unnecessary mode traversal, reducing the encoding time and accelerating the encoding process of the screen content video.
Referring to fig. 5, a schematic diagram of the MFF-CNN network structure according to the present embodiment is shown.
Referring to fig. 6, the embodiment further discloses an HEVC-SCC fast encoding device based on IPMS-CNN and spatial neighboring CU encoding modes, including:
the data set making module 601 establishes video sequence data sets with different resolutions and codes the video sequence data sets to obtain whether each CU of HEVC-SCC under different quantization parameters uses a real tag of an IBC/PLT mode;
the network model construction module 602 constructs a network model IPMS-CNN comprising an input layer, a feature extraction layer and an output expression layer; three convolution layers are built in the feature extraction layer, and three feature graphs are extracted;
the network model training module 603 trains the constructed network model based on the produced data set to obtain a trained network model IPMS-CNN;
the network model prediction module 604 inputs the CTU to the trained IPMS-CNN to obtain a mode prediction label to predict the mode selection of the CTU;
the current CU mode prediction Module 605 calculates the number of IBC/PLT modes used by the neighboring 3 CUsAnd the number of Intra modes used by the neighboring 3 CUsJointly predicting a current 8×8CU mode according to two quantitative relationships;
the encoding module 606, the encoder invokes the prediction tag based on the network model prediction step to predict the CU partitioning result along with the current CU mode prediction step.
The specific implementation of each module of the HEVC-SCC rapid coding device based on the IPMS-CNN and the spatial neighboring CU coding modes is the same as that of the HEVC-SCC rapid coding method based on the IPMS-CNN and the spatial neighboring CU coding modes, and the embodiment is not repeated.
The foregoing is merely illustrative of specific embodiments of the present application, but the design concept of the present application is not limited thereto, and any insubstantial modification of the present application by using the design concept shall fall within the scope of the present application.
Claims (8)
1. An HEVC-SCC fast coding method based on IPMS-CNN and space domain adjacent CU coding modes is characterized by comprising the following steps:
a data set manufacturing step, namely, creating video sequence data sets with different resolutions, and encoding to obtain whether each CU of HEVC-SCC uses a real tag of an IBC/PLT mode under different quantization parameters;
a network model construction step of constructing a network model IPMS-CNN comprising an input layer, a feature extraction layer and an output expression layer; three convolution layers are built in the feature extraction layer to extract three feature images, and the feature images obtained after downsampling are added to extract four feature images with different sizes;
training a network model based on the manufactured data set to acquire a trained IBC/PLT mode selection convolutional neural network IPMS-CNN model;
a network model prediction step of inputting LCU into the trained IPMS-CNN to obtain a mode prediction label so as to predict the mode selection of CTU;
a current CU mode prediction step of calculating the number NumSCC of IBC/PLT modes used by 3 adjacent CUs Neigh And the number of Intra modes used by 3 adjacent CUs, numIntra Neigh Jointly predicting a current 8×8CU mode according to two quantitative relationships; an encoding step, wherein the encoder calls a prediction label based on the network model prediction step, and predicts a CU dividing result together with the current CU mode prediction step;
in the network model prediction step, the network model outputs 21 binary labels to indicate whether the CTU of 64×64, 32×32 and 16×16 is divided and whether the IBC/PLT mode is selected based on the division;
the current CU mode predicting step specifically includes:
when the CU size is 8×8, calculating IBC, the number of PLT modes, and the number of Intra modes used by the neighboring 3 CUs; in particular, when NumScc Neigh When=0, the candidate mode is only Intra mode; when NumSCC Neigh Not equal to 0 and NumIntra Neigh When=0, the candidate modes are IBC and PLT modes; when NumSCC Neigh Not equal to 0 and NumIntra Neigh When not equal to 0, the candidate modes are Intra, IBC, and PLT modes.
2. The HEVC-SCC fast encoding method based on IPMS-CNN and spatial neighboring CU coding modes according to claim 1, wherein the data set making step specifically includes:
making three video sequence data sets with different resolutions, wherein the data sets comprise a picture data set and a video data set, and the three video sequence data sets comprise three types of TGM/M, A, CC video sequences;
and then encoding by a standard encoding software platform, and setting mode labels of IBC/PLT modes of the CUs under different quantization parameters QP under the full-frame configuration.
3. The HEVC-SCC fast coding method based on IPMS-CNN and spatial neighboring CU coding modes according to claim 1, wherein the data set includes: training set, validation set and test set; each of the training set, the validation set, and the test set comprises three subsets; the first subset has a resolution of 1024×576, the second subset has a resolution of 1792×1024, and the third subset has a resolution of 2304×1280.
4. The HEVC-SCC fast coding method based on IPMS-CNN and spatial neighboring CU coding modes according to claim 1, wherein the quantization parameters include four quantization levels, 22, 27, 32 and 37, respectively.
5. The HEVC-SCC rapid coding method based on IPMS-CNN and spatial neighboring CU coding modes according to claim 1, wherein in the network model construction step, three convolution layers are built in a feature extraction layer, three feature graphs are extracted, and simultaneously the feature graphs after downsampling are directly sent into a connection layer of a network.
6. The HEVC-SCC fast coding method based on IPMS-CNN and spatial neighboring CU coding modes according to claim 1, wherein the output expression layer includes a full connection layer; the feature vector of the full connection layer is added with a quantization parameter QP.
7. The HEVC-SCC fast coding method based on IPMS-CNN and spatial neighboring CU coding modes according to claim 1, wherein a loss function of the network model is as follows:
where H represents the cross entropy of the true and predicted values,real mode tags representing 64×64, 32×32, and 16×16CU of the first stage, respectively,/->A real mode tag representing 64 x 64CTU, ">Real mode tag representing 4 32×32 CTUs +.>A real mode tag representing 4×4 16×16 CTUs; similarly, let go of>Prediction tags representing the first stage 64×64, the second stage 32×32, and the third stage 16×16, respectively, +.>Prediction mode tag representing 64×64CTU, < -> Prediction mode tag representing 4 32×32 CTUs ++>Prediction mode labels representing 4×4 16×16 CTUs; the predicted label and the real label of the network are binarized and the range is [0,1 ]]Between them.
8. HEVC-SCC quick coding device based on IPMS-CNN and space domain adjacent CU coding mode, characterized by comprising:
the data set making module is used for establishing video sequence data sets with different resolutions and encoding the video sequence data sets to obtain whether each CU of HEVC-SCC under different quantization parameters uses a real tag of an IBC/PLT mode or not;
the network model building module builds a network model IPMS-CNN comprising an input layer, a feature extraction layer and an output expression layer; three convolution layers are built in the feature extraction layer, and three feature graphs are extracted;
the network model training module is used for training the constructed network model based on the manufactured data set to obtain a trained network model IPMS-CNN;
the network model prediction module inputs the CTU to the trained IPMS-CNN to obtain a mode prediction label so as to predict the mode selection of the CTU;
the current CU mode prediction module calculates the number NumSCC of IBC/PLT modes used by 3 adjacent CUs Neigh And the number of Intra modes used by 3 adjacent CUs, numIntra Neight Jointly predicting a current 8×8CU mode according to two quantitative relationships; the coding module is used for calling a prediction label based on the network model prediction step by the coder, and predicting a CU dividing result together with the current CU mode prediction step;
in the network model prediction module, the network model outputs 21 binary labels to indicate whether the CTU of 64×64, 32×32 and 16×16 is divided and whether the IBC/PLT mode is selected on the basis;
the current CU mode prediction module specifically comprises:
when the CU size is8×8, calculating IBC, PLT mode number and Intra mode number used by the neighboring 3 CUs; in particular, when NumSCC Neigh When=0, the candidate mode is only Intra mode; when NumSCC Neigh Not equal to 0 and numinitra Neight When=0, the candidate modes are IBC and PLT modes; when NumSCC Neigh Not equal to 0 and NumIntra Neigh When not equal to 0, the candidate modes are Intra, IBC, and PLT modes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310893891.7A CN116600107B (en) | 2023-07-20 | 2023-07-20 | HEVC-SCC quick coding method and device based on IPMS-CNN and spatial neighboring CU coding modes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310893891.7A CN116600107B (en) | 2023-07-20 | 2023-07-20 | HEVC-SCC quick coding method and device based on IPMS-CNN and spatial neighboring CU coding modes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116600107A CN116600107A (en) | 2023-08-15 |
CN116600107B true CN116600107B (en) | 2023-11-21 |
Family
ID=87594223
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310893891.7A Active CN116600107B (en) | 2023-07-20 | 2023-07-20 | HEVC-SCC quick coding method and device based on IPMS-CNN and spatial neighboring CU coding modes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116600107B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117915080B (en) * | 2024-01-31 | 2024-10-01 | 重庆邮电大学 | Quick coding mode decision method suitable for VVC SCC |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107087172A (en) * | 2017-03-22 | 2017-08-22 | 中南大学 | Quick code check code-transferring method and its system based on HEVC SCC |
CN107623850A (en) * | 2017-09-26 | 2018-01-23 | 杭州电子科技大学 | A kind of quick screen contents encoding method based on temporal correlation |
CN112601087A (en) * | 2020-11-23 | 2021-04-02 | 郑州轻工业大学 | Fast CU splitting mode decision method for H.266/VVC |
CN113079373A (en) * | 2021-04-01 | 2021-07-06 | 北京允博瑞捷信息科技有限公司 | Video coding method based on HEVC-SCC |
CN114286093A (en) * | 2021-12-24 | 2022-04-05 | 杭州电子科技大学 | Rapid video coding method based on deep neural network |
CN115314710A (en) * | 2020-01-08 | 2022-11-08 | Oppo广东移动通信有限公司 | Encoding method, decoding method, encoder, decoder, and storage medium |
CN115941943A (en) * | 2022-12-02 | 2023-04-07 | 杭州电子科技大学 | HEVC video coding method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11627327B2 (en) * | 2019-08-05 | 2023-04-11 | Qualcomm Incorporated | Palette and prediction mode signaling |
-
2023
- 2023-07-20 CN CN202310893891.7A patent/CN116600107B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107087172A (en) * | 2017-03-22 | 2017-08-22 | 中南大学 | Quick code check code-transferring method and its system based on HEVC SCC |
CN107623850A (en) * | 2017-09-26 | 2018-01-23 | 杭州电子科技大学 | A kind of quick screen contents encoding method based on temporal correlation |
CN115314710A (en) * | 2020-01-08 | 2022-11-08 | Oppo广东移动通信有限公司 | Encoding method, decoding method, encoder, decoder, and storage medium |
CN112601087A (en) * | 2020-11-23 | 2021-04-02 | 郑州轻工业大学 | Fast CU splitting mode decision method for H.266/VVC |
CN113079373A (en) * | 2021-04-01 | 2021-07-06 | 北京允博瑞捷信息科技有限公司 | Video coding method based on HEVC-SCC |
CN114286093A (en) * | 2021-12-24 | 2022-04-05 | 杭州电子科技大学 | Rapid video coding method based on deep neural network |
CN115941943A (en) * | 2022-12-02 | 2023-04-07 | 杭州电子科技大学 | HEVC video coding method |
Non-Patent Citations (1)
Title |
---|
基于卷积神经网络的屏幕内容帧内快速编码算法;张倩云;《中国优秀硕士学位论文全文数据库 (信息科技辑)》(第2021 年 第02期期);I136-539,第4.1-4.6.1节 * |
Also Published As
Publication number | Publication date |
---|---|
CN116600107A (en) | 2023-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111800641B (en) | Image coding and decoding method and device adopting different kinds of reconstructed pixels in same mode | |
CN108012157B (en) | Method for constructing convolutional neural network for video coding fractional pixel interpolation | |
CN104378644B (en) | Image compression method and device for fixed-width variable-length pixel sample string matching enhancement | |
CN107147911A (en) | LIC quick interframe coding mode selection method and device is compensated based on local luminance | |
CN108495135A (en) | A kind of fast encoding method of screen content Video coding | |
CN108259897B (en) | Intra-frame coding optimization method based on deep learning | |
CN116600107B (en) | HEVC-SCC quick coding method and device based on IPMS-CNN and spatial neighboring CU coding modes | |
CN104754362B (en) | Image compression method using fine-divided block matching | |
CN107770540B (en) | Data compression method and device for fusing multiple primitives with different reference relations | |
CN112637600B (en) | Method and device for encoding and decoding data in a lossy or lossless compression mode | |
CN111447452B (en) | Data coding method and system | |
CN105100814A (en) | Methods and devices for image encoding and decoding | |
CN113079378B (en) | Image processing method and device and electronic equipment | |
CN112565790B (en) | Method and device for encoding and decoding string prediction by using minimum base vector mark | |
CN110213584A (en) | Coding unit classification method and coding unit sorting device based on Texture complication | |
CN118614061A (en) | Encoding/decoding method, encoder, decoder, and storage medium | |
Ma et al. | A cross channel context model for latents in deep image compression | |
CN112770120B (en) | 3D video depth map intra-frame rapid coding method based on depth neural network | |
CN116112694B (en) | Video data coding method and system applied to model training | |
CN115209147B (en) | Camera video transmission bandwidth optimization method, device, equipment and storage medium | |
CN117337449A (en) | Point cloud quality enhancement method, encoding and decoding methods and devices, and storage medium | |
Luo et al. | Super-High-Fidelity Image Compression via Hierarchical-ROI and Adaptive Quantization | |
CN116634147B (en) | HEVC-SCC intra-frame CU rapid partitioning coding method and device based on multi-scale feature fusion | |
CN111614961A (en) | Encoding method for searching by calculating hash values and establishing hash table in different modes | |
Li et al. | You Can Mask More For Extremely Low-Bitrate Image Compression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |