Nothing Special   »   [go: up one dir, main page]

CN116982082A - Image encoding/decoding method, encoder, decoder, and storage medium - Google Patents

Image encoding/decoding method, encoder, decoder, and storage medium Download PDF

Info

Publication number
CN116982082A
CN116982082A CN202180090510.0A CN202180090510A CN116982082A CN 116982082 A CN116982082 A CN 116982082A CN 202180090510 A CN202180090510 A CN 202180090510A CN 116982082 A CN116982082 A CN 116982082A
Authority
CN
China
Prior art keywords
target
quantization
value
channels
bit width
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180090510.0A
Other languages
Chinese (zh)
Inventor
虞露
周胜辉
邵宇超
于化龙
戴震宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Zhejiang University ZJU
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU, Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Zhejiang University ZJU
Publication of CN116982082A publication Critical patent/CN116982082A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The application provides an image coding and decoding method, an encoder, a decoder and a storage medium, wherein the characteristic data of a current image is obtained by acquiring the current image to be coded and inputting the current image into a neural network, and the characteristic data of the current image comprises N channels of characteristic data; quantizing the characteristic data of at least one of the N channels; and encoding the quantized characteristic data of at least one channel to obtain a code stream, wherein the code stream comprises first information which is used for indicating a decoding point to dequantize the characteristic data of at least one channel in the N channels. The method and the device realize the quantization of the characteristic data output by the middle layer of the neural network, so that the characteristic data can be encoded by multiplexing the technology in the existing video and image encoding and decoding standards, and the encoding efficiency is provided.

Description

Image encoding/decoding method, encoder, decoder, and storage medium Technical Field
The present application relates to the field of video encoding and decoding technologies, and in particular, to an image encoding and decoding method, an encoder, a decoder, and a storage medium.
Background
Digital video technology may be incorporated into a variety of video devices, such as digital televisions, smartphones, computers, electronic readers, or video players, among others. With the development of video technology, video data includes a larger amount of data, and in order to facilitate the transmission of video data, video apparatuses perform video compression technology to make the transmission or storage of video data more efficient.
With the rapid development of visual analysis technology, a neural network technology and an image video compression technology are combined, and a video coding framework oriented to machine vision is provided.
However, the current video coding framework facing machine vision has low coding efficiency.
Disclosure of Invention
The embodiment of the application provides an image coding and decoding method, an encoder, a decoder and a storage medium, so as to improve coding efficiency.
In a first aspect, the present application provides an image encoding method, including:
acquiring a current image to be coded;
inputting the current image into a neural network to obtain characteristic data of the current image, wherein the characteristic data of the current image comprises characteristic data of N channels, and N is a positive integer;
quantizing the characteristic data of at least one of the N channels;
And encoding the quantized characteristic data of the at least one channel to obtain a code stream, wherein the code stream comprises first information, and the first information is used for indicating to dequantize the characteristic data of the at least one channel in the N channels.
In a second aspect, an embodiment of the present application provides an image decoding method, including:
decoding a code stream to obtain characteristic data of a current image, wherein the characteristic data of the current image comprises characteristic data of N channels, and N is a positive integer;
decoding a code stream to obtain first information, wherein the first information is used for indicating feature data of at least one channel in the N channels to be dequantized;
and dequantizing the characteristic data of the at least one channel according to the first information.
In a third aspect, the present application provides a video encoder for performing the method of the first aspect or implementations thereof. In particular, the encoder comprises functional units for performing the method of the first aspect described above or in various implementations thereof.
In a fourth aspect, the present application provides a video decoder for performing the method of the second aspect or implementations thereof. In particular, the decoder comprises functional units for performing the method of the second aspect described above or in various implementations thereof.
In a fifth aspect, a video encoder is provided that includes a processor and a memory. The memory is for storing a computer program and the processor is for calling and running the computer program stored in the memory for performing the method of the first aspect or implementations thereof.
In a sixth aspect, a video decoder is provided that includes a processor and a memory. The memory is for storing a computer program and the processor is for invoking and running the computer program stored in the memory to perform the method of the second aspect or implementations thereof described above.
In a seventh aspect, a video codec system is provided that includes a video encoder and a video decoder. The video encoder is for performing the method of the first aspect described above or in various implementations thereof, and the video decoder is for performing the method of the second aspect described above or in various implementations thereof.
An eighth aspect provides a chip for implementing the method of any one of the first to second aspects or each implementation thereof. Specifically, the chip includes: a processor for calling and running a computer program from a memory, causing a device on which the chip is mounted to perform the method as in any one of the first to second aspects or implementations thereof described above.
In a ninth aspect, a computer-readable storage medium is provided for storing a computer program for causing a computer to perform the method of any one of the above first to second aspects or implementations thereof.
In a tenth aspect, there is provided a computer program product comprising computer program instructions for causing a computer to perform the method of any one of the first to second aspects or implementations thereof.
In an eleventh aspect, there is provided a computer program which, when run on a computer, causes the computer to perform the method of any one of the above-described first to second aspects or implementations thereof.
Based on the technical scheme, the current image to be encoded is acquired and is input into a neural network to obtain the characteristic data of the current image, wherein the characteristic data of the current image comprises the characteristic data of an N channel; quantizing the characteristic data of at least one of the N channels; and encoding the quantized characteristic data of at least one channel to obtain a code stream, wherein the code stream comprises first information which is used for indicating a decoding point to dequantize the characteristic data of at least one channel in the N channels. The method and the device realize the quantization of the characteristic data output by the middle layer of the neural network, so that the characteristic data can be encoded by multiplexing the technology in the existing video and image encoding and decoding standards, and the encoding efficiency is provided.
Drawings
FIG. 1 is a schematic diagram of a codec framework for pre-analyzing and recompressing images in accordance with an embodiment of the present application;
FIG. 2 is a schematic diagram of a potential encoding scheme for MPEG-VCM;
fig. 3 is a flowchart of an image encoding method 300 according to an embodiment of the present application;
fig. 4 is a flowchart of an image encoding method 400 according to an embodiment of the present application;
fig. 5 is a flowchart of an image encoding method 500 according to an embodiment of the present application;
fig. 6 is a flowchart of an image encoding method 600 according to an embodiment of the present application;
fig. 7 is a flowchart of an image decoding method 700 according to an embodiment of the present application;
fig. 8 is a flowchart of an image decoding method 800 according to an embodiment of the present application;
fig. 9 is a flowchart of an image decoding method 900 according to an embodiment of the present application;
fig. 10 is a flowchart of an image decoding method 1000 according to an embodiment of the present application;
fig. 11 is a schematic block diagram of a video encoder 10 provided by an embodiment of the present application;
fig. 12 is a schematic block diagram of video decoder 20 provided by an embodiment of the present application;
FIG. 13 is a schematic block diagram of an electronic device 30 provided by an embodiment of the present application;
fig. 14 is a schematic block diagram of a video codec system 40 provided by an embodiment of the present application.
Detailed Description
The application can be applied to various video coding and decoding fields oriented to machine vision and man-machine mixed vision, and combines the technologies of 5G, AI, deep learning, feature extraction, video analysis and the like with the existing video processing and coding technologies. The 5G era promotes the mass application of machine oriented, such as the machine vision content of the Internet of vehicles, unmanned, industrial Internet, wisdom and safe city, wearable, video monitoring and the like, and compared with the increasingly saturated human video oriented, the application scene is wider, and the video coding oriented to the machine vision becomes one of the main increment flow sources of the 5G and the later 5G era.
For example, the scheme of the application can be combined to an audio video coding standard (audio video coding standard, AVS for short), such as H.264/audio video coding (audio video coding, AVC for short), H.265/high efficiency video coding (high efficiency video coding, HEVC for short) and H.266/multifunctional video coding (versatile video coding, VVC for short) standards. Alternatively, aspects of the present application may operate in conjunction with other proprietary or industry standards including ITU-T H.261, ISO/IECMPEG-1Visual, ITU-T H.262 or ISO/IECMPEG-2Visual, ITU-T H.263, ISO/IECMPEG-4Visual, ITU-T H.264 (also known as ISO/IECMPEG-4 AVC), including Scalable Video Codec (SVC) and Multiview Video Codec (MVC) extensions. It should be understood that the techniques of this disclosure are not limited to any particular codec standard or technique.
Fig. 1 is a schematic diagram of a codec framework for pre-compressing and recompressing images according to an embodiment of the present application.
In an application scene facing intelligent analysis, videos and images are required to be presented to a user for high-quality viewing, and are also used for analyzing and understanding semantic information in the videos and images. Aiming at the analysis requirement of the intelligent analysis task for video and image coding more uniquely, a plurality of researchers now change the traditional direct compression coding of images into the compression coding of characteristic data output by the network middle layer of the intelligent analysis task.
As shown in fig. 1, the terminal devices such as a camera firstly pre-analyzes the acquired or input original video and image data by using a task network, for example, inputs a task network a, a task network B and a task network B, extracts and obtains enough feature data for cloud analysis, and performs compression coding and transmission on the feature data. And after receiving the corresponding code stream, the cloud device rebuilds the corresponding characteristic data according to the grammar information of the code stream, and inputs the corresponding characteristic data into a specific task network for continuous analysis. Under the coding and decoding framework shown in fig. 1, a large amount of characteristic data is transmitted between the terminal device and the cloud device, and the purpose of characteristic data compression is to compress and encode the characteristic data extracted from the existing task network in a recoverable mode so as to be used for further intelligent analysis and processing of the cloud.
Aiming at the problem of efficient video and image coding for intelligent analysis task scenes shown in fig. 1, the international standard organization of MPEG (moving picture experts group, original WG 11) under the international ISO/IEC HTC 1/SC 29 (committee for audio, image coding, multimedia and hypermedia information division techniques) has established the Video Coding for Machines (VCM) standard working group at the 127 th meeting in 2019, aiming at defining a code stream for compressed video or feature information extracted from video, so that a plurality of intelligent analysis tasks can be executed by using the same code stream without significantly reducing the analysis performance of the intelligent tasks, the decompressed information is more friendly to the intelligent analysis tasks, and the loss of the performance of the intelligent analysis tasks is smaller at the same code rate. Meanwhile, a multimedia group conference standard working conference set down by national information technology standardization technical commission holds a first working group conference in Hangzhou city of Zhejiang province in month 1 of 2020, and a machine-intelligent-oriented data coding (Data Compression for Machines, DCM) standard working group is correspondingly established to study the technical application of the aspect, so that the related machine intelligent application or man-machine hybrid intelligent application is supported through efficient data representation and compression.
Fig. 2 is a schematic diagram of a potential encoding scheme of an MPEG-VCM. The current VCM standard working group designs a potential coding flow chart as shown in fig. 2, so as to improve the coding efficiency of video and images under the intelligent analysis task. The video and the image can directly pass through a video and image encoder optimized for the task, or can be subjected to network pre-analysis to extract and encode the characteristic data, and then the decoded characteristic data is input into a subsequent network for continuous analysis. If the extracted feature data is compressed by multiplexing the existing video and image coding standards, the feature data represented by floating point is required to be subjected to fixed-point processing.
The image encoding method according to the embodiment of the present application will be described in detail with reference to specific examples.
Taking the encoding end as an example, the encoding process will be described.
Fig. 3 is a flowchart of an image encoding method 300 according to an embodiment of the present application. The implementation of the embodiment of the present application may be understood as an encoder shown in fig. 2, as shown in fig. 3, including:
s301, acquiring a current image to be encoded.
S302, inputting a current image into a neural network to obtain feature data of the current image, wherein the feature data of the current image comprise feature data of N channels, and N is a positive integer;
S303, quantizing the characteristic data of at least one channel in the N channels;
s304, the quantized feature data of at least one channel is encoded to obtain a code stream, wherein the code stream comprises first information, and the first information is used for indicating that the feature data of at least one channel in N channels is dequantized.
The current image of the application can be understood as a frame image to be encoded or a part of images in the frame image in the video flow; alternatively, the current image may be understood as a single image to be encoded or a partial image of the image to be encoded.
The neural network is any task network, such as a classification network, a target detection network, a semantic segmentation gateway and the like, and the type of the neural network is not limited.
The current image is input into the neural network to obtain feature data output by the middle layer of the neural network, and in some embodiments, the feature data is of a floating point type, and in order to multiplex the existing video coding frame, the feature data of the floating point type needs to be quantized.
In some embodiments, most existing video coding frameworks compress fixed-point type data when compressing, so an encoder needs to quantize floating-point type feature data into fixed-point type feature data, and encode the fixed-point type feature data.
Optionally, the fixed-point number type feature data includes integer type feature data, i.e. the encoder quantizes floating-point number type feature data into integer type feature data.
In some embodiments, the floating point type feature data of the current image includes N channels of floating point type feature data, where N is a positive integer, and the encoder quantizes the floating point type feature data of at least one of the N channels.
In some embodiments, the manner of quantifying the floating point number type of the feature data of at least one of the N channels in S303 includes, but is not limited to, the following:
in a first mode, feature data of floating point number types of all channels in the N channels are quantized by using the same quantization mode; at this time, a set of quantization parameters is transmitted in all of the N channels in the code stream.
A second mode, wherein the feature data of the floating point number type of each channel in the N channels is quantized by using a quantization mode respectively; at this time, a set of quantization parameters is transmitted for each of the N channels in the code stream.
And thirdly, grouping the N channels, and respectively quantizing the characteristic data of the floating point number type of each group of channels by using a quantization mode. At this time, the same set of channels in the code stream transmit a set of quantization parameters.
In some embodiments, the quantization mode that quantizes the floating point type feature data for at least one channel may include a linear uniform quantization mode, a non-linear uniform quantization mode, or a look-up table quantization mode. The nonlinear uniform quantization mode further comprises nonlinear exponential function quantization and nonlinear logarithmic function quantization. It should be noted that, the quantization modes in the embodiment of the present application include, but are not limited to, the above several quantization modes, and other quantization modes may be used to quantize the floating point type feature data into the fixed point type feature data.
According to the image coding method, the current image to be coded is acquired and input into the neural network to obtain the floating point type characteristic data of the current image, wherein the floating point type characteristic data of the current image comprises the floating point type characteristic data of N channels; quantifying the feature data of the floating point number type of at least one of the N channels; and encoding the quantized characteristic data of at least one channel to obtain a code stream. Therefore, the process of the characteristic data output by the middle layer of the neural network is subjected to the fixed-point operation, so that the characteristic data can be encoded by multiplexing the technology in the existing video and image coding and decoding standards, and the characteristic data of at least one channel of N channels is subjected to the fixed-point operation, thereby improving the encoding efficiency of the characteristic data after the fixed-point operation, and realizing the efficient compression of the characteristic data. In addition, the application considers the channel information of the characteristic data in the quantization process of the coding end, can process the characteristic data among different channels, and further improves the quantization reliability of the characteristic data.
The process of quantifying the floating point type feature data of all the N channels using the same quantization mode is described in detail below with reference to fig. 4.
Fig. 4 is a flowchart of an image encoding method 400 according to an embodiment of the present application, as shown in fig. 4, including:
s401, acquiring a current image to be encoded.
S402, inputting the current image into a neural network to obtain feature data of floating point number types of N channels of the current image;
s403, quantifying the feature data of the floating point number type of all channels in the N channels by using the same quantification mode;
s404, encoding the characteristic data of the fixed point number type of the current image to obtain a code stream.
Optionally, the quantization mode includes linear uniform quantization, nonlinear function quantization, and table look-up quantization.
The code stream comprises characteristic data of fixed point number type under all channels.
In some embodiments, if the present embodiment quantizes the floating point type feature data of all the N channels in a linear uniform quantization manner, the above S403 includes the following S403-A1 and S403-A2:
S403-A1, acquiring a preset first quantization bit width, and a first characteristic value and a second characteristic value in the floating point number type characteristic data of all channels in N channels;
S403-A2, according to the first characteristic value, the second characteristic value and the first quantization bit width, using a linear uniform quantization mode to quantize the characteristic data of the floating point number type of each of the N channels.
In the embodiment of the application, the characteristic data of the floating point number type of all channels in the N channels are taken as a whole, and the first characteristic value and the second characteristic value are obtained from the characteristic data of the floating point number type of all channels in the N channels.
Optionally, the preset first quantization bit width may be preset and set in a configuration file of the encoder.
Optionally, the first feature value is a minimum feature value in the floating point type feature data of all channels in the N channels of the current image, and the second feature value is a maximum feature value in the floating point type feature data of all channels in the N channels of the current image.
In some embodiments, the encoder quantizes the feature data for the floating point number type for all of the N channels according to equation (1) below:
wherein x is cij A characteristic value of the floating point number type of the ith row and the jth column of the c-th channel; x is x cmax1 And x cmin1 The second characteristic value and the first characteristic value in the characteristic data of the floating point number type of all the N channels are respectively; bitdepth1 is the first quantized bit width, int [ · ]Representing an integer function; y is cij The characteristic value of the fixed point number type of the ith row and the jth column of the ith channel after quantization; delta is a minimum value, and can be taken to be 0, so that the floating point type characteristic data is mapped into a value interval which is left closed and right opened.
It should be noted that the above formula (1) is only an example, and the linear uniform quantization mode of the present application further includes deforming the above formula (1), for example, into the formula (2):
alternatively, one or more coefficients may be added, multiplied, or divided in the above formula (1).
And (3) quantizing the floating point type characteristic data of all channels in the N channels of the current image into fixed point type characteristic data according to the formula (1), and then encoding the fixed point type characteristic data to form a code stream.
The nonlinear uniform quantization method includes nonlinear logarithmic uniform quantization method and nonlinear exponential uniform quantization method according to nonlinear function.
In some embodiments, if the present embodiment uses a nonlinear logarithmic uniform quantization method to quantize the floating point number type feature data of all the N channels, the above S403 includes the following S403-B1 and S403-B2:
S403-B1, acquiring a first base number of a preset second quantization bit width and logarithmic function, and a first characteristic value and a second characteristic value in the characteristic data of the floating point number type of all channels in the N channels;
S403-B2, quantizing the feature data of the floating point number type of each of the N channels by using a nonlinear logarithmic uniform quantization mode according to the first feature value and the second feature value and the first base of the second quantization bit width and the logarithmic function.
Optionally, the first base of the preset second quantization bit width and logarithmic function may be preset by a user and set in a configuration file of the encoder. The second quantization bit width may be determined according to a magnitude of the first feature value, and the first base of the logarithmic function may be determined according to a characteristic of the feature data.
In some embodiments, the encoder quantizes the feature data for the floating point type for each of the N channels according to equation (3) below:
wherein bitdepth2 is the second quantization bit width and log_base1 is the first base of the logarithmic function used in logarithmic quantization.
It should be noted that the above formula (3) is only an example, and the nonlinear logarithmic uniform quantization mode of the present application further includes the modification of the above formula (3), for example, the modification into formula (4):
or adding, multiplying or dividing one or more coefficients in the above formula (3), etc.
Optionally, the second quantization bit width is equal to the first quantization bit width.
In some embodiments, if the present embodiment uses a nonlinear exponential uniform quantization method to quantize the floating point type feature data of all the N channels, the above S403 includes the following S403-C1 and S403-C2:
S403-C1, acquiring a first base number of a preset third quantization bit width and an exponential function, and a first characteristic value and a second characteristic value in the characteristic data of the floating point number type of all channels in the N channels;
S403-C2, quantizing the feature data of the floating point number type of each of the N channels by using a nonlinear index uniform quantization mode according to the first feature value, the second feature value, the third quantization bit width and the first base of the index function.
Optionally, the above-mentioned preset third quantization bit width and the first base of the exponential function may be preset by a user and set in a configuration file of the encoder. The third quantization bit width may be determined according to a size of the first feature value, and the first base of the exponential function may be determined according to a characteristic of the feature data.
In some embodiments, the encoder quantizes the feature data for the floating point type for each of the N channels according to equation (5) below:
Wherein bitdepth3 is the third quantization bit width, and e_base is the first base of the exponential function used in the exponential quantization.
It should be noted that, the above formula (5) is only an example, and the nonlinear logarithmic uniform quantization mode of the present application further includes the modification of the above formula (5), for example, the modification into formula (6):
or adding, multiplying, or dividing one or more coefficients in the above formula (5), and the like.
Optionally, the third quantization bit width is equal to the first quantization bit width.
In some embodiments, if the present embodiment adopts the table look-up quantization method to quantize the floating point type feature data of all the N channels, the above S403 includes the following S403-D1 to S403-D3:
S403-D1, sorting the feature data of the floating point number type of all channels in the N channels according to the value, and obtaining first feature data after sorting;
S403-D2, dividing the sequenced first characteristic data into a plurality of first quantization intervals, wherein each first quantization interval comprises the same quantity of characteristic data;
S403-D3, for each first quantization interval, quantizing the value of the feature data within the first quantization interval to an index value of the first quantization interval.
In this embodiment, the feature data of the floating point type of all channels in the N channels is taken as a whole, and each feature value in the feature data of the floating point type of all channels in the N channels is sorted according to the value size in order from large to small or from small to large, so as to obtain the feature data of the floating point type of all channels after sorting, and for convenience of description, the feature data after sorting is referred to as first feature data after sorting. The sorted first characteristic data are divided into a plurality of first quantization intervals, and the quantity of the characteristic data included in each first quantization interval is the same. The respective first quantization intervals are represented using indexes that can be represented by the corresponding quantization bit widths such that each first quantization interval has one index. In this way, during quantization, the value of the feature data in each first quantization interval can be quantized into the index value of each first quantization interval.
In the table look-up quantization method, since the 0 value in the feature data occupies a relatively large area, the feature data other than the 0 value after the sorting can be divided into sections containing the feature data of equal number, that is, all the 0 values of the feature data after the sorting are marked as index number 0, and the corresponding reconstruction value is set as index number 0.
The process of quantizing the floating point number type feature data of each of the N channels of the current image using one quantization method is described in detail below with reference to fig. 5.
Fig. 5 is a flowchart of an image encoding method 500 according to an embodiment of the present application, as shown in fig. 5, including:
s501, acquiring a current image to be encoded.
S502, inputting the current image into a neural network to obtain the feature data of the floating point number type of N channels of the current image.
S503, respectively quantizing the feature data of the floating point number type of each of the N channels by using a quantization mode.
S505, encoding the characteristic data of the fixed point number type of the current image to obtain a code stream.
Optionally, the quantization mode includes linear uniform quantization, nonlinear function quantization, and table look-up quantization.
In some embodiments, if the present embodiment quantizes the floating point type feature data of each of the N channels in a linear uniform quantization manner, the above S503 includes the following S503-A1 and S503-A2:
S503-A1, acquiring a preset fourth quantization bit width, and a third characteristic value and a fourth characteristic value in floating point number type characteristic data of each channel in N channels;
S503-A2, according to the third characteristic value, the fourth characteristic value and the fourth quantization bit width, the characteristic data of the floating point number type of the channel is quantized by using a linear uniform quantization mode.
In the embodiment of the application, the characteristic data of the floating point type of each channel in the N channels is taken as a whole, and the characteristic data of the floating point type of each channel is quantized by using a quantization mode.
It should be noted that, the quantization process of the feature data of each of the N channels is the same, and for convenience of description, one of the N channels is taken as an example. The method comprises the steps of obtaining a maximum characteristic value and a minimum characteristic value from the characteristic data of the floating point number type of the channel, marking the maximum characteristic value as a third characteristic value, and marking the minimum characteristic value as a fourth characteristic value.
Optionally, the preset fourth quantization bit width may be preset by a user and set in a configuration file of the encoder. Wherein the fourth quantization bit width may be determined according to the size of the third feature value.
Optionally, the third feature value is a maximum feature value in the floating point type feature data of the channel, and the fourth feature value is a minimum feature value in the floating point type feature data of the channel.
In some embodiments, the encoder quantizes the floating point type of feature data for the channel according to equation (7) below:
wherein the current channel is the c-th channel, x cij For the characteristic value, x, of the floating point number type of the ith row and the jth column of the channel cmax2 And x cmin2 The second maximum value and the second minimum value in the feature data of the floating point number type of the channel are respectively, bitdepth4 is the fourth quantization bit width, int [ · []Represents an integer function, y cij And delta is a minimum value, which can be taken as 0, for mapping floating point type characteristic data into a left-closed right-open value interval.
It should be noted that the above formula (7) is only an example, and the linear uniform quantization mode of the present application further includes a modification of the above formula (7), for example, a modification into the following formula (8):
or adding, multiplying or dividing one or more coefficients in the above formula (7), and the like.
And (3) quantizing the floating point type characteristic data of the channel into fixed point type characteristic data according to the formula (7), and then encoding the fixed point type characteristic data to form a code stream.
According to the different nonlinear functions, the nonlinear uniform quantization mode comprises a nonlinear logarithmic uniform quantization mode and a nonlinear exponential uniform quantization mode.
In some embodiments, if the present embodiment uses a nonlinear log-uniform quantization method to quantize the floating point type feature data of the channel, the S503 includes the following S503-B1 and S503-B2:
S503-B1, obtaining a second base number of a preset fifth quantization bit width and logarithmic function and a third characteristic value and a fourth characteristic value in floating point number type characteristic data of each channel in N channels;
S503-B2, according to the third characteristic value and the fourth characteristic value and the fifth quantization bit width and the second base of the logarithmic function, the characteristic data of the floating point number type of the channel is quantized by using a nonlinear logarithmic uniform quantization mode.
Optionally, the second base of the fifth quantization bit width and logarithmic function may be preset by the user and set in the configuration file of the encoder. Wherein the fifth quantization bit width may be determined according to the magnitude of the third characteristic value, and the second base of the logarithmic function is determined according to the characteristic of the characteristic data in the channel.
In some embodiments, the encoder quantizes the floating point type of feature data for the channel according to equation (9) below:
Where bitdepth5 is the fifth quantization bit width, log_base2 is the second base of the logarithmic function used in logarithmic quantization, e.g., 10.
It should be noted that the above formula (9) is only an example, and the nonlinear logarithmic uniform quantization mode of the present application further includes a modification of the above formula (9), for example, a modification into the following formula (10):
or adding, multiplying, or dividing one or more coefficients in the above formula (9), and the like.
Optionally, the fifth quantization bit width is equal to the fourth quantization bit width.
In some embodiments, if the present embodiment uses a nonlinear exponential uniform quantization method to quantize the floating point type feature data of the channel, the S503 includes the following S503-C1 and S503-C2:
S503-C1, acquiring a second base number of a preset sixth quantization bit width and an exponential function, and a third characteristic value and a fourth characteristic value in floating point number type characteristic data of the channels for each channel in the N channels;
S503-C2, according to the third characteristic value and the fourth characteristic value and the sixth quantization bit width and the second base of the exponential function, the characteristic data of the floating point number type of the channel is quantized by using a nonlinear exponential uniform quantization mode.
Optionally, the above-mentioned preset sixth quantization bit width and the second base of the exponential function may be preset by a user and set in a configuration file of the encoder. The sixth quantization bit width may be determined according to the magnitude of the third feature value, and the second base of the exponential function may be determined according to the characteristic of the feature data under the channel.
In some embodiments, the encoder quantizes the floating point type of feature data for the channel according to equation (11) below:
wherein bitdepth6 is the sixth quantization bit width and e_base2 is the second base of the exponential function used in the exponential quantization.
It should be noted that, the above formula (11) is only an example, and the nonlinear logarithmic uniform quantization mode of the present application further includes the modification of the above formula (11), for example, the modification into the following formula (12):
or adding, multiplying, or dividing one or more coefficients in the above formula (11), and the like.
Optionally, the sixth quantization bit width is equal to the fourth quantization bit width.
In some embodiments, if the present embodiment adopts the table look-up quantization method to quantize the floating point type feature data of the channel, the above S503 includes the following S503-D1 to S503-D3:
S503-D1, aiming at each channel in the N channels, sorting the feature data of the floating point number type of the channel according to the value, and obtaining second feature data of the channel after sorting;
S503-D2, dividing the second feature data after the channel down-sorting into a plurality of second quantization intervals, wherein each second quantization interval comprises the same number of feature data;
S503-D3, for each second quantization interval, quantizing the value of the feature data within the second quantization interval to an index value of the second quantization interval.
In this embodiment, the feature data of the floating point number type of the channel is sorted from large to small or from small to large according to the value, and for convenience of description, the feature data sorted under the channel is referred to as sorted second feature data. And dividing the second characteristic data after the channel is subjected to the lower sequencing into a plurality of second quantization intervals, wherein the quantity of the characteristic data included in each second quantization interval is the same. The respective second quantization intervals are represented using indexes that can be represented by the corresponding quantization bit widths such that each second quantization interval has one index. In this way, at the time of quantization, the value of the feature data in each second quantization interval can be quantized to the index value of each second quantization interval.
The process of quantifying the floating point type of feature data for each of the M sets of channels using one quantization scheme is described in detail below in connection with FIG. 6.
Fig. 6 is a flowchart of an image encoding method 600 according to an embodiment of the present application, as shown in fig. 6, including:
s601, acquiring a current image to be encoded.
S602, inputting the current image into a neural network to obtain the feature data of the floating point number type of N channels of the current image.
S603, respectively quantizing the feature data of the floating point number type of each group of channels by using a quantization mode.
S604, encoding the characteristic data of the fixed point number type of the current image to obtain a code stream.
Optionally, the quantization mode includes linear uniform quantization, nonlinear function quantization, and table look-up quantization.
In some embodiments, if the present embodiment quantizes the floating point type feature data of the set of channels in a linear uniform quantization manner, the above S603 includes the following S603-A1 and S603-A2:
S603-A1, acquiring a preset seventh quantization bit width, and a fifth characteristic value and a sixth characteristic value in the floating point number type characteristic data of each group of channels;
S603-A2, quantizing the floating point number type characteristic data of each channel in the group of channels by using a linear uniform quantization mode according to the fifth characteristic value and the sixth characteristic value and the seventh quantization bit width.
In the embodiment of the application, N channels are divided into a plurality of groups of channels, the characteristic data of the floating point number type of each group of channels is taken as a whole, and the characteristic data of the floating point number type of each group of channels is quantized by using a quantization mode.
It should be noted that the quantization process of the feature data of each set of channels is the same, and for convenience of description, a set of channels is taken as an example. The method comprises the steps of obtaining a maximum characteristic value and a minimum characteristic value from the characteristic data of the floating point number type of the group of channels, marking the maximum characteristic value as a fifth characteristic value, and marking the minimum characteristic value as a sixth characteristic value.
Optionally, the above-mentioned preset seventh quantization bit width may be preset by a user and set in a configuration file of the encoder. Wherein the seventh quantization bit width may be determined according to the size of the fifth feature value.
Optionally, the fifth feature value is a maximum feature value in the floating point type feature data of the set of channels, and the sixth feature value is a minimum feature value in the floating point type feature data of the set of channels.
In some embodiments, the encoder quantizes the feature data for the floating point number type for the set of channels according to equation (13) below:
wherein the c-th channel is one channel of the group of channels, x cij Characteristic value x of floating point number type of ith row and jth column of c-th channel cmax3 And x cmin3 Respectively a third maximum value and a third minimum value in the feature data of the floating point number type of the group of channels, wherein bitdepth7 is a seventh quantized bit width, int [ · ]]Represents an integer function, y cij And the delta is a minimum value for mapping the floating point type characteristic data into a left-closed right-open value interval.
It should be noted that the above formula (12) is only an example, and the linear uniform quantization mode of the present application further includes a modification of the above formula (13), for example, a modification into the following formula (14):
or adding, multiplying, or dividing one or more coefficients in the above formula (14), and the like.
And (3) quantizing the floating point type characteristic data of the group of channels into fixed point type characteristic data according to the formula (14), and then encoding the fixed point type characteristic data to form a code stream.
The nonlinear uniform quantization method includes nonlinear logarithmic uniform quantization method and nonlinear exponential uniform quantization method according to nonlinear function.
In some embodiments, if the present embodiment quantizes the floating point type feature data of the group of channels in a nonlinear log-uniform quantization manner, the above S603 includes the following S603-B1 and S603-B2:
S603-B1, acquiring a third base number of a preset eighth quantization bit width and logarithmic function, and a fifth characteristic value and a sixth characteristic value in the characteristic data of the floating point number type of the group of channels for each group of channels;
S603-B2, according to the fifth characteristic value and the sixth characteristic value and the eighth quantization bit width and the third base of the logarithmic function, the characteristic data of the floating point number type of each channel in the group of channels is quantized by using a nonlinear logarithmic uniform quantization mode.
Optionally, the third base of the above-mentioned preset eighth quantization bit width and logarithmic function may be preset by the user and set in the configuration file of the encoder. Wherein the eighth quantization bit width may be determined based on the magnitude of the fifth characteristic value and the third base of the logarithmic function is determined based on the characteristics of the characteristic data in the set of channels.
In some embodiments, the encoder quantizes the feature data for the floating point number type for the set of channels according to equation (15) below:
Where bitdepth8 is the eighth quantization bit width, and log_base3 is the third base of the logarithmic function used in logarithmic quantization, e.g., 10.
It should be noted that the above formula (15) is only an example, and the nonlinear logarithmic uniform quantization mode of the present application further includes deforming the above formula (15), for example, into the following formula (16):
or adding, multiplying, or dividing one or more coefficients in the above formula (15), and the like.
Optionally, the eighth quantization bit width is equal to the eighth quantization bit width.
In some embodiments, if the present embodiment quantizes the floating point type of feature data of the channel in a nonlinear exponential uniform quantization manner, the S603 includes the following S603-C1 and S603-C2:
S603-C1, acquiring a third base number of a preset ninth quantization bit width and an exponential function, and a fifth characteristic value and a sixth characteristic value in the floating point number type characteristic data of each group of channels;
S603-C2, quantizing the feature data of the floating point number type of each channel in the group of channels by using a nonlinear logarithmic uniform quantization mode according to the fifth feature value and the sixth feature value and the ninth quantization bit width and the third base of the exponential function.
Optionally, the above-mentioned third base of the preset ninth quantization bit width and exponential function may be preset for the user and set in a configuration file of the encoder. Wherein the ninth quantization bit width may be determined based on the magnitude of the fifth characteristic value, and the third base of the exponential function is determined based on the characteristics of the characteristic data under the set of channels.
In some embodiments, the encoder quantizes the feature data for the floating point number type for the set of channels according to equation (17) below:
wherein bitdepth9 is the ninth quantization bit width, and e_base3 is the third base of the exponential function used in the exponential quantization.
It should be noted that the above formula (17) is only an example, and the nonlinear logarithmic uniform quantization mode of the present application further includes a modification of the above formula (17), for example, a modification into the following formula (18):
or adding, multiplying, or dividing one or more coefficients in the above formula (18), etc.
Optionally, the ninth quantization bit width is equal to the ninth quantization bit width.
In some embodiments, if the present embodiment uses a look-up table quantization method to quantize the floating point type feature data of the set of channels, the S603 includes the following S603-D1 to S603-D3:
S603-D1, aiming at each group of channels, sorting the feature data of the floating point number type of the group of channels according to the value, and obtaining third feature data after sorting under the group of channels;
S603-D2, dividing the third feature data after the group channel lower ordering into a plurality of third quantization intervals, wherein each third quantization interval comprises the same number of feature data;
S603-D3, for each third quantization interval, quantizing the value of the feature data within the third quantization interval to an index value of the third quantization interval.
In this embodiment, the floating point type feature data of the group of channels are sorted from large to small or from small to large according to the value, and for convenience of description, the feature data sorted under the group of channels is referred to as sorted third feature data. And dividing the third feature data after the channel lower ordering of the group into a plurality of third quantization intervals, wherein the number of the feature data included in each third quantization interval is the same. Each third quantization interval is represented using an index representable by the corresponding quantization bit width such that each third quantization interval has an index. In this way, during quantization, the value of the feature data in each third quantization interval can be quantized into the index value of each third quantization interval.
The quantization process at the encoding end is described above, and the content indicated by the first information is described below.
The encoding end quantizes the characteristic data of the floating point number type of at least one channel into the fixed point number type according to the steps, and then encodes the characteristic data of the fixed point number type in the code stream and sends the code stream to the decoding end. Meanwhile, the encoding end carries first information in the code stream, wherein the first information indicates that feature data of the fixed point number type of at least one channel is dequantized.
In some embodiments, the code stream further comprises second information indicating an inverse quantization manner used in inverse quantizing the fixed point number type of feature data of the at least one channel.
The inverse quantization method used in inverse quantization of the fixed point number type of the feature data of at least one channel includes any one of the following: a linear uniform inverse quantization mode, a nonlinear exponential uniform inverse quantization mode, a nonlinear logarithmic uniform inverse quantization mode and a table lookup inverse quantization mode. It should be noted that, the inverse quantization method according to the embodiment of the present application includes, but is not limited to, the above several quantization methods, and other inverse quantization methods may be used to inverse-quantize the fixed-point type feature data into the floating-point type feature data.
In some embodiments, the first information includes at least one parameter required to dequantize fixed point number type feature data for the at least one channel.
The at least one parameter included in the first information in the present application includes the following cases:
in case 1, the first information indicates that feature data of fixed point number types of all channels in the N channels are dequantized, where according to a difference of dequantization manners, the first information includes any one of the following example one, example two, example three, or example four:
for example, if the inverse quantization mode for inverse-quantizing the fixed point number type feature data of all the N channels is a linear uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value, and a first target quantization bit width.
The first target feature value is one feature value in the feature data of all channels in the N channels, for example, the first target feature value is the minimum feature value of all channels in the N channels.
The first target scaling value is a scaling value corresponding to the characteristic data of all channels in the N channels during quantization, and the first target quantization bit width is a quantization bit width corresponding to the characteristic data of all channels in the N channels during quantization.
The following describes a process for determining the first target scaling value in combination with the encoding mode of the encoding end.
In one example, if the quantization mode of the encoding end for all channels is a linear uniform quantization mode, the encoding end may determine the first target scaling value according to the first feature value and the second feature value in the feature data of all channels in the N channels and the first target quantization bit width.
Alternatively, the first target scaling value s may be determined according to the following equation (19) c1
Wherein x is cmin1 And x cmax1 The first characteristic value and the second characteristic value in the characteristic data of all the N channels are respectively. The first target quantization bit width 1bitdepth may be the first quantization bit width bitdepth1 in the above formula (1).
It should be noted that the above formula (19) is only an example, and the present application determines the first target scaling value s c1 The formula (19) may be modified, or one or more coefficients may be added, multiplied, or divided in the formula (19).
In another example, if the quantization mode of the encoding end for all channels is a nonlinear logarithmic uniform quantization mode, the encoding end may determine the first target scaling value according to the first feature value and the second feature value in the feature data of all channels in the N channels and the first base of the first target quantization bit width and logarithmic function.
Alternatively, the first target scaling value s may be determined according to the following equation (20) c1
Wherein log log_base1 The first target quantization bit width may be the second quantization bit width in equation (3) above, as the first base of the logarithmic function.
It should be noted that the above formula (20) is only an example, and the present application determines the first target scaling value s c1 The formula (20) may be modified, or one or more coefficients may be added, multiplied, or divided in the formula (20).
In another example, if the quantization mode of the encoding end for all channels is a nonlinear exponential uniform quantization mode, the encoding end may determine the first target scaling value according to the first feature value and the second feature value in the feature data of all channels in the N channels, and the first target quantization bit width and the first base of the exponential function.
Alternatively, the first target scaling value s may be determined according to the following equation (21) c1
Wherein e_base1 is the first base of the exponential function, and the first target quantization bit width may be the third quantization bit width bitdepth3 in the above formula (5).
The above formula (21) is onlyFor example, the present application determines a first target scaling value s c1 The formula (1) further includes deforming the formula (21), or adding, multiplying, dividing, or the like one or more coefficients in the formula (21).
In this way, the decoding end can parse the first information from the code stream, and dequantize the feature data of the fixed point number type of all channels in the N channels by using a linear uniform dequantization mode according to the first target feature value, the first target scaling value and the first target quantization bit width included in the first information.
In the second example, if the inverse quantization mode of inverse quantization is a nonlinear logarithmic uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value, and a first target quantization bit width, or the first information includes a first target feature value, a first target scaling value, and a first target quantization bit width, and a first logarithmic base, or the first information includes an indication information of the first target feature value, the first target scaling value, and the first target quantization bit width, and the first logarithmic base.
Specifically, if the first information includes a first target feature value, a first target scaling value and a first target quantization bit width, the decoding end dequantizes the feature data of the fixed point number type of all channels in the N channels by using a nonlinear logarithmic uniform dequantization mode according to the first target feature value, the first target scaling value, the first target quantization bit width and a default logarithmic base number.
If the first information includes a first target feature value, a first target scaling value, a first target quantization bit width and a first logarithmic base, the decoding end directly uses the first target feature value, the first target scaling value, the first target quantization bit width and the first logarithmic base carried by the first information, and uses a nonlinear logarithmic uniform inverse quantization mode to inversely quantize the feature data of the fixed point number type of all channels in the N channels.
If the first information includes a first target feature value, a first target scaling value, a first target quantization bit width, and indication information of a first logarithmic base, the indication information of the first logarithmic base is used for indicating that the first logarithmic base is determined from a plurality of preset logarithmic bases. In this way, the decoding end analyzes the first information from the code stream, determines the first logarithmic base from a plurality of preset logarithmic bases according to the indication information of the first logarithmic base, and further dequantizes the characteristic data of the fixed point number type of all channels in the N channels by using a nonlinear logarithmic uniform dequantization mode according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first logarithmic base.
In an example three, if the inverse quantization mode for performing inverse quantization on the fixed point number type feature data of all the N channels is a nonlinear exponential uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value, and a first target quantization bit width, or the first information includes a first target feature value, a first target scaling value, and a first target quantization bit width, and a first exponential base, or the first information includes indication information of the first target feature value, the first target scaling value, and the first target quantization bit width, and the first exponential base.
Specifically, if the first information includes a first target feature value, a first target scaling value and a first target quantization bit width, the decoding end dequantizes the feature data of the fixed point number type of all channels in the N channels by using a nonlinear exponential uniform dequantization mode according to the first target feature value, the first target scaling value, the first target quantization bit width and a default exponent base.
If the first information includes a first target feature value, a first target scaling value, a first target quantization bit width and a first exponent base, the decoding end directly uses the first target feature value, the first target scaling value, the first target quantization bit width and the first exponent base carried by the first information, and uses a nonlinear exponent uniform dequantization mode to dequantize the feature data of the fixed point number type of all channels in the N channels.
If the first information includes a first target feature value, a first target scaling value, a first target quantization bit width, and indication information of a first exponent base, the indication information of the first exponent base is used for indicating that the first exponent base is determined from a plurality of preset exponent bases. In this way, the decoding end analyzes the first information from the code stream, determines the first index base from a plurality of preset index bases according to the indication information of the first index base, and further dequantizes the characteristic data of the fixed point number type of all channels in the N channels by using a nonlinear index uniform dequantization mode according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first index base.
In an example four, if the inverse quantization method for inverse-quantizing the fixed point number type feature data of all the N channels is a table look-up inverse quantization method, the first information includes a first correspondence between the index value of the quantization interval and the inverse quantization value of the quantization interval, and the first correspondence is determined based on the values before quantization and the values after quantization of the feature data of all the N channels. The index of the quantization interval may be understood as a characteristic value after the quantization interval is fixed, and the inverse quantization value of the quantization interval may be understood as a weighted average value of each characteristic value in the quantization interval, or a characteristic value corresponding to a central position of the quantization interval. The weighted average of the feature values in the quantization interval may also be referred to as a feature value corresponding to the probability distribution center of the quantization interval. Wherein the dequantized values may also be referred to as reconstructed values.
For the table look-up inverse quantization mode, since the 0 value in the feature data occupies a relatively large area, the feature data except the 0 value after sorting can be divided into sections containing the feature data with equal quantity, namely, all the 0 values of the feature data after sorting are marked as index number 0, and the corresponding reconstruction value is set as 0 value.
In a specific embodiment, the quantization mode of the encoding end corresponds to the inverse quantization mode of the decoding end one by one, for example, when the encoding end uses a linear quantization mode to quantize the feature data of floating point number type of all channels in the N channels, the decoding end uses a linear inverse quantization mode to inverse quantize the feature data of fixed point number type of all channels in the N channels. If the decoding end uses a nonlinear logarithmic uniform quantization mode to quantize the feature data of the floating point number type of all the N channels, the decoding end uses a nonlinear logarithmic uniform inverse quantization mode to inverse quantize the feature data of the fixed point number type of all the N channels. If the decoding end uses a nonlinear index uniform quantization mode to quantize the feature data of the floating point number type of all the N channels, the decoding end uses a nonlinear index uniform inverse quantization mode to inverse quantize the feature data of the fixed point number type of all the N channels. If the decoding end uses a table look-up quantization mode to quantize the feature data of the floating point number type of all the N channels, the decoding end uses a table look-up inverse quantization mode to inversely quantize the feature data of the fixed point number type of all the N channels.
The embodiment of the application can adopt the linear uniform dequantization mode, the nonlinear logarithmic function dequantization mode, the nonlinear exponential function dequantization mode and the table lookup dequantization mode.
In some embodiments, the dequantization information of the present application related to dequantization feature Data may be recorded in supplemental enhancement information, for example, in the Supplemental Enhancement Information (SEI) of the existing video coding standard h.265/HEVC, h.266/VVC, or the Extension Data (Extension Data) of the AVS standard.
In one example, a new SEI category, i.e. Feature data quantization SEI message, is added to sei_packet () of sei_message () in the sei_rbsp () of the existing video coding standard AVC/HEVC/VVC/EVC, and the payloadType may be defined as any number that is not used by other SEI, e.g. 183, at this time, the sei_payload () syntax structure is shown in table 1.
TABLE 1
Wherein feature_data_quantization represents dequantization of feature data.
And dequantizing the feature data of the fixed point number type of all channels in the N channels of the current image. When the inverse quantization modes are different, the grammar structures are also different, and the grammar structures corresponding to the different inverse quantization modes are described below.
In some embodiments, if all channels are dequantized in a linear uniform dequantization manner, the syntax structure is as shown in table 2:
TABLE 2
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value here is 0;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization value here is 0;
channel_num: the number of the channels for describing the characteristic data is channel_num;
scale_num: the scaling value for describing the feature data under all channels is scale_num, which can be understood as the first target scaling value;
min_num: the minimum value for describing feature data under all channels is min_num, which can be understood as the first target feature value described above.
In some embodiments, if all channels are dequantized by a nonlinear logarithmic function, the syntax structure is as shown in table 3:
TABLE 3 Table 3
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value here is 0;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization value here is 1;
channel_num: the number of the channels for describing the characteristic data is channel_num;
scale_num: the scaling value for describing the feature data under all channels is scale_num, which can be understood as the first target scaling value;
min_num: the minimum value for describing the feature data under all channels is min_num, which can be understood as the first target feature value;
log_base: the base used in describing the logarithmic inverse quantization is log_base, which can be understood as the first logarithmic base described above.
In some embodiments, if all channels are dequantized by a nonlinear logarithmic function, the syntax structure is as shown in table 4:
TABLE 4 Table 4
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value here is 0;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization here takes a value of 2.
channel_num: the number of the channels for describing the characteristic data is channel_num;
scale_num: the scaling value for describing the feature data under all channels is scale_num, which can be understood as the first target scaling value;
min_num: the minimum value for describing the feature data under all channels is min_num, which can be understood as the first target feature value;
e_base: the base for describing the exponential function used in the inverse quantization of the exponent is e_base, which can be understood as the first exponent base described above.
In some embodiments, if all channels are dequantized by table look-up dequantization, the table look-up dequantization includes histogram equalization dequantization. The syntax structure of the table look-up dequantization is shown in table 5:
TABLE 5
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value here is 0;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; here the flag_iquantization is 3;
channel_num: the number of the channels for describing the characteristic data is channel_num;
hist_codebook_num: the number hist_codebook_num of the inverse quantization values contained in the reconstruction codebook formed by the first correspondence between the index value for describing the quantization interval and the inverse quantization value of the quantization interval;
hist_codebook: the method is used for describing the inverse quantization value corresponding to the ith quantization interval index in the reconstructed codebook under the table lookup inverse quantization.
In case 2, the first information indicates that feature data of a fixed point number type of each of the N channels is respectively dequantized, and for each channel, according to a different dequantization manner, the first information includes any one of the following examples one, two, three, or four:
For example, if the inverse quantization mode for inverse-quantizing the fixed point number type feature data of the channel is a linear uniform inverse quantization mode, the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width.
The second target feature value is one feature value in the feature data of the channel, for example, the second target feature value is the minimum feature value of the feature data of the channel.
The second target scaling value is a scaling value corresponding to the characteristic data of the channel during quantization, and the second target quantization bit width is a quantization bit width corresponding to the characteristic data of the channel during quantization.
The following describes a process for determining the second target scaling value in combination with the encoding mode of the encoding end.
In one example, if the quantization mode of the encoding end for the channel is a linear uniform quantization mode, the encoding end may determine the second target scaling value according to the third feature value and the fourth feature value in the feature data of the channel and the second target quantization bit width.
Alternatively, the second target scaling value s may be determined according to the following equation (22) c2
Wherein x is cmax2 And x cmin2 The third characteristic value and the second characteristic value in the characteristic data of the channel respectively. The second target quantization bit width 2bitdepth may be the fourth quantization bit width bitdepth4 in the above equation (7).
It should be noted that the above formula (21) is only an example, and the present application determines the second target scaling value s c2 The formula (1) further includes deforming the formula (21), or adding, multiplying, dividing, or the like one or more coefficients in the formula (21).
In another example, if the quantization mode of the encoding end for the channel is a nonlinear logarithmic uniform quantization mode, the encoding end may determine the second target scaling value according to the second feature value and the second feature value in the feature data of the channel and the second base of the second target quantization bit width and logarithmic function.
Alternatively, the second target scaling value s may be determined according to the following equation (23) c2
Wherein log log_base2 The second target quantization bit width may be the fifth quantization bit width in the above equation (9) as the second base of the logarithmic function.
It should be noted that the above formula (23) is only an example, and the present application determines the second target scaling value s 2 The formula (2) further includes deforming the formula (23) or adding, multiplying or dividing one or more coefficients in the formula (23).
In another example, if the quantization mode of the encoding end for the channel is a nonlinear exponential uniform quantization mode, the encoding end may determine the second target scaling value according to the third feature value and the fourth feature value in the feature data of the channel, and the second target quantization bit width and the second base of the exponential function.
Alternatively, the second target scaling value s may be determined according to the following equation (24) c2
Wherein e_base2 is the second base of the exponential function, and the second target quantization bit width may be the sixth quantization bit width bitdepth6 in the above formula (11).
It should be noted that the above formula (24) is only an example, and the present application determines the second target scaling value s c2 Also included in the formula (24) is a modification of the formula (24), or one or more coefficients added, multiplied, or divided in the formula (24), or the like.
In this way, the decoding end can parse the first information from the code stream, and dequantize the feature data of the fixed point number type of the channel by using a linear uniform dequantization mode according to the second target feature value, the second target scaling value and the second target quantization bit width included in the first information.
In the second example, if the inverse quantization mode for performing inverse quantization on the fixed point number type feature data of the channel is a nonlinear logarithmic uniform inverse quantization mode, the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width, or the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width, and a second logarithmic base, or the first information includes indication information of the second target feature value, the second target scaling value, and the second target quantization bit width, and the second logarithmic base.
Specifically, if the first information includes a second target feature value, a second target scaling value and a second target quantization bit width, the decoding end uses a nonlinear logarithmic uniform inverse quantization mode to inversely quantize the feature data of the fixed point number type of the channel according to the second target feature value, the second target scaling value, the second target quantization bit width and a default logarithmic base number.
If the first information includes a second target feature value, a second target scaling value, a second target quantization bit width and a second logarithmic base number, the decoding end directly uses the second target feature value, the second target scaling value, the second target quantization bit width and the second logarithmic base number carried by the first information, and uses a nonlinear logarithmic uniform inverse quantization mode to inversely quantize the feature data of the fixed point number type of the channel.
If the first information includes a second target feature value, a second target scaling value, a second target quantization bit width, and indication information of a second logarithmic base, the indication information of the second logarithmic base is used for indicating to determine the second logarithmic base from a plurality of preset logarithmic bases. In this way, the decoding end analyzes the first information from the code stream, determines the second logarithmic base number from a plurality of preset logarithmic base numbers according to the indication information of the second logarithmic base number, and further dequantizes the fixed point number type characteristic data of the channel by using a nonlinear logarithmic uniform dequantization mode according to the second target characteristic value, the second target scaling value, the second target quantization bit width and the second logarithmic base number.
In example three, if the inverse quantization mode for inverse-quantizing the fixed point number type feature data of the channel is a nonlinear exponential uniform inverse quantization mode, the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width, or the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width, and a second exponent base, or the first information includes an indication information of the second target feature value, the second target scaling value, and the second target quantization bit width, and the second exponent base.
Specifically, if the first information includes a second target feature value, a second target scaling value and a second target quantization bit width, the decoding end uses a nonlinear exponential uniform inverse quantization mode to inversely quantize the feature data of the fixed point number type of the channel according to the second target feature value, the second target scaling value, the second target quantization bit width and a default exponent base.
If the first information includes a second target feature value, a second target scaling value, a second target quantization bit width and a second exponent base, the decoding end directly uses the second target feature value, the second target scaling value, the second target quantization bit width and the second exponent base carried by the first information, and uses a nonlinear index uniform inverse quantization mode to inversely quantize the feature data of the fixed point number type of the channel.
If the first information includes a second target feature value, a second target scaling value, a second target quantization bit width, and indication information of a second exponent base, the indication information of the second exponent base is used for indicating to determine the second exponent base from a plurality of preset exponent bases. In this way, the decoding end analyzes the first information from the code stream, determines the second exponent base from a plurality of preset exponent bases according to the indication information of the second exponent base, and further dequantizes the fixed point number type feature data of the channel by using a nonlinear exponent uniform dequantization mode according to the second target feature value, the second target scaling value, the second target quantization bit width and the second exponent base.
In an example four, if the inverse quantization method for inverse-quantizing the fixed point number type feature data of the channel is a look-up table inverse quantization method, the first information includes a second correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, and the second correspondence is determined based on a value before quantization and a value after quantization of the feature data of the channel. The index of the quantization interval may be understood as a characteristic value after the quantization interval is fixed, and the inverse quantization value of the quantization interval may be understood as a weighted average value of each characteristic value in the quantization interval, or a characteristic value corresponding to a central position of the quantization interval. The weighted average of the feature values in the quantization interval may also be referred to as a feature value corresponding to the probability distribution center of the quantization interval.
In a specific embodiment, the quantization mode of the encoding end corresponds to the inverse quantization mode of the decoding end, for example, when the encoding end uses a linear quantization mode to quantize the floating point type feature data of the channel, the decoding end uses a linear inverse quantization mode to inverse quantize the fixed point type feature data of the channel. If the decoding end uses a nonlinear logarithmic uniform quantization mode to quantize the characteristic data of the floating point number type of the channel, the decoding end uses a nonlinear logarithmic uniform inverse quantization mode to inversely quantize the characteristic data of the fixed point number type of the channel. If the decoding end uses a nonlinear index uniform quantization mode to quantize the characteristic data of the floating point number type of the channel, the decoding end uses a nonlinear index uniform inverse quantization mode to inversely quantize the characteristic data of the fixed point number type of the channel. If the decoding end uses a table look-up quantization mode to quantize the characteristic data of the floating point number type of the channel, the decoding end uses a table look-up inverse quantization mode to inversely quantize the characteristic data of the fixed point number type of the channel.
The embodiment of the application can adopt the linear uniform dequantization mode, the nonlinear function dequantization mode and the lookup table dequantization mode to dequantize the feature data of the floating point number type of each channel in N channels of the current image. When the inverse quantization modes are different, the grammar structures are also different, and the grammar structures corresponding to the different inverse quantization modes are described below.
In some embodiments, if the inverse quantization mode is linear uniform inverse quantization 6, the syntax structure is as shown in table 6:
TABLE 6
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; here flag_channel is 1;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; here the flag_iquantization is 0;
channel_num: the number of channels for describing the feature data is channel_num;
scale_num [ i ]: the scaling value for describing the feature data under the ith channel is scale_num [ i ], which can be understood as the second target scaling value;
min_num [ i ]: the minimum value for describing feature data under the ith channel is min_num [ i ], which can be understood as the above-mentioned second target feature value.
In some embodiments, if the inverse quantization is a nonlinear logarithmic function inverse quantization, the syntax structure is as shown in table 7:
TABLE 7
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value here is 1;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization value here is 1;
channel_num: the number of channels for describing the feature data is channel_num;
scale_num [ i ]: the scaling value for describing the feature data under the ith channel is scale_num [ i ], which can be understood as the second target scaling value;
min_num [ i ]: the minimum value for describing feature data under the ith channel is min_num [ i ], which can be understood as the above-mentioned second target feature value.
log_base: the base log_base describing the logarithmic function used in the logarithmic inverse quantization can be understood as the second logarithmic base described above.
In some embodiments, if the inverse quantization is a nonlinear logarithmic function inverse quantization, the syntax structure is as shown in table 8:
TABLE 8
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value here is 1;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization value here is 2;
channel_num: the number of channels for describing the feature data is channel_num;
scale_num [ i ]: the scaling value for describing the feature data under the ith channel is scale_num [ i ], which can be understood as the second target scaling value;
min_num [ i ]: the minimum value for describing feature data under the ith channel is min_num [ i ], which can be understood as the above-mentioned second target feature value.
e_base: the base used to describe the exponential function used in the logarithmic dequantization is e_base, which can be understood to be the second base described above.
In some embodiments, if the inverse quantization is a table look-up inverse quantization, optionally, the table look-up inverse quantization includes histogram equalization inverse quantization. The syntax structure of the table look-up dequantization is shown in table 9:
TABLE 9
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value here is 1;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization value here is 3;
channel_num: the number of channels for describing the feature data is channel_num;
hist_codebook_num [ i ]: the size of the reconstruction codebook formed by the second corresponding relation between the index value used for describing the quantization interval and the inverse quantization value of the quantization interval is hist_codebook_num [ i ];
hist_codebook [ i ] [ j ]: the inverse quantization value of the index describing the jth quantization interval in the reconstruction codebook corresponding to the ith channel is hist_codebook [ i ] [ j ].
In case 3, the first information indicates that feature data of a fixed point number type of each of the M groups of channels is respectively dequantized, and for each group of channels, according to a difference of dequantization modes, the first information includes any one of the following examples one, two, three or four: :
for example, if the inverse quantization mode for inverse-quantizing the fixed point number type feature data of the set of channels is a linear uniform inverse quantization mode, the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width.
Wherein the third target feature value is one feature value in the feature data of the set of channels, for example, the third target feature value is a feature data minimum value of the set of channels.
The third target scaling value is a scaling value corresponding to the characteristic data of the group of channels during quantization, and the third target quantization bit width is a quantization bit width corresponding to the characteristic data of the group of channels during quantization.
The following describes a process for determining the third target scaling value in combination with the encoding mode of the encoding end.
In one example, if the manner in which the encoding end quantizes the set of channels is a linear uniform quantization manner, the encoding end may determine the third target scaling value according to the fifth and sixth feature values in the feature data of the set of channels and the third target quantization bit width.
Alternatively, the third target scaling value s may be determined according to the following equation (25) c3
Wherein x is cmax3 And x cmin3 A fifth eigenvalue and a fifth eigenvalue, respectively, in the eigenvalue data of the set of channels. The third target quantization bit width 3bitdepth may be the seventh quantization bit width bitdepth7 in the above equation (13).
It should be noted that the above formula (25) is only an example, and the present application determines the third target scaling value s c3 The formula (25) may be modified, or one or more coefficients may be added, multiplied, or divided in the formula (25).
In another example, if the quantization mode of the encoding end for the set of channels is a nonlinear logarithmic uniform quantization mode, the encoding end may determine the third target scaling value according to the fifth feature value and the fifth feature value in the feature data of the set of channels and the third base of the third target quantization bit width and logarithmic function.
Alternatively, the third target scaling value s may be determined according to the following equation (26) c3
Wherein log log_base3 The third target quantization bit width may be the eighth quantization bit width in the above equation (15) as the third base of the logarithmic function.
It should be noted that the above formula (26) is only an example, and the present application determines the third target scaling value s c3 Also included in the formula (26) is a modification of the formula (26), or one or more coefficients added, multiplied, or divided in the formula (26).
In another example, if the quantization mode of the encoding end for the set of channels is a nonlinear exponential uniform quantization mode, the encoding end may determine according to the fifth feature value and the sixth feature value in the feature data of the set of channels and the third base of the third target quantization bit width and the exponential function.
Alternatively, the third target scaling value s may be determined according to the following equation (27) c3
Where e_base3 is the third base of the exponential function, the third target quantization bit width may be the ninth quantization bit width bitdepth9 in equation (18) above.
It should be noted that the above formula (27) is only an example, and the present application determines the third target scaling value s c3 Also included in the formula (27) is a modification of the formula (27), or one or more coefficients added, multiplied or divided in the formula (27), or the like.
In this way, the decoding end can parse the first information from the code stream, and dequantize the feature data of the fixed point number type of the group of channels by using a linear uniform dequantization mode according to the third target feature value, the third target scaling value and the third target quantization bit width included in the first information.
In the second example, if the inverse quantization mode for performing inverse quantization on the fixed point number type feature data of the set of channels is a nonlinear logarithmic uniform inverse quantization mode, the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, or the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, and a third logarithmic base, or the first information includes indication information of the third target feature value, the third target scaling value, and the third target quantization bit width, and the third logarithmic base.
Specifically, if the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, the decoding end dequantizes the feature data of the fixed point number type of the group of channels by using a nonlinear logarithmic uniform dequantization mode according to the third target feature value, the third target scaling value, the third target quantization bit width, and a default logarithmic base number.
If the first information includes a third target feature value, a third target scaling value, a third target quantization bit width and a third logarithmic base number, the decoding end directly uses the third target feature value, the third target scaling value, the third target quantization bit width and the third logarithmic base number carried by the first information, and uses a nonlinear logarithmic uniform inverse quantization mode to inversely quantize the feature data of the fixed point number type of the group of channels.
If the first information includes a third target feature value, a third target scaling value, a third target quantization bit width, and indication information of a third logarithmic base, the indication information of the third logarithmic base is used for indicating to determine the third logarithmic base from a plurality of preset logarithmic bases. In this way, the decoding end analyzes the first information from the code stream, determines the third logarithmic base from the preset plurality of logarithmic bases according to the indication information of the third logarithmic base, and further dequantizes the characteristic data of the fixed point number type of the group of channels by using a nonlinear logarithmic uniform dequantization mode according to the third target characteristic value, the third target scaling value, the third target quantization bit width and the third logarithmic base.
In example three, if the inverse quantization mode for inverse-quantizing the fixed point number type of feature data of the group of channels is a nonlinear exponential uniform inverse quantization mode, the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, or the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, and a third exponential base, or the first information includes indication information of the third target feature value, the third target scaling value, and the third target quantization bit width, and the third exponential base.
Specifically, if the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, the decoding end dequantizes the feature data of the fixed point number type of the group of channels by using a nonlinear exponential uniform dequantization mode according to the third target feature value, the third target scaling value, the third target quantization bit width, and a default exponent base.
If the first information includes a third target feature value, a third target scaling value, a third target quantization bit width and a third exponent base, the decoding end directly uses the third target feature value, the third target scaling value, the third target quantization bit width and the third exponent base carried by the first information, and uses a nonlinear exponent uniform inverse quantization mode to inversely quantize the feature data of the fixed point number type of the group of channels.
If the first information includes a third target feature value, a third target scaling value, a third target quantization bit width, and indication information of a third exponent base, the indication information of the third exponent base is used for indicating that the third exponent base is determined from a plurality of preset exponent bases. In this way, the decoding end analyzes the first information from the code stream, determines the third index base from the preset plurality of index bases according to the indication information of the third index base, and further dequantizes the fixed point number type characteristic data of the group of channels by using a nonlinear index uniform dequantization mode according to the third target characteristic value, the third target scaling value, the third target quantization bit width and the third index base.
In a fourth example, if the inverse quantization method for inverse-quantizing the fixed point number type feature data of the set of channels is a table look-up inverse quantization method, the first information includes a third correspondence between an index value of the quantization interval and an inverse quantization value of the quantization interval, and the third correspondence is determined based on a value before quantization and a value after quantization of the feature data of the set of channels. The index of the quantization interval may be understood as a characteristic value after the quantization interval is fixed, and the inverse quantization value of the quantization interval may be understood as a weighted average value of each characteristic value in the quantization interval, or a characteristic value corresponding to a central position of the quantization interval. The weighted average of the feature values in the quantization interval may also be referred to as a feature value corresponding to the probability distribution center of the quantization interval.
In a specific embodiment, the quantization mode of the encoding end corresponds to the inverse quantization mode of the decoding end, for example, when the encoding end uses the linear quantization mode to quantize the floating point number type feature data of the group of channels, the decoding end uses the linear inverse quantization mode to inverse quantize the fixed point number type feature data of the group of channels. If the decoding end uses a nonlinear logarithmic uniform quantization mode to quantize the characteristic data of the floating point number type of the group of channels, the decoding end uses a nonlinear logarithmic uniform inverse quantization mode to inverse quantize the characteristic data of the fixed point number type of the group of channels. If the decoding end uses a nonlinear index uniform quantization mode to quantize the characteristic data of the floating point number type of the group of channels, the decoding end uses a nonlinear index uniform inverse quantization mode to inverse quantize the characteristic data of the fixed point number type of the group of channels. If the decoding end uses a table look-up quantization mode to quantize the characteristic data of the floating point number type of the group of channels, the decoding end uses a table look-up inverse quantization mode to inversely quantize the characteristic data of the fixed point number type of the group of channels.
The embodiment of the application can adopt the linear uniform dequantization mode, the nonlinear function dequantization mode and the lookup table dequantization mode to dequantize the feature data of the floating point number type of each channel in N channels of the current image. When the inverse quantization modes are different, the grammar structures are also different, and the grammar structures corresponding to the different inverse quantization modes are described below.
In some embodiments, if the inverse quantization mode is linear uniform inverse quantization, the syntax structure is as shown in table 10:
table 10
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value is 2;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization value here is 0;
channel_num: the number of channels for describing the feature data is channel_num;
group_num: the number of packets for describing the feature data is group_num;
group_channel: the number of channels under each group for describing the feature data is group_channel;
scale_num [ i ]: the scaling value for describing the feature data under the ith channel is scale_num [ i ], which can be understood as the third target scaling value;
min_num [ i ]: the minimum value for describing all channel feature data under the ith packet is min_num [ i ], which can be understood as the above-mentioned third target feature value.
In some embodiments, if the inverse quantization is a nonlinear logarithmic function inverse quantization, the syntax structure is as shown in table 11:
TABLE 11
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3;
channel_num: the number of channels for describing the feature data is channel_num;
group_num: the number of packets for describing the feature data is group_num;
group_channel: the number of channels under each group for describing the feature data is group_channel;
scale_num [ i ]: the scaling value for describing the feature data under the ith channel is scale_num [ i ], which can be understood as the third target scaling value;
min_num [ i ]: the minimum value for describing all channel feature data under the ith packet is min_num [ i ], which can be understood as the third target feature value;
log_base: the base log_base describing the logarithmic function used in the logarithmic inverse quantization can be understood as the third logarithmic base described above.
In some embodiments, if the inverse quantization mode is inverse quantization of a nonlinear exponential function, the syntax structure is as shown in table 12:
table 12
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value is 2;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; the flag_iquantization value here is 2;
channel_num: the number of channels for describing the feature data is channel_num;
group_num: the number of packets for describing the feature data is group_num;
group_channel: the number of channels under each group for describing the feature data is group_channel;
scale_num [ i ]: the scaling value for describing the feature data under the ith channel is scale_num [ i ], which can be understood as the third target scaling value;
min_num [ i ]: the minimum value for describing all channel feature data under the ith packet is min_num [ i ], which can be understood as the third target feature value;
e_base: the base used to describe the exponential function used in logarithmic quantization is e_base, which can be understood as the third exponential base described above.
In some embodiments, if the inverse quantization mode is table lookup inverse quantization, the syntax structure is as shown in table 14:
TABLE 14
Syntax elements may be encoded in different efficient entropy coding modes, wherein the syntax elements are:
flag_channel: the symbol bit is used for describing and indicating a processing object of the decoding end, and represents the unified inverse quantization of all channels when the symbol bit is 0, represents the inverse quantization of each channel respectively when the symbol bit is 1, and represents the quantization of each group of channels respectively when the symbol bit is 2; the flag_channel value is 2;
flag_iquantization: the symbol bit is used for describing a method for indicating inverse quantization of a decoding end, and represents linear inverse quantization when the symbol bit is 0, nonlinear logarithmic inverse quantization when the symbol bit is 1, nonlinear exponential inverse quantization when the symbol bit is 2, and table lookup inverse quantization when the symbol bit is 3; here the flag_iquantization is 3;
channel_num: the number of channels for describing the feature data is channel_num;
group_num: the number of packets for describing the feature data is group_num;
group_channel: the number of channels under each group for describing the feature data is group_channel;
hist_codebook [ i ]: the reconstruction codebook formed by the third correspondence between the index value for describing the quantization interval under the i-th packet and the inverse quantization value of the quantization interval is hist_codebook [ i ].
hist_codebook_num [ i ]: the size of the reconstruction codebook formed by the third corresponding relation between the index value of the quantization interval and the inverse quantization value of the quantization interval under the ith group is hist_codebook_num [ i ];
hist_codebook [ i ] [ j ]: the inverse quantization value used to describe the jth quantization interval index in the reconstruction codebook corresponding to the ith packet is hist_codebook [ i ] [ j ].
In some embodiments, the decoding side dequantizes the fixed point number type of feature data for at least one channel using a default dequantization scheme.
The image encoding process is described above with reference to fig. 3 to 7, and the image decoding process at the decoding end is described below based on the above-described embodiments.
The decoding end performs an image decoding process may be the decoder shown in fig. 2.
Fig. 7 is a flowchart of an image decoding method 700 according to an embodiment of the present application, as shown in fig. 7, including:
s701, decoding a code stream to obtain feature data of a current image, wherein the feature data of the current image comprise feature data of N channels, and N is a positive integer;
s702, decoding the code stream to obtain first information, wherein the first information is used for indicating to dequantize characteristic data of at least one channel in the N channels;
s703, dequantizing the characteristic data of at least one channel according to the first information.
From the above, the encoding end quantizes the characteristic data of the current image by taking the channel of the characteristic data as consideration. Therefore, when the decoding end dequantizes the feature data, the dequantization is also performed taking the channel as a consideration. Specifically, the decoder analyzes the code stream to obtain the characteristic data of N channels of the current image, and first information, and dequantizes the characteristic data of at least one channel of the N channels according to the first information.
In some embodiments, the dequantizing the feature data of the at least one channel according to the first information in S703 includes: and according to the first information, the fixed point number type characteristic data of at least one channel is inversely quantized into the floating point number type characteristic data of at least one channel.
In some embodiments, the inverse quantization used by the decoder in inverse quantizing the fixed point number type of feature data of the at least one channel includes any one of: linear uniform inverse quantization, nonlinear uniform inverse quantization, or look-up table inverse quantization. The nonlinear uniform dequantization mode further comprises nonlinear exponential function dequantization and nonlinear logarithmic function dequantization. It should be noted that, the dequantization modes in the embodiment of the present application include, but are not limited to, the above several dequantization modes, and other dequantization modes may be used to dequantize the feature data of the fixed point number type, and the present application is not limited to the dequantization modes.
In some embodiments, the dequantization mode is default, i.e., the decoder dequantizes the fixed point number type feature data of the at least one channel using the default dequantization mode according to the first information.
In some embodiments, the code stream includes second information, where the second information is used to indicate an inverse quantization mode used when performing inverse quantization on the fixed-point number type feature data of the at least one channel, and the decoder may perform inverse quantization on the fixed-point number type feature data of the at least one channel according to the first information using the inverse quantization mode indicated by the second information.
In some embodiments, the first information in the code stream includes at least one parameter required to dequantize fixed point number type feature data for the at least one channel. For example, the first information includes a parameter corresponding to the inverse quantization mode.
In some embodiments, the manner of dequantizing the fixed point number type feature data of at least one channel of the N channels according to the first information in S703 includes, but is not limited to, the following:
if the first information indicates that feature data of fixed point number types of all channels in the N channels are dequantized, dequantizing the feature data of fixed point number types of all channels in the N channels by using the same dequantizing mode;
if the first information indicates that the fixed point number type characteristic data of each of the N channels are respectively dequantized, dequantizing the fixed point number type characteristic data of each channel by using a dequantization mode corresponding to the channel;
If the first information indicates that the fixed point number type characteristic data of the M groups of channels are respectively dequantized, dividing the N channels into the M groups of channels, and dequantizing the fixed point number type characteristic data of each group of channels by using a dequantization mode corresponding to the group of channels for each group of channels.
According to the image decoding method provided by the application, the characteristic data of the fixed point number type of N channels of the current image is obtained by decoding the code stream; decoding the code stream to obtain first information, wherein the first information indicates that feature data of a fixed point number type of at least one channel in the N channels is dequantized, so that the decoder dequantizes feature data of the fixed point number type of at least one channel in the N channels according to the first information to obtain feature data of a floating point number type of the current image. According to the application, the progress of the characteristic data output by the middle layer of the neural network is subjected to the dotting, so that the characteristic data can be decoded by multiplexing the technology in the existing video and image coding and decoding standards, and meanwhile, the characteristic data of the fixed point number type of N channels is subjected to the inverse quantization in at least one inverse quantization mode, so that the decoding efficiency of the characteristic data after the dotting is improved. In addition, the application considers the channel information of the characteristic data in the inverse quantization process of the decoding end, and can process the characteristic data among different channels, thereby improving the inverse quantization reliability of the characteristic data.
The process of dequantizing feature data of the fixed point number type for all channels in the N channels using one quantization method is described in detail below with reference to fig. 8.
Fig. 8 is a flowchart of an image decoding method 800 according to an embodiment of the present application, including:
s801, decoding a code stream to obtain first information;
s802, according to the first information, feature data of fixed point number types of all channels in the N channels are dequantized by using a dequantization mode.
The parameters included in the first information may be different for different dequantization modes, and the process of dequantizing the fixed point number type feature data of all the N channels using different dequantization modes by the decoder is described below.
In some embodiments, if the inverse quantization mode is a linear uniform inverse quantization mode, the above S802 includes S802-A1 and S802-A2 as follows:
S802-A1, analyzing first information to obtain a first target characteristic value, a first target scaling value and a first target quantization bit width;
S802-A2, according to the first target characteristic value, the first target scaling value and the first target quantization bit width, using a linear uniform dequantization mode to dequantize the characteristic data of the fixed point number type of all channels in the N channels.
The first target scaling value is a scaling value corresponding to the characteristic data of all channels in the N channels during quantization, and the first target quantization bit width is a quantization bit width corresponding to the characteristic data of all channels in the N channels during quantization.
Optionally, the first objective feature value is a minimum feature value in feature data of all channels in the N channels of the current image.
In this embodiment, the first information includes a first target feature value, a first target scaling value, and a first target quantization bit width required by the linear uniform dequantization mode, so that the decoder can dequantize the fixed-point number type feature data of all channels in the N channels using the linear uniform dequantization mode according to the first target feature value, the first target scaling value, and the first target quantization bit width carried by the first information. For example, the decoder uses several bits as one inverse quantization value according to the first target quantization bit width judgment, and then uses a linear uniform inverse quantization mode to inverse quantize the feature data of all channels in the N channels according to the first target feature value and the first target scaling value.
For example, the decoder dequantizes the fixed point number type feature data for all channels according to the following equation (28):
wherein y is cij Quantized value s for ith row and jth column of c-th channel c1 First target scaling value, x1, for feature data under all channels cmin For the first target feature value, x, of feature data under all channels cij The reconstructed value or the inverse quantized value of the ith row and the jth column of the c-th channel.
The nonlinear uniform quantization mode includes nonlinear logarithmic uniform inverse quantization mode and nonlinear exponential uniform inverse quantization mode according to the nonlinear function.
In some embodiments, if the inverse quantization is a nonlinear log-uniform inverse quantization, the above S802 includes S802-B1 and S802-B2 as follows:
S802-B1, determining a first target characteristic value, a first target scaling value, a first target quantization bit width and a first logarithmic base number according to first information;
S802-B2, according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first logarithmic base number, the characteristic data of the fixed point number type of all channels in the N channels is dequantized by using a nonlinear logarithmic uniform dequantization mode.
The determining, according to the first information, the first target feature value, the first target scaling value, the first target quantization bit width, and the first logarithmic base by the S802-B1 includes, but is not limited to, the following ways:
In one mode, if the first information includes the first target feature value, the first target scaling value, the first target quantization bit width, and the first logarithmic base number, the decoder may directly obtain the first target feature value, the first target scaling value, the first target quantization bit width, and the first logarithmic base number by parsing the first information.
In a second mode, if the first information includes the first target feature value, the first target scaling value, the first target quantization bit width and the indication information of the first logarithmic base number, the decoder analyzes the first information to obtain the first target feature value, the first target scaling value, the first target quantization bit width and the indication information of the first logarithmic base number; and determining the first logarithmic base from a plurality of preset logarithmic bases according to the indication information of the first logarithmic base.
In a third mode, if the first information includes the first target feature value, the first target scaling value and the first target quantization bit width, and does not include the first logarithmic base number, the decoder analyzes the first information to obtain the first target feature value, the first target scaling value and the first target quantization bit width, and determines the default logarithmic base number as the first logarithmic base number.
After determining the first target feature value, the first target scaling value, the first target quantization bit width and the first logarithmic base according to the above manner, the decoder uses a nonlinear logarithmic uniform inverse quantization manner to inverse-quantize the feature data of the fixed point number type of all channels in the N channels according to the first target feature value, the first target scaling value, the first target quantization bit width and the first logarithmic base.
For example, the decoder dequantizes the fixed point number type feature data for all channels according to the following equation (29):
wherein log_base 1 Is the first logarithmic base.
In some embodiments, if the inverse quantization is a nonlinear exponential uniform inverse quantization, the above S802 includes S802-C1 and S802-C2 as follows:
S802-C1, determining a first target characteristic value, a first target scaling value, a first target quantization bit width and a first exponent base according to first information;
S802-C2, according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first exponent base, using a nonlinear exponent uniform inverse quantization mode to inverse quantize the characteristic data of the fixed point number type of all channels in the N channels.
In some embodiments, the determining the first target feature value, the first target scaling value, the first target quantization bit width and the first exponent base according to the first information in the above S802-C1 includes, but is not limited to, the following ways:
in one mode, if the first information includes a first target feature value, a first target scaling value, a first target quantization bit width, and a first exponent base, the decoder directly parses the first information to obtain the first target feature value, the first target scaling value, the first target quantization bit width, and the first exponent base.
In a second mode, if the first information includes the first target feature value, the first target scaling value, the first target quantization bit width and the indication information of the first exponent base, the decoder analyzes the first information to obtain the first target feature value, the first target scaling value, the first target quantization bit width and the indication information of the first exponent base; and determining the first index base from a plurality of preset index bases according to the indication information of the first index base.
In a third mode, if the first information includes the first target feature value, the first target scaling value, and the first target quantization bit width, the decoder parses the first information to obtain the first target feature value, the first target scaling value, and the first target quantization bit width, and determines the default exponent base as the first exponent base.
After determining the first target feature value, the first target scaling value, the first target quantization bit width and the first exponent base according to the above manner, the decoder uses a nonlinear exponent uniform inverse quantization manner to inverse quantize the feature data of the fixed point number type of all channels in the N channels according to the first target feature value, the first target scaling value, the first target quantization bit width and the first exponent base.
For example, the decoder dequantizes the fixed point number type feature data for all channels according to the following equation (30):
wherein e_base 1 Is the first exponent base.
In some embodiments, if the inverse quantization is a look-up table inverse quantization, the step S802 includes the following steps S802-D1 to S802-D3:
S802-D1, determining a first corresponding relation between an index value of a quantization interval and an inverse quantization value of the quantization interval, wherein the first corresponding relation is determined based on values before quantization and values after quantization of characteristic data of all channels in N channels;
S802-D2, aiming at the characteristic data of each fixed point number type in all N channels, taking the value of the characteristic data of the fixed point number type as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the characteristic data of the fixed point number type in a first corresponding relation;
S802-D3, determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
The corresponding relation between the index value of the quantization interval and the inverse quantization value of the quantization interval is default; alternatively, the first information includes a correspondence between an index value of the quantization interval and an inverse quantization value of the quantization interval.
Optionally, the inverse quantization value of the quantization interval is a feature value corresponding to a central position in the quantization interval, or is a weighted average of feature values in the quantization interval. The weighted average of the feature values in the quantization interval may also be referred to as a feature value corresponding to the probability distribution center of the quantization interval.
The process of dequantizing feature data of the fixed point number type of each of the N channels of the current image by using an dequantizing means is described in detail below with reference to fig. 9.
Fig. 9 is a flowchart of an image decoding method 900 according to an embodiment of the present application, including:
s901, decoding a code stream aiming at each channel in N channels to obtain characteristic data of a fixed point number type of the channel;
s902, according to the first information, the feature data of the fixed point number type of the channel is dequantized by using the dequantization mode corresponding to the channel.
Alternatively, the inverse quantization method includes linear uniform inverse quantization, nonlinear function inverse quantization, and table look-up inverse quantization.
In some embodiments, if the inverse quantization is a linear uniform inverse quantization, the above S902 includes the following S902-A1 and S902-A2:
S902-A1, analyzing the first information to obtain a second target characteristic value, a second target scaling value and a second target quantization bit width;
S902-A2, dequantizing the fixed point number type characteristic data of the channel by using a linear uniform dequantizing mode according to the second target characteristic value, the second target scaling value and the second target quantization bit width.
The second target characteristic value is one characteristic value in the characteristic data of the group of channels, the second target scaling value is a scaling value corresponding to the characteristic data of the channels during quantization, and the second target quantization bit width is a quantization bit width corresponding to the characteristic data of the channels during quantization.
Optionally, the second target feature value is a minimum feature value in the feature data of the channel.
In this embodiment, the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width required by the linear uniform dequantization mode, so that the decoder can dequantize the fixed-point number type feature data of the channel using the linear uniform dequantization mode according to the second target feature value, the second target scaling value, and the second target quantization bit width carried by the first information. For example, the decoder uses several bits as one inverse quantization value according to the second target quantization bit width determination, and then inversely quantizes the feature data of the channel using a linear uniform inverse quantization method according to the second target feature value and the second target scaling value.
For example, feature data of the fixed point number type of the channel is dequantized according to the following formula (31):
wherein, assuming the current channel is the c-th channel, y cij Quantized value s for ith row and jth column of c-th channel c2 For a second target scaling value, x2, of the under-channel feature data cmin For the second target feature value, x, of the under-channel feature data cij The reconstructed value for the ith row and jth column of the c-th channel.
The nonlinear uniform quantization mode includes nonlinear logarithmic uniform inverse quantization mode and nonlinear exponential uniform inverse quantization mode according to the nonlinear function.
In some embodiments, if the inverse quantization is a nonlinear log-uniform inverse quantization, the above S902 includes S902-B1 and S902-B1 as follows:
S902-B1, determining a second target characteristic value, a second target scaling value, a second target quantization bit width and a second logarithmic base number according to the first information;
S902-B2, according to the second target characteristic value, the second target scaling value, the second target quantization bit width and the second logarithmic base, using a nonlinear logarithmic uniform inverse quantization mode to inversely quantize the characteristic data of the fixed point number type of the channel.
The determining the second target feature value, the second target scaling value, the second target quantization bit width and the second logarithmic base according to the first information in S902-B1 includes, but is not limited to, the following ways:
In one mode, if the first information includes a second target feature value, a second target scaling value, a second target quantization bit width, and a second logarithmic base, the decoder directly parses the first information to obtain the second target feature value, the second target scaling value, the second target quantization bit width, and the second logarithmic base.
In a second mode, if the first information includes the second target feature value, the second target scaling value, the second target quantization bit width and the indication information of the second logarithmic base number, the decoder analyzes the first information to obtain the second target feature value, the second target scaling value, the second target quantization bit width and the indication information of the second logarithmic base number; and determining the second logarithmic base from a plurality of preset logarithmic bases according to the indication information of the second logarithmic base.
In a third mode, if the first information includes the second target feature value, the second target scaling value and the second target quantization bit width, the decoder parses the first information to obtain the second target feature value, the second target scaling value and the second target quantization bit width, and determines a default logarithmic base number as the second logarithmic base number.
After determining the second target feature value, the second target scaling value, the second target quantization bit width and the second logarithmic base according to the above manner, the decoder dequantizes the feature data of the fixed point number type of the channel by using a nonlinear logarithmic uniform dequantization manner according to the second target feature value, the second target scaling value, the second target quantization bit width and the second logarithmic base.
For example, the decoder dequantizes the fixed point number type feature data for all channels according to the following equation (32):
wherein log_base 2 Is the second logarithmic base.
In some embodiments, if the inverse quantization mode corresponding to the channel is a nonlinear exponential uniform inverse quantization mode, the above S902 includes the following S902-C1 and S902-C2:
S902-C1, determining a second target characteristic value, a second target scaling value, a second target quantization bit width and a second exponent base according to the first information;
S902-C2, dequantizing the feature data of the fixed point number type of the channel by using a nonlinear index uniform dequantization mode according to the second target feature value, the second target scaling value, the second target quantization bit width and the second index base.
In some embodiments, the determining the second target feature value, the second target scaling value, the second target quantization bit width, and the second exponent base according to the first information in the S902-B1 manner includes, but is not limited to, the following:
in one mode, if the first information includes a second target feature value, a second target scaling value, a second target quantization bit width, and a second exponent base, the decoder directly parses the first information to obtain the second target feature value, the second target scaling value, the second target quantization bit width, and the second exponent base.
In a second mode, if the first information includes the second target feature value, the second target scaling value, the second target quantization bit width and the indication information of the second logarithmic base number, the decoder analyzes the first information to obtain the first information including the second target feature value, the second target scaling value, the second target quantization bit width and the indication information of the second logarithmic base number; and determining the second exponent base from a plurality of preset exponent bases according to the indication information of the second exponent base.
In a third mode, if the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width, the decoder parses the first information to obtain the second target feature value, the second target scaling value, and the second target quantization bit width, and determines a default exponent base as the second exponent base.
After determining the second target feature value, the second target scaling value, the second target quantization bit width and the second exponent base according to the above manner, the decoder dequantizes the feature data of the fixed point number type of the channel by using a nonlinear exponent uniform dequantization manner according to the second target feature value, the second target scaling value, the second target quantization bit width and the second exponent base.
For example, the decoder dequantizes the fixed point number type of feature data for the channel according to the following equation (33):
wherein e_base 2 Is the second exponent base.
In some embodiments, if the inverse quantization corresponding to the channel is a look-up table inverse quantization, the step S902 includes steps S902-D1 to S902-D3:
S902-D1, determining a second corresponding relation between an index value of a quantization interval and an inverse quantization value of the quantization interval, wherein the second corresponding relation is determined based on a value before quantization and a value after quantization of the characteristic data of the channel;
S902-D2, aiming at the characteristic data of each fixed point type in the channel, taking the value of the characteristic data of the fixed point type as an index of a quantization interval, and inquiring a target dequantization value corresponding to the value of the characteristic data of the fixed point type in a second corresponding relation;
S902-D3, determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
The corresponding relation between the index value of the quantization interval and the inverse quantization value of the quantization interval is default; alternatively, the first information includes a correspondence between an index value of the quantization interval and an inverse quantization value of the quantization interval.
Optionally, the inverse quantization value of the quantization interval is a feature value corresponding to a central position in the quantization interval, or is a weighted average of feature values in the quantization interval. The weighted average of the feature values in the quantization interval may also be referred to as a feature value corresponding to the probability distribution center of the quantization interval.
The process of dequantizing the floating point number type of feature data for each set of channels, respectively, using an dequantization scheme is described in detail below in conjunction with FIG. 10.
Fig. 10 is a flowchart of an image decoding method 1000 according to an embodiment of the present application, including:
s101, decoding a code stream aiming at each group of channels to obtain characteristic data of the fixed point number type of the group of channels;
s102, according to the first information, performing inverse quantization on the fixed point number type characteristic data of the group of channels by using an inverse quantization mode corresponding to the group of channels.
Alternatively, the inverse quantization method includes linear uniform inverse quantization, nonlinear function inverse quantization, and table look-up inverse quantization.
In some embodiments, if the inverse quantization corresponding to the set of channels is a linear uniform inverse quantization, the step S102 includes the following steps S102-A1 and S102-A2:
S102-A1, analyzing the first information to obtain a third target characteristic value, a third target scaling value and a third target quantization bit width;
S102-A2, dequantizing the fixed point number type characteristic data of the group of channels by using a linear uniform dequantizing mode according to the third target characteristic value, the third target scaling value and the third target quantization bit width.
The third target characteristic value is one characteristic value in the characteristic data of the group of channels, the third target scaling value is a scaling value corresponding to the characteristic data of the group of channels during quantization, and the third target quantization bit width is a quantization bit width corresponding to the characteristic data of the group of channels during quantization.
Optionally, the third target feature value is a minimum feature value in the feature data of the set of channels.
In this embodiment, the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width required by the linear uniform dequantization mode, so that the decoder can dequantize the fixed-point number type feature data of the set of channels using the linear uniform dequantization mode according to the third target feature value, the third target scaling value, and the third target quantization bit width carried by the first information. For example, the decoder uses several bits as one inverse quantization value according to the third target quantization bit width determination, and then inversely quantizes the feature data of the set of channels using a linear uniform inverse quantization method according to the third target feature value and the third target scaling value.
For example, feature data of the fixed point number type for the set of channels is dequantized according to the following equation (34):
wherein the c-th channel is one channel in the current group of channels, y cij Quantized value s for ith row and jth column of c-th channel c3 A third target scaling value, x3, for the feature data under the set of channels cmin For a third target feature value, x, of the feature data under the set of channels cij The reconstructed value for the ith row and jth column of the c-th channel.
The nonlinear uniform quantization mode includes a nonlinear logarithmic uniform inverse quantization mode and a nonlinear exponential uniform inverse quantization mode.
In some embodiments, if the inverse quantization corresponding to the set of channels is a nonlinear logarithmic uniform inverse quantization, the step S102 includes the following steps S102-B1 and S102-B2:
S102-B1, determining a third target characteristic value, a third target scaling value, a third target quantization bit width and a third logarithmic base number according to the first information;
S102-B2, dequantizing the feature data of the fixed point number type of the group of channels by using a nonlinear logarithmic uniform dequantization mode according to the third target feature value, the third target scaling value, the third target quantization bit width and the third logarithmic base number.
In some embodiments, the manners of determining the third target feature value, the third target scaling value, the third target quantization bit width, and the third logarithmic base in S102-B1 described above include, but are not limited to, the following:
In one mode, if the first information includes a third target feature value, a third target scaling value, a third target quantization bit width, and a third logarithmic base, the decoder directly parses the first information to obtain the third target feature value, the third target scaling value, the third target quantization bit width, and the third logarithmic base.
In a second mode, if the first information includes the third target feature value, the third target scaling value, the third target quantization bit width and the indication information of the third logarithmic base number, the decoder analyzes the first information to obtain the third target feature value, the third target scaling value, the third target quantization bit width and the indication information of the third logarithmic base number; determining a third logarithmic base from a plurality of preset logarithmic bases according to the indication information of the third logarithmic base;
in a third mode, if the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, the decoder parses the first information to obtain the third target feature value, the third target scaling value, and the third target quantization bit width, and determines a default logarithmic base number as a third logarithmic base number.
After determining the third target feature value, the third target scaling value, the third target quantization bit width and the third logarithmic base according to the above manner, the decoder dequantizes the feature data of the fixed point number type of the group of channels by using a nonlinear logarithmic uniform dequantization manner according to the third target feature value, the third target scaling value, the third target quantization bit width and the third logarithmic base.
For example, feature data of the fixed point number type for the set of channels is dequantized according to the following equation (35):
wherein log_base 3 Is the third logarithmic base.
In some embodiments, if the inverse quantization mode corresponding to the set of channels is a nonlinear exponential uniform inverse quantization mode, the step S102 includes the following steps S102-C1 and S102-C2:
S102-C1, determining a third target characteristic value, a third target scaling value, a third target quantization bit width and a third exponent base according to the first information;
S102-C2, according to the third target characteristic value, the third target scaling value, the third target quantization bit width and the third exponent base, performing dequantization on the characteristic data of the fixed point number type of the group of channels by using a nonlinear exponent uniform dequantization mode.
In some embodiments, the manners of determining the third target feature value, the third target scaling value, the third target quantization bit width and the third exponent base in S102-C1 described above include, but are not limited to, the following manners:
in one mode, if the first information includes a third target feature value, a third target scaling value, a third target quantization bit width, and a third exponent base, the decoder directly parses the first information to obtain the third target feature value, the third target scaling value, the third target quantization bit width, and the third exponent base.
In a second mode, if the first information includes the third target feature value, the third target scaling value, the third target quantization bit width and the indication information of the third logarithmic base number, the decoder analyzes the first information to obtain the first information including the third target feature value, the third target scaling value, the third target quantization bit width and the indication information of the third logarithmic base number; determining a third index base number from a plurality of preset index base numbers according to the indication information of the third logarithmic base number;
in a third mode, if the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, the decoder parses the first information to obtain the third target feature value, the third target scaling value, and the third target quantization bit width, and determines a default exponent base as the third exponent base.
After determining the third target feature value, the third target scaling value, the third target quantization bit width and the third index base according to the above manner, the decoder dequantizes the feature data of the fixed point number type of the group of channels by using a nonlinear index uniform dequantization manner according to the third target feature value, the third target scaling value, the third target quantization bit width and the third index base.
For example, feature data of the fixed point number type for the set of channels is dequantized according to the following equation (36):
wherein e_base 3 Is the third exponential base.
In some embodiments, if the inverse quantization mode corresponding to the set of channels is a look-up table inverse quantization mode, the step S102 includes the following steps S102-D1 to S102-D3:
S102-D1, determining a third corresponding relation between an index value of a quantization interval and an inverse quantization value of the quantization interval, wherein the third corresponding relation is determined based on a value before quantization and a value after quantization of the characteristic data of the group of channels;
S102-D2, aiming at the characteristic data of each fixed point type in the group of channels, taking the value of the characteristic data of the fixed point type as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the characteristic data of the fixed point type in a third corresponding relation;
S102-D3, determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
The corresponding relation between the index value of the quantization interval and the inverse quantization value of the quantization interval is default; alternatively, the first information includes a correspondence between an index value of the quantization interval and an inverse quantization value of the quantization interval.
Optionally, the inverse quantization value of the quantization interval is a feature value corresponding to a central position in the quantization interval, or is a weighted average of feature values in the quantization interval. The weighted average of the feature values in the quantization interval may also be referred to as a feature value corresponding to the probability distribution center of the quantization interval.
It should be understood that the above-described fig. 3 to 10 are only examples of the present application and should not be construed as limiting the present application.
The preferred embodiments of the present application have been described in detail above with reference to the accompanying drawings, but the present application is not limited to the specific details of the above embodiments, and various simple modifications can be made to the technical solution of the present application within the scope of the technical concept of the present application, and all the simple modifications belong to the protection scope of the present application. For example, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described further. As another example, any combination of the various embodiments of the present application may be made without departing from the spirit of the present application, which should also be regarded as the disclosure of the present application.
It should be further understood that, in the various method embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic of the processes, and should not constitute any limitation on the implementation process of the embodiments of the present application. In addition, in the embodiment of the present application, the term "and/or" is merely an association relationship describing the association object, which means that three relationships may exist. Specifically, a and/or B may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
The method embodiment of the present application is described in detail above with reference to fig. 3 to 10, and the apparatus embodiment of the present application is described in detail below with reference to fig. 14.
Fig. 11 is a schematic block diagram of a video encoder 10 provided by an embodiment of the present application.
As shown in fig. 11, the video encoder 10 includes:
an acquisition unit 110 for acquiring a current image to be encoded;
the feature extraction unit 120 is configured to input the current image into a neural network to obtain feature data of the current image, where the feature data of the current image includes feature data of N channels, and N is a positive integer;
A quantization unit 130, configured to quantize feature data of at least one channel of the N channels;
the encoding unit 140 is configured to encode the quantized feature data of the at least one channel to obtain a code stream, where the code stream includes first information, and the first information is used to instruct dequantizing the feature data of the at least one channel in the N channels.
In some embodiments, the quantization unit 130 is specifically configured to quantize the floating point type feature data of at least one of the N channels into fixed point type feature data.
In some embodiments, the quantization mode for quantizing the floating point number type of the feature data of at least one of the N channels includes any one of the following: linear uniform quantization mode, nonlinear exponential uniform quantization mode, nonlinear logarithmic uniform quantization mode, and table look-up quantization mode.
In some embodiments, the quantization unit 130 is specifically configured to quantize the feature data of the floating point number type of all channels in the N channels using the same quantization mode; or, respectively quantizing the feature data of the floating point number type of each channel in the N channels by using one quantization mode; or grouping the N channels, and quantizing the feature data of the floating point number type of each group of channels by using a quantization mode.
In some embodiments, if the quantization mode is a linear uniform quantization mode, the quantization unit 130 is specifically configured to obtain a preset first quantization bit width, and a first feature value and a second feature value in the floating point number type feature data of all channels in the N channels; and quantizing the feature data of the floating point number type of each of the N channels by using the linear uniform quantization mode according to the first feature value, the second feature value and the first quantization bit width.
In some embodiments, if the quantization mode is a nonlinear logarithmic uniform quantization mode, the quantization unit 130 is specifically configured to obtain a first base of a preset second quantization bit width and logarithmic function, and a first feature value and a second feature value in the feature data of the floating point number type of all channels in the N channels; and quantizing the feature data of the floating point number type of each of the N channels by using the nonlinear logarithmic uniform quantization mode according to the first feature value and the second feature value, the second quantization bit width and the first base of the logarithmic function.
In some embodiments, if the quantization mode is a nonlinear exponential uniform quantization mode, the quantization unit 130 is specifically configured to obtain a first base of a preset third quantization bit width and exponential function, and a first feature value and a second feature value in the feature data of the floating point number type of all channels in the N channels; and quantizing the feature data of the floating point number type of each of the N channels by using the nonlinear index uniform quantization mode according to the first feature value, the second feature value, the third quantization bit width and the first base of the index function.
In some embodiments, if the quantization mode is a table look-up quantization mode, the quantization unit 130 specifically sorts the feature data of the floating point number type of all channels in the N channels according to the value size, to obtain sorted first feature data; dividing the ordered first characteristic data into a plurality of first quantization intervals, wherein each first quantization interval comprises the same number of characteristic data; for each of the first quantization intervals, quantizing values of the feature data within the first quantization interval to index values of the first quantization interval.
Optionally, the first characteristic value is a minimum characteristic value in the floating point type characteristic data of all channels in the N channels, and the second characteristic value is a maximum characteristic value in the floating point type characteristic data of all channels in the N channels.
In some embodiments, if the quantization mode is a linear uniform quantization mode, the quantization unit 130 obtains a preset fourth quantization bit width, and a third feature value and a fourth feature value in the floating-point number type feature data of the channel for each of the N channels; and quantizing the floating point number type characteristic data of the channel by using the linear uniform quantization mode according to the third characteristic value, the fourth characteristic value and the fourth quantization bit width.
In some embodiments, if the quantization mode is a nonlinear logarithmic uniform quantization mode, the quantization unit 130 obtains, for each of the N channels, a second base of a preset fifth quantization bit width and logarithmic function, and a third feature value and a fourth feature value in the floating-point number type feature data of the channel; and quantizing the floating point number type characteristic data of the channel by using the nonlinear logarithmic uniform quantization mode according to the third characteristic value, the fourth characteristic value, the fifth quantization bit width and the second base of the logarithmic function.
In some embodiments, if the quantization mode is a nonlinear exponential uniform quantization mode, the quantization unit 130 obtains, for each of the N channels, a second base of a preset sixth quantization bit width and exponential function, and a third feature value and a fourth feature value in the floating-point number type feature data of the channel; and quantizing the floating point number type characteristic data of the channel by using the nonlinear index uniform quantization mode according to the third characteristic value, the fourth characteristic value, the sixth quantization bit width and the second base of the index function.
In some embodiments, if the quantization mode is a table look-up quantization mode, the quantization unit 130 specifically sorts the floating point number type feature data of the channel according to the value size for each channel of the N channels, to obtain second feature data after the channel is down-sorted; dividing the ordered second characteristic data under the channel into a plurality of second quantization intervals, wherein each second quantization interval comprises the same number of characteristic data; for each of the second quantization intervals, quantizing values of the feature data within the second quantization interval to index values of the second quantization interval.
Optionally, the third feature value is a maximum feature value in the floating point type feature data of the channel, and the fourth feature value is a minimum feature value in the floating point type feature data of the channel.
In some embodiments, if the quantization mode is a linear uniform quantization mode, the quantization unit 130 obtains a preset seventh quantization bit width for each group of channels, and a fifth feature value and a sixth feature value in the floating point number type feature data of the group of channels; and quantizing the floating point number type characteristic data of each channel in the group of channels by using the linear uniform quantization mode according to the fifth characteristic value, the sixth characteristic value and the seventh quantization bit width.
In some embodiments, if the quantization mode is a nonlinear logarithmic uniform quantization mode, the quantization unit 130 obtains, for each group of channels, a third base of a preset eighth quantization bit width and logarithmic function, and a fifth feature value and a sixth feature value in the feature data of the floating point number type of the group of channels; and quantizing the floating point number type characteristic data of each channel in the group of channels by using the nonlinear logarithm uniform quantization mode according to the fifth characteristic value and the sixth characteristic value and the eighth quantization bit width and the third base of the logarithmic function.
In some embodiments, if the quantization mode is a nonlinear exponential uniform quantization mode, the quantization unit 130 obtains, for each group of channels, a third base number of a preset ninth quantization bit width and exponential function, and a fifth feature value and a sixth feature value in the feature data of the floating point number type of the group of channels; and quantizing the floating point number type characteristic data of each channel in the group of channels by using the nonlinear logarithmic uniform quantization mode according to the fifth characteristic value and the sixth characteristic value and the ninth quantization bit width and the third base of the exponential function.
In some embodiments, if the quantization mode is a table look-up quantization mode, the quantization unit 130 specifically sorts the feature data of the floating point number type of each group of channels according to the value size by using the feature data of the floating point number type of each group of channels, so as to obtain third feature data after being sorted under the group of channels; dividing the third feature data sequenced under the group of channels into a plurality of third quantization intervals, wherein each third quantization interval comprises the same number of feature data; for each of the third quantization intervals, quantizing values of the feature data within the third quantization interval to index values of the third quantization interval.
Optionally, the fifth characteristic value is a maximum characteristic value in the floating point type characteristic data of the group of channels, and the sixth characteristic value is a minimum characteristic value in the floating point type characteristic data of the group of channels.
In some embodiments, the first information indicates dequantizing feature data of a fixed point number type for all of the N channels; or, the first information indicates that feature data of a fixed point number type of each of the N channels is respectively dequantized; or, the first information indicates that feature data of a fixed point number type of each of M groups of channels is respectively dequantized, where the M groups of channels are obtained by grouping the N channels, and each group of channels includes at least one channel of the N channels.
In some embodiments, the inverse quantization method used in the inverse quantization of the fixed point number type of feature data of the at least one channel includes any one of the following: a linear uniform inverse quantization mode, a nonlinear exponential uniform inverse quantization mode, a nonlinear logarithmic uniform inverse quantization mode and a table lookup inverse quantization mode.
In some embodiments, the first information includes at least one parameter required to dequantize fixed point number type feature data for the at least one channel.
In some embodiments, the first information indicates that feature data of a fixed point number type of all channels in the N channels is dequantized, and the first information includes any one of the following:
if the inverse quantization mode of inverse quantization is a linear uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value and a first target quantization bit width;
if the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of all the N channels is a nonlinear logarithmic uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value and a first target quantization bit width, or the first information includes a first target feature value, a first target scaling value and a first target quantization bit width and a first logarithmic base, or the first information includes indication information of the first target feature value, the first target scaling value and the first target quantization bit width and the first logarithmic base;
If the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of all the N channels is a nonlinear exponential uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value and a first target quantization bit width, or the first information includes a first target feature value, a first target scaling value and a first target quantization bit width and a first exponential base, or the first information includes indication information of the first target feature value, the first target scaling value and the first target quantization bit width and the first exponential base;
if the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of all the N channels is a table look-up inverse quantization mode, the first information includes a first correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, and the first correspondence is determined based on a value before quantization and a value after quantization of the feature data of all the N channels;
the first target scaling value is a scaling value corresponding to the characteristic data of all channels in the N channels during quantization, and the first target quantization bit width is a quantization bit width corresponding to the characteristic data of all channels in the N channels during quantization.
Optionally, the first target feature value is a feature data minimum value of all channels in the N channels.
In some embodiments, the first information indicates that feature data of a fixed point number type of each of the N channels is respectively dequantized, and for each channel, the first information includes any one of the following:
if the inverse quantization mode of inverse quantization is a linear uniform inverse quantization mode, the first information includes a second target feature value, a second target scaling value and a second target quantization bit width;
if the feature data of the fixed point number type of the channel is inversely quantized into a nonlinear logarithmic uniform inverse quantization mode of the inverse quantization mode, the first information comprises a second target feature value, a second target scaling value and a second target quantization bit width, or the first information comprises a second target feature value, a second target scaling value and a second target quantization bit width and a second logarithmic base, or the first information comprises a second target feature value, a second target scaling value and an indication information of the second target quantization bit width and the second logarithmic base;
if the dequantization mode for dequantizing the fixed point number type feature data of the channel is a nonlinear exponential uniform dequantization mode, the first information includes a second target feature value, a second target scaling value and a second target quantization bit width, or the first information includes a second target feature value, a second target scaling value and a second target quantization bit width and a second exponent base, or the first information includes a second target feature value, a second target scaling value and an indication information of a second target quantization bit width and a second exponent base;
If the inverse quantization mode of inverse quantization is a table look-up inverse quantization mode, the first information includes a second correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, and the second correspondence is determined based on a value before quantization and a value after quantization of the feature data of the channel;
the second target characteristic value is one characteristic value in the characteristic data of the channel, the second target scaling value is a scaling value corresponding to the characteristic data of the channel during quantization, and the second target quantization bit width is a quantization bit width corresponding to the characteristic data of the channel during quantization.
Optionally, the second target feature value is a feature data minimum value of the channel.
In some embodiments, the first information indicates that feature data of fixed point number type of M groups of channels are respectively dequantized, and for each group of channels, the first information includes any one of the following:
if the inverse quantization mode of the group of channels is a linear uniform inverse quantization mode, the first information comprises a third target characteristic value, a third target scaling value and a third target quantization bit width;
If the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of the group of channels is a nonlinear logarithmic uniform inverse quantization mode, the first information includes a third target feature value, a third target scaling value and a third target quantization bit width, or the first information includes a third target feature value, a third target scaling value, a third target quantization bit width and a third logarithmic base, or the first information includes indication information of the third target feature value, the third target scaling value, the third target quantization bit width and the third logarithmic base;
if the inverse quantization mode of inverse quantization is a nonlinear exponential uniform inverse quantization mode, the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, or the first information includes a third target feature value, a third target scaling value, a third target quantization bit width, and a third exponential base, or the first information includes indication information of the third target feature value, the third target scaling value, the third target quantization bit width, and the third exponential base;
if the inverse quantization mode of inverse quantization is a table look-up inverse quantization mode, the first information includes a third correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, and the third correspondence is determined based on a value before quantization and a value after quantization of the feature data of the group of channels;
The M channels are obtained by grouping the N channels, each group of channels includes at least one channel of the N channels, the third target feature value is one feature value in feature data of the group of channels, the third target scaling value is a scaling value corresponding to the feature data of the group of channels during quantization, and the third target quantization bit width is a quantization bit width corresponding to the feature data of the group of channels during quantization.
Optionally, the third target feature value is a feature data minimum value of the set of channels.
In some embodiments, the code stream further comprises second information indicating an inverse quantization manner used when inverse quantizing the fixed point number type of feature data of the at least one channel.
It should be understood that apparatus embodiments and method embodiments may correspond with each other and that similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the video encoder 10 shown in fig. 11 may perform the method according to the embodiment of the present application, and the foregoing and other operations and/or functions of each unit in the video encoder 10 are respectively for implementing the corresponding flows in each method, such as the methods 300 to 600, and are not described herein for brevity.
Fig. 12 is a schematic block diagram of video decoder 20 provided by an embodiment of the present application.
As shown in fig. 12, the video decoder 20 may include:
a decoding unit 210, configured to decode a code stream to obtain feature data of a current image, where the feature data of the current image includes feature data of N channels, and N is a positive integer;
the decoding unit 210 is further configured to decode the code stream to obtain first information, where the first information is used to indicate that feature data of at least one channel of the N channels is dequantized;
the dequantizing unit 220 is configured to dequantize the feature data of the at least one channel according to the first information.
In some embodiments, the inverse quantization unit 220 is specifically configured to inverse-quantize the fixed-point number type of feature data of the at least one channel into the floating-point number type of feature data of the at least one channel according to the first information.
In some embodiments, the inverse quantization method used in the inverse quantization of the fixed point number type of feature data of the at least one channel includes any one of the following: a linear uniform inverse quantization mode, a nonlinear exponential uniform inverse quantization mode, a nonlinear logarithmic uniform inverse quantization mode and a table lookup inverse quantization mode.
In some embodiments, the dequantizing unit 220 is specifically configured to dequantize the fixed point number type feature data of the at least one channel according to the first information by using a default dequantizing method.
In some embodiments, the code stream further includes second information, where the second information is used to indicate an inverse quantization mode used when performing inverse quantization on the fixed point number type feature data of the at least one channel, and the corresponding inverse quantization unit 220 is specifically configured to perform inverse quantization on the fixed point number type feature data of the at least one channel according to the first information using the inverse quantization mode indicated by the second information.
In some embodiments, the first information includes at least one parameter required to dequantize fixed point number type feature data for the at least one channel.
In some embodiments, the inverse quantization unit 220 is specifically configured to, if the first information indicates that the feature data of the fixed point number type of all channels in the N channels is inversely quantized, inversely quantize the feature data of the fixed point number type of all channels in the N channels by using the same inverse quantization mode; or if the first information indicates that the fixed point number type characteristic data of each channel in the N channels are respectively dequantized, dequantizing the fixed point number type characteristic data of each channel by using a dequantization mode corresponding to the channel; or if the first information indicates that the fixed point number type characteristic data of the M groups of channels are respectively dequantized, dividing the N channels into the M groups of channels, and dequantizing the fixed point number type characteristic data of each group of channels by using a dequantization mode corresponding to the group of channels for each group of channels.
In some embodiments, if the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of all the N channels is a linear uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to parse the first information to obtain a first target feature value, a first target scaling value, and a first target quantization bit width; and according to the first target characteristic value, the first target scaling value and the first target quantization bit width, performing dequantization on the characteristic data of the fixed point number type of all the N channels by using a linear uniform dequantization mode.
In some embodiments, if the inverse quantization mode for performing inverse quantization on the fixed-point number type feature data of all the N channels is a nonlinear logarithmic uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine, according to the first information, a first target feature value, a first target scaling value, a first target quantization bit width, and a first logarithmic base; and according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first logarithmic base number, performing dequantization on the characteristic data of the fixed point number type of all the N channels by using the nonlinear logarithmic uniform dequantization mode.
The inverse quantization unit 220 is specifically configured to parse the first information to obtain the first target feature value, the first target scaling value, the first target quantization bit width, and the first logarithmic base; or analyzing the first information to obtain the indication information of the first target characteristic value, the first target scaling value, the first target quantization bit width and the first logarithmic base number; determining the first logarithmic base number from a plurality of preset logarithmic base numbers according to the indication information of the first logarithmic base number; or analyzing the first information to obtain the first target characteristic value, the first target scaling value and the first target quantization bit width, and determining a default logarithmic base as the first logarithmic base.
In some embodiments, if the inverse quantization mode for performing inverse quantization on the fixed point number type feature data of all the N channels is a nonlinear exponential uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine, according to the first information, a first target feature value, a first target scaling value, a first target quantization bit width, and a first exponential base; and according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first exponent base, performing dequantization on the characteristic data of the fixed point number type of all the N channels by using the nonlinear exponent uniform dequantization mode.
The dequantizing unit 220 is specifically configured to parse the first information to obtain the first target feature value, the first target scaling value, the first target quantization bit width, and the first exponent base; or analyzing the first information to obtain the indication information of the first target characteristic value, the first target scaling value, the first target quantization bit width and the first exponent base; determining the first index base from a plurality of preset index bases according to the indication information of the first index base; or analyzing the first information to obtain the first target characteristic value, the first target scaling value and the first target quantization bit width, and determining a default exponent base as the first exponent base.
In some embodiments, the first target feature value is one feature value of feature data of all channels in the N channels, the first target scaling value is a scaling value corresponding to feature data of all channels in the N channels when quantized, and the first target quantization bit width is a quantization bit width corresponding to feature data of all channels in the N channels when quantized.
Optionally, the first target feature value is a minimum feature value in feature data of all channels in the N channels.
In some embodiments, if the inverse quantization mode of inverse-quantizing the fixed point number type of feature data of all channels in the N channels is a table look-up inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine a first correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, where the first correspondence is determined based on a value before quantization and a value after quantization of the feature data of all channels in the N channels; for each fixed point type characteristic data of all the N channels, taking the value of the fixed point type characteristic data as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the fixed point type characteristic data in the first corresponding relation; and determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
In some embodiments, if the inverse quantization mode corresponding to the channel is a linear uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to parse the first information to obtain a second target feature value, a second target scaling value, and a second target quantization bit width; and according to the second target characteristic value, the second target scaling value and the second target quantization bit width, performing dequantization on the characteristic data of the fixed point number type of the channel by using the linear uniform dequantization mode.
In some embodiments, if the inverse quantization mode corresponding to the channel is a nonlinear logarithmic uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine, according to the first information, a second target feature value, a second target scaling value, a second target quantization bit width, and a second logarithmic base; and according to the second target characteristic value, the second target scaling value, the second target quantization bit width and the second logarithmic base, performing dequantization on the characteristic data of the fixed point number type of the channel by using the nonlinear logarithmic uniform dequantization mode.
The inverse quantization unit 220 is specifically configured to parse the first information to obtain the second target feature value, the second target scaling value, the second target quantization bit width, and the second logarithmic base; or analyzing the first information to obtain the second target characteristic value, the second target scaling value, the second target quantization bit width and the indication information of the second logarithmic base number; determining the second logarithmic base number from a plurality of preset logarithmic base numbers according to the indication information of the second logarithmic base number; or analyzing the first information to obtain the second target characteristic value, the second target scaling value and the second target quantization bit width, and determining a default logarithmic base number as the second logarithmic base number.
In some embodiments, if the inverse quantization mode corresponding to the channel is a nonlinear exponential uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine, according to the first information, a second target feature value, a second target scaling value, a second target quantization bit width, and a second exponent base; and according to the second target characteristic value, the second target scaling value, the second target quantization bit width and the second exponent base, performing dequantization on the characteristic data of the fixed point number type of the channel by using the nonlinear exponent uniform dequantization mode.
The dequantizing unit 220 is specifically configured to parse the first information to obtain the second target feature value, the second target scaling value, the second target quantization bit width, and the second exponent base; or analyzing the first information to obtain indication information of the first information including the second target characteristic value, the second target scaling value, the second target quantization bit width and the second logarithmic base number; determining the second exponent base from a plurality of preset exponent bases according to the indication information of the second exponent base; or analyzing the first information to obtain the second target characteristic value, the second target scaling value and the second target quantization bit width, and determining a default exponent base as the second exponent base.
In some embodiments, the second target feature value is one feature value in the feature data of the set of channels, the second target scaling value is a scaling value corresponding to the feature data of the channel during quantization, and the second target quantization bit width is a quantization bit width corresponding to the feature data of the channel during quantization.
Optionally, the second target feature value is a minimum feature value in feature data of the channel.
In some embodiments, if the inverse quantization mode corresponding to the channel is a look-up table inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine a second correspondence between the index value of the quantization interval and the inverse quantization value of the quantization interval, where the second correspondence is determined based on the pre-quantization value and the post-quantization value of the feature data of the channel; for each fixed point type of feature data in the channel, taking the value of the fixed point type of feature data as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the fixed point type of feature data in the second corresponding relation; and determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
In some embodiments, if the inverse quantization mode corresponding to the set of channels is a linear uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to parse the first information to obtain a third target feature value, a third target scaling value, and a third target quantization bit width; and according to the third target characteristic value, the third target scaling value and the third target quantization bit width, performing dequantization on the characteristic data of the fixed point number type of the group of channels by using the linear uniform dequantization mode.
In some embodiments, if the inverse quantization mode corresponding to the set of channels is a nonlinear logarithmic uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine a third target feature value, a third target scaling value, a third target quantization bit width, and a third logarithmic base according to the first information; and according to the third target characteristic value, the third target scaling value, the third target quantization bit width and the third logarithmic base number, performing dequantization on the characteristic data of the fixed point number type of the group of channels by using the nonlinear logarithmic uniform dequantization mode.
The inverse quantization unit 220 is specifically configured to parse the first information to obtain the third target feature value, the third target scaling value, the third target quantization bit width, and the third logarithmic base; or analyzing the first information to obtain indication information of the third target characteristic value, the third target scaling value, the third target quantization bit width and the third logarithmic base number; determining the third logarithmic base number from a plurality of preset logarithmic base numbers according to the indication information of the third logarithmic base number; or analyzing the first information to obtain the third target characteristic value, the third target scaling value and the third target quantization bit width, and determining a default logarithmic base number as the third logarithmic base number.
In some embodiments, if the inverse quantization mode corresponding to the set of channels is a nonlinear exponential uniform inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine a third target feature value, a third target scaling value, a third target quantization bit width, and a third exponential base according to the first information; and according to the third target characteristic value, the third target scaling value, the third target quantization bit width and the third exponent base, performing dequantization on the characteristic data of the fixed point number type of the group of channels by using the nonlinear exponent uniform dequantization mode.
The inverse quantization unit 220 is specifically configured to parse the first information to obtain the third target feature value, the third target scaling value, the third target quantization bit width, and the third exponent base; or analyzing the first information to obtain indication information of the first information including the third target characteristic value, the third target scaling value, the third target quantization bit width and the third logarithmic base number; determining the third index base number from a plurality of preset index base numbers according to the indication information of the third logarithmic base number; or analyzing the first information to obtain the third target characteristic value, the third target scaling value and the third target quantization bit width, and determining a default exponent base as the third exponent base.
In some embodiments, the third target feature value is one feature value in the feature data of the set of channels, the third target scaling value is a scaling value corresponding to the feature data of the set of channels at the time of quantization, and the third target quantization bit width is a quantization bit width corresponding to the feature data of the set of channels at the time of quantization.
Optionally, the third target feature value is a minimum feature value in the feature data of the set of channels.
In some embodiments, if the inverse quantization mode corresponding to the set of channels is a look-up table inverse quantization mode, the inverse quantization unit 220 is specifically configured to determine a third correspondence between the index value of the quantization interval and the inverse quantization value of the quantization interval, where the third correspondence is determined based on the pre-quantization value and the post-quantization value of the feature data of the set of channels;
for each fixed point type characteristic data in the group of channels, taking the value of the fixed point type characteristic data as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the fixed point type characteristic data in the third corresponding relation;
and determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
Optionally, the corresponding relation between the index value of the quantization interval and the inverse quantization value of the quantization interval is default; alternatively, the first information includes a correspondence between an index value of the quantization interval and an inverse quantization value of the quantization interval.
It should be understood that apparatus embodiments and method embodiments may correspond with each other and that similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the video decoder 20 shown in fig. 12 may correspond to respective bodies in performing the methods 700 to 1000 according to the embodiments of the present application, and the foregoing and other operations and/or functions of the respective units in the video decoder 20 are respectively for implementing respective flows in the respective methods 700 to 1000 and so on, which are not described herein for brevity.
The apparatus and system of embodiments of the present application are described above in terms of functional units in conjunction with the accompanying drawings. It should be understood that the functional units may be implemented in hardware, or in instructions in software, or in a combination of hardware and software units. Specifically, each step of the method embodiment in the embodiment of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in a software form, and the steps of the method disclosed in connection with the embodiment of the present application may be directly implemented as a hardware decoding processor or implemented by a combination of hardware and software units in the decoding processor. Alternatively, the software elements may reside in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with hardware, performs the steps in the above method embodiments.
Fig. 13 is a schematic block diagram of an electronic device 30 provided by an embodiment of the present application.
As shown in fig. 13, the electronic device 30 may be a video encoder or a video decoder according to an embodiment of the present application, and the electronic device 30 may include:
a memory 33 and a processor 32, the memory 33 being adapted to store a computer program 34 and to transmit the program code 34 to the processor 32. In other words, the processor 32 may call and run the computer program 34 from the memory 33 to implement the method of an embodiment of the present application.
For example, the processor 32 may be configured to perform the steps of the methods described above in accordance with instructions in the computer program 34.
In some embodiments of the present application, the processor 32 may include, but is not limited to:
a general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.
In some embodiments of the present application, the memory 33 includes, but is not limited to:
Volatile memory and/or nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and Direct memory bus RAM (DR RAM).
In some embodiments of the application, the computer program 34 may be partitioned into one or more units that are stored in the memory 33 and executed by the processor 32 to perform the methods provided by the application. The one or more elements may be a series of computer program instruction segments capable of performing the specified functions, which instruction segments describe the execution of the computer program 34 in the electronic device 30.
As shown in fig. 13, the electronic device 30 may further include:
a transceiver 33, the transceiver 33 being connectable to the processor 32 or the memory 33.
The processor 32 may control the transceiver 33 to communicate with other devices, and in particular, may send information or data to other devices or receive information or data sent by other devices. The transceiver 33 may include a transmitter and a receiver. The transceiver 33 may further include antennas, the number of which may be one or more.
It will be appreciated that the various components in the electronic device 30 are connected by a bus system that includes, in addition to a data bus, a power bus, a control bus, and a status signal bus.
Fig. 14 is a schematic block diagram of a video codec system 40 provided by an embodiment of the present application.
As shown in fig. 14, the video codec system 40 may include: a video encoder 41 and a video decoder 42, wherein the video encoder 41 is used for executing the video encoding method according to the embodiment of the present application, and the video decoder 42 is used for executing the video decoding method according to the embodiment of the present application.
The present application also provides a computer storage medium having stored thereon a computer program which, when executed by a computer, enables the computer to perform the method of the above-described method embodiments. Alternatively, embodiments of the present application also provide a computer program product comprising instructions which, when executed by a computer, cause the computer to perform the method of the method embodiments described above.
When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, functional units in various embodiments of the application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The above is only a specific embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (65)

  1. An image encoding method, comprising:
    acquiring a current image to be coded;
    inputting the current image into a neural network to obtain characteristic data of the current image, wherein the characteristic data of the current image comprises characteristic data of N channels, and N is a positive integer;
    Quantizing the characteristic data of at least one of the N channels;
    and encoding the quantized characteristic data of the at least one channel to obtain a code stream, wherein the code stream comprises first information, and the first information is used for indicating to dequantize the characteristic data of the at least one channel in the N channels.
  2. The method of claim 1, wherein the quantizing the characteristic data of at least one of the N channels comprises:
    and quantifying the feature data of the floating point number type of at least one channel in the N channels to obtain the feature data of the fixed point number type of the at least one channel.
  3. The method of claim 2, wherein the quantization mode for quantizing the floating point type of feature data of at least one of the N channels includes any one of: linear uniform quantization mode, nonlinear exponential uniform quantization mode, nonlinear logarithmic uniform quantization mode, and table look-up quantization mode.
  4. A method according to claim 2 or 3, wherein said quantifying feature data of the floating point type for at least one of said N channels comprises:
    Quantizing the feature data of the floating point number type of all channels in the N channels by using the same quantization mode; or,
    respectively quantizing the feature data of the floating point number type of each channel in the N channels by using one quantization mode;
    or grouping the N channels, and quantizing the feature data of the floating point number type of each group of channels by using a quantization mode.
  5. The method of claim 4, wherein if the quantization mode is a linear uniform quantization mode, the quantizing the floating point number type feature data of all channels in the N channels using the same quantization mode comprises:
    acquiring a preset first quantization bit width, and a first characteristic value and a second characteristic value in the floating point number type characteristic data of all channels in the N channels;
    and quantizing the feature data of the floating point number type of each of the N channels by using the linear uniform quantization mode according to the first feature value, the second feature value and the first quantization bit width.
  6. The method of claim 4, wherein if the quantization mode is a nonlinear logarithmic uniform quantization mode, the quantizing the floating point number type feature data of all the N channels using the same quantization mode comprises:
    Acquiring a first base number of a preset second quantization bit width and logarithmic function, and a first characteristic value and a second characteristic value in the floating point number type characteristic data of all channels in the N channels;
    and quantizing the feature data of the floating point number type of each of the N channels by using the nonlinear logarithmic uniform quantization mode according to the first feature value and the second feature value, the second quantization bit width and the first base of the logarithmic function.
  7. The method of claim 4, wherein if the quantization mode is a nonlinear exponential uniform quantization mode, the quantizing the floating point number type feature data of all the N channels using the same quantization mode comprises:
    acquiring a first base number of a preset third quantization bit width and an exponential function, and a first characteristic value and a second characteristic value in the floating point number type characteristic data of all channels in the N channels;
    and quantizing the feature data of the floating point number type of each of the N channels by using the nonlinear index uniform quantization mode according to the first feature value, the second feature value, the third quantization bit width and the first base of the index function.
  8. The method of claim 4, wherein if the quantization mode is a look-up table quantization mode, the quantizing the floating point type feature data of all channels in the N channels using the same quantization mode comprises:
    sorting the feature data of the floating point number type of all channels in the N channels according to the value, and obtaining first feature data after sorting;
    dividing the ordered first characteristic data into a plurality of first quantization intervals, wherein each first quantization interval comprises the same number of characteristic data;
    for each of the first quantization intervals, quantizing values of the feature data within the first quantization interval to index values of the first quantization interval.
  9. The method of any of claims 5-7, wherein the first eigenvalue is a minimum eigenvalue in the floating point type of eigenvalue of all channels in the N channels, and the second eigenvalue is a maximum eigenvalue in the floating point type of eigenvalue of all channels in the N channels.
  10. The method of claim 4, wherein if the quantization mode is a linear uniform quantization mode, the quantizing the floating point number type feature data of each of the N channels by using one quantization mode, respectively, comprises:
    Acquiring a preset fourth quantization bit width, and a third characteristic value and a fourth characteristic value in floating point number type characteristic data of each channel in the N channels;
    and quantizing the floating point number type characteristic data of the channel by using the linear uniform quantization mode according to the third characteristic value, the fourth characteristic value and the fourth quantization bit width.
  11. The method of claim 4, wherein if the quantization mode is a nonlinear logarithmic uniform quantization mode, the quantizing the floating point number type feature data of each of the N channels by using one quantization mode, respectively, comprises:
    for each channel in the N channels, acquiring a second base number of a preset fifth quantization bit width and logarithmic function, and a third characteristic value and a fourth characteristic value in the characteristic data of the floating point number type of the channel;
    and quantizing the floating point number type characteristic data of the channel by using the nonlinear logarithmic uniform quantization mode according to the third characteristic value, the fourth characteristic value, the fifth quantization bit width and the second base of the logarithmic function.
  12. The method of claim 4, wherein if the quantization mode is a nonlinear exponential uniform quantization mode, the quantizing the floating point type feature data of each of the N channels by using one quantization mode, respectively, comprises:
    For each channel in the N channels, acquiring a preset second base number of a sixth quantization bit width and an exponential function, and a third characteristic value and a fourth characteristic value in the floating point number type characteristic data of the channel;
    and quantizing the floating point number type characteristic data of the channel by using the nonlinear index uniform quantization mode according to the third characteristic value, the fourth characteristic value, the sixth quantization bit width and the second base of the index function.
  13. The method of claim 5, wherein if the quantization mode is a look-up table quantization mode, the quantizing the floating point type feature data of each of the N channels by using one quantization mode, respectively, comprises:
    aiming at each channel in the N channels, sorting the feature data of the floating point number type of the channel according to the value, and obtaining second feature data after sorting under the channel;
    dividing the ordered second characteristic data under the channel into a plurality of second quantization intervals, wherein each second quantization interval comprises the same number of characteristic data;
    for each of the second quantization intervals, quantizing values of the feature data within the second quantization interval to index values of the second quantization interval.
  14. The method of any of claims 10-12, wherein the third eigenvalue is a largest eigenvalue in the floating point type of eigenvalue of the channel, and the fourth eigenvalue is a smallest eigenvalue in the floating point type of eigenvalue of the channel.
  15. The method of claim 4, wherein if the quantization mode is a linear uniform quantization mode, the quantizing the floating point number type of feature data of each group of channels by using one quantization mode, respectively, comprises:
    acquiring a preset seventh quantization bit width, and a fifth characteristic value and a sixth characteristic value in the floating point number type characteristic data of each group of channels;
    and quantizing the floating point number type characteristic data of each channel in the group of channels by using the linear uniform quantization mode according to the fifth characteristic value, the sixth characteristic value and the seventh quantization bit width.
  16. The method of claim 4, wherein if the quantization mode is a nonlinear logarithmic uniform quantization mode, the quantizing the floating point number type of feature data of each group of channels using one quantization mode, respectively, comprises:
    For each group of channels, acquiring a third base number of a preset eighth quantization bit width and logarithmic function, and a fifth characteristic value and a sixth characteristic value in the floating point number type characteristic data of the group of channels;
    and quantizing the floating point number type characteristic data of each channel in the group of channels by using the nonlinear logarithm uniform quantization mode according to the fifth characteristic value and the sixth characteristic value and the eighth quantization bit width and the third base of the logarithmic function.
  17. The method of claim 4, wherein if the quantization mode is a nonlinear exponential uniform quantization mode, the quantizing the floating point number type of feature data of each group of channels by using one quantization mode, respectively, comprises:
    for each group of channels, acquiring a third base number of a preset ninth quantized bit width and an exponential function, and a fifth characteristic value and a sixth characteristic value in the floating point number type characteristic data of the group of channels;
    and quantizing the floating point number type characteristic data of each channel in the group of channels by using the nonlinear logarithmic uniform quantization mode according to the fifth characteristic value and the sixth characteristic value, the ninth quantization bit width and the third base of the exponential function.
  18. The method of claim 4, wherein if the quantization mode is a look-up table quantization mode, the quantizing the floating point number type of feature data of each group of channels by using one quantization mode, respectively, comprises:
    aiming at each group of channels, sorting the feature data of the floating point number type of the group of channels according to the value, and obtaining third feature data of the group of channels after sorting;
    dividing the third feature data of the group of channels after the ordering into a plurality of third quantization intervals, wherein each third quantization interval comprises the same number of feature data;
    for each of the third quantization intervals, quantizing values of the feature data within the third quantization interval to index values of the third quantization interval.
  19. The method of any of claims 15-17, wherein the fifth eigenvalue is a largest eigenvalue in the floating point type of eigenvalue of the group of channels, and the sixth eigenvalue is a smallest eigenvalue in the floating point type of eigenvalue of the group of channels.
  20. The method according to any of claims 2-19, wherein the first information indicates dequantizing feature data of a fixed point number type for all of the N channels; or,
    The first information indicates that feature data of a fixed point number type of each of the N channels is respectively dequantized; or,
    the first information indicates that feature data of a fixed point number type of each of M groups of channels is respectively dequantized, wherein the M groups of channels are obtained by grouping the N channels, and each group of channels comprises at least one channel of the N channels.
  21. The method according to any of claims 2-20, wherein the inverse quantization used in inverse quantizing the fixed point number type of feature data of the at least one channel comprises any of: a linear uniform inverse quantization mode, a nonlinear exponential uniform inverse quantization mode, a nonlinear logarithmic uniform inverse quantization mode and a table lookup inverse quantization mode.
  22. The method according to any of claims 2-21, wherein the first information comprises at least one parameter required for dequantizing fixed point number type feature data of the at least one channel.
  23. The method of claim 22, wherein the first information indicates that feature data of a fixed point number type for all of the N channels is dequantized, and the first information includes any one of:
    If the inverse quantization mode of inverse quantization is a linear uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value and a first target quantization bit width;
    if the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of all the N channels is a nonlinear logarithmic uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value and a first target quantization bit width, or the first information includes a first target feature value, a first target scaling value, a first target quantization bit width and a first logarithmic base, or the first information includes indication information of the first target feature value, the first target scaling value, the first target quantization bit width and the first logarithmic base;
    if the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of all the N channels is a nonlinear exponential uniform inverse quantization mode, the first information includes a first target feature value, a first target scaling value and a first target quantization bit width, or the first information includes a first target feature value, a first target scaling value, a first target quantization bit width and a first exponential base, or the first information includes indication information of the first target feature value, the first target scaling value, the first target quantization bit width and the first exponential base;
    If the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of all the N channels is a table look-up inverse quantization mode, the first information includes a first correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, and the first correspondence is determined based on a value before quantization and a value after quantization of the feature data of all the N channels;
    the first target scaling value is a scaling value corresponding to the characteristic data of all channels in the N channels during quantization, and the first target quantization bit width is a quantization bit width corresponding to the characteristic data of all channels in the N channels during quantization.
  24. The method of claim 23, wherein the first target feature value is a feature data minimum value for all of the N channels.
  25. The method of claim 22, wherein the first information indicates that feature data of a fixed point number type of each of the N channels is respectively dequantized, and for each channel, the first information includes any one of:
    If the inverse quantization mode of inverse quantization is a linear uniform inverse quantization mode, the first information includes a second target feature value, a second target scaling value and a second target quantization bit width;
    if the feature data of the fixed point number type of the channel is inversely quantized into a nonlinear logarithmic uniform inverse quantization mode of the inverse quantization mode, the first information comprises a second target feature value, a second target scaling value and a second target quantization bit width, or the first information comprises a second target feature value, a second target scaling value, a second target quantization bit width and a second logarithmic base, or the first information comprises indication information of the second target feature value, the second target scaling value, the second target quantization bit width and the second logarithmic base;
    if the dequantization mode for dequantizing the fixed point number type feature data of the channel is a nonlinear exponential uniform dequantization mode, the first information includes a second target feature value, a second target scaling value, and a second target quantization bit width, or the first information includes a second target feature value, a second target scaling value, a second target quantization bit width, and a second exponent base, or the first information includes indication information of the second target feature value, the second target scaling value, the second target quantization bit width, and the second exponent base;
    If the inverse quantization mode of inverse quantization is a table look-up inverse quantization mode, the first information includes a second correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, and the second correspondence is determined based on a value before quantization and a value after quantization of the feature data of the channel;
    the second target characteristic value is one characteristic value in the characteristic data of the channel, the second target scaling value is a scaling value corresponding to the characteristic data of the channel during quantization, and the second target quantization bit width is a quantization bit width corresponding to the characteristic data of the channel during quantization.
  26. The method of claim 25, wherein the second target feature value is a feature data minimum value for the channel.
  27. The method of claim 22, wherein the first information indicates that feature data of a fixed point number type of M groups of channels is respectively dequantized, and for each group of channels, the first information includes any one of:
    if the inverse quantization mode of the group of channels is a linear uniform inverse quantization mode, the first information comprises a third target characteristic value, a third target scaling value and a third target quantization bit width;
    If the inverse quantization mode of performing inverse quantization on the fixed point number type feature data of the group of channels is a nonlinear logarithmic uniform inverse quantization mode, the first information includes a third target feature value, a third target scaling value and a third target quantization bit width, or the first information includes a third target feature value, a third target scaling value, a third target quantization bit width and a third logarithmic base, or the first information includes indication information of the third target feature value, the third target scaling value, the third target quantization bit width and the third logarithmic base;
    if the inverse quantization mode of inverse quantization is a nonlinear exponential uniform inverse quantization mode, the first information includes a third target feature value, a third target scaling value, and a third target quantization bit width, or the first information includes a third target feature value, a third target scaling value, a third target quantization bit width, and a third exponential base, or the first information includes indication information of the third target feature value, the third target scaling value, the third target quantization bit width, and the third exponential base;
    if the inverse quantization mode of inverse quantization is a table look-up inverse quantization mode, the first information includes a third correspondence between an index value of a quantization interval and an inverse quantization value of the quantization interval, and the third correspondence is determined based on a value before quantization and a value after quantization of the feature data of the group of channels;
    The M channels are obtained by grouping the N channels, each group of channels includes at least one channel of the N channels, the third target feature value is one feature value in feature data of the group of channels, the third target scaling value is a scaling value corresponding to the feature data of the group of channels during quantization, and the third target quantization bit width is a quantization bit width corresponding to the feature data of the group of channels during quantization.
  28. The method of claim 27, wherein the third target feature value is a feature data minimum value for the set of channels.
  29. The method according to any of claims 2-28, wherein the code stream further comprises second information indicating an inverse quantization mode used in inverse quantizing fixed point number type feature data of the at least one channel.
  30. An image decoding method, comprising:
    decoding a code stream to obtain fixed point number type characteristic data of a current image, wherein the characteristic data of the current image comprises N channels of characteristic data, and N is a positive integer;
    decoding a code stream to obtain first information, wherein the first information is used for indicating feature data of at least one channel in the N channels to be dequantized;
    And dequantizing the characteristic data of the at least one channel according to the first information.
  31. The method of claim 30, wherein said dequantizing the feature data for the at least one channel based on the first information comprises:
    and performing inverse quantization on the fixed point number type characteristic data of the at least one channel according to the first information to obtain floating point number type characteristic data of the at least one channel.
  32. The method of claim 31, wherein the inverse quantization used in inverse quantizing the fixed point number type of feature data of the at least one channel comprises any one of: a linear uniform inverse quantization mode, a nonlinear exponential uniform inverse quantization mode, a nonlinear logarithmic uniform inverse quantization mode and a table lookup inverse quantization mode.
  33. The method according to claim 31 or 32, wherein said dequantizing the fixed point number type feature data of said at least one channel according to said first information comprises:
    and according to the first information, using a default inverse quantization mode to inversely quantize the feature data of the fixed point number type of the at least one channel.
  34. The method according to claim 31 or 32, wherein the code stream further comprises second information indicating an inverse quantization mode used in inverse quantizing the fixed point type feature data of the at least one channel, and wherein the inverse quantizing the fixed point type feature data of the at least one channel according to the first information comprises:
    and according to the first information, dequantizing the feature data of the fixed point number type of the at least one channel by using a dequantization mode indicated by the second information.
  35. The method according to claim 31 or 34, wherein the first information comprises at least one parameter required for dequantizing the fixed point number type of feature data of the at least one channel.
  36. The method according to any one of claims 31-35, wherein said dequantizing feature data of a fixed point number type for said at least one channel according to said first information comprises:
    if the first information indicates that the feature data of the fixed point number type of all the N channels is dequantized, dequantizing the feature data of the fixed point number type of all the N channels by using the same dequantizing mode; or,
    If the first information indicates that the fixed point number type characteristic data of each channel in the N channels are respectively dequantized, dequantizing the fixed point number type characteristic data of each channel by using a dequantization mode corresponding to the channel; or,
    if the first information indicates that the fixed point number type characteristic data of the M groups of channels are respectively dequantized, dividing the N channels into the M groups of channels, and dequantizing the fixed point number type characteristic data of each group of channels by using a dequantization mode corresponding to the group of channels for each group of channels.
  37. The method of claim 36, wherein if the inverse quantization mode for inverse quantizing the fixed point number type feature data of all the N channels is a linear uniform inverse quantization mode, the inverse quantizing the fixed point number type feature data of all the N channels using the same inverse quantization mode includes:
    analyzing the first information to obtain a first target characteristic value, a first target scaling value and a first target quantization bit width;
    and according to the first target characteristic value, the first target scaling value and the first target quantization bit width, performing dequantization on the characteristic data of the fixed point number type of all the N channels by using a linear uniform dequantization mode.
  38. The method of claim 36, wherein if the inverse quantization mode for inverse quantizing the fixed point number type of feature data of all the N channels is a nonlinear log-uniform inverse quantization mode, then inverse quantizing the fixed point number type of feature data of all the N channels using the same inverse quantization mode comprises:
    determining a first target characteristic value, a first target scaling value, a first target quantization bit width and a first logarithmic base according to the first information;
    and according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first logarithmic base number, performing dequantization on the characteristic data of the fixed point number type of all the N channels by using the nonlinear logarithmic uniform dequantization mode.
  39. The method of claim 38, wherein determining a first target feature value, a first target scaling value, and a first target quantization bit width and a first logarithmic base from the first information comprises:
    analyzing the first information to obtain the first target characteristic value, the first target scaling value, the first target quantization bit width and the first logarithmic base number; or,
    Analyzing the first information to obtain indication information of the first target characteristic value, the first target scaling value, the first target quantization bit width and the first logarithmic base number; determining the first logarithmic base number from a plurality of preset logarithmic base numbers according to the indication information of the first logarithmic base number; or,
    and analyzing the first information to obtain the first target characteristic value, the first target scaling value and the first target quantization bit width, and determining a default logarithmic base number as the first logarithmic base number.
  40. The method of claim 36, wherein if the inverse quantization mode for inverse quantizing the fixed point number type feature data of all the N channels is a nonlinear exponential uniform inverse quantization mode, the inverse quantizing the fixed point number type feature data of all the N channels using the same inverse quantization mode includes:
    determining a first target characteristic value, a first target scaling value, a first target quantization bit width and a first exponent base according to the first information;
    and according to the first target characteristic value, the first target scaling value, the first target quantization bit width and the first exponent base, performing dequantization on the characteristic data of the fixed point number type of all the N channels by using the nonlinear exponent uniform dequantization mode.
  41. The method of claim 40, wherein determining a first target feature value, a first target scaling value, a first target quantization bit width, and a first exponent base based on the first information comprises:
    analyzing the first information to obtain the first target characteristic value, a first target scaling value, a first target quantization bit width and a first exponent base; or,
    analyzing the first information to obtain indication information of the first target characteristic value, the first target scaling value, the first target quantization bit width and the first index base number; determining the first index base from a plurality of preset index bases according to the indication information of the first index base; or,
    and analyzing the first information to obtain the first target characteristic value, the first target scaling value and the first target quantization bit width, and determining a default exponent base as the first exponent base.
  42. The method of any one of claims 37-41, wherein the first target feature value is one feature value of feature data of all of the N channels, the first target scaling value is a scaling value corresponding to feature data of all of the N channels when quantized, and the first target quantization bit width is a quantization bit width corresponding to feature data of all of the N channels when quantized.
  43. The method of claim 42, wherein the first target feature value is a minimum feature value in feature data of all of the N channels.
  44. The method of claim 36, wherein if the inverse quantization mode for inverse quantizing the fixed point number type feature data of all the N channels is a look-up table inverse quantization mode, the inverse quantizing the fixed point number type feature data of all the N channels using the same inverse quantization mode comprises:
    determining a first correspondence between index values of quantization intervals and inverse quantization values of quantization intervals, the first correspondence being determined based on pre-quantization values and post-quantization values of feature data of all of the N channels;
    for each fixed point type characteristic data of all the N channels, taking the value of the fixed point type characteristic data as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the fixed point type characteristic data in the first corresponding relation;
    and determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
  45. The method of claim 36, wherein if the inverse quantization corresponding to the channel is a linear uniform inverse quantization, the inverse quantization using the inverse quantization corresponding to the channel includes:
    analyzing the first information to obtain a second target characteristic value, a second target scaling value and a second target quantization bit width;
    and according to the second target characteristic value, the second target scaling value and the second target quantization bit width, performing dequantization on the characteristic data of the fixed point number type of the channel by using the linear uniform dequantization mode.
  46. The method of claim 36, wherein if the inverse quantization corresponding to the channel is a nonlinear log-uniform inverse quantization, the inverse quantization using the inverse quantization corresponding to the channel includes:
    determining a second target characteristic value, a second target scaling value, a second target quantization bit width and a second logarithmic base according to the first information;
    and according to the second target characteristic value, the second target scaling value, the second target quantization bit width and the second logarithmic base, performing dequantization on the characteristic data of the fixed point number type of the channel by using the nonlinear logarithmic uniform dequantization mode.
  47. The method of claim 46, wherein determining a second target feature value, a second target scaling value, a second target quantization bit width, and a second logarithmic base from the first information comprises:
    analyzing the first information to obtain the second target characteristic value, a second target scaling value, a second target quantization bit width and a second logarithmic base number; or,
    analyzing the first information to obtain indication information of the second target characteristic value, the second target scaling value, the second target quantization bit width and the second logarithmic base number; determining the second logarithmic base number from a plurality of preset logarithmic base numbers according to the indication information of the second logarithmic base number; or,
    and analyzing the first information to obtain the second target characteristic value, the second target scaling value and the second target quantization bit width, and determining a default logarithmic base number as the second logarithmic base number.
  48. The method of claim 36, wherein if the inverse quantization mode corresponding to the channel is a nonlinear exponential uniform inverse quantization mode, the inverse quantization of the fixed point number type of feature data of the channel using the inverse quantization mode corresponding to the channel comprises:
    Determining a second target characteristic value, a second target scaling value, a second target quantization bit width and a second exponent base according to the first information;
    and according to the second target characteristic value, the second target scaling value, the second target quantization bit width and the second exponent base, performing dequantization on the characteristic data of the fixed point number type of the channel by using the nonlinear exponent uniform dequantization mode.
  49. The method of claim 48, wherein determining a second target feature value, a second target scaling value, a second target quantization bit width, and a second exponent base based on the first information comprises:
    analyzing the first information to obtain the second target characteristic value, a second target scaling value, a second target quantization bit width and a second index base; or,
    analyzing the first information to obtain indication information of the first information including the second target characteristic value, the second target scaling value, the second target quantization bit width and the second logarithmic base number; determining the second exponent base from a plurality of preset exponent bases according to the indication information of the second exponent base; or,
    and analyzing the first information to obtain the second target characteristic value, the second target scaling value and the second target quantization bit width, and determining a default exponent base as the second exponent base.
  50. The method according to any one of claims 45-49, wherein the second target feature value is one feature value in the feature data of the set of channels, the second target scaling value is a scaling value corresponding to the feature data of the channel when quantized, and the second target quantization bit width is a quantization bit width corresponding to the feature data of the channel when quantized.
  51. The method of claim 50, wherein the second target feature value is a minimum feature value in the feature data of the channel.
  52. The method of claim 36, wherein if the inverse quantization mode corresponding to the channel is a look-up table inverse quantization mode, the inverse quantizing the fixed point number type feature data of the channel using the inverse quantization mode corresponding to the channel comprises:
    determining a second correspondence between the index value of the quantization interval and the inverse quantization value of the quantization interval, the second correspondence being determined based on the pre-quantization value and the post-quantization value of the feature data of the channel;
    for each fixed point type of feature data in the channel, taking the value of the fixed point type of feature data as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the fixed point type of feature data in the second corresponding relation;
    And determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
  53. The method of claim 36, wherein if the inverse quantization corresponding to the set of channels is a linear uniform inverse quantization, the inverse quantization using the inverse quantization corresponding to the set of channels comprises:
    analyzing the first information to obtain a third target characteristic value, a third target scaling value and a third target quantization bit width;
    and according to the third target characteristic value, the third target scaling value and the third target quantization bit width, performing dequantization on the characteristic data of the fixed point number type of the group of channels by using the linear uniform dequantization mode.
  54. The method of claim 36, wherein if the inverse quantization mode corresponding to the set of channels is a nonlinear log-uniform inverse quantization mode, the inverse quantization of the fixed point number type of feature data of the set of channels using the inverse quantization mode corresponding to the set of channels comprises:
    determining a third target characteristic value, a third target scaling value, a third target quantization bit width and a third logarithmic base according to the first information;
    And according to the third target characteristic value, the third target scaling value, the third target quantization bit width and the third logarithmic base number, performing dequantization on the characteristic data of the fixed point number type of the group of channels by using the nonlinear logarithmic uniform dequantization mode.
  55. The method of claim 54, wherein determining a third target feature value, a third target scaling value, a third target quantization bit width, and a third logarithmic base from the first information comprises:
    analyzing the first information to obtain the third target characteristic value, a third target scaling value, a third target quantization bit width and a third logarithmic base number; or,
    analyzing the first information to obtain indication information of the third target characteristic value, the third target scaling value, the third target quantization bit width and the third logarithmic base number; determining the third logarithmic base number from a plurality of preset logarithmic base numbers according to the indication information of the third logarithmic base number; or,
    and analyzing the first information to obtain the third target characteristic value, the third target scaling value and the third target quantization bit width, and determining a default logarithmic base number as the third logarithmic base number.
  56. The method of claim 36, wherein if the inverse quantization mode corresponding to the set of channels is a nonlinear exponential uniform inverse quantization mode, the inverse quantizing the fixed point number type of feature data of the set of channels using the inverse quantization mode corresponding to the set of channels comprises:
    determining a third target characteristic value, a third target scaling value, a third target quantization bit width and a third exponent base according to the first information;
    and according to the third target characteristic value, the third target scaling value, the third target quantization bit width and the third exponent base, performing dequantization on the characteristic data of the fixed point number type of the group of channels by using the nonlinear exponent uniform dequantization mode.
  57. The method of claim 56, wherein said determining a third target feature value, a third target scaling value, a third target quantization bit width, and a third exponent base based on said first information comprises:
    analyzing the first information to obtain the third target characteristic value, a third target scaling value, a third target quantization bit width and a third exponent base; or,
    analyzing the first information to obtain indication information of the first information including the third target characteristic value, the third target scaling value, the third target quantization bit width and the third logarithmic base number; determining the third index base number from a plurality of preset index base numbers according to the indication information of the third logarithmic base number; or,
    And analyzing the first information to obtain the third target characteristic value, the third target scaling value and the third target quantization bit width, and determining a default exponent base as the third exponent base.
  58. The method of any one of claims 53-57, wherein the third target feature value is one feature value in the feature data of the set of channels, the third target scaling value is a scaling value corresponding to the feature data of the set of channels when quantized, and the third target quantization bit width is a quantization bit width corresponding to the feature data of the set of channels when quantized.
  59. The method of claim 58, wherein the third target feature value is a minimum feature value in the feature data of the set of channels.
  60. The method of claim 36, wherein if the inverse quantization mode corresponding to the set of channels is a look-up table inverse quantization mode, the inverse quantizing the fixed point number type feature data of the set of channels using the inverse quantization mode corresponding to the set of channels comprises:
    determining a third correspondence between index values of quantization intervals and inverse quantization values of quantization intervals, the third correspondence being determined based on pre-quantization values and post-quantization values of the feature data of the set of channels;
    For each fixed point type characteristic data in the group of channels, taking the value of the fixed point type characteristic data as an index of a quantization interval, and inquiring a target inverse quantization value corresponding to the value of the fixed point type characteristic data in the third corresponding relation;
    and determining the target dequantization value as a floating point number type value of the fixed point number type characteristic data.
  61. The method of claim 44, 52 or 60, wherein the target correspondence between the index value of the quantization interval and the inverse quantization value of the quantization interval is default; or, the first information includes a target correspondence between the index value of the quantization interval and the inverse quantization value of the quantization interval, where the target correspondence includes a first pair of relationships, a second correspondence, or a third correspondence.
  62. An image encoder, comprising:
    an acquisition unit configured to acquire a current image to be encoded;
    the feature extraction unit is used for inputting the current image into a neural network to obtain feature data of the current image, wherein the feature data of the current image comprise feature data of N channels, and N is a positive integer;
    A quantization unit configured to quantize feature data of at least one channel of the N channels;
    the coding unit is used for coding the quantized characteristic data of the at least one channel to obtain a code stream, wherein the code stream comprises first information, and the first information is used for indicating to dequantize the characteristic data of the at least one channel in the N channels.
  63. A video decoder, comprising:
    the decoding unit is used for decoding the code stream to obtain the characteristic data of the current image, wherein the characteristic data of the current image comprises the characteristic data of N channels, and N is a positive integer; decoding the code stream to obtain first information, wherein the first information is used for indicating that feature data of at least one channel in the N channels is dequantized;
    and the inverse quantization unit is used for inversely quantizing the characteristic data of the at least one channel according to the first information.
  64. A video codec system, comprising:
    the video encoder of claim 62;
    and a video decoder according to claim 63.
  65. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are for implementing the method of any one of claims 1 to 61.
CN202180090510.0A 2021-03-01 2021-03-01 Image encoding/decoding method, encoder, decoder, and storage medium Pending CN116982082A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/078522 WO2022183335A1 (en) 2021-03-01 2021-03-01 Image encoding and decoding methods, encoder, decoder, and storage medium

Publications (1)

Publication Number Publication Date
CN116982082A true CN116982082A (en) 2023-10-31

Family

ID=83153774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180090510.0A Pending CN116982082A (en) 2021-03-01 2021-03-01 Image encoding/decoding method, encoder, decoder, and storage medium

Country Status (2)

Country Link
CN (1) CN116982082A (en)
WO (1) WO2022183335A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11949763B2 (en) * 2020-11-19 2024-04-02 Steradian Semiconductors Private Limited System, device and method for data compression in a radar system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102301232B1 (en) * 2017-05-31 2021-09-10 삼성전자주식회사 Method and apparatus for processing multiple-channel feature map images
KR102419136B1 (en) * 2017-06-15 2022-07-08 삼성전자주식회사 Image processing apparatus and method using multiple-channel feature map
KR102601604B1 (en) * 2017-08-04 2023-11-13 삼성전자주식회사 Method and apparatus for quantizing parameter of neural network
US11113839B2 (en) * 2019-02-26 2021-09-07 Here Global B.V. Method, apparatus, and system for feature point detection
US20200364552A1 (en) * 2019-05-13 2020-11-19 Baidu Usa Llc Quantization method of improving the model inference accuracy

Also Published As

Publication number Publication date
WO2022183335A1 (en) 2022-09-09

Similar Documents

Publication Publication Date Title
US11936884B2 (en) Coded-block-flag coding and derivation
CN109348226B (en) Picture file processing method and intelligent terminal
CN106899861B (en) A kind of photograph document handling method and its equipment, system
KR20210134402A (en) Encoders, decoders and corresponding methods for intra prediction
US12003730B2 (en) Video decoder with reduced dynamic range transform with inverse transform shifting memory
CN113728637B (en) Framework for encoding and decoding low rank and shift rank based layers of deep neural networks
CN106331716B (en) Video-frequency compression method and device
US20190020900A1 (en) Coding video syntax elements using a context tree
CN116018757A (en) System and method for encoding/decoding deep neural networks
US8938001B1 (en) Apparatus and method for coding using combinations
CN115868115A (en) System and method for encoding/decoding deep neural network
Kabir et al. Edge-based transformation and entropy coding for lossless image compression
CN116982082A (en) Image encoding/decoding method, encoder, decoder, and storage medium
CN117176952A (en) Video encapsulation and decapsulation method, apparatus, electronic device, and computer-readable storage medium
US20140133552A1 (en) Method and apparatus for encoding an image
Wang et al. Compound image compression based on unified LZ and hybrid coding
CN113287301A (en) Inter-component linear modeling method and device for intra-frame prediction
WO2022269469A1 (en) Method, apparatus and computer program product for federated learning for non independent and non identically distributed data
KR20200005748A (en) Complex Motion-Compensation Prediction
US20230188726A1 (en) Dynamic Method for Symbol Encoding
US10666986B1 (en) Sub-block based entropy coding for embedded image codec
EP3096520B1 (en) A method for encoding/decoding a picture block
CN110099279B (en) Method for adjusting lossy compression based on hardware
CN117692669A (en) Live broadcast compression method and device
CN117640938A (en) Screen image compression method and system based on neural network image block classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination