Nothing Special   »   [go: up one dir, main page]

CN113660498B - Inter-frame image universal coding method and system based on significance detection - Google Patents

Inter-frame image universal coding method and system based on significance detection Download PDF

Info

Publication number
CN113660498B
CN113660498B CN202111218449.1A CN202111218449A CN113660498B CN 113660498 B CN113660498 B CN 113660498B CN 202111218449 A CN202111218449 A CN 202111218449A CN 113660498 B CN113660498 B CN 113660498B
Authority
CN
China
Prior art keywords
coding tree
tree unit
preset
region
moving object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111218449.1A
Other languages
Chinese (zh)
Other versions
CN113660498A (en
Inventor
蒋先涛
蔡佩华
张纪庄
郭咏梅
郭咏阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Kangda kaineng Medical Technology Co.,Ltd.
Original Assignee
Kangda Intercontinental Medical Devices Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangda Intercontinental Medical Devices Co ltd filed Critical Kangda Intercontinental Medical Devices Co ltd
Priority to CN202111218449.1A priority Critical patent/CN113660498B/en
Publication of CN113660498A publication Critical patent/CN113660498A/en
Application granted granted Critical
Publication of CN113660498B publication Critical patent/CN113660498B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a general coding method for interframe images based on significance detection, which relates to the technical field of image processing and mainly comprises the following steps: extracting a moving target set in a preset range threshold value after the parameterization of the current frame image through a significance detector; screening out a coding tree unit set covered with any moving target or covered by any moving target from each coding tree unit according to the overlapping relation between each moving target in the moving target set and the coding tree unit where the moving target is located; acquiring the relative overlapping degree of each coding tree unit in the coding tree unit set and the corresponding moving target; extracting the corresponding coding tree unit with the relative overlapping degree larger than the preset overlapping degree as a significance region; and correcting the preset quantization parameter by using a preset correction value according to the extracted significance region, and coding the current frame image by using the corrected preset quantization parameter. When the video coding is carried out, the salient region is distributed to more number resources, and the loss of the data resources of the non-salient region is reduced.

Description

Inter-frame image universal coding method and system based on significance detection
Technical Field
The invention relates to the technical field of image processing, in particular to a method and a system for universal coding of interframe images based on significance detection.
Background
In conventional video coding, each inter picture is usually coded using a constant Quantization Parameter (QP), which depends on the basic QP selected by the user. When a Coding Tree Unit (CTU) covers an insignificant region, a block-by-block QP Adaptation system (QP Adaptation, QPA) based on the human visual system can improve transmission with lower visual quality in the subjective Coding quality region when a human is used as a final observer. The QPA function is included in the latest Video Coding standard (VVC) and its reference software VVC Test Model (VTM).
H.266/VVC (Universal Video Coding) is a standard jointly developed by VCEG and MPEG, and H.266/VVC is the latest international generation Video Coding standard currently. Today more and more multimedia data traffic is not consumed by human-based observers, but is used by computer vision algorithms to analyze data to solve different tasks, e.g. in smart machine applications in the field of surveillance or automatic driving. Therefore, MPEG introduced a special group on this so-called machine Video Coding (VCM) task to optimize the Video codec of computer-to-machine communication scenarios.
As the amount of data required to be processed and the time-dependent demands increase, the demand for video compression by machines increases, and it is crucial to select an appropriate algorithm to detect these salient regions before encoding. On the one hand it has to find precisely the important area containing the relevant object, but on the other hand the saliency detector has to be fast enough to satisfy real-time applications, which is the direction of the present invention.
Disclosure of Invention
In order to realize high-aging and high-quality processing of a video frame image by a machine during video coding, the invention provides an interframe image universal coding method based on significance detection, which comprises the following steps:
s1: extracting a moving target set in a preset range threshold value after the parameterization of the current frame image through a significance detector;
s2: screening out a coding tree unit set covered with any moving target or covered by any moving target from each coding tree unit according to the overlapping relation between each moving target in the moving target set and the coding tree unit where the moving target is located;
s3: acquiring the relative overlapping degree of each coding tree unit in the coding tree unit set and the corresponding moving target;
s4: extracting the corresponding coding tree unit with the relative overlapping degree larger than the preset overlapping degree as a significance region;
s5: and correcting the preset quantization parameter by using a preset correction value according to the extracted significance region, and coding the current frame image by using the corrected preset quantization parameter.
Furthermore, the saliency detector is a single-target detection network, and the moving target extracted by the single-target detection network contains classification information, area information and a bounding box.
Further, the step of S1 is followed by a step of,
s11: and scaling the width of the moving object bounding box to a preset size.
Further, the preset range threshold includes a non-maximum suppression threshold and an intersection threshold, and the parameterized extraction range of the moving object is as follows:
the moving object identification score is above a non-maximum rejection threshold, and the moving object overlap portion is greater than the intersection threshold.
Further, in the step S3, the relative overlapping degree may be obtained by a first formula, where the first formula is expressed as:
Figure 934785DEST_PATH_IMAGE001
in the formula, CTU represents a coding tree unit, det represents a moving object, overlap represents intersection, i is the number of the moving object, k is the number of the coding tree unit,
Figure 537805DEST_PATH_IMAGE002
is the relative overlap of the moving object of number i with the coding tree element of number k,
Figure 251115DEST_PATH_IMAGE003
the region of the coding tree unit numbered k,
Figure 383019DEST_PATH_IMAGE004
the area of the moving object numbered i,
Figure 128252DEST_PATH_IMAGE005
is composed of
Figure 738225DEST_PATH_IMAGE006
And
Figure 767361DEST_PATH_IMAGE007
in the intersection region between, min () is the minimum solution.
Further, in the step S4, the determination of the significant region may be expressed as a second formula, where the expression of the second formula is:
Figure 162177DEST_PATH_IMAGE008
in the formula, SkThe coding tree unit region type of number k, 1 is a significance region, 0 is a non-significance region,
Figure 925865DEST_PATH_IMAGE009
the moving object with number i corresponds to the maximum value of the relative overlapping degree with the coding tree unit with number k,
Figure 401845DEST_PATH_IMAGE010
is a preset overlap.
Further, in the step S5, the modifying the preset quantization parameter by the preset modification value according to the extracted saliency region may be represented by a third formula, where the formula expression of the third formula is:
Figure 36220DEST_PATH_IMAGE011
in the formula (I), the compound is shown in the specification,
Figure 509927DEST_PATH_IMAGE012
in order to preset the quantization parameter for the purpose of quantization,
Figure 213441DEST_PATH_IMAGE013
in order to preset the correction value, the correction value is set,
Figure 165216DEST_PATH_IMAGE014
and the preset quantization parameter is the modified preset quantization parameter of the coding tree unit with the number k.
The invention also provides a general coding system for interframe images based on significance detection, which comprises the following steps:
the saliency detector is used for extracting a moving target set within a preset range threshold after the current frame image is parameterized;
the coding tree screening unit is used for screening a coding tree unit set covered with any moving target or covered by any moving target from each coding tree unit according to the overlapping relation between each moving target in the moving target set and the coding tree unit;
the overlapping degree calculating unit is used for acquiring the relative overlapping degree of each coding tree unit and the corresponding moving target in the coding tree unit set;
the region extraction unit is used for extracting the corresponding coding tree unit with the relative overlapping degree larger than the preset overlapping degree as a significance region;
and the coding unit is used for correcting the preset quantization parameter by using the preset correction value according to the extracted significance region and coding the current frame image by using the corrected preset quantization parameter.
Furthermore, the saliency detector is a single-target detection network, the moving target extracted by the single-target detection network contains classification information, area information and a boundary box, and the saliency detector further comprises a frame scaling unit for scaling the width of the boundary box of the moving target to a preset size.
Further, the preset range threshold includes a non-maximum suppression threshold and an intersection threshold, and the parameterized extraction range of the moving object is as follows:
the moving object identification score is above a non-maximum rejection threshold, and the moving object overlap portion is greater than the intersection threshold.
Compared with the prior art, the invention at least has the following beneficial effects:
(1) according to the interframe image general coding method and system based on significance detection, a single-target detection network is selected as a significance detector, so that parameterized data processing can be better adapted, and the consumption of data resources can be reduced in the data processing process;
(2) screening the coding tree units which can show the moving target most through screening of the full-coverage coding tree units and calculation of the corresponding overlapping degree, and adjusting quantization parameters under preset adjustment, so that more data resources can be allocated to the significant area, while the allocation of the data resources is reduced in the non-significant area, and the utilization rate of the data resources is improved;
(3) the timeliness of detecting the moving target is effectively improved by using the single-target detection network.
Drawings
FIG. 1 is a diagram of method steps for a method for universal coding of inter-frame images based on saliency detection;
FIG. 2 is a system block diagram of a general inter-frame image coding system based on saliency detection;
fig. 3 is a schematic diagram of window extraction of a sliding window.
Detailed Description
The following are specific embodiments of the present invention and are further described with reference to the drawings, but the present invention is not limited to these embodiments.
Example one
In order to meet the increasing requirements on timeliness and precision of video coding in the existing machine communication scene, as shown in fig. 1, the invention provides a general inter-frame image coding method based on significance detection, which comprises the following steps:
s1: extracting a moving target set in a preset range threshold value after the parameterization of the current frame image through a significance detector;
s2: screening out a coding tree unit set covered with any moving target or covered by any moving target from each coding tree unit according to the overlapping relation between each moving target in the moving target set and the coding tree unit where the moving target is located;
s3: acquiring the relative overlapping degree of each coding tree unit in the coding tree unit set and the corresponding moving target;
s4: extracting the corresponding coding tree unit with the relative overlapping degree larger than the preset overlapping degree as a significance region;
s5: and correcting the preset quantization parameter by using a preset correction value according to the extracted significance region, and coding the current frame image by using the corrected preset quantization parameter.
Based on the latest video coding standard "universal video coding (VVC)", the present invention proposes the above-mentioned coding steps for video coding in machine communication scenarios. In order to more effectively acquire the initial salient region before encoding, the invention selects a single object detection network (YOLO) to extract the moving object.
The early RCNN, FAST RCNN and FASTER RCNN computing networks roughly divide the detection result into two parts for solution, that is, an object type solution result and an object position solution result (obtained through regression calculation), and the two parts have a causal relationship and can be solved in the next step only on the basis of completing the solution in the previous step; the single target detection network is different from the previous calculation networks, the object detection is directly used as a regression problem to solve, and after an inference (reference) is carried out on an input image, the region information of all objects in the image, the categories of the objects and the corresponding confidence probabilities can be obtained, so that the intersected previous calculation networks have the advantage of innate processing speed.
Meanwhile, the moving target is extracted through the parameterized interframe images, the YOLO has the advantages of processing parameterized data, and the moving target can be extracted with less data resource (bit) consumption in the processing process because the moving target is stronger in real-time property because the moving target belongs to a single-step processing method.
However, when the moving objects are extracted through the single-object detection network, the extracted moving objects all contain bounding boxes, and the bounding boxes have a certain width. If the bounding box is too wide, it is possible to cause the calculation result to be biased because the bounding box occupies a part of the region when calculating the subsequent overlapping degree (because the width of the bounding box is constant, but the area of the moving object is not constant, so the calculation result of the relative overlapping degree of the moving objects with different areas is different when the coding tree units with the same proportion are respectively configured), thereby causing the determination error of the significant region, therefore, after the step S1, the method further comprises a step,
s11: the width of the moving object bounding box is scaled to a preset size (here, the preset size is set manually according to the precision requirement, so the detailed parameters are not limited).
The extracted boundary frame of the moving object is zoomed, so that the subsequent calculation of the relative overlapping degree is not influenced by the boundary frame any more, and the overall accuracy of the overall saliency region extraction is improved.
After the bounding box is refined, the invention selects the object which is most likely to be the moving object according to the non-maximum inhibition threshold and the intersection threshold. Here, the Non-maximum Suppression threshold (NMS) is an object whose Suppression is not the maximum value as the name implies, and may be understood as a local maximum search. This part represents a domain with two parameters that are variable, one being the dimension of the domain and the second being the size of the domain. For example, in the face detection shown in fig. 3, after a sliding window is subjected to feature extraction and identified by a classifier, each window will obtain a score. But sliding windows can result in many windows having an inclusive or mostly intersecting with other windows. The NMS is then used to select the window with the highest score (i.e., the highest probability of being determined as a pedestrian) in the area and suppress the windows with the low score. And the overlapping degree threshold value is refined again from the screened window on the basis of the non-maximum inhibition threshold value. Because the extracted windows are overlapped in the sliding window process, the more the extracted windows are close to the correct moving object, the more the windows are overlapped with other windows, the higher the frequency of the overlapping times is, the higher the probability that the part is the moving object is, and therefore, the windows with the overlapping degree exceeding the overlapping degree threshold value can be extracted to be used as the final moving object set.
In the embodiment, for better extracting the moving object, multiple comparison experiments are performed, the non-maximum suppression threshold is set to 0.1, and the intersection threshold is set to 0.5, so that such numerical setting can ensure that if the windows overlap each other too much, the window with higher confidence can be preferentially regarded as the final extraction result. Compared with the default threshold value without setting specific parameters, the setting of the two threshold values can obviously improve the recall rate of the coding tree unit.
Based on the extracted moving object set, how to extract a suitable coding tree unit from a plurality of coding tree units of the current frame image according to the moving object set is important as a salient region. Further, the overlapping region of the coding tree unit and the moving object
Figure 903365DEST_PATH_IMAGE015
The following formula can be expressed under the overlap of the current coding tree unit and the moving object:
Figure 298705DEST_PATH_IMAGE016
in the formula, CTU represents a coding tree unit, det represents a moving object, overlap represents intersection, i is the number of the moving object, k is the number of the coding tree unit,
Figure 489515DEST_PATH_IMAGE017
the region of the coding tree unit numbered k,
Figure 244982DEST_PATH_IMAGE018
the area of the moving object numbered i,
Figure 572058DEST_PATH_IMAGE019
is composed of
Figure 387567DEST_PATH_IMAGE020
And
Figure 65673DEST_PATH_IMAGE021
the intersection area between them.
To code a treeWhen a cell is defined as a salient region, a suitable threshold can be found, and there are two cases that can be considered as salient. In the first case, the moving object region extracted by the single object detection network is smaller than the region size of the coding tree unit, so the overlapping region
Figure 378493DEST_PATH_IMAGE022
Area not larger than moving object
Figure 825654DEST_PATH_IMAGE023
. In the second case, the moving object region extracted by the single object detection network is larger than the region size of the coding tree unit, so the overlapping region
Figure 546486DEST_PATH_IMAGE024
Area that cannot be larger than coding tree unit
Figure 711888DEST_PATH_IMAGE025
. Based on this, the present invention proposes the concept of relative overlapping degree to determine the salient region, wherein the relative overlapping degree can be represented by a first formula:
Figure 809157DEST_PATH_IMAGE026
in the formula (I), the compound is shown in the specification,
Figure 127137DEST_PATH_IMAGE027
min () is the minimum solution for the relative overlap of the moving object numbered i and the coding tree element numbered k.
For the above two cases, when it is detected that the moving object is completely located in the code tree unit or the code tree unit is completely covered by the moving object and is greater than the preset overlap degree (which is manually set and can be set according to the precision requirement), it can be considered as a significant region, and when the moving object and the code tree unit are not completely overlapped, the invention can be considered as a non-significant region, which can be specifically represented by the second formula:
Figure 753290DEST_PATH_IMAGE028
in the formula, SkThe region type of the coding tree unit with number k, 1 is a significant region, 0 is a non-significant region,
Figure 671568DEST_PATH_IMAGE029
the moving object with number i corresponds to the maximum value of the relative overlapping degree with the coding tree unit with number k,
Figure 572528DEST_PATH_IMAGE030
is a preset overlap.
Then according to SkThe final classification result of (2) is obtained by adjusting quantization parameters of each coding tree unit under preset regulation, and can be specifically expressed by a third formula:
Figure 728702DEST_PATH_IMAGE031
in the formula (I), the compound is shown in the specification,
Figure 807648DEST_PATH_IMAGE032
in order to preset the quantization parameter for the purpose of quantization,
Figure 682063DEST_PATH_IMAGE033
in order to preset the correction value, the correction value is set,
Figure 386714DEST_PATH_IMAGE034
and the preset quantization parameter is the modified preset quantization parameter of the coding tree unit with the number k. And then, each coding tree unit can be coded according to the corrected preset quantization parameter.
The final purpose of the invention is to reduce the data resource (bit) allocated to the non-significant area, to screen the moving target through the whole process, to select the optimal coding tree unit, and to greatly save the whole data resource consumption in the inter-frame image coding process through the data resource consumption control of one ring after another.
Example two
In order to better understand the technical content of the present invention, this embodiment explains the present invention by way of a system structure, as shown in fig. 2, a general coding system for inter-frame images based on saliency detection includes:
the saliency detector is used for extracting a moving target set within a preset range threshold after the current frame image is parameterized;
the coding tree screening unit is used for screening a coding tree unit set covered with any moving target or covered by any moving target from each coding tree unit according to the overlapping relation between each moving target in the moving target set and the coding tree unit;
the overlapping degree calculating unit is used for acquiring the relative overlapping degree of each coding tree unit and the corresponding moving target in the coding tree unit set;
the region extraction unit is used for extracting the corresponding coding tree unit with the relative overlapping degree larger than the preset overlapping degree as a significance region;
and the coding unit is used for correcting the preset quantization parameter by using the preset correction value according to the extracted significance region and coding the current frame image by using the corrected preset quantization parameter.
The saliency detector selects a single-target detection network, a moving target extracted by the single-target detection network contains classification information, area information and a boundary frame, and the saliency detector also comprises a frame scaling unit used for scaling the width of the boundary frame of the moving target to a preset size.
Meanwhile, the preset range threshold comprises a non-maximum inhibition threshold and an intersection threshold, and the parameterized extraction range of the moving target is as follows:
the moving object identification score is above a non-maximum rejection threshold, and the moving object overlap portion is greater than the intersection threshold.
In summary, the method and system for coding inter-frame images based on significance detection in the present invention select a single-target detection network as the significance detector, so as to better adapt to parameterized data processing, and reduce the consumption of data resources in the data processing process.
Through screening of the full-coverage coding tree units and calculation of the corresponding overlapping degree, the coding tree unit which can show the moving target most is screened out, and quantization parameter adjustment under preset adjustment is carried out on the coding tree unit, so that more data resources can be allocated to the significant area, while the allocation of the data resources is reduced in the non-significant area, and the utilization rate of the data resources is improved. Meanwhile, the timeliness of the detection of the moving target can be effectively improved by using the single-target detection network.
It should be noted that all the directional indicators (such as up, down, left, right, front, and rear … …) in the embodiment of the present invention are only used to explain the relative position relationship between the components, the movement situation, etc. in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indicator is changed accordingly.
Moreover, descriptions of the present invention as relating to "first," "second," "a," etc. are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit ly indicating a number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "connected," "secured," and the like are to be construed broadly, and for example, "secured" may be a fixed connection, a removable connection, or an integral part; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In addition, the technical solutions in the embodiments of the present invention may be combined with each other, but it must be based on the realization of those skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination of technical solutions should not be considered to exist, and is not within the protection scope of the present invention.

Claims (8)

1. A method for coding an inter-frame image based on saliency detection is characterized by comprising the following steps:
s1: extracting a moving target set in a preset range threshold value after the parameterization of the current frame image through a significance detector;
s2: screening out a coding tree unit set covered with any moving target or covered by any moving target from each coding tree unit according to the overlapping relation between each moving target in the moving target set and the coding tree unit where the moving target is located;
s3: acquiring the relative overlapping degree of each coding tree unit in the coding tree unit set and the corresponding moving target;
s4: extracting the corresponding coding tree unit with the relative overlapping degree larger than the preset overlapping degree as a significance region;
s5: correcting the preset quantization parameter by using a preset correction value according to the extracted significance region, and coding the current frame image by using the corrected preset quantization parameter;
the saliency detector is a single-target detection network, and moving targets extracted by the single-target detection network contain classification information, area information and a boundary frame;
in the step S3, the relative overlapping degree may be obtained by a first formula, where the first formula is:
Figure 493100DEST_PATH_IMAGE001
in the formula, CTU represents a coding tree unit, det represents a moving object, overlap represents intersection, i is the number of the moving object, k is the number of the coding tree unit,
Figure 834082DEST_PATH_IMAGE002
is the relative overlap of the moving object of number i with the coding tree element of number k,
Figure 233971DEST_PATH_IMAGE003
the region of the coding tree unit numbered k,
Figure 660404DEST_PATH_IMAGE004
the area of the moving object numbered i,
Figure 69520DEST_PATH_IMAGE005
is composed of
Figure 993570DEST_PATH_IMAGE006
And
Figure 95518DEST_PATH_IMAGE007
in the intersection region between, min () is the minimum solution.
2. The method of claim 1, wherein the step of S1 is followed by a step of detecting the saliency of the inter-frame images,
s11: and scaling the width of the moving object bounding box to a preset size.
3. The method as claimed in claim 1, wherein the preset range threshold includes a non-maximum suppression threshold and an intersection threshold, and the parameterized extraction range of the moving object is:
the moving object identification score is above a non-maximum rejection threshold, and the moving object overlap portion is greater than the intersection threshold.
4. The method as claimed in claim 1, wherein in the step S4, the determination of the saliency region is represented by a second formula, and the second formula is expressed as:
Figure 743669DEST_PATH_IMAGE008
in the formula, SkThe coding tree unit region type of number k, 1 is a significance region, 0 is a non-significance region,
Figure 690896DEST_PATH_IMAGE009
the moving object with number i corresponds to the maximum value of the relative overlapping degree with the coding tree unit with number k,
Figure 6471DEST_PATH_IMAGE010
is a preset overlap.
5. The method as claimed in claim 4, wherein the step S5 for correcting the preset quantization parameter according to the extracted saliency region by the preset correction value is represented by a third formula, where the formula of the third formula is:
Figure 748162DEST_PATH_IMAGE011
in the formula (I), the compound is shown in the specification,
Figure 192263DEST_PATH_IMAGE012
in order to preset the quantization parameter for the purpose of quantization,
Figure 772542DEST_PATH_IMAGE013
in order to preset the correction value, the correction value is set,
Figure 677044DEST_PATH_IMAGE014
and the preset quantization parameter is the modified preset quantization parameter of the coding tree unit with the number k.
6. A system for universal coding of inter-frame images based on saliency detection, comprising:
the saliency detector is used for extracting a moving target set within a preset range threshold after the current frame image is parameterized;
the coding tree screening unit is used for screening a coding tree unit set covered with any moving target or covered by any moving target from each coding tree unit according to the overlapping relation between each moving target in the moving target set and the coding tree unit;
the overlapping degree calculating unit is used for acquiring the relative overlapping degree of each coding tree unit and the corresponding moving target in the coding tree unit set;
the region extraction unit is used for extracting the corresponding coding tree unit with the relative overlapping degree larger than the preset overlapping degree as a significance region;
the coding unit is used for correcting the preset quantization parameter by using a preset correction value according to the extracted significance region and coding the current frame image by using the corrected preset quantization parameter;
the saliency detector is a single-target detection network, and moving targets extracted by the single-target detection network contain classification information, area information and a boundary frame;
in the overlap degree calculation unit, the relative overlap degree may be obtained by a first formula, where the first formula is expressed as:
Figure 855216DEST_PATH_IMAGE001
in the formula, CTU represents a coding tree unit, det represents a moving object, overlap represents intersection, i is the number of the moving object, k is the number of the coding tree unit,
Figure 206520DEST_PATH_IMAGE002
is the relative overlap of the moving object of number i with the coding tree element of number k,
Figure 761129DEST_PATH_IMAGE003
the region of the coding tree unit numbered k,
Figure 520137DEST_PATH_IMAGE004
the area of the moving object numbered i,
Figure 134790DEST_PATH_IMAGE005
is composed of
Figure 244828DEST_PATH_IMAGE006
And
Figure 71970DEST_PATH_IMAGE007
in the intersection region between, min () is the minimum solution.
7. The system as claimed in claim 6, wherein the saliency detector further comprises a bounding box scaling unit for scaling the width of the moving object bounding box to a preset size.
8. The system as claimed in claim 6, wherein the preset range threshold includes a non-maximum suppression threshold and an intersection threshold, and the parameterized extraction range of the moving object is:
the moving object identification score is above a non-maximum rejection threshold, and the moving object overlap portion is greater than the intersection threshold.
CN202111218449.1A 2021-10-20 2021-10-20 Inter-frame image universal coding method and system based on significance detection Active CN113660498B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111218449.1A CN113660498B (en) 2021-10-20 2021-10-20 Inter-frame image universal coding method and system based on significance detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111218449.1A CN113660498B (en) 2021-10-20 2021-10-20 Inter-frame image universal coding method and system based on significance detection

Publications (2)

Publication Number Publication Date
CN113660498A CN113660498A (en) 2021-11-16
CN113660498B true CN113660498B (en) 2022-02-11

Family

ID=78484279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111218449.1A Active CN113660498B (en) 2021-10-20 2021-10-20 Inter-frame image universal coding method and system based on significance detection

Country Status (1)

Country Link
CN (1) CN113660498B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104853196A (en) * 2014-02-18 2015-08-19 华为技术有限公司 Coding and decoding method and device
CN111432207A (en) * 2020-03-30 2020-07-17 北京航空航天大学 Perceptual high-definition video coding method based on salient target detection and salient guidance
CN111726633A (en) * 2020-05-11 2020-09-29 河南大学 Compressed video stream recoding method based on deep learning and significance perception

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3425911A1 (en) * 2017-07-06 2019-01-09 Thomson Licensing A method and a device for picture encoding and decoding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104853196A (en) * 2014-02-18 2015-08-19 华为技术有限公司 Coding and decoding method and device
CN111432207A (en) * 2020-03-30 2020-07-17 北京航空航天大学 Perceptual high-definition video coding method based on salient target detection and salient guidance
CN111726633A (en) * 2020-05-11 2020-09-29 河南大学 Compressed video stream recoding method based on deep learning and significance perception

Also Published As

Publication number Publication date
CN113660498A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
CN112633384B (en) Object recognition method and device based on image recognition model and electronic equipment
CN111709975B (en) Multi-target tracking method, device, electronic equipment and storage medium
US8525935B2 (en) Moving image processing apparatus and method, and computer readable memory
CN110620924B (en) Method and device for processing coded data, computer equipment and storage medium
CN110399842B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN111698505B (en) Video frame encoding method, video frame encoding device, and readable storage medium
CN110536172B (en) Video image display adjusting method, terminal and readable storage medium
US20230368394A1 (en) Image Segmentation Method and Apparatus, Computer Device, and Readable Storage Medium
CN113011433B (en) Filtering parameter adjusting method and device
CN113660498B (en) Inter-frame image universal coding method and system based on significance detection
CN111970510A (en) Video processing method, storage medium and computing device
CN114339306B (en) Live video image processing method and device and server
CN111192213A (en) Image defogging adaptive parameter calculation method, image defogging method and system
CN113452996B (en) Video coding and decoding method and device
CN113223023A (en) Image processing method and device, electronic device and storage medium
CN112601029B (en) Video segmentation method, terminal and storage medium with known background prior information
CN113642442B (en) Face detection method and device, computer readable storage medium and terminal
CN107154052B (en) Object state estimation method and device
CN111724426B (en) Background modeling method and camera for background modeling
CN112085002A (en) Portrait segmentation method, portrait segmentation device, storage medium and electronic equipment
CN118038302A (en) Method and device for determining outline of moving object, electronic equipment and storage medium
EP4206974A1 (en) Chrominance component-based image segmentation method and system, image segmentation device, and readable storage medium
CN115170612A (en) Detection tracking method and device, electronic equipment and storage medium
CN115034975A (en) Global mapping table generation method and device applied to image enhancement
CN110322475B (en) Video sparse detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220302

Address after: 315800 building 6, No. 88 Kangda Road, Meishan bonded port area, Ningbo, Zhejiang

Patentee after: Ningbo Kangda kaineng Medical Technology Co.,Ltd.

Address before: 315800 No. 88, Kangda Road, Meishan bonded port area, Beilun District, Ningbo City, Zhejiang Province

Patentee before: Kangda Intercontinental Medical Devices Co.,Ltd.

TR01 Transfer of patent right